AI & ML

Multimodal AI with Gemini & GPT-4V

Build AI that sees, reads, and understands

Instructor Shweta Deshpande
Duration 2 hours
Platform Live on Zoom
Level Advanced
Spots Available 10

About This Workshop

Build multimodal applications that process images, text, audio, and video together. Use Gemini Pro Vision, GPT-4V, and CLIP. Build an AI that reads handwritten notes and generates reports.

About Shweta Deshpande

Applied AI Researcher at Google India. Worked on Gemini multimodal capabilities for Indian languages and scripts.

₹99 Enroll Now