AI & ML
Multimodal AI with Gemini & GPT-4V
Build AI that sees, reads, and understands
About This Workshop
Build multimodal applications that process images, text, audio, and video together. Use Gemini Pro Vision, GPT-4V, and CLIP. Build an AI that reads handwritten notes and generates reports.
About Shweta Deshpande
Applied AI Researcher at Google India. Worked on Gemini multimodal capabilities for Indian languages and scripts.
₹99
Enroll Now