Entrepreneur and AI advocate focused on AI for social good and sustainability.
OpenAI has finally rolled out real-time video capabilities for ChatGPT, a feature initially demoed nearly seven months ago. This update, part of the Advanced Voice Mode, allows users to interact with the AI using their phone's camera. Now, ChatGPT can "see" and respond to the real world in real time, much like a video call with a very knowledgeable friend.
Subscribers to ChatGPT Plus, Team, and Pro can now access this feature on the iOS and Android apps. By tapping the voice icon and then the video icon, users can point their cameras at objects or scenes, and ChatGPT will provide insights, suggestions, or answers in real time. This includes identifying objects, and even remembering people who introduce themselves. In a demo, OpenAI showed how ChatGPT could help brew coffee, guiding the user through the process.
Additionally, screen sharing is now available, allowing users to share their device screens with ChatGPT. This is particularly useful for tasks like troubleshooting technical issues or getting help with a math problem. You can simply tap the three-dot menu and select “Share Screen” to activate this feature. This functionality brings ChatGPT out of the app and into the realm of the browser, similar to Microsoft's Copilot Vision and Google’s Project Astra.
The rollout began this week, but access will be staggered. ChatGPT Enterprise and Edu subscribers will have to wait until January, and users in the EU, Switzerland, Iceland, Norway, and Liechtenstein will not have access. The feature has been delayed multiple times, reportedly due to the fact that it was announced before it was fully ready. This is a significant step forward, making ChatGPT more versatile and interactive, and it could signal a way for enterprises to collaborate more with AI agents.
In a fun twist, OpenAI has also launched a "Santa Mode" for the holiday season. Users can access Santa's voice by tapping the snowflake icon next to the prompt bar, although chats with Santa will not be saved in the chat history.
Key Takeaways
This integration of visual capabilities enhances ChatGPT's utility, moving it closer to a truly multimodal AI assistant, similar to how Apple Intelligence integrates with ChatGPT.
— in Computer Vision
— in GenAI
— in AI Research Highlights
— in GenAI
— in Natural Language Processing (NLP)