ChatGPT’s Advanced Voice Mode, enabling real-time conversations, could soon incorporate visual capabilities, according to code discovered within the platform’s latest beta build. While OpenAI hasn’t officially confirmed this feature’s release, code within the ChatGPT v1.2024.317 beta, spotted by Android Authority, suggests a “live camera” feature may be imminent.
ChatGPT identifies a dog
Early Demonstrations and User Experiences
OpenAI initially showcased Advanced Voice Mode’s visual potential during a May alpha launch demo. The system successfully identified a dog through a phone’s camera feed, recognizing the dog from previous interactions, its ball, and even understanding their relationship (playing fetch).
Alpha testers quickly embraced the feature. X user Manuel Sainsily effectively used it to answer questions about his new kitten based on the camera’s video feed.
Trying #ChatGPT’s new Advanced Voice Mode that just got released in Alpha. It feels like face-timing a super knowledgeable friend, which in this case was super helpful — reassuring us with our new kitten. It can answer questions in real-time and use the camera as input too! pic.twitter.com/Xx0HCAc4To
— Manuel Sainsily (@ManuVision) July 30, 2024
Beta Release and Competitive Landscape
Advanced Voice Mode entered beta for Plus and Enterprise subscribers in September, albeit without the visual component. Even so, users extensively explored the feature’s vocal capabilities. According to OpenAI, Advanced Voice “offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions.”
Adding visual input would significantly differentiate Advanced Voice Mode from competitors like Google’s Gemini Live and Meta’s Natural Voice Interactions, neither of which currently utilize camera input. While Gemini Live boasts multilingual capabilities, it lacks visual understanding, and Meta’s Natural Voice Interactions, launched at Connect 2024, doesn’t incorporate camera functionality.
Expanding Availability
OpenAI recently announced that Advanced Voice Mode is now also available for paid ChatGPT Plus accounts on desktop, expanding access beyond the initial mobile-only availability.
Conclusion
The potential integration of visual input into ChatGPT’s Advanced Voice Mode promises to revolutionize chatbot interactions. This advancement could significantly enhance user experience and further distinguish ChatGPT from competitors, offering a more immersive and intuitive conversational AI experience.