OpenAI held a webcast Monday to roll out a new version of its ChatGPT app, which sounds and acts a lot like the AI in the 2013 Spike Jonze film, Her.
The experience is powered by a new version of its GPT-4 large language model—available on desktop and mobile—called GPT-4o (“GPT-four-oh”). The new model, OpenAI says, returns answers much faster than GPT-4, and improves on its text, vision, and audio capabilities.
The model is a showcase for OpenAI’s development of multimodal AI. GPT-4o can receive and reason about text, audio, and visual inputs, then deliver outputs in natural language and natural-sounding voice.
OpenAI researcher Mark Chen demonstrated the new model’s impressive conversational capabilities during a live demo. He told the chatbot that he was nervous about the demo, and asked her for advice to help calm down. Chen then mock-hyperventilated into his phone, to which the app responded, “Mark! You’re not a vacuum cleaner.” The AI was spontaneous and funny, much like the voice assistant (voiced by Scarlett Johansson) in Her, which has become a North Star for people developing consumer AI.
The app was asked to tell a story with various levels of “drama” in its voice, which it did, convincingly. The AI then told the same story in a stereotypical robot’s voice, and then again in sing-song fashion.
Chen also demonstrated how he could interrupt the AI voice, and she would quickly stop talking. ChatGPT, in other words, is getting more “emotionally” intelligent. This is very similar to what Inflection.ai was developing with its Pi AI app. But Inflection.ai was essentially bought out by Microsoft, the same tech giant that owns almost half of OpenAI.
The ChatGPT app also has the ability to “see” things and reason about them. Through the phone camera, the app was shown a math problem written on a white board and asked for help in working it out. It was then asked to explain some computer code. The app also did a live translation from Italian to English and back.
The new features in the ChatGPT app will roll out to users of ChatGPT Plus over the next few weeks. OpenAI says it’s also making GPT-4o available to developers through its API. OpenAI’s live-streamed announcement Monday seemed timed to steal some thunder from Google, which is expected to make a series of AI-related announcements at its I/O developer conference Tuesday.