Open AI announced on Monday that it is releasing a much faster version of its AI model, ChatGPT-4o, to the public.
The o in GPT-4o stands for omni, referring to the model’s ability to handle text, speech, and video. Over the next few weeks, it will be rolled out across the company’s developer and consumer-facing products.
In a live-streamed event, Mira Murati, technology chief revealed that the GPT-4o provides ‘GPT-4-level’ intelligence but improves GPT-4’s capabilities across multiple modalities and media. This new model allows ChatGPT to handle 50 languages with improved speed and quality.
Here is a breakdown of some of ChatGPT-4o’s capabilities:
- It has been touted as the first multi-modal AI model capable of reasoning about text, audio, and vision in real-time.
- GPT-4o will allow for real-time translations, making global communication easier without learning new languages.
- It can serve as a tutor, observing actions and offering real-time guidance through conversations.
- The text-to-image capabilities of the model are a significant improvement on what is currently obtainable.
- It can sing and harmonise with another GPT-4o.
- It can be used to provide customer service support, which is likely to shake up the industry.
- It is great at maths and is currently better at solving math problems than humans. It can also count very fast.
- It can help job seekers prepare for interviews.
- With its desktop app, ChatGPT-4o can participate in meetings and summarise them.
- It can interact with pets.
In a blog post, Sam Altman, chief executive officer of OpenAI, revealed that this new GPT-4o feels like AI from the movies because of how real it is.
“Getting to human-level response times and expressiveness turns out to be a big change. The original ChatGPT showed a hint of what was possible with language interfaces; this new thing feels viscerally different. It is fast, smart, fun, natural, and helpful,” he said.
Join BusinessDay whatsapp Channel, to stay up to date
Open In Whatsapp