OpenAI explores text-to-video AI model, Sora

OpenAI, the owner of ChatGPT, has expanded the possibilities of its Artificial Intelligence capabilities with the launch of Sora, its text-to-video AI model.

Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt, the firm said. It can create complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background, Open AI explained.

While announcing the launch of Sora on X, Sam Altman, OpenAI’s Chief Executive Officer, told his followers, “We’d like to show you what Sora can do, please reply with captions for videos you’d like to see, and we’ll start making some.”

In a blog post, OpenAI noted that its AI model is available to red teamers to assess critical areas for harm or risks and several visual artists, designers, and filmmakers to gain feedback on advancing the model.

Related News

The firm noted that the model still has apparent weaknesses. “The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterwards, the cookie may not have a bite mark,” it highlighted.

Sora is not yet available in OpenAI’s products, with the firm still working with domain experts in areas like misinformation, hateful content, and bias.

Since OpenAI debuted its still image generator Dall-E in 2021 and generative AI chatbot ChatGPT in November 2022, it accrued over 100 million users (primarily helped by ChatGPT’s success). OpenAI’s launch of Sora follows in the footsteps of Google and Meta, who are also working on generative video tools.

Aside from Sora, the firm on Wednesday disclosed it is experimenting with adding more profound memory to ChatGPT so that it could remember more of its users’ chats.