OpenAI, an artificial intelligence startup that created the chatbot ChatGPT, released a text-to-video model called Sora on Friday (16th). This model allows users to generate videos up to 1 minute long through text commands.
Sora is capable of generating complex scenes with multiple characters, specific types of actions, and precise details of the subject and background. OpenAI showcased several videos generated by Sora on its official website, along with the text commands used to generate them.
OpenAI also mentioned that Sora can generate videos based on existing static images, adding dynamic effects accurately and delicately to the image content. The model can also utilize existing videos to extend the content or fill in missing frames.
However, OpenAI warned that the current model has limitations. For example, it may have difficulty accurately simulating physical phenomena in complex scenes or understanding causal relationships in specific situations. Additionally, the model may confuse spatial details, such as distinguishing between left and right.
Sam Altman, the CEO of OpenAI, stated on the X platform that the company has started “red-teaming” exercises for Sora and has granted access to a limited number of creators. Altman also shared multiple videos generated by Sora on the X platform earlier.