OpenAI introduces Sora, an impressive text-to-video AI model capable of generating realistic videos up to one minute in length. Sora handles reflections and shadows adeptly, with a resolution of 1080p. It is speculated that Sora is trained on Unreal Engine simulations. Currently, Sora is not available to regular users as OpenAI collaborates with experts to assess the model for bias, risks, and harms.
When Google unveiled its next-gen Gemini 1.5 Pro model, OpenAI surprised the industry with Sora, a groundbreaking text-to-video AI model. Unlike previous models like Runway’s Gen-2 and Pika, Sora sets a new standard in video generation. Here’s what you should know about OpenAI’s Sora.
Sora Can Generate Videos Up to 1 Minute
Sora, OpenAI’s text-to-video AI model, creates detailed videos (up to 1080p) from text inputs. It faithfully interprets user prompts, simulating motion in the physical world. Notably, Sora produces AI videos up to one minute in length, surpassing the few seconds typical of existing models.
OpenAI has presented numerous visual demonstrations to exhibit Sora’s impressive capabilities. The creators of ChatGPT assert that Sora possesses a profound comprehension of language, capable of producing “compelling characters that convey vibrant emotions.” Moreover, it can incorporate various shots in a single video, ensuring continuity of characters and scenes throughout the duration.
However, Sora does exhibit some limitations. Presently, it lacks a comprehensive understanding of real-world physics. OpenAI clarifies this by stating, “A person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.”
Regarding its architecture, Sora operates as a diffusion model constructed upon the transformer architecture. It employs the recaptioning technique pioneered with DALL-E 3, which generates a highly descriptive prompt from a provided user prompt. In addition to text-to-video generation, Sora can also animate still images and extend them into video format frames.
Indeed, OpenAI appears to have achieved another breakthrough with Sora, as evident from its concluding remarks on the blog, emphasizing the pursuit of Artificial General Intelligence (AGI).
Regular users do not have access to Sora at present. OpenAI is collaborating with experts to assess the model for potential risks and harms. Additionally, access to Sora is being granted to filmmakers, designers, and artists for feedback and refinement before a public release.