NewsWorld

OpenAI’s Sora: A New Frontier for Text-to-Video Generation

OpenAI, the research organization behind the popular chatbot ChatGPT and the text-to-image generator DALL-E, has unveiled its latest AI model: Sora.

Sora is a text-to-video model that can create realistic videos up to a minute long based on text prompts. The model can also generate videos from still images or extend existing videos with new content.

Sora is a breakthrough in the field of generative AI, as it can simulate the physical world in motion and produce videos that match the user’s instructions on both subject and style. For example, Sora can generate a video of woolly mammoths walking in the snow, a movie trailer featuring a spaceman, or a cooking tutorial, just by typing a few words.

OpenAI’s CEO Sam Altman has also shared some examples of Sora-generated videos on X, in response to users’ prompts.

According to OpenAI’s blog post, the model is trained on a large corpus of videos that are both publicly available and licensed from copyright owners. It uses a neural network architecture that consists of a text encoder, a video encoder, a video decoder, and a discriminator.

“The model has a deep understanding of language, enabling it to accurately interpret prompts and generate compelling characters that express vibrant emotions.”


The text encoder converts the text prompt into a vector representation, the video encoder extracts features from the input video or image, the video decoder generates a new video frame based on the text and video features, and the discriminator evaluates the realism and relevance of the generated video.

OpenAI Sora is not perfect, however. The model may struggle to capture the physics or spatial details of more complex scenes, which can lead to illogical or unnatural results. For instance, it may generate a person running in the wrong direction on a treadmill, morph a subject in weird ways, or make it disappear altogether.

Moreover, Sora may raise ethical and social issues, such as the potential for misuse, abuse, and disinformation. OpenAI says that they will be,

“Taking several important safety steps ahead of making Sora available in OpenAI’s products.” They are also, “working with red teamers — domain experts in areas like misinformation, hateful content, and bias — who will be adversarially testing the model,” and are “building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora.”


The company also says that should it choose to build the model into a public-facing product, it will ensure that provenance metadata is included in the generated outputs.

Sora is currently only available to a few researchers, filmmakers, and video creators, who will provide feedback and test the model’s capabilities and limitations.

Related posts

Thai Human Rights Lawyer Convicted For Allegedly Insulting The King

Judith Beryl

Photos: Deadly Tornadoes Rip Through Indiana and Ohio, Leaving Trail of Devastation

Sylvia Eze

Video: Elon Musk Defends Ketamine Use for Mental Health, Claims It Benefits Tesla

Sylvia Eze

Royal Website Changes Sussex Profiles After Transition to Non-Working Roles

Sylvia Eze

Former Takeaway Worker, Jian Wen, Convicted in £3.4 Billion Bitcoin Money Laundering Scheme

Sylvia Eze

President Biden Supports Ban Of TikTok in the U.S.

Sylvia Eze