OpenAI Introduces Sora: A New AI Model Turning Text into One-Minute Videos

Abu Zar Mishwani February 17, 2024

written by

Abu Zar Mishwani February 17, 2024

OpenAI Introduces Sora: A New AI Model Turning Text into One-Minute Videos

OpenAI, known for ChatGPT, unveiled its debut AI-driven text-to-video model, named Sora, on Thursday. According to the company, Sora can produce videos lasting up to one minute.

Sora surpasses competitors like Google’s Lumiere by offering longer video generation capabilities. It’s currently accessible to red teamers—cybersecurity professionals who rigorously test software—and select content creators.

In the future, OpenAI intends to integrate Coalition for Content Provenance and Authenticity (C2PA) metadata into its products once Sora is deployed as part of OpenAI’s lineup.

In a post on X (formerly known as Twitter), the company announced the AI video generator, stating, “Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions.”

Remarkably, Sora’s claimed video length surpasses its competitors by over tenfold. While Google’s Lumiere produces 5-second videos, Runway AI and Pika 1.0 offer even shorter durations with 4-second and 3-second videos, respectively.

OpenAI’s X account, along with CEO Sam Altman, shared numerous videos generated by Sora alongside the prompts that guided their creation.

These resulting videos showcase remarkable detail and fluid motion, a feature that sets them apart from other video generators currently available in the market.

According to the company, Sora can create intricate scenes featuring multiple characters, various camera angles, specific motions, and precise details of both subjects and backgrounds.

This capability stems from the text-to-video model’s utilization of both prompts and a comprehensive understanding of “how these things exist in the physical world.”

Sora operates as a diffusion model, using a transformer architecture similar to GPT models. Likewise, it processes and generates data using “patches,” a concept analogous to tokens in text-based models.

According to the company, patches consist of grouped videos and images packaged in small segments.

OpenAI trained the video generation model using this visual data across various durations, resolutions, and aspect ratios.

Beyond text-to-video generation, Sora possesses the capability to transform a still image into a dynamic video.

OpenAI acknowledged on its website that “The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterwards, the cookie may not have a bite mark.”

To prevent misuse of the AI tool for creating deceptive content like deepfakes, OpenAI is developing detection tools for identifying misleading content.

Additionally, the company intends to incorporate C2PA metadata into the generated videos, following a similar approach recently implemented for its DALL-E 3 model.

OpenAI is collaborating with red teamers, particularly domain experts specializing in combating misinformation, hateful content, and bias, to enhance the model’s capabilities.

Currently, Sora is accessible exclusively to red teamers and a select group of visual artists, designers, and filmmakers. This limited availability enables OpenAI to gather valuable feedback on the product’s performance and usability.

Abu Zar Mishwani

I am a Computer Science graduate from the University of Chitral. Alongside my studies, I work in IT and have a strong interest in technology, especially AI, with a deep passion for programming. I’m a determined individual with an entrepreneurial mindset and passionate about sharing my insights as a Tech Blogger. I offer my freelance services in Web Development, Content Writing, and SEO on Fiverr, where I am committed to delivering high-quality work to my clients. My blog (techabu.co) allows me to further explore and discuss the latest tech and business trends, including topics like Programming, WordPress, SEO, AI, and more.

OpenAI Introduces Sora: A New AI Model Turning Text into One-Minute Videos

Local and Global Functions in C++

First Developer Preview of Android 15 Released by Google: Check Out What’s New

Leave a Comment Cancel Reply

RECOMMENDED ARTICLES

AdBlock Detected