OpenAI, the unicorn behind the boom of the Generative AI industry, has unveiled DALL-E 3, the latest iteration of its image generator. With its ChatGPT controller, the upgrade provides top quality images in response to natural-language prompts—and launches with ethical controls in place.
Image generators like DALL-E, MidJourney, and Stable Diffusion have opened new creative frontiers for artists and casual users since the AI boom late last year. By translating text prompts into stunning visuals, they offer glimpses of machine interpretations of human creativity. Now, OpenAI aims to push boundaries further with DALL-E 3, a model that could put it back in direct competition with other leaders of the industry.
Unveiled early today, DALL-E 3 demonstrates massive improvements in accurately depicting detailed textual descriptions. Unlike previous iterations, it adheres closely to complex prompts without requiring huge prompt-engineering tweaks or other complicated prompting tricks. The new system also excels at capturing relationships between objects and generating photorealistic human details like hands and reflections.
When outputs from the same prompts in DALL-E 2 and DALL-E 3 are compared, the latter produces markedly sharper and more precise images. It can render extremely realistic depictions of scenes while getting textures, lighting, and backgrounds right. And it seems pretty capable of generating text and integrating it into its images—something that remains a problem for even the most powerful AI image generators to date.
DALL-E 3 is built on top of ChatGPT, allowing users to iteratively refine prompts through conversational exchanges. Early leaked samples hint at blazingly fast iteration capabilities. As Decrypt previously reported, YouTuber MattVidPro called an earlier beta of DALL-E 3 “insane” and asserted that not even MidJourney’s upcoming version could compete.
However, availability remains tightly limited to around 400 testers and OpenAI says its new model will be released “soon.”
For now, users can create images with DALL-E 2 using plugins with ChatGPT Plus. Those who don’t pay for a subscription will have to deal with restrictions like this:
The journey to this point has not been without its bumps. During its beta testing phase, the model was noted for its uncensored nature, capable of generating content that ranged from nudity to gore and violence. This raised eyebrows and stirred concerns about the potential misuse of such technology. But OpenAI seems to have taken these concerns to heart, implementing features in DALL-E 3 that prevent the generation of content that could be considered violent, adult, or hateful, ensuring a safer user experience.
One of such measures is the assembly of a team of experts “to help inform our risk assessment and mitigation efforts in areas like propaganda and misinformation.”
Concerns around AI art persist, especially regarding inappropriate or unethical content. While OpenAI removed filters during testing, the company is exploring strategies to prevent misuse in public versions. It will also make it easier to identify images generated with its tool. This could prevent deepfake spreading and potentially identify the origin of an image in case someone bypasses the model’s native censorship.
OpenAI is also aware of concerns over the legal use of human artwork for training its model and came up with an answer to a more ethical generator. DALL-E 3 won’t reproduce content when asked to mimic living artists, and OpenAI will enable opt-outs for creators. This addresses backlash from artists like Greg Rutkowski, who argue that AI copying their style without consent is unethical.
Major lawsuits have also been filed, including from author George R.R. Martin accusing OpenAI of improper use of copyrighted material.
OpenAI didn’t immediately respond to a request for comments by Decrypt.