OpenAI & Google Text-To-Image AI Generator

OpenAI & Google Text-To-Image AI Generator

Just a day ago, news broke out around OpenAI's DALL-E 2 newest competitor, Google's Imagen. Below are high-level insights of what DALL-E 2 & Imagen are.

Let me just share how much I love the name DALL-E real quick....

Personally, I am IN LOVE with the name, DALL-E. It's a mix of Salvador Dali and WALL-E. Dali having the unique art style of surrealism, cubism, Dada, futurism, modern, etc. Him being very avant garde combined with the cute robot, WALL-E, who finds more meaning to life throughout his entire movie.

What is Text-To-Image AI?

  • AI that creates an image from scratch based on a text description

What is DALL-E 2? What happened to the original DALL-E?

  • Both are text-to-image AI models however DALLE-2 is just a new and improved version (higher accuracy, realism, resolution, etc.)
  • DALL-E  debuted Jan 2021, DALL-E 2 debuted April 2022
  • DALL-E 2 can edit photos (unsure if Google's Imagen can as it just hit the news)

What is Google's Imagen?

  • Google's text-to-image AI model. Outperforms DALL-E 2's precision, accuracy, resolution, etc.
  • A scientific paper has been published around it:
  • Currently not open to public... Tweaking still needs to be done as AI has to be ethical, inclusive, unbiased, etc.
  • Humans are very curious creatures.. And Disgusting...And harmful...And well you have to think the worst case scenario with people taking advantage of a poor Text-to-Image AI bot trying its best to pursue ethical means (Learn more below about the data being absorbed)

To learn more:

Imagen: Google introduces DALL-E 2 competition
With Imagen, Google follows OpenAI and shows that artificial intelligence can produce credible and useful images.
  • Good read with great visuals + charts
The dark secret behind those cute AI-generated animal images
Google Brain has revealed its own image-making AI, called Imagen. But don’t expect to see anything that isn’t wholesome.
  • High-level overview of text-to-photo and the data intake these AIs use


  • "DALL·E is a 12-billion parameter version of GPT-3 trained to generate images from text descriptions, using a dataset of text–image pairs" - OpenAI
  • DALL-E is a smaller version of GPT-3 and specifically trained to generate images
Transformers, Explained: Understand the Model Behind GPT-3, BERT, and T5
A quick intro to Transformers, a new neural network transforming SOTA in machine learning.
  • More information than you need to know. Perhaps I'll cover this some time

Other Sources:

Google claims its text-to-image AI delivers ‘unprecedented photorealism’ | Engadget
Imagen is the company’s version of OpenAI’s DALL-E, but it isn’t available to the public..
OpenAI’s DALL-E AI image generator can now edit pictures, too
OpenAI hopes to release it publicly after testing.
Imagen: Text-to-Image Diffusion Models
Google’s image generator rivals DALL-E in shiba inu drawing – TechCrunch
Google Research has publicized Imagen, a text-to-image diffusion-based generator built on large transformer language models.
DALL·E: Creating Images from Text
We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language.
DALL·E 2 is a new AI system that can create realistic images and art from a description in natural language.