How ChatGPT Images 2.0 Works | ChatGPT Image Generation

DALL-E 3 (Retired May 2026)

Architecture

Separate diffusion model called by ChatGPT

Text in images

Frequently garbled or misspelled

Multi turn consistency

Each generation independent

Instruction following

Often ignored complex constraints

Context awareness

Only sees the prompt ChatGPT wrote

In context learning

Not supported

ChatGPT Images 2.0 (Current)

Architecture

Built into the language model natively

Text in images

Dramatically improved; accurate and readable even at small sizes

Multi turn consistency

Characters and styles persist across turns

Instruction following

Reliably follows multi constraint prompts

Context awareness

Sees full conversation, uploads, and history

In context learning

Learns from uploaded reference images

Images with consistency (Thinking Mode)

Objects per scene (max reliable)

New Pro tier (5x Plus)

Free tier (Instant Mode)

DALL-E 2 and DALL-E 3 have been retired

As of May 12, 2026, DALL-E 2 and DALL-E 3 are no longer available. All image generation in ChatGPT now uses ChatGPT Images 2.0 (gpt-image-2). Developers with existing DALL-E API integrations need to migrate to gpt-image-2.