Model Versions

Varies by model and plan1 min read

What is AI Video Generation?

AI video generation creates short video clips from text prompts or reference images. You describe a scene, specify a style, and the AI generates a video clip (typically 5 to 20 seconds long) with realistic motion, lighting, and physics. This technology is still in its early stages compared to image generation, but it is advancing rapidly and already useful for creating short form content, prototyping ideas, and visual storytelling.

Common use cases
  • Creating short promotional clips, social media content, or animated explainers
  • Prototyping video concepts before investing in full production
  • Generating visual effects, transitions, or artistic video content

Veo Release Timeline

Veo 2

2024

First generation. API access at $0.35/sec. No native audio.

Veo 3

Mid 2025

Native audio, 4K resolution, cinematic physics. Available in Whisk.

Veo 3.1

Late 2025

Improved audio and visual fidelity. Powers Flow and Gemini Ultra.

Veo 3.1 Fast

Early 2026

Speed optimized for everyday use in Gemini Plus and Pro.

Veo 2Veo 3Veo 3.1Veo 3.1 Fast
Native audioYes (improved)Yes (improved)
Max resolution1080p4K4KUp to 1080p
SpeedModerateSlowerSlowerFast
Where availableAPI onlyWhiskFlow, Gemini (Ultra)Gemini (Plus/Pro)
Best forAPI developersHigh quality creative workProfessional filmmakingEveryday video generation