Video Generation (Veo 3.1) | Gemini

What is AI Video Generation?

AI video generation creates short video clips from text prompts or reference images. You describe a scene, specify a style, and the AI generates a video clip (typically 5 to 20 seconds long) with realistic motion, lighting, and physics. This technology is still in its early stages compared to image generation, but it is advancing rapidly and already useful for creating short form content, prototyping ideas, and visual storytelling.

Common use cases

Creating short promotional clips, social media content, or animated explainers
Prototyping video concepts before investing in full production
Generating visual effects, transitions, or artistic video content

0 sec

Per generation (extendable via API)

Max resolution (Veo 3.1 Full)

Step 1: Describe your video

Type a detailed description including scene, action, lighting, camera movement, and mood

Step 2: Generation

Gemini creates an 8 second clip with synchronized audio

Step 3: Iterate

Describe changes like "make the camera pan left" or "add rain to the scene"

Step 4: Extend for longer narratives

Generate multiple clips and describe how they connect

Video generations per month by plan (Veo 3.1 Fast)

Veo 3.1 (Gemini)

Max duration

8 sec (extendable via API)

Resolution

Up to 1080p

Native audio

Yes

Storyboard mode

Minimum plan

AI Plus ($7.99/mo)

Standalone app

No (in Gemini chat)

Sora 2 (ChatGPT)