Google Veo 3.1 audio-enabled video example: city close-up

Google Veo 3.1Text to video8s16:9Audio

This Google Veo 3.1 text to video example shows city close-up. It highlights audio-enabled output with 8-second timing · 16:9 output.

Prompt

Shot 1 (0–3 s): macro close-up of one earbud rotating slowly on a wooden desk, shallow depth of field, warm desk lamp glow. Shot 2 (3–6 s): medium shot of a young professional putting the earbuds in before stepping onto…

Show full prompt

Shot 1 (0–3 s): macro close-up of one earbud rotating slowly on a wooden desk, shallow depth of field, warm desk lamp glow. Shot 2 (3–6 s): medium shot of a young professional putting the earbuds in before stepping onto a busy city street, subtle bokeh lights. Shot 3 (6–8 s): close-up of the charging case clicking shut next to a laptop, soft logo reflection in the lid. Camera: smooth dolly moves between shots, handheld feel but not shaky. Lighting: evening, warm indoors transitioning to cool street light, gentle film grain. Audio: city ambience low in the mix, soft electronic music bed, short VO line: “Block the noise, keep the focus.” No subtitles. Negative: no brand names, no on-screen text, no extreme wide angles.

Render details

Workflow

Text-to-video workflow

8-second render in 16:9

Audio-enabled output

Close-up framing

Cinematic styling

Engine

Google Veo 3.1

Veo 3.1 now handles prompts, single-image animation, multi-reference guidance, first/last bridging, and clip extension in one engine.

Text prompts
Reference mode
Audio native

Specs

Engine

Google Veo 3.1

Mode

Text to video

Duration

8s

Aspect ratio

16:9

Audio

Enabled

Render cost

$4.16

Created

2025-11-22

Related examples

Recreate

Load this render in the workspace

Start from the same prompt and settings, then remix duration, aspect ratio, references, or audio.