Duration & Output
- Durations: 4 s, 6 s, 8 s
- Output resolution: 720p (1280x720) or 1080p (1920x1080)
- Frame rate: 24 fps cinematic cadence
Generate short, cinematic videos with Google DeepMind's Veo 3.1 directly inside your MaxVideoAI workspace. Text-to-video, image-to-video, framing presets, and native audio with transparent per-second pricing.
Describe the scene, choose 4, 6 or 8 seconds, pick 16:9, 9:16 or 1:1, decide whether you want native audio, and let Veo 3.1 deliver polished footage for ads, explainers, campaigns, and client work.
Veo 3.1 – AI Text-to-Video & Image-to-Video in MaxVideoAI (720p/1080p, 4–8s)
Cinematic 8-second TV commercial in 16:9 with sound. From a tiny FPV-style camera flying indoors, we explore a bright, modern apartment. At…
View render →Why Veo 3.1 is powerful inside MaxVideoAI:
Best use cases
On paper, Veo 3.1 is DeepMind’s latest short-form video model with richer audio and tighter prompt adherence.
In MaxVideoAI, Veo 3.1 is exposed as a controlled, production-ready engine:
MaxVideoAI wraps all of this in a simple flow:
These specs describe Veo 3.1 exactly as you can use it today via MaxVideoAI – not theoretical lab demos.
Veo 3.1 in MaxVideoAI gives you framing presets, native audio control, seeds and Extend—so it behaves like a directable camera, not a black box.
See live Veo 3.1 renders powered by the same settings you have in MaxVideoAI.

Google Veo 3.1 · 8s
Shot 1 (0–3 s): macro close-up of one earbud rotating slowly on a wooden desk, shallow depth of field, warm desk lamp…
Recreate this shot →
Google Veo 3 Fast · 8s
Cinematic 8-second TV commercial in 16:9 with sound. From a tiny FPV-style camera flying indoors, we explore a bright, modern apartment. At…
Recreate this shot →Write prompts like a short director’s note, built around cinematography, subject, action, context and style.
Medium shot of [subject] in [environment], [clear action] over 8 seconds. Camera [movement], 16:9 at 1080p, cinematic look with [lighting and color]. Audio: [ambience] + [music/VO cue], no subtitles.
Drop that into MaxVideoAI, choose Veo 3.1, set duration/orientation, and you’re ready to render.
Pair Veo 3.1 with Nano Banana to lock style and iterate on motion.
Veo 3.1 can compress a mini-sequence into a single 6 or 8 second clip when you write a structured prompt.
Use seeds and Extend to keep framing consistent across beats.
Demo: One Sequenced Prompt (with Native Audio)
Shot 1 (0–3 s): macro close-up of one earbud rotating slowly on a wooden desk, shallow depth of field, warm desk lamp…
View render →8 second cinematic product story for wireless earbuds (16:9, 1080p)
Shot 1 (0–3 s): macro close-up of one earbud rotating slowly on a wooden desk, shallow depth of field, warm desk lamp glow.
Shot 2 (3–6 s): medium shot of a young professional putting the earbuds in before stepping onto a busy city street, subtle bokeh lights.
Shot 3 (6–8 s): close-up of the charging case clicking shut next to a laptop, soft logo reflection in the lid.
Camera: smooth dolly moves between shots, handheld feel but not shaky.
Lighting: evening, warm indoors transitioning to cool street light, gentle film grain.
Audio: city ambience low in the mix, soft electronic music bed, short VO line: “Block the noise, keep the focus.” No subtitles.
Negative: no brand names, no on-screen text, no extreme wide angles.
Lean into these constraints and Veo 3.1 becomes a repeatable, directable tool instead of a slot machine.
These guardrails keep Veo 3.1 usable and compliant for professional work.
Yes. MaxVideoAI routes Veo jobs through licensed DeepMind endpoints, so you can render from Europe, the UK and most supported regions without separate Veo contracts.
Yes. Veo 3.1 supports 16:9, 9:16 and 1:1. Choose 9:16 for Reels/TikTok/Shorts and keep key action centered.
Yes. Start from a single still (Image→Video) or use 1–4 reference images to guide a Text→Video clip.
Base clips are 4/6/8 s. Use Extend and chain clips in timelines; treat each extension as another 4–8 s block.
Use reference stills (Nano Banana or your brand library), keep character/setting descriptions consistent, and call out palette/lighting.
Compare pricing, latency, and output options across other engines available in MaxVideoAI.
google-veo
Use Veo 3.1 Fast for affordable, fast AI video generation. Up to 8-second clips with optional native audio—ideal for social formats and iterative testing.
Compare Veo 3.1 vs Veo 3.1 Fast →google-veo
Upload starting and ending frames, write a brief, and let Veo 3.1 animate seamless transitions with optional native audio. Swap to Fast mode for cheaper iterations.
Compare Veo 3.1 vs Veo 3.1 Fast →openai
Create rich AI-generated videos from text or image prompts using Sora 2. Native voice-over, ambient effects, and motion sync via MaxVideoAI.
Compare Veo 3.1 vs Veo 3.1 Fast →Veo 3.1 in MaxVideoAI gives you direct, pay-as-you-go access to DeepMind’s most controllable short-form video engine.
Framing and audio controls make it feel like a virtual camera, not just another black-box generator.
Open Generate