Multimodal input stackDetails
Seedance 2.0 accepts text, image, audio, and video references.
- Text instructions + multimodal references
- Up to 9 image references
- Up to 3 video references
- Up to 3 audio references
bytedance model
Seedance 2.0 is ByteDance's next-gen AI video model focused on cinematic motion, multi-shot continuity, and native audio generated in sync with visuals. This page is a pre-launch overview: specs, best use cases, and prompt templates so you can plan workflows before release.
Best for Action sequences with more believable physics and interaction, Multi-shot ads with consistent product and scene continuity, and Music-led visuals with synced ambience and SFX cues.
Pay-as-you-go on MaxVideoAI · Price shown before you generate (at launch)

Use cases
What makes Seedance 2.0 different
The limits that shape your renders.
Seedance 2.0 accepts text, image, audio, and video references.
Model messaging emphasizes cinematic control and multi-shot outputs.
Seedance 2.0 tends to reward shot timing, camera verbs, and audio cues. Keep dialogue short, pin SFX to visible actions, and use references to lock continuity.
Tip: duration + aspect ratio are set in the UI — your prompt controls subject, action, camera, lighting, style, and sound.
Fast ideation with one cinematic beat.
Quick = iterate concept and mood.
Template (copy/paste)
[Subject] in [setting]. Camera: [move + lens feel]. Lighting: [style]. Action: [one clear beat]. Audio: [ambience + one SFX cue]. (<=15s, choose aspect ratio in UI.) Example: Handheld UGC unboxing at a kitchen table. Slow push-in, natural daylight. She peels the seal, smiles, turns the bottle to camera. Room tone + packaging crinkle + soft click when cap opens.
Example
Handheld smartphone UGC clip of a woman unboxing a new skincare bottle at a kitchen table. She peels the seal, smiles, and turns the bottle toward camera. Soft window daylight, natural colors, subtle room tone + packaging crinkle.
Not sure if Seedance 2.0 is the best fit for your shot? These side-by-side comparisons break down the tradeoffs — price per second, resolution, audio, speed, and motion style — so you can pick the right engine fast.
Each page includes real outputs and practical best-use cases.
openai
Create rich AI-generated videos from text or image prompts using Sora 2. Native voice-over, ambient effects, and motion sync via MaxVideoAI.
Compare Seedance 2.0 vs OpenAI Sora 2 →pika
Generate stylized AI video from prompts or animate uploaded stills using Pika 2.2. Perfect for short-form loops without audio via MaxVideoAI.
Compare Seedance 2.0 vs Pika 2.2 Text & Image to Video →bytedance
Generate Seedance 1.5 Pro clips with cinematic motion, camera lock, and native audio. Supports text-to-video or image-to-video up to 12s.
Compare Seedance 2.0 vs Seedance 1.5 Pro →Seedance 2.0 is ByteDance's AI video model focused on cinematic motion, multi-shot continuity, and native audio generation.
Launch materials highlight outputs up to 15 seconds per generation.
Yes. Native audio is highlighted in official messaging, with audio generated alongside video for better sync.
This page is pre-launch. Availability and pricing will be confirmed at launch (official date TBA).
No. The model page can be indexed for discovery, but runtime remains locked until release.