Start a render
Wan 2.5 Text & Image to Video
Audio enabled
0:00 / 0:00

Wan 2.5 Text & Image to Video audio-enabled video example: Cinematic medieval…

This Wan 2.5 Text & Image to Video text to video example shows Cinematic medieval cliffside at night vertical. It highlights audio-enabled output with 5-second timing · 9:16 output.

Wan 2.5 Text & Image to VideoText to video5s9:16Enabled$0.65
Wan 2.5 Text & Image to VideoText to video5s9:16Audio

Prompt breakdown

Text-to-video prompt used to generate this render.

Cinematic medieval cliffside at night, vertical 9:16. A lone ranger in a weathered leather cloak stands against a windswept ridge, illuminated by cool moonlight with subtle rim-lighting on his silhouette. Camera perform…

Subject

Cinematic medieval cliffside at night, vertical 9:16. A lone ranger in a weathered leather cloak stands against a windswept ridge, illuminated by cool moonlight with subtle rim-lighting on his silhouette. Camera perform…

Workflow

Text to video

Camera

Audio Enabled

Output

5s · 9:16

Estimated price

$0.65

Audio

Enabled

Constraints

Text To Video, Audio Enabled

Show full prompt

Cinematic medieval cliffside at night, vertical 9:16. A lone ranger in a weathered leather cloak stands against a windswept ridge, illuminated by cool moonlight with subtle rim-lighting on his silhouette. Camera performs a slow forward dolly (steady, 35mm lens) while soft choral voices echo with natural outdoor reverb. His eyes lift to the horizon as he murmurs: ‘Every journey begins somewhere.’ Dust and parchment scraps drift through the air, captured with shallow depth of field and realistic film-grain.

Prompt improvement notes

Note 1

Keep the subject, camera move, lighting, duration, aspect ratio and audio requirement grouped so the render has one clear production brief.

Note 2

Change one variable at a time when cloning this prompt: model, duration, camera motion or reference input. That makes quality and price differences easier to compare.

Note 3

Add a short negative prompt if you need to block text overlays, logos, distorted hands, face warping or unwanted camera shake.

Compare this model

Review this example beside nearby engines before choosing a render path.

Why Wan 2.5 Text & Image to Video fits this shot

Wan 2.5 handles 5 or 10 second clips with optional background audio plus prompt expansion when you need extra detail.

Audio option

5s or 10s

480p–1080p

Key frames

Opening frame
Motion beat
Final shot

Related examples

View all examples