Start a render
Wan 2.5 Text & Image to Video
Audio enabled
0:00 / 0:00

Wan 2.5 Text & Image to Video audio-enabled video example: city camera move

This Wan 2.5 Text & Image to Video text to video example shows city camera move. It highlights audio-enabled output with 10-second timing · 9:16 output.

Wan 2.5 Text & Image to VideoText to video10s9:16Enabled$0.65
Wan 2.5 Text & Image to VideoText to video10s9:16Audio

Prompt breakdown

Text-to-video prompt used to generate this render.

Ultra-realistic walking selfie shot filmed with a smartphone held in one hand. The person is speed-walking through a busy urban street in daylight. Camera movement is dynamic: fast steps, sudden micro-shakes, quick tilt…

Subject

Ultra-realistic walking selfie shot filmed with a smartphone held in one hand. The person is speed-walking through a busy urban street in daylight. Camera movement is dynamic: fast steps, sudden micro-shakes, quick tilt…

Workflow

Text to video

Camera

Audio Enabled

Output

10s · 9:16

Estimated price

$0.65

Audio

Enabled

Constraints

Text To Video, Audio Enabled, Camera Move

Show full prompt

Ultra-realistic walking selfie shot filmed with a smartphone held in one hand. The person is speed-walking through a busy urban street in daylight. Camera movement is dynamic: fast steps, sudden micro-shakes, quick tilts as the person avoids people and obstacles. Natural motion blur, realistic stabilization drift, shifting sunlight and shadows on their face. High-detail skin texture, real reflections in the eyes. The person speaks extremely fast, slightly out of breath, trying to explain something urgently while walking. Lip-sync must perfectly match the following rapid line: “Okay listen, I don’t have much time but everything’s happening way faster than I expected and I swear I’ll explain everything once I get there!” Audio: realistic city ambience (footsteps, passing cars, faint horns), wind hitting the phone mic, breath sounds, occasional clothing rustle. Keep the phone-mic quality: compressed, slightly distorted on loud peaks. Mood: energetic, chaotic, spontaneous. No filters, no beautification. Keep it raw and real.

Prompt improvement notes

Note 1

Keep the subject, camera move, lighting, duration, aspect ratio and audio requirement grouped so the render has one clear production brief.

Note 2

Change one variable at a time when cloning this prompt: model, duration, camera motion or reference input. That makes quality and price differences easier to compare.

Note 3

Add a short negative prompt if you need to block text overlays, logos, distorted hands, face warping or unwanted camera shake.

Compare this model

Review this example beside nearby engines before choosing a render path.

Why Wan 2.5 Text & Image to Video fits this shot

Wan 2.5 handles 5 or 10 second clips with optional background audio plus prompt expansion when you need extra detail.

Audio option

5s or 10s

480p–1080p

Key frames

Opening frame
Motion beat
Final shot

Related examples

View all examples