Wan 2.5 Text & Image to Video audio-enabled video example: street

Wan 2.5 Text & Image to VideoText to video10s26:15Audio

This Wan 2.5 Text & Image to Video text to video example shows street. It highlights audio-enabled output with 10-second timing · 26:15 output.

Prompt

Ultra-realistic handheld selfie shot, filmed on a modern smartphone. A 30-year-old person stands in natural daylight, holding the phone at arm’s length. Slight camera shake, natural breathing motion, soft shadows on the…

Show full prompt

Ultra-realistic handheld selfie shot, filmed on a modern smartphone. A 30-year-old person stands in natural daylight, holding the phone at arm’s length. Slight camera shake, natural breathing motion, soft shadows on the face, detailed skin texture. The background is a real location: a quiet street with parked cars and warm evening light. The person speaks directly to the camera with a casual, natural tone. Audio: include real recorded ambience (soft wind, distant cars), realistic microphone pickup from a phone’s front mic. Lip sync must match the following line: “I’ve had a long month, but today feels different. I’m ready for a fresh start.” Mood: grounded, authentic, documentary-style realism. No filters, no smoothing, no beauty enhancement.

Render details

Workflow

Text-to-video workflow

10-second render in 26:15

Audio-enabled output

Realistic styling

Scene focus: street

Engine

Wan 2.5 Text & Image to Video

Wan 2.5 handles 5 or 10 second clips with optional background audio plus prompt expansion when you need extra detail.

Audio option
5s or 10s
480p–1080p

Specs

Engine

Wan 2.5 Text & Image to Video

Mode

Text to video

Duration

10s

Aspect ratio

26:15

Audio

Enabled

Render cost

$0.65

Created

2025-11-16

Related examples

Recreate

Load this render in the workspace

Start from the same prompt and settings, then remix duration, aspect ratio, references, or audio.