Start a render
Wan 2.6 Text & Image to Video
Audio enabled
0:00 / 0:00

Wan 2.6 Text & Image to Video camera movement example: studio push-in

This Wan 2.6 Text & Image to Video text to video example shows studio push-in. It highlights audio-enabled output and camera motion control with 10-second timing · 16:9 · 720p output.

Wan 2.6 Text & Image to VideoText to video10s16:9Enabled$1.30
Wan 2.6 Text & Image to VideoText to video10s16:9Audio

Prompt breakdown

Text-to-video prompt used to generate this render.

Wide 16:9 full-body unboxing video in a clean studio/kitchen setting. A person is fully visible (head-to-toe or at least head-to-knees) standing behind a minimalist tabletop. They unbox a small generic gadget from a pla…

Subject

Wide 16:9 full-body unboxing video in a clean studio/kitchen setting. A person is fully visible (head-to-toe or at least head-to-knees) standing behind a minimalist tabletop. They unbox a small generic gadget from a pla…

Workflow

Text to video

Camera

Push In

Output

10s · 16:9 · 720p

Estimated price

$1.30

Audio

Enabled

Constraints

Text To Video, Audio Enabled, Push In

Show full prompt

Wide 16:9 full-body unboxing video in a clean studio/kitchen setting. A person is fully visible (head-to-toe or at least head-to-knees) standing behind a minimalist tabletop. They unbox a small generic gadget from a plain matte cardboard box: peel the seal, open the lid, remove the inner tray, take out the device and accessories, and lay everything neatly on the table. The person occasionally lifts the item toward the camera for a closer look, then places it back down. Realism requirements: natural body proportions, stable identity, realistic skin and clothing fabric, no face warping, no unnatural limb bending. Hands must be highly realistic: correct finger count, natural grip, believable pressure/contact with the box and device, consistent shadows, no extra fingers, no “floating” objects. Keep object geometry stable, no wobbling background, minimal temporal flicker. Camera: single continuous shot, tripod-stable, slight cinematic push-in (very slow), eye-level or slightly above table height. Natural soft daylight, clean shadows, realistic materials and textures. No logos, no brand names, no watermarks. No subtitles. Optional on-screen title at the top (perfectly readable and stable, no jitter): "UNBOXING — FIRST LOOK"

Prompt improvement notes

Note 1

Keep the subject, camera move, lighting, duration, aspect ratio and audio requirement grouped so the render has one clear production brief.

Note 2

Change one variable at a time when cloning this prompt: model, duration, camera motion or reference input. That makes quality and price differences easier to compare.

Note 3

Add a short negative prompt if you need to block text overlays, logos, distorted hands, face warping or unwanted camera shake.

Compare this model

Review this example beside nearby engines before choosing a render path.

Why Wan 2.6 Text & Image to Video fits this shot

Wan 2.6 merges text, image, and reference-to-video in one card with multi-shot prompting and 720p/1080p tiers.

Text prompts

Image input

Reference video

Key frames

Opening frame
Motion beat
Final shot

Related examples

View all examples