This is an AI video generation comparison for
image-to-video
prompt:
Starts with a close-up of a single purple-yellow slide resting upright on a semi-reflective floor, identical to the one shown. Another matching slide drops gently beside it, completing the pair. After a brief pause, two bare feminine feet enter the frame from above and step into the slides one by one, clearly showing the motion of putting them on. The camera follows this action smoothly, then slowly zooms out as the person walks from left to right, taking three relaxed steps before stopping and ...
Log in to see full prompt.
Tested: November 4, 2025
Very good. This model has solid idea of how the slides are to be put on, also knows to begin with bare feet and 2 slides and end with 2 feet inside 2 slides. Sounds simple but for some it is challenging.
Tested: November 4, 2025
This model isn't that great with putting on slides. In another test the woman just remains barefoot with footwear just sitting on the floor.
Tested: November 4, 2025
This is made with 2 frames. Very good.
Tested: November 4, 2025
Start-end frames used. But if the model doesn't know how to render the action, frames often won't help because in-between is what the issue is not the end result.
Tested: November 4, 2025
Seedance done a small mistake here when the first slide was coming on but in general it seems to have the capacity to render this flawlessly given more tries.
Tested: November 4, 2025
Frames or not PixVerse 5 just doesn't seem to know how people put slides on
Tested: November 4, 2025
The Pro variant of 2.3 still struggles with physics of putting on slides, just like the Fast version.
Tested: November 5, 2025
Ok this is quite funny
Tested: November 5, 2025
Wan is struggling with this one. Even when it does the beginning right, there are 2 extra slides. Tried 'putting/placing' slides on variation, tried adding 'Final shot is the woman wearing the slides smiling, the floor around her is clean having no other objects' - simply makes extra footware disappear in the end but doesnt prevent it appearing.
Tested: November 6, 2025
This model is totally not ready for this task.
So which AI video model can realistically show slides being put on?
Does the second matching slide drop beside it smoothly and without clipping or jitter?
Are the slides identical in design, color, and proportions to the reference image?
Do the bare feet enter from above - are they bare? - and do they step into the slides naturally, one after the other?
Is the stepping motion anatomically correct, with toes and arches bending realistically? Toes numebring 5 on each feet?
Do the slides deform slightly or react believably when the feet enter?
Is the floor semi-reflective, with consistent reflections and shadows under the slides?
Is the walking sequence smooth, with a couple of visible relaxed steps before stopping?
Does the woman turn naturally toward the camera at the end of the shot and smile?
Check out the results from GROK (Grok Imagine v0.9) vs Freepik (Kling 2.5 Turbo Standard) vs Wan (Online Platform) (Wan 2.2) vs Freepik (Kling 2.1) vs Freepik (Seedance 1.0 Pro Fast) vs PixVerse (PixVerse V5) vs Fal AI (Hailuo 2.3) vs Freepik (Hailuo 2.3) vs Wan (Online Platform) (Wan2.5 Preview) vs Fal AI (LTX-2) for similar or identical prompts side-by-side.
Model showcasing outfit with backpack