LatentSync
LatentSync by ByteDance is a tool for creating lifelike lip-synced videos directly from audio input.
Overview
LatentSync is an innovative tool for creating lifelike lip-synced videos directly from audio input. Unlike old-school methods, it skips complex intermediate steps like 3D modeling or facial landmarks. Instead, it uses latent diffusion models, which focus on delivering high-quality, time-based consistency in every frame.
The system was developed by ByteDance, the company behind TikTok, in partnership with researchers at Beijing Jiaotong University.
The tool is available for download at GitHUB, for testing at FAL.ai and within ComfyUI (see wrappers links below).
The first tests are looking quite promising.
LatentSync now also comes with Gradio UI.
You can try adjusting the following inference parameters to achieve better results:
inference_steps
[20-50]: A higher value improves visual quality but slows down the generation speed.guidance_scale
[1.0-3.0]: A higher value improves lip-sync accuracy but may cause the video distortion or jitter.
Tags
Freeware Apache License 2.0 PC-based #Video & AnimationLinks
none
Generated on April 22, 2025:
Static, motionless video lip synced.
Generated on January 6, 2025:
none
Generated on January 5, 2025:
Take a bite of these muffins...
Compare ToolsGenerated on January 5, 2025:
Jazz blues song + digital illustration style video
Compare ToolsGenerated on January 5, 2025:
Kling's video + Udio's vocals.
Generated on January 5, 2025:
Useful Links
ComfyUI LatentSync Wrapper
Other
This node provides lip-sync capabilities in ComfyUI using ByteDance's LatentSync model. It allows you to synchronize video lips with audio input.
This page was last updated on April 19, 2025 at 8:01 PM