open-mmlab/FoleyCrafter
[IJCV] FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
Uses a modular pipeline combining semantic video understanding (via adapter networks), text-to-audio diffusion (Auffusion base model), and temporal synchronization through timestamp detection to align sound effects with visual events. Incorporates a dedicated vocoder for high-quality audio synthesis and supports both semantic-driven generation and frame-level temporal alignment modes for precise audio-visual synchronization.
644 stars. No commits in the last 6 months.
Stars
644
Forks
66
Language
Python
License
Apache-2.0
Category
Last pushed
Jul 26, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/open-mmlab/FoleyCrafter"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
kyegomez/Sora
Implementation of the premier Text to Video model from OpenAI
ShubhamZade1997/Pixelle-Video
🎥 Create stunning short videos effortlessly with Pixelle-Video, an AI-driven engine designed for...
CharlieDreemur/AI-Video-Converter
AI Video Converter Based on ControlNet
abrahamjroy/ProcessedElectricSheepDreams
A native Application , With Agentic Support (MCP) for ultra fast AI image generation using a...
donahowe/AutoStudio
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation