pyvideotrans and SoniTranslate
Both tools are direct competitors offering similar core functionality for translating and dubbing video content.
About pyvideotrans
jianchang512/pyvideotrans
Translate the video from one language to another and embed dubbing & subtitles.
Combines speech recognition, LLM-based translation, and text-to-speech synthesis in a unified pipeline with support for speaker diarization and multi-role dubbing. Integrates pluggable ASR models (Faster-Whisper, Qwen, WhisperX), translation backends (DeepSeek, ChatGPT, Ollama), and TTS engines (Edge-TTS, F5-TTS, CosyVoice), enabling both cloud API and fully local offline workflows. Provides interactive editing checkpoints throughout the translation chain plus a CLI interface for headless batch processing and server deployment.
About SoniTranslate
R3gm/SoniTranslate
Synchronized Translation for Videos. Video dubbing
Extracts audio from videos using speaker diarization (Pyannote), translates transcribed content across 100+ languages, and synthesizes new audio with voice cloning to match original speaker characteristics. Built on Gradio for the web interface, it integrates Hugging Face models for speech recognition and generation, with GPU acceleration via CUDA for real-time processing. Deployable locally, on Google Colab, or via Hugging Face Spaces for browser-based video dubbing without manual subtitle editing.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work