readbeyond/aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

/ 100

Established

Leverages FFmpeg for audio processing and eSpeak for speech recognition to compute frame-level alignments, outputting results in 10+ formats including SMIL for EPUB 3, WebVTT for captions, and research formats like ELAN and TextGrid. Designed as both a Python library and command-line tool with batch job processing capabilities via ZIP containers, supporting multiple languages and text input types (plain text, HTML with ID markers, structured formats).

2,811 stars and 4,158 monthly downloads. No commits in the last 6 months. Available on PyPI.

Stale 6m No Dependents

Maintenance 0 / 25

Adoption 18 / 25

Maturity 25 / 25

Community 20 / 25

How are scores calculated?

Stars

2,811

Forks

270

Language

Python

License

AGPL-3.0

Featured in

Things AI Won't Tell You About Building a Voice App

Compare

aeneas and ForcedAlignment

Related tools

fgnt/meeteval

MeetEval - A meeting transcription evaluation toolkit

analyticsinmotion/werpy

🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for...

kahne/fastwer

A PyPI package for fast word/character error rate (WER/CER) calculation

tabahi/bournemouth-forced-aligner

Extract phoneme-level timestamps from speeh audio.

wq2012/SimpleDER

A lightweight library to compute Diarization Error Rate (DER).

Explore Voice AI Tools

All categories Trending Voice AI directory Insights