astorfi/3D-convolutional-speaker-recognition

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

/ 100

Established

Implements text-independent speaker verification using 3D-CNNs to jointly model temporal and spectral information from speech utterances, capturing both speaker identity and within-speaker variation. The architecture processes MFEC features (log-energies without DCT) extracted from overlapping 20ms windows, feeding multiple speaker utterances simultaneously through the network for direct speaker model creation rather than averaging d-vectors. Built on TensorFlow with Slim API, following a three-phase protocol: development (utterance-level speaker classification), enrollment (feature extraction for speaker model), and evaluation (test utterance comparison against stored models).

792 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

792

Forks

267

Language

Python

License

Apache-2.0

Related frameworks

felixbur/nkululeko

Machine learning speaker characteristics

claritychallenge/clarity

Clarity Challenge toolkit - software for building Clarity Challenge systems

juanmc2005/diart

A python package to build AI-powered real-time audio applications

wq2012/awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

hitachi-speech/EEND

End-to-End Neural Diarization

Explore ML Frameworks

All categories Trending ML Framework directory Insights