ManaTTS-Persian-Speech-Dataset and GPTInformal-Persian-Speech-Dataset
These are complementary datasets designed for Persian text-to-speech development, where ManaTTS provides the larger foundation dataset (114+ hours) for training robust models while GPTInformal-Persian-Speech-Dataset offers a specialized, smaller dataset (6+ hours) with semantic labeling (subject metadata) for fine-tuning or domain-specific TTS applications.
About ManaTTS-Persian-Speech-Dataset
MahtaFetrat/ManaTTS-Persian-Speech-Dataset
ManaTTS is the largest open Persian speech dataset with 114+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
About GPTInformal-Persian-Speech-Dataset
MahtaFetrat/GPTInformal-Persian-Speech-Dataset
A free licensed Persian TTS dataset including 6+ hours of audio-text pairs with subject
Scores updated daily from GitHub, PyPI, and npm data. How scores work