ManaTTS-Persian-Speech-Dataset and GPTInformal-Persian-Speech-Dataset

These are complementary datasets designed for Persian text-to-speech development, where ManaTTS provides the larger foundation dataset (114+ hours) for training robust models while GPTInformal-Persian-Speech-Dataset offers a specialized, smaller dataset (6+ hours) with semantic labeling (subject metadata) for fine-tuning or domain-specific TTS applications.

Maintenance 2/25
Adoption 8/25
Maturity 16/25
Community 10/25
Maintenance 2/25
Adoption 5/25
Maturity 9/25
Community 7/25
Stars: 49
Forks: 5
Downloads:
Commits (30d): 0
Language: Jupyter Notebook
License: MIT
Stars: 10
Forks: 1
Downloads:
Commits (30d): 0
Language:
License: MIT
Stale 6m No Package No Dependents
Stale 6m No Package No Dependents

About ManaTTS-Persian-Speech-Dataset

MahtaFetrat/ManaTTS-Persian-Speech-Dataset

ManaTTS is the largest open Persian speech dataset with 114+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.

About GPTInformal-Persian-Speech-Dataset

MahtaFetrat/GPTInformal-Persian-Speech-Dataset

A free licensed Persian TTS dataset including 6+ hours of audio-text pairs with subject

Scores updated daily from GitHub, PyPI, and npm data. How scores work