danthelion/doc2audiobook
Convert text documents to high fidelity audio(books).
Supports 30+ input document formats (PDF, DOCX, EPUB, images with OCR, etc.) via textract, then synthesizes audio using Google Cloud's WaveNet models for natural-sounding speech. Runs containerized with Docker, mapping local input/output directories and requiring GCP authentication via service account credentials. Offers flexible voice selection across multiple languages and speaker profiles through command-line configuration.
204 stars. No commits in the last 6 months.
Stars
204
Forks
34
Language
Python
License
MIT
Category
Last pushed
Jan 17, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/danthelion/doc2audiobook"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NVIDIA-AI-Blueprints/pdf-to-podcast
Transform PDFs into AI podcasts for engaging on-the-go audio content.
tjunttila/pdf2video
A tool for making videos from PDF presentations.
chaonan99/ppt_presenter
Convert ppt to video with audio track, using text to speech synthesis
eminemahjoub/pdf-voice-reader
"PDF Reader: A Python application for seamless PDF viewing with enhanced text-to-speech capabilities."
hutchresearch/latex2speech
TeX2Speech is an application that turns LaTeX documents into spoken audio.