Showing 41-60 of 130 projects
An open-source Chinese voice assistant project that supports ChatGPT-like conversational abilities and brain-computer interface integration.
A high-performance text-to-speech, speech-to-text, and speech-to-speech library for Apple Silicon devices.
An open-source, tokenizer-free text-to-speech (TTS) model for context-aware speech generation and voice cloning.
Orpheus-TTS is a high-quality, real-time text-to-speech library for creating human-sounding AI voices.
A Python library that generates audiobooks from eBooks, enabling developers to create audio content experiences.
Pre-trained text-to-speech models for various languages, made simple to use.
Inference and training library for high-quality text-to-speech (TTS) models.
A lightweight on-device TTS model for AI-powered voice applications and tools.
DiffSinger is a singing voice synthesis system using a shallow diffusion mechanism, enabling efficient TTS and SVS.
A Dockerized FastAPI wrapper for the Kokoro-82M text-to-speech model with CPU and GPU support.
An Android TTS app with Microsoft TTS engine, HTTP requests, local TTS engine support, and more features.
A real-time state-of-the-art speech synthesis library for TensorFlow 2, supporting multiple languages.
A large language model-powered virtual salesperson that can generate product descriptions to drive user purchases.
A lightweight, fast, and efficient text-to-speech library for developers who need to add voice functionality to their projects.
A comprehensive collection of research papers on automatic speech recognition, speech synthesis, and related topics.
A high-quality voice conversion tool focused on ease of use and performance for AI-powered audio applications.
A versatile WebUI for various AI-powered text-to-speech engines, enabling vibe coders to explore and utilize cutting-edge audio generation tools.
An unofficial PyTorch implementation of the audio LM VALL-E, a text-to-speech AI model.
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model
A Python/C library and toolkit for automatically synchronizing audio and text (forced alignment).
Get weekly updates on trending AI coding tools and projects.