Showing 21-40 of 124 projects
End-to-end speech processing toolkit for tasks like speech recognition, synthesis, translation, and more.
Amphion is a toolkit for Audio, Music, and Speech Generation to support reproducible research.
Bert-VITS2 is a Python library that implements the VITS2 backbone with multilingual-BERT for speech synthesis and text-to-speech applications.
A Jupyter Notebook project for zero-shot speech editing and text-to-speech using AI models.
EmotiVoice is a multi-voice and prompt-controlled TTS engine built with PyTorch for developers working with AI voice tools.
An open-source implementation of Microsoft's VALL-E X zero-shot text-to-speech model, enabling voice cloning and emotional speech synthesis.
A PyTorch-based text-to-speech model that generates high-quality speech with expressive prosody.
A simple native web interface for ChatTTS text-to-speech synthesis with API support.
High-quality multi-lingual text-to-speech library supporting English, Spanish, French, Chinese, Japanese and Korean.
Zonos is an open-source, high-quality text-to-speech model for developers building AI-powered applications.
A high-performance text-to-speech, speech-to-text, and speech-to-speech library for Apple Silicon devices.
An open-source, tokenizer-free text-to-speech (TTS) model for context-aware speech generation and voice cloning.
Orpheus-TTS is a high-quality, real-time text-to-speech library for creating human-sounding AI voices.
Pre-trained text-to-speech models for various languages, made simple to use.
Inference and training library for high-quality text-to-speech (TTS) models.
This repository provides a curated collection of resources for Prompt Engineering with a focus on large language models like ChatGPT and GPT-3.
A PyTorch implementation of Tacotron 2, a state-of-the-art text-to-speech model, with faster-than-realtime inference.
DiffSinger is a singing voice synthesis system using a shallow diffusion mechanism, enabling efficient TTS and SVS.
A Python library for building real-time communication applications using AI tools like speech-to-text and text-to-speech.
A Dockerized FastAPI wrapper for the Kokoro-82M text-to-speech model with CPU and GPU support.
Get weekly updates on trending AI coding tools and projects.