Showing 1-20 of 23 projects
Real-time voice cloning using deep learning
Few-shot voice cloning and TTS with 1 min training data
Fine-tuning & RL for LLMs with optimized performance and memory use
๐ธTTS is a deep learning toolkit for advanced Text-to-Speech generation with 1100+ languages and tools for training and fine-tuning models.
Instant voice cloning model with tone color cloning and multi-lingual support
Multilingual voice generation model with full-stack capabilities for TTS, training, and deployment
An efficient zero-shot text-to-speech system with fine-grained control over the generated voice.
Converts e-books to audiobooks using AI voice cloning and supports over 1158 languages.
Fully automated AI video subtitle team with one-click subtitle cutting, translation, alignment, and dubbing.
Open-source speech toolkit with state-of-the-art ASR, TTS, translation, and audio processing capabilities.
Open-source voice synthesis studio powered by Qwen3-TTS
An open-source implementation of Microsoft's VALL-E X zero-shot text-to-speech model, enabling voice cloning and emotional speech synthesis.
Open-source full-song music generation foundation model for developers building AI-powered audio applications.
An open-source, tokenizer-free text-to-speech (TTS) model for context-aware speech generation and voice cloning.
A high-quality voice conversion tool focused on ease of use and performance for AI-powered audio applications.
A Python/Pytorch app for easily synthesising human voices
A GPT-SoVITS ONNX Inference Engine & Model Converter to enable voice cloning and text-to-speech for developers.
A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Official server for the MiniMax Model Context Protocol (MCP) that enables powerful AI capabilities like text-to-speech, image generation, and video generation.
Get weekly updates on trending AI coding tools and projects.