Showing 1-20 of 368 projects
Model framework for state-of-the-art ML models in text, vision, audio, and multimodal tasks.
Robust speech recognition model for multilingual tasks
Comprehensive Chinese NLP resource collection for developers
Real-time voice cloning using deep learning
Few-shot voice cloning and TTS with 1 min training data
Fine-tuning & RL for LLMs with optimized performance and memory use
High-performance C/C++ port of OpenAI's Whisper for speech recognition
🐸TTS is a deep learning toolkit for advanced Text-to-Speech generation with 1100+ languages and tools for training and fine-tuning models.
Text-to-audio model for generating realistic speech and sounds
Generates natural speech for dialogue scenarios
Voice cloning tool for real-time speech generation
Instant voice cloning model with tone color cloning and multi-lingual support
Singing Voice Conversion framework using AI
Offline speech-to-text engine for real-time on-device use
FishAudio-S1 is a high-quality open-source TTS model with voice cloning capabilities.
On-device multimodal LLM for vision, speech, and live streaming on phones
Open-source voice AI models for speech synthesis and recognition
Faster Whisper transcription with CTranslate2 for efficient speech-to-text
AI-powered dataset management and preprocessing library for ML projects
WhisperX for fast ASR with word-level timestamps and diarization
Get weekly updates on trending AI coding tools and projects.