Showing 1-20 of 188 projects
๐ธTTS is a deep learning toolkit for advanced Text-to-Speech generation with 1100+ languages and tools for training and fine-tuning models.
Stability AI's advanced 4D video generation model for high-fidelity novel-view synthesis
Open-source voice AI models for speech synthesis and recognition
An open-source personal assistant that provides AI-powered voice and text interactions.
A scalable generative AI framework for researchers and developers
A library for consistently and controllably animating images into videos of characters.
A collection of state-of-the-art deep learning scripts for various AI/ML tasks, easily trainable and deployable.
Tone.js is a Web Audio framework for building interactive music and audio applications in the browser.
A high-performance latent diffusion model for generating high-resolution images.
Open-source speech toolkit with state-of-the-art ASR, TTS, translation, and audio processing capabilities.
An open-source AI avatar toolkit for offline video generation and digital human cloning.
Open-source voice synthesis studio powered by Qwen3-TTS
Sonic Pi is a live coding environment for creating music and sound using Ruby.
AudioKit is an audio synthesis, processing, and analysis platform for iOS, macOS, and tvOS
A fast, local neural text-to-speech system for developers building voice-enabled applications.
A Python library that allows developers to use Microsoft Edge's online text-to-speech service without requiring Edge or an API key.
End-to-end speech processing toolkit for tasks like speech recognition, synthesis, translation, and more.
Amphion is a toolkit for Audio, Music, and Speech Generation to support reproducible research.
A fork of the so-vits-svc project with realtime support, improved interface, and more features for AI-powered voice conversion.
Bert-VITS2 is a Python library that implements the VITS2 backbone with multilingual-BERT for speech synthesis and text-to-speech applications.
Get weekly updates on trending AI coding tools and projects.