Showing 21-40 of 228 projects
Converts e-books to audiobooks using AI voice cloning and supports over 1158 languages.
An end-to-end realtime stack for connecting humans and AI, built with Go and WebRTC.
An open-source personal assistant that provides AI-powered voice and text interactions.
Fully automated AI video subtitle team with one-click subtitle cutting, translation, alignment, and dubbing.
A comprehensive speech recognition toolkit with state-of-the-art pretrained models for various speech tasks.
A multi-voice text-to-speech (TTS) system with a focus on high-quality audio output.
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
A mobile and web client for Codex and Claude Code, with realtime voice, encryption, and full-featured functionality.
Open-source speech toolkit with state-of-the-art ASR, TTS, translation, and audio processing capabilities.
An open-source project that connects the Xiaomi AI speaker to ChatGPT and Douban, turning it into a custom voice assistant.
Open-source voice synthesis studio powered by Qwen3-TTS
A PyTorch-based toolkit for speech processing, including ASR, speaker recognition, and speech enhancement.
A fast, local neural text-to-speech system for developers building voice-enabled applications.
An open-source framework for building voice and multimodal conversational AI applications.
Open-source framework for building conversational voice AI agents
Production ready AI toolkit for local AI inference
A real-time microphone noise suppression tool for Linux developers, built with Go.
Moshi is an open-source speech-to-text foundation model and dialogue framework for building AI-powered voice apps.
End-to-end speech processing toolkit for tasks like speech recognition, synthesis, translation, and more.
Amphion is a toolkit for Audio, Music, and Speech Generation to support reproducible research.
Get weekly updates on trending AI coding tools and projects.