Showing 41-60 of 368 projects
A commenting system powered by GitHub Discussions, allowing developers to add comments to their projects.
A PyTorch-based toolkit for speech processing, including ASR, speaker recognition, and speech enhancement.
A lightweight, state-of-the-art text-to-speech (TTS) model for developers building AI-powered applications.
Spark-TTS is an open-source Python library for high-quality text-to-speech inference.
A fast, local neural text-to-speech system for developers building voice-enabled applications.
An offline-capable speech processing library for embedded systems, supporting a wide range of languages and platforms.
AudioGPT is a powerful tool for understanding and generating speech, music, sound, and talking heads using AI.
High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model
A privacy-focused, open-source AI meeting assistant with faster transcription, speaker diarization, and summarization, built on Rust.
A Python library that allows developers to use Microsoft Edge's online text-to-speech service without requiring Edge or an API key.
A deep learning library for text-to-speech applications, focusing on generating high-quality speech from text.
OpenVINO is an open-source toolkit for optimizing and deploying AI inference on a variety of hardware.
Moshi is an open-source speech-to-text foundation model and dialogue framework for building AI-powered voice apps.
A simultaneous speech-to-text model powered by the Whisper AI library for real-time transcription.
End-to-end speech processing toolkit for tasks like speech recognition, synthesis, translation, and more.
Amphion is a toolkit for Audio, Music, and Speech Generation to support reproducible research.
A robust, efficient, and low-latency speech-to-text library with advanced voice activity detection and wake word activation.
TextBlob is a simple, Pythonic library for natural language processing tasks like sentiment analysis, part-of-speech tagging, and more.
A fork of the so-vits-svc project with realtime support, improved interface, and more features for AI-powered voice conversion.
A neural network library for speaker diarization, including speech activity detection, speaker change detection, and speaker embedding.
Get weekly updates on trending AI coding tools and projects.