Showing 301-320 of 368 projects
An all-in-one model for offline and simultaneous speech recognition, translation, and synthesis.
Praat in Python, a Python library for speech analysis and manipulation.
A speech recognition server based on Vosk and Kaldi libraries, supporting WebSocket, gRPC, and WebRTC protocols.
An all-in-one web UI for different audio-related neural networks, including text-to-speech, voice cloning, and generative music.
SincNet is a neural architecture for efficiently processing raw audio samples for speech and audio processing tasks.
A Python command-line client for the Whisper speech-to-text model by OpenAI, using the CTranslate2 library.
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!
A collection of resources for speech enhancement, speech separation, and sound source localization.
Open-source PyTorch implementation of an end-to-end automatic speech recognition (ASR) system.
A collection of high-quality open-source speech, audio, and codec models for building AI-powered speech applications.
LPCNet is an efficient neural speech synthesis library for developers building voice-based applications.
A Chinese text-to-speech engine supporting Cantonese, Tibetan and other languages.
Soprano is a Python library that provides ultra-realistic text-to-speech capabilities.
Fine-tune and deploy the Whisper speech recognition model with accelerated inference and support for various platforms.
A voice chat app built with Python that leverages AI-powered speech recognition and text-to-speech capabilities.
An open-source speech dialogue generation model that enables expressive dialogue speech synthesis in Chinese and English.
Unofficial PyTorch implementation of Google AI's VoiceFilter system for audio separation.
Detoxify is a Python library with trained models to detect toxic comments, built using Pytorch Lightning and Transformers.
This repository contains pre-trained Chinese speech models for developers working with speech AI.
PantoMatrix is a Python library for generating facial and body animations from speech, designed for vibe coders building AI-powered projects.
Get weekly updates on trending AI coding tools and projects.