Showing 21-40 of 49 projects
Standalone Windows executables for Whisper speech-to-text & diarization without Python setup.
An open-source library for multilingual automatic speech recognition with word-level timestamps and confidence.
An open-source deep learning toolkit for building Speech-to-Text models and deploying them easily.
This Python library provides common speech feature extraction functions for automatic speech recognition (ASR) tasks.
A PyTorch-based project for developing state-of-the-art speech recognition systems using the Kaldi toolkit.
A Python library that quickly extracts structured Markdown notes from audio and video content.
ASR models for AI coding assistance
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, with outstanding singing lyrics recognition.
Real-time speech recognition and voice activity detection for offline use on multiple platforms.
Bailing is an open-source AI voice assistant built with ASR, LLM, and TTS, supporting low-latency response on low-end devices.
Frontier CoreML audio models for iOS and macOS apps with text-to-speech, speech-to-text, voice activity detection, and speaker diarization.
A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.
A Python-based project that provides a TTS API server and Gradio-based web UI for speech synthesis and voice generation.
A Linux app for speech-to-text, text-to-speech, and machine translation, with offline capabilities.
icefall is a Python library for building Automatic Speech Recognition (ASR) systems using Finite State Transducers (FSTs).
Next-gen AI+IoT framework for fast IoT and AI agent hardware integration
Synchronized Translation for Videos: Automatic dubbing and subtitling for video content.
A Python library that uses LLMs, computer vision, and speech recognition to analyze video content.
An all-in-one model for offline and simultaneous speech recognition, translation, and synthesis.
A speech recognition server based on Vosk and Kaldi libraries, supporting WebSocket, gRPC, and WebRTC protocols.
Get weekly updates on trending AI coding tools and projects.