Explore Projects

Discover 49 open source projects

Active filters (1):
Search: asrร—
Clear all

Showing 21-40 of 49 projects

Purfview/whisper-standalone-win

Standalone Windows executables for Whisper speech-to-text & diarization without Python setup.

2.9K
Stable
Desktop Model Runners
AI Voice & Speech
Whisper
#speech-to-text#whisper#faster-whisper

linto-ai/whisper-timestamped

An open-source library for multilingual automatic speech recognition with word-level timestamps and confidence.

2.8K
Stable
Python
AI Voice & Speech
CLI Tools
PyTorch
#speech-recognition#multilingual#transformers

coqui-ai/STT

An open-source deep learning toolkit for building Speech-to-Text models and deploying them easily.

2.6K
Archived
C++
Speech Recognition
API Frameworks
TensorFlow
#speech-recognition#deep-learning#asr

jameslyons/python_speech_features

This Python library provides common speech feature extraction functions for automatic speech recognition (ASR) tasks.

2.4K
Archived
Python
AI Voice & Speech
#speech-recognition#feature-extraction#mfcc

mravanelli/pytorch-kaldi

A PyTorch-based project for developing state-of-the-art speech recognition systems using the Kaldi toolkit.

2.4K
Archived
Python
Speech Recognition
API Frameworks
PyTorch
#speech-recognition#deep-learning#kaldi

harry0703/AudioNotes

A Python library that quickly extracts structured Markdown notes from audio and video content.

2.0K
Archived
Python
AI Voice & Speech
CLI Tools
Python
#asr#transcription#note-taking

QwenLM/Qwen3-ASR

ASR models for AI coding assistance

1.8K
Active
Python
AI Editors/Agents/Copilot
Vibe Coders
#Qwen3-ASR#ASR Models#AI Coding Tools

FireRedTeam/FireRedASR

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, with outstanding singing lyrics recognition.

1.8K
Active
Python
Speech Recognition
API Frameworks
Python
#asr#speech-recognition#conformer

k2-fsa/sherpa-ncnn

Real-time speech recognition and voice activity detection for offline use on multiple platforms.

1.6K
Stable
C++
AI Voice & Speech
Cross-Platform
#speech-recognition#voice-activity-detection#offline

wwbin2017/bailing

Bailing is an open-source AI voice assistant built with ASR, LLM, and TTS, supporting low-latency response on low-end devices.

1.6K
Experimental
Python
LLM Frameworks
AI Voice & Speech
Python
#asr#chatgpt#tts

FluidInference/FluidAudio

Frontier CoreML audio models for iOS and macOS apps with text-to-speech, speech-to-text, voice activity detection, and speaker diarization.

1.6K
Active
Swift
AI Voice & Speech
iOS
Swift
#text-to-speech#speech-to-text#voice-activity-detection

coqui-ai/open-speech-corpora

A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.

1.4K
Archived
AI Voice & Speech
Databases
#speech-recognition#speech-synthesis#speech-processing

lenML/Speech-AI-Forge

A Python-based project that provides a TTS API server and Gradio-based web UI for speech synthesis and voice generation.

1.4K
Active
Python
AI Voice & Speech
API Frameworks
Gradio
#text-to-speech#speech-synthesis#api-server

mkiol/dsnote

A Linux app for speech-to-text, text-to-speech, and machine translation, with offline capabilities.

1.4K
Active
C++
AI Voice & Speech
API Frameworks
#speech-recognition#speech-synthesis#machine-translation

k2-fsa/icefall

icefall is a Python library for building Automatic Speech Recognition (ASR) systems using Finite State Transducers (FSTs).

1.4K
Stable
Python
LLM Frameworks
API Frameworks
Python
#speech-recognition#asr#fst

tuya/TuyaOpen

Next-gen AI+IoT framework for fast IoT and AI agent hardware integration

1.4K
Active
C
LLM Frameworks
Arduino & Embedded
React
#iot#aiot#embedded

R3gm/SoniTranslate

Synchronized Translation for Videos: Automatic dubbing and subtitling for video content.

1.3K
Stable
Python
AI Voice & Speech
CMS & Content
#video-dubbing#speech-to-text#text-to-speech

byjlw/video-analyzer

A Python library that uses LLMs, computer vision, and speech recognition to analyze video content.

1.3K
Experimental
Python
Computer Vision
LLM Frameworks
#video-processing#llms#asr

ictnlp/StreamSpeech

An all-in-one model for offline and simultaneous speech recognition, translation, and synthesis.

1.2K
Experimental
Python
AI Voice & Speech
API Frameworks
Python
#speech-recognition#speech-translation#speech-synthesis

alphacep/vosk-server

A speech recognition server based on Vosk and Kaldi libraries, supporting WebSocket, gRPC, and WebRTC protocols.

1.2K
Experimental
Python
AI Voice & Speech
BaaS Platforms
Python
#speech-recognition#asr#kaldi

Stay in the loop

Get weekly updates on trending AI coding tools and projects.