Explore Projects

Discover 49 open source projects

Active filters (1):

Search: asr×

Clear all

Showing 21-40 of 49 projects

Purfview/whisper-standalone-win

Standalone Windows executables for Whisper speech-to-text & diarization without Python setup.

2.9K

Stable

Desktop Model Runners

AI Voice & Speech

Whisper

#speech-to-text#whisper#faster-whisper

linto-ai/whisper-timestamped

An open-source library for multilingual automatic speech recognition with word-level timestamps and confidence.

2.8K

Stable

Python

AI Voice & Speech

CLI Tools

PyTorch

#speech-recognition#multilingual#transformers

coqui-ai/STT

An open-source deep learning toolkit for building Speech-to-Text models and deploying them easily.

2.6K

Archived

C++

Speech Recognition

API Frameworks

TensorFlow

#speech-recognition#deep-learning#asr

jameslyons/python_speech_features

This Python library provides common speech feature extraction functions for automatic speech recognition (ASR) tasks.

2.4K

Archived

Python

AI Voice & Speech

#speech-recognition#feature-extraction#mfcc

mravanelli/pytorch-kaldi

A PyTorch-based project for developing state-of-the-art speech recognition systems using the Kaldi toolkit.

2.4K

Archived

Python

Speech Recognition

API Frameworks

PyTorch

#speech-recognition#deep-learning#kaldi

harry0703/AudioNotes

A Python library that quickly extracts structured Markdown notes from audio and video content.

2.0K

Archived

Python

AI Voice & Speech

CLI Tools

Python

#asr#transcription#note-taking

QwenLM/Qwen3-ASR

ASR models for AI coding assistance

1.8K

Active

Python

AI Editors/Agents/Copilot

Vibe Coders

#Qwen3-ASR#ASR Models#AI Coding Tools

FireRedTeam/FireRedASR

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, with outstanding singing lyrics recognition.

1.8K

Active

Python

Speech Recognition

API Frameworks

Python

#asr#speech-recognition#conformer

k2-fsa/sherpa-ncnn

Real-time speech recognition and voice activity detection for offline use on multiple platforms.

1.6K

Stable

C++

AI Voice & Speech

Cross-Platform

#speech-recognition#voice-activity-detection#offline

wwbin2017/bailing

Bailing is an open-source AI voice assistant built with ASR, LLM, and TTS, supporting low-latency response on low-end devices.

1.6K

Experimental

Python

LLM Frameworks

AI Voice & Speech

Python

#asr#chatgpt#tts

FluidInference/FluidAudio

Frontier CoreML audio models for iOS and macOS apps with text-to-speech, speech-to-text, voice activity detection, and speaker diarization.

1.6K

Active

Swift

AI Voice & Speech

iOS

Swift

#text-to-speech#speech-to-text#voice-activity-detection

coqui-ai/open-speech-corpora

A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.

1.4K

Archived

AI Voice & Speech

Databases

#speech-recognition#speech-synthesis#speech-processing

lenML/Speech-AI-Forge

A Python-based project that provides a TTS API server and Gradio-based web UI for speech synthesis and voice generation.

1.4K

Active

Python

AI Voice & Speech

API Frameworks

Gradio

#text-to-speech#speech-synthesis#api-server

mkiol/dsnote

A Linux app for speech-to-text, text-to-speech, and machine translation, with offline capabilities.

1.4K

Active

C++

AI Voice & Speech

API Frameworks

#speech-recognition#speech-synthesis#machine-translation

k2-fsa/icefall

icefall is a Python library for building Automatic Speech Recognition (ASR) systems using Finite State Transducers (FSTs).

1.4K

Stable

Python

LLM Frameworks

API Frameworks

Python

#speech-recognition#asr#fst

tuya/TuyaOpen

Next-gen AI+IoT framework for fast IoT and AI agent hardware integration

1.4K

Active

LLM Frameworks

Arduino & Embedded

React

#iot#aiot#embedded

R3gm/SoniTranslate

Synchronized Translation for Videos: Automatic dubbing and subtitling for video content.

1.3K

Stable

Python

AI Voice & Speech

CMS & Content

#video-dubbing#speech-to-text#text-to-speech

byjlw/video-analyzer

A Python library that uses LLMs, computer vision, and speech recognition to analyze video content.

1.3K

Experimental

Python

Computer Vision

LLM Frameworks

#video-processing#llms#asr

ictnlp/StreamSpeech

An all-in-one model for offline and simultaneous speech recognition, translation, and synthesis.

1.2K

Experimental

Python

AI Voice & Speech

API Frameworks

Python

#speech-recognition#speech-translation#speech-synthesis

alphacep/vosk-server

A speech recognition server based on Vosk and Kaldi libraries, supporting WebSocket, gRPC, and WebRTC protocols.

1.2K

Experimental

Python

AI Voice & Speech

BaaS Platforms

Python

#speech-recognition#asr#kaldi

Stay in the loop

Get weekly updates on trending AI coding tools and projects.