Explore Projects

Discover 54 open source projects

Active filters (1):
Search: speakerร—
Clear all

Showing 21-40 of 54 projects

zzw922cn/awesome-speech-recognition-speech-synthesis-papers

A comprehensive collection of research papers on automatic speech recognition, speech synthesis, and related topics.

3.1K
Archived
AI Voice & Speech
#speech-recognition#speech-synthesis#language-modeling

tmoroney/auto-subs

AI subtitle generator for video with DaVinci Resolve integration, speaker diarization, runs locally.

2.9K
Active
TypeScript
AI Voice & Speech
Desktop Model Runners
OpenAI
#ai-subtitles#davinci-resolve#speaker-diarization

Purfview/whisper-standalone-win

Standalone Windows executables for Whisper speech-to-text & diarization without Python setup.

2.9K
Stable
Desktop Model Runners
AI Voice & Speech
Whisper
#speech-to-text#whisper#faster-whisper

modelscope/3D-Speaker

A library for single- and multi-modal speaker verification, recognition, and diarization.

2.8K
Stable
Python
Computer Vision
AI Voice & Speech
Python
#speaker-verification#speaker-recognition#speaker-diarization

linto-ai/whisper-timestamped

An open-source library for multilingual automatic speech recognition with word-level timestamps and confidence.

2.8K
Stable
Python
AI Voice & Speech
CLI Tools
PyTorch
#speech-recognition#multilingual#transformers

pschatzmann/ESP32-A2DP

A simple ESP32 Bluetooth A2DP library for building audio receiver or sender applications.

2.5K
Active
C++
Arduino & Embedded
API Frameworks
#bluetooth#audio#esp32

owntone/owntone-server

An open-source server that supports AirPlay, Apple Remote, Chromecast, and more for streaming audio on Linux/FreeBSD.

2.4K
Active
C
API Frameworks
Audio Server
#airplay#chromecast#apple-remote

muammar/mkchromecast

A Python library for casting audio and video from macOS and Linux to Google Cast and Sonos devices.

2.3K
Stable
Python
Backend & APIs
General Utilities
#chromecast#sonos#audio-streaming

idootop/open-xiaoai

An open-source project that enables voice control for Xiaomi AI speakers, unlocking new possibilities for developers.

2.2K
Active
Rust
AI Voice & Speech
Home Automation
#voice-control#smart-home#open-source

r9y9/deepvoice3_pytorch

A PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models.

2.0K
Archived
Python
Speech-to-Text
Speech-Synthesis
PyTorch
#speech-processing#text-to-speech#multi-speaker

scraly/developers-conferences-agenda

A community-driven platform for listing developer/tech conferences and CFPs worldwide.

1.9K
Active
JavaScript
CLI Tools
Full-Stack Frameworks
Node.js
#conferences#cfp#events

juanmc2005/diart

A Python package for building real-time, AI-powered audio applications like speaker diarization and voice activity detection.

1.9K
Experimental
Python
AI Voice & Speech
API Frameworks
#real-time#speaker-diarization#speaker-embedding

mkckr0/audio-share

An open-source C++ and Kotlin project that enables sharing a computer's audio over a network to an Android phone.

1.9K
Experimental
C++
Audio Capture and Playback
Android
Kotlin
#audio-streaming#android#linux

matteofigus/awesome-speaking

A curated list of resources for public speakers, including conference talks, tips, and tools.

1.9K
Archived
Tutorials & Courses
Awesome Lists
#public-speaking#conference#resources

wzpan/dingdang-robot

A Chinese voice assistant project for Raspberry Pi that uses AI and supports various smart speakers.

1.9K
Archived
Python
React
#authentication#streaming#real-time

wq2012/awesome-diarization

A curated list of resources for speaker diarization, a speech processing task to identify who spoke when.

1.9K
Experimental
Speech Processing
Awesome Lists
#speaker-diarization#speech-recognition#machine-learning

kaixxx/noScribe

Cutting-edge AI-powered audio transcription tool with a user-friendly GUI and support for speaker identification.

1.8K
Active
Python
LLM Frameworks
AI Voice & Speech
Python
#audio-transcription#speaker-identification#whisper

rustcc/RustPrimer

A Rust primer for beginners, requiring help from native English speakers to modify the translation.

1.8K
Archived
Rust
Books & Guides
API Frameworks
#rust#primer#learning

FluidInference/FluidAudio

Frontier CoreML audio models for iOS and macOS apps with text-to-speech, speech-to-text, voice activity detection, and speaker diarization.

1.6K
Active
Swift
AI Voice & Speech
iOS
Swift
#text-to-speech#speech-to-text#voice-activity-detection

google/uis-rnn

A library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, for supervised speaker diarization.

1.6K
Archived
Python
LLM Frameworks
API Frameworks
Python
#clustering#machine-learning#speaker-diarization

Stay in the loop

Get weekly updates on trending AI coding tools and projects.