Explore Projects

Discover 54 open source projects

Active filters (1):

Search: speaker×

Clear all

Showing 21-40 of 54 projects

zzw922cn/awesome-speech-recognition-speech-synthesis-papers

A comprehensive collection of research papers on automatic speech recognition, speech synthesis, and related topics.

3.1K

Archived

AI Voice & Speech

#speech-recognition#speech-synthesis#language-modeling

tmoroney/auto-subs

AI subtitle generator for video with DaVinci Resolve integration, speaker diarization, runs locally.

2.9K

Active

TypeScript

AI Voice & Speech

Desktop Model Runners

OpenAI

#ai-subtitles#davinci-resolve#speaker-diarization

Purfview/whisper-standalone-win

Standalone Windows executables for Whisper speech-to-text & diarization without Python setup.

2.9K

Stable

Desktop Model Runners

AI Voice & Speech

Whisper

#speech-to-text#whisper#faster-whisper

modelscope/3D-Speaker

A library for single- and multi-modal speaker verification, recognition, and diarization.

2.8K

Stable

Python

Computer Vision

AI Voice & Speech

Python

#speaker-verification#speaker-recognition#speaker-diarization

linto-ai/whisper-timestamped

An open-source library for multilingual automatic speech recognition with word-level timestamps and confidence.

2.8K

Stable

Python

AI Voice & Speech

CLI Tools

PyTorch

#speech-recognition#multilingual#transformers

pschatzmann/ESP32-A2DP

A simple ESP32 Bluetooth A2DP library for building audio receiver or sender applications.

2.5K

Active

C++

Arduino & Embedded

API Frameworks

#bluetooth#audio#esp32

owntone/owntone-server

An open-source server that supports AirPlay, Apple Remote, Chromecast, and more for streaming audio on Linux/FreeBSD.

2.4K

Active

API Frameworks

Audio Server

#airplay#chromecast#apple-remote

muammar/mkchromecast

A Python library for casting audio and video from macOS and Linux to Google Cast and Sonos devices.

2.3K

Stable

Python

Backend & APIs

General Utilities

#chromecast#sonos#audio-streaming

idootop/open-xiaoai

An open-source project that enables voice control for Xiaomi AI speakers, unlocking new possibilities for developers.

2.2K

Active

Rust

AI Voice & Speech

Home Automation

#voice-control#smart-home#open-source

r9y9/deepvoice3_pytorch

A PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models.

2.0K

Archived

Python

Speech-to-Text

Speech-Synthesis

PyTorch

#speech-processing#text-to-speech#multi-speaker

scraly/developers-conferences-agenda

A community-driven platform for listing developer/tech conferences and CFPs worldwide.

1.9K

Active

JavaScript

CLI Tools

Full-Stack Frameworks

Node.js

#conferences#cfp#events

juanmc2005/diart

A Python package for building real-time, AI-powered audio applications like speaker diarization and voice activity detection.

1.9K

Experimental

Python

AI Voice & Speech

API Frameworks

#real-time#speaker-diarization#speaker-embedding

mkckr0/audio-share

An open-source C++ and Kotlin project that enables sharing a computer's audio over a network to an Android phone.

1.9K

Experimental

C++

Audio Capture and Playback

Android

Kotlin

#audio-streaming#android#linux

matteofigus/awesome-speaking

A curated list of resources for public speakers, including conference talks, tips, and tools.

1.9K

Archived

Tutorials & Courses

Awesome Lists

#public-speaking#conference#resources

wzpan/dingdang-robot

A Chinese voice assistant project for Raspberry Pi that uses AI and supports various smart speakers.

1.9K

Archived

Python

React

#authentication#streaming#real-time

wq2012/awesome-diarization

A curated list of resources for speaker diarization, a speech processing task to identify who spoke when.

1.9K

Experimental

Speech Processing

Awesome Lists

#speaker-diarization#speech-recognition#machine-learning

kaixxx/noScribe

Cutting-edge AI-powered audio transcription tool with a user-friendly GUI and support for speaker identification.

1.8K

Active

Python

LLM Frameworks

AI Voice & Speech

Python

#audio-transcription#speaker-identification#whisper

rustcc/RustPrimer

A Rust primer for beginners, requiring help from native English speakers to modify the translation.

1.8K

Archived

Rust

Books & Guides

API Frameworks

#rust#primer#learning

FluidInference/FluidAudio

Frontier CoreML audio models for iOS and macOS apps with text-to-speech, speech-to-text, voice activity detection, and speaker diarization.

1.6K

Active

Swift

AI Voice & Speech

iOS

Swift

#text-to-speech#speech-to-text#voice-activity-detection

google/uis-rnn

A library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, for supervised speaker diarization.

1.6K

Archived

Python

LLM Frameworks

API Frameworks

Python

#clustering#machine-learning#speaker-diarization

Stay in the loop

Get weekly updates on trending AI coding tools and projects.