Showing 21-40 of 54 projects
A comprehensive collection of research papers on automatic speech recognition, speech synthesis, and related topics.
AI subtitle generator for video with DaVinci Resolve integration, speaker diarization, runs locally.
Standalone Windows executables for Whisper speech-to-text & diarization without Python setup.
A library for single- and multi-modal speaker verification, recognition, and diarization.
An open-source library for multilingual automatic speech recognition with word-level timestamps and confidence.
A simple ESP32 Bluetooth A2DP library for building audio receiver or sender applications.
An open-source server that supports AirPlay, Apple Remote, Chromecast, and more for streaming audio on Linux/FreeBSD.
A Python library for casting audio and video from macOS and Linux to Google Cast and Sonos devices.
An open-source project that enables voice control for Xiaomi AI speakers, unlocking new possibilities for developers.
A PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models.
A community-driven platform for listing developer/tech conferences and CFPs worldwide.
A Python package for building real-time, AI-powered audio applications like speaker diarization and voice activity detection.
An open-source C++ and Kotlin project that enables sharing a computer's audio over a network to an Android phone.
A curated list of resources for public speakers, including conference talks, tips, and tools.
A Chinese voice assistant project for Raspberry Pi that uses AI and supports various smart speakers.
A curated list of resources for speaker diarization, a speech processing task to identify who spoke when.
Cutting-edge AI-powered audio transcription tool with a user-friendly GUI and support for speaker identification.
A Rust primer for beginners, requiring help from native English speakers to modify the translation.
Frontier CoreML audio models for iOS and macOS apps with text-to-speech, speech-to-text, voice activity detection, and speaker diarization.
A library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, for supervised speaker diarization.
Get weekly updates on trending AI coding tools and projects.