Showing 61-80 of 622 projects
A neural network library for speaker diarization, including speech activity detection, speaker change detection, and speaker embedding.
A C++ library for acoustic keyboard eavesdropping using microphone audio capture.
Python speech recognition library supporting multiple engines and APIs, both online and offline.
A collection of various audio effects plugins for PipeWire, a sound server for Linux.
A sound cloning tool that lets you use your voice or any sound to record audio, with a web interface.
Generates hierarchical audio-driven visual synthesis for portrait image animation
Desktop app for downloading videos and audio from hundreds of sites with support for various platforms.
An HTML5 media player library with support for various video and audio formats, as well as streaming protocols.
A fluent API for the FFMPEG media processing library, enabling developers to work with video and audio files.
A Python library for audio and music analysis, useful for developers working with audio-related applications.
An open-source C++ framework for building desktop and mobile applications, including audio plug-ins.
A generative model for creating music, implemented in Python with PyTorch.
Text-audio foundation model from Boson AI for vibe coders building AI-powered applications.
A full-featured audio/video downloader for Android using the yt-dlp library.
Mumble is an open-source, low-latency, high-quality voice chat software for gaming and communication.
A multilingual voice understanding model for AI-powered audio analysis and transcription.
This repository contains a diffusion model for generating expressive portrait videos from audio.
Automagically synchronize subtitles with video using audio alignment and speech detection.
Synchronous multiroom audio player for building audio streaming applications
An extensible, plugin-oriented, HTML5-first media player for the web
Get weekly updates on trending AI coding tools and projects.