Showing 81-100 of 124 projects
This Python repository provides an evaluation framework for text-to-speech models, focusing on enabling vibe coder development with AI tools.
A Python/Pytorch app for easily synthesising human voices
Interface for OuteTTS models, a Python library for text-to-speech using transformer-based models.
A GPT-SoVITS ONNX Inference Engine & Model Converter to enable voice cloning and text-to-speech for developers.
An open-source virtual assistant for Ubuntu-based Linux distributions, focused on speech recognition and natural language processing.
An emotion-controllable text-to-speech model for vibe coders, built on the VITS framework.
A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.
A Python-based project that provides a TTS API server and Gradio-based web UI for speech synthesis and voice generation.
A Linux app for speech-to-text, text-to-speech, and machine translation, with offline capabilities.
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
A tool to record Microsoft Edge browser's text-to-speech (TTS) audio and output it as .wav files on Windows.
A Jupyter Notebook project that converts PDF documents to audio using AI-powered text-to-speech.
Synchronized Translation for Videos: Automatic dubbing and subtitling for video content.
speak.js is a text-to-speech library for JavaScript that uses the eSpeak speech synthesis engine.
This open-source Python library is a toolkit for building speech synthesis and voice conversion systems using deep learning.
Offline AI-powered inference engine for art, chatbots, and automated workflows focused on privacy and self-hosting
Official server for the MiniMax Model Context Protocol (MCP) that enables powerful AI capabilities like text-to-speech, image generation, and video generation.
An Android assistant app that uses voice recognition, text-to-speech, and AI skills to provide a personal assistant experience.
A state-of-the-art discrete acoustic codec model for audio language modeling with 40/75 tokens per second.
A CLI text-to-speech tool using the Kokoro model, supporting multiple languages and input formats.
Get weekly updates on trending AI coding tools and projects.