Showing 341-360 of 368 projects
Unofficial PyTorch implementation of the Conformer model for speech recognition tasks.
A PyTorch-based library for zero-shot voice style transfer using only autoencoder loss.
A real-time speech recognition server built with the Kaldi toolkit and GStreamer framework.
A real-time text-to-speech model that can stream conversational audio in real-time.
GMTalker is a 3D digital human system that integrates speech recognition, speech synthesis, natural language understanding, and mouth animation for fast deployment on Windows, Linux, and Android.
AI-powered digital assistant that keeps your data private, with intelligent summaries and task tracking.
Adds support for very large language models (vLLMs) to IndexTTS, enabling faster AI-powered text-to-speech inference.
A high-quality, open-source text-to-speech library in Rust for developers to build AI-powered voice applications.
This repository contains a list of speech synthesis papers for developers interested in AI-powered voice and speech technology.
A Python GUI tool for offline transcription of speech recordings, including speaker diarization, using state-of-the-art machine learning models.
Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation.
A zero-shot multi-speaker text-to-speech (TTS) and voice conversion library for developers.
A Python tool that allows you to easily convert text to video using AI-powered text-to-speech and image generation.
A simple API for the VITS text-to-speech model, with additional features for vibe coders.
An open-source AI-powered virtual YouTuber (VTuber) platform built with Python for streaming on YouTube and Twitch.
A React component with AI-powered speech recognition and visual editing capabilities for video subtitles.
A GAN-based Mel-Spectrogram Inversion Network for high-quality text-to-speech synthesis.
A Python library that provides a wrapper around various speech quality metrics for audio processing and analysis.
A small fast portable speech synthesis system written in C.
Kuromoji is a Java-based Japanese morphological analyzer for natural language processing applications.
Get weekly updates on trending AI coding tools and projects.