Explore Projects

Discover 368 open source projects

Active filters (1):

Search: speech×

Clear all

Showing 341-360 of 368 projects

sooftware/conformer

Unofficial PyTorch implementation of the Conformer model for speech recognition tasks.

1.1K

Active

Python

AI Voice & Speech

Backend Frameworks

PyTorch

#speech-recognition#convolution#transformer

auspicious3000/autovc

A PyTorch-based library for zero-shot voice style transfer using only autoencoder loss.

1.1K

Archived

Python

AI Voice & Speech

API Frameworks

PyTorch

#speech-synthesis#voice-conversion#generative-models

alumae/kaldi-gstreamer-server

A real-time speech recognition server built with the Kaldi toolkit and GStreamer framework.

1.1K

Archived

Python

AI Voice & Speech

#speech-recognition#real-time#open-source

nari-labs/dia2

A real-time text-to-speech model that can stream conversational audio in real-time.

1.1K

Stable

Python

AI Voice & Speech

Python

#tts#streaming#real-time

feima09/GMTalker

GMTalker is a 3D digital human system that integrates speech recognition, speech synthesis, natural language understanding, and mouth animation for fast deployment on Windows, Linux, and Android.

1.1K

Active

Python

AI Voice & Speech

Computer Vision

Python

#3d-human#speech-recognition#speech-synthesis

matthiasn/lotti

AI-powered digital assistant that keeps your data private, with intelligent summaries and task tracking.

1.1K

Active

Dart

AI Voice & Speech

Component Libraries (Flutter)

Flutter

#ai-assistant#private-data#speech-recognition

Ksuriuri/index-tts-vllm

Adds support for very large language models (vLLMs) to IndexTTS, enabling faster AI-powered text-to-speech inference.

1.1K

Stable

Python

LLM Frameworks

Inference

Python

#text-to-speech#llm#inference

VOICEVOX/voicevox_core

A high-quality, open-source text-to-speech library in Rust for developers to build AI-powered voice applications.

1.1K

Active

Rust

AI Voice & Speech

API Frameworks

Rust

#text-to-speech#open-source#ai-voice

wenet-e2e/speech-synthesis-paper

This repository contains a list of speech synthesis papers for developers interested in AI-powered voice and speech technology.

1.1K

Archived

AI Voice & Speech

#speech-synthesis#ai-voice#text-to-speech

JuergenFleiss/aTrain

A Python GUI tool for offline transcription of speech recordings, including speaker diarization, using state-of-the-art machine learning models.

1.1K

Active

Python

AI Voice & Speech

CLI Tools

#speech-recognition#transcription#diarization

ddlBoJack/emotion2vec

Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation.

1.1K

Archived

Python

Speech Emotion Recognition

API Frameworks

PyTorch

#speech-emotion-recognition#self-supervised-learning#feature-extraction

Edresson/YourTTS

A zero-shot multi-speaker text-to-speech (TTS) and voice conversion library for developers.

1.1K

Archived

Jupyter Notebook

AI Voice & Speech

Backend Frameworks

Jupyter Notebook

#speech-synthesis#tts#voice-conversion

bravekingzhang/text2video

A Python tool that allows you to easily convert text to video using AI-powered text-to-speech and image generation.

1.0K

Archived

Python

AI Image & Video

AI Voice & Speech

#text-to-video#stable-diffusion#edge-tts

Artrajz/vits-simple-api

A simple API for the VITS text-to-speech model, with additional features for vibe coders.

1.0K

Stable

Python

AI Voice & Speech

API Clients & Testing

#tts#vits#text-to-speech

ardha27/AI-Waifu-Vtuber

An open-source AI-powered virtual YouTuber (VTuber) platform built with Python for streaming on YouTube and Twitch.

1.0K

Archived

Python

AI Voice & Speech

Animation & Motion

Python

#ai-vtuber#virtual-youtuber#streaming

x007xyz/flycut-caption

A React component with AI-powered speech recognition and visual editing capabilities for video subtitles.

1.0K

Active

TypeScript

AI Voice & Speech

Component Libraries (React)

React

#video-subtitles#ai-speech-recognition#visual-editing

descriptinc/melgan-neurips

A GAN-based Mel-Spectrogram Inversion Network for high-quality text-to-speech synthesis.

1.0K

Archived

Python

Speech Synthesis

GANs

PyTorch

#text-to-speech#speech-synthesis#deep-learning

aliutkus/speechmetrics

A Python library that provides a wrapper around various speech quality metrics for audio processing and analysis.

1.0K

Archived

Python

AI Voice & Speech

#audio-processing#speech-quality-metrics#mos-net

festvox/flite

A small fast portable speech synthesis system written in C.

1.0K

Archived

AI Voice & Speech

#speech-synthesis#text-to-speech#portable

atilika/kuromoji

Kuromoji is a Java-based Japanese morphological analyzer for natural language processing applications.

1.0K

Archived

Java

NLP Library

API Frameworks

#japanese#morphological-analysis#part-of-speech-tagging

1...1719

Stay in the loop

Get weekly updates on trending AI coding tools and projects.