Explore Projects

Discover 368 open source projects

Active filters (1):

Search: speech×

Clear all

Showing 21-40 of 368 projects

FunAudioLLM/CosyVoice

Multilingual voice generation model with full-stack capabilities for TTS, training, and deployment

19.8K

Active

Python

AI Voice & Speech

Fine-tuning

PyTorch

#tts#voice-generation#multilingual

nari-labs/dia

An ultra-realistic text-to-speech model for generating natural-sounding dialogue and audio.

19.2K

Stable

Python

AI Voice & Speech

Python

#text-to-speech#dialogue-generation#natural-language-processing

index-tts/index-tts

An efficient zero-shot text-to-speech system with fine-grained control over the generated voice.

19.1K

Stable

Python

AI Voice & Speech

Python

#text-to-speech#zero-shot#voice-cloning

IDEA-Research/Grounded-Segment-Anything

A set of Jupyter Notebooks that combine Grounding DINO, Segment Anything, and Stable Diffusion for automatic detection, segmentation, and generation of anything in images.

17.4K

Archived

Jupyter Notebook

Computer Vision

Jupyter Notebook

#computer-vision#image-segmentation#image-generation

leon-ai/leon

An open-source personal assistant that provides AI-powered voice and text interactions.

17.0K

Active

TypeScript

AI Voice & Speech

Node

#open-source#virtual-assistant#speech-to-text

cjpais/Handy

A free, open-source, and extensible speech-to-text application that works offline.

16.9K

Active

TypeScript

React

#speech-to-text#accessibility#offline

NVIDIA-NeMo/NeMo

A scalable generative AI framework for researchers and developers

16.9K

Active

Python

React

#generative-ai#machine-learning#neural-networks

jianchang512/pyvideotrans

A Python library that translates videos from one language to another, with support for dubbing and subtitles.

16.4K

Active

Python

AI Voice & Speech

#speech-to-text#text-to-speech#video-translation

kaldi-asr/kaldi

An open-source speech recognition toolkit used for building speech recognition systems.

15.3K

Stable

Shell

Speech Recognition

#speech-recognition#speaker-identification#speaker-verification

modelscope/FunASR

A comprehensive speech recognition toolkit with state-of-the-art pretrained models for various speech tasks.

15.1K

Active

Python

AI Voice & Speech

PyTorch

#speech-recognition#voice-activity-detection#audio-visual-speech-recognition

neonbjb/tortoise-tts

A multi-voice text-to-speech (TTS) system with a focus on high-quality audio output.

14.8K

Archived

Jupyter Notebook

AI Voice & Speech

#text-to-speech#multi-voice#audio-quality

NVIDIA/DeepLearningExamples

A collection of state-of-the-art deep learning scripts for various AI/ML tasks, easily trainable and deployable.

14.7K

Archived

Jupyter Notebook

ML Ops

PyTorch

#deep-learning#computer-vision#natural-language-processing

SesameAILabs/csm

A conversational speech generation model for developers building AI-powered applications.

14.5K

Experimental

Python

LLM Frameworks

Python

#speech-generation#language-model#conversational-ai

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

14.3K

Stable

Jupyter Notebook

AI Voice & Speech

Node

#speech-recognition#voice-recognition#offline

SWivid/F5-TTS

Official code for a text-to-speech model that generates fluent and faithful speech with flow matching.

14.2K

Active

Python

AI Voice & Speech

Python

#text-to-speech#speech-synthesis#language-model

zhongyi-tong/electronic-wechat

An Electron-based WeChat client for macOS and Linux, providing a better user experience.

13.9K

Archived

JavaScript

Component Libraries (React)

Electron

#electron#linux#macos

Rudrabha/Wav2Lip

A lip sync generation tool that leverages AI to synchronize speech with video in the wild.

12.8K

Experimental

Python

AI Video & Image

Python

#computer-vision#speech-to-video#lip-sync

kmario23/deep-learning-drizzle

A comprehensive collection of deep learning, reinforcement learning, and machine learning resources for vibe coders.

12.8K

Archived

HTML

Machine Learning

#deep-learning#machine-learning#computer-vision

PaddlePaddle/PaddleSpeech

Open-source speech toolkit with state-of-the-art ASR, TTS, translation, and audio processing capabilities.

12.5K

Active

Python

AI Voice & Speech

Python

#speech-recognition#speech-synthesis#speech-translation

facebookresearch/seamless_communication

Foundational models for state-of-the-art speech and text translation

11.8K

Archived

Jupyter Notebook

LLM Frameworks

Jupyter Notebook

#speech-translation#text-translation#foundational-models

13...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.