Explore Projects

Discover 368 open source projects

Active filters (1):
Search: speechร—
Clear all

Showing 21-40 of 368 projects

FunAudioLLM/CosyVoice

Multilingual voice generation model with full-stack capabilities for TTS, training, and deployment

19.8K
Active
Python
AI Voice & Speech
Fine-tuning
PyTorch
#tts#voice-generation#multilingual

nari-labs/dia

An ultra-realistic text-to-speech model for generating natural-sounding dialogue and audio.

19.2K
Stable
Python
AI Voice & Speech
Python
#text-to-speech#dialogue-generation#natural-language-processing

index-tts/index-tts

An efficient zero-shot text-to-speech system with fine-grained control over the generated voice.

19.1K
Stable
Python
AI Voice & Speech
Python
#text-to-speech#zero-shot#voice-cloning

IDEA-Research/Grounded-Segment-Anything

A set of Jupyter Notebooks that combine Grounding DINO, Segment Anything, and Stable Diffusion for automatic detection, segmentation, and generation of anything in images.

17.4K
Archived
Jupyter Notebook
Computer Vision
Jupyter Notebook
#computer-vision#image-segmentation#image-generation

leon-ai/leon

An open-source personal assistant that provides AI-powered voice and text interactions.

17.0K
Active
TypeScript
AI Voice & Speech
Node
#open-source#virtual-assistant#speech-to-text

cjpais/Handy

A free, open-source, and extensible speech-to-text application that works offline.

16.9K
Active
TypeScript
React
#speech-to-text#accessibility#offline

NVIDIA-NeMo/NeMo

A scalable generative AI framework for researchers and developers

16.9K
Active
Python
React
#generative-ai#machine-learning#neural-networks

jianchang512/pyvideotrans

A Python library that translates videos from one language to another, with support for dubbing and subtitles.

16.4K
Active
Python
AI Voice & Speech
#speech-to-text#text-to-speech#video-translation

kaldi-asr/kaldi

An open-source speech recognition toolkit used for building speech recognition systems.

15.3K
Stable
Shell
Speech Recognition
#speech-recognition#speaker-identification#speaker-verification

modelscope/FunASR

A comprehensive speech recognition toolkit with state-of-the-art pretrained models for various speech tasks.

15.1K
Active
Python
AI Voice & Speech
PyTorch
#speech-recognition#voice-activity-detection#audio-visual-speech-recognition

neonbjb/tortoise-tts

A multi-voice text-to-speech (TTS) system with a focus on high-quality audio output.

14.8K
Archived
Jupyter Notebook
AI Voice & Speech
#text-to-speech#multi-voice#audio-quality

NVIDIA/DeepLearningExamples

A collection of state-of-the-art deep learning scripts for various AI/ML tasks, easily trainable and deployable.

14.7K
Archived
Jupyter Notebook
ML Ops
PyTorch
#deep-learning#computer-vision#natural-language-processing

SesameAILabs/csm

A conversational speech generation model for developers building AI-powered applications.

14.5K
Experimental
Python
LLM Frameworks
Python
#speech-generation#language-model#conversational-ai

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

14.3K
Stable
Jupyter Notebook
AI Voice & Speech
Node
#speech-recognition#voice-recognition#offline

SWivid/F5-TTS

Official code for a text-to-speech model that generates fluent and faithful speech with flow matching.

14.2K
Active
Python
AI Voice & Speech
Python
#text-to-speech#speech-synthesis#language-model

zhongyi-tong/electronic-wechat

An Electron-based WeChat client for macOS and Linux, providing a better user experience.

13.9K
Archived
JavaScript
Component Libraries (React)
Electron
#electron#linux#macos

Rudrabha/Wav2Lip

A lip sync generation tool that leverages AI to synchronize speech with video in the wild.

12.8K
Experimental
Python
AI Video & Image
Python
#computer-vision#speech-to-video#lip-sync

kmario23/deep-learning-drizzle

A comprehensive collection of deep learning, reinforcement learning, and machine learning resources for vibe coders.

12.8K
Archived
HTML
Machine Learning
#deep-learning#machine-learning#computer-vision

PaddlePaddle/PaddleSpeech

Open-source speech toolkit with state-of-the-art ASR, TTS, translation, and audio processing capabilities.

12.5K
Active
Python
AI Voice & Speech
Python
#speech-recognition#speech-synthesis#speech-translation

facebookresearch/seamless_communication

Foundational models for state-of-the-art speech and text translation

11.8K
Archived
Jupyter Notebook
LLM Frameworks
Jupyter Notebook
#speech-translation#text-translation#foundational-models
13...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.