Explore Projects

Discover 368 open source projects

Active filters (1):

Search: speech×

Clear all

Showing 181-200 of 368 projects

ming024/FastSpeech2

FastSpeech 2 implementation for high-quality end-to-end text-to-speech

2.2K

Archived

Python

Prompt Engineering

None

React

#text-to-speech#natural-language-processing#machine-learning

react-native-voice/voice

A React Native library for voice recognition on iOS and Android, with online and offline support.

2.2K

Active

TypeScript

React Native

AI Voice & Speech

React Native

#voice-recognition#speech-recognition#android

SeanNaren/deepspeech.pytorch

An open-source library for building speech recognition models using the DeepSpeech2 architecture.

2.1K

Archived

Python

AI Voice & Speech

PyTorch

#speech-recognition#deep-learning#audio-processing

yan5xu/ququ

An open-source, privacy-first desktop voice assistant that integrates local speech recognition and configurable language models.

2.0K

Stable

JavaScript

AI Voice & Speech

AI App Builders

Electron

#ai-voice-recognition#speech-to-text#local-processing

TEN-framework/ten-vad

A lightweight, high-performance voice activity detector (VAD) library for conversational AI and real-time speech processing.

2.0K

Stable

AI Voice & Speech

API Frameworks

#audio#speech-processing#real-time

r9y9/deepvoice3_pytorch

A PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models.

2.0K

Archived

Python

Speech-to-Text

Speech-Synthesis

PyTorch

#speech-processing#text-to-speech#multi-speaker

cosin2077/easyVoice

An open-source text-to-speech tool supporting long-form text and multi-voice narration.

2.0K

Active

TypeScript

AI Voice & Speech

API Frameworks

TypeScript

#tts#text-to-speech#edge-tts

nobody132/masr

Mandarin Automatic Speech Recognition system using Python

2.0K

Archived

Python

React

#speech-recognition#mandarin-chinese#pytorch

crow-translate/crow-translate

A lightweight, open-source translator that supports multiple translation engines and features like OCR and text-to-speech.

1.9K

Archived

C++

General Utilities

CLI Tools

#translation#ocr#text-to-speech

julius-speech/julius

Open-source large vocabulary continuous speech recognition engine for various applications.

1.9K

Experimental

AI Voice & Speech

#speech-recognition#audio-processing#voice

miaomiaosoft/PandaOCR.Pro

A multi-functional OCR tool for text recognition, translation, text-to-speech, manga translation, and more.

1.9K

Stable

Computer Vision

File Storage

#ocr#text-recognition#translation

astorfi/lip-reading-deeplearning

Cross-modal lip reading using 3D convolutional neural networks for speech recognition.

1.9K

Archived

Python

Computer Vision

Speech Recognition

TensorFlow

#speech-recognition#computer-vision#deep-learning

jiesutd/NCRFpp

NCRF++: A Neural Sequence Labeling Toolkit for tasks like NER, POS tagging, and text chunking.

1.9K

Archived

Python

Agents & Orchestration

Named Entity Recognition

PyTorch

#neural-networks#sequence-labeling#chunking

alexpinel/Dot

A developer-focused platform for text-to-speech, RAG, and LLMs, with local-first architecture.

1.9K

Archived

JavaScript

LLM Frameworks

RAG & Vector

React

#text-to-speech#rag#llm

alan-ai/alan-sdk-ios

The Alan AI SDK for iOS enables developers to add voice AI and conversational interfaces to their mobile apps.

1.9K

Experimental

Objective-C

AI Voice & Speech

iOS

#alan-ai#voice-ai#speech-recognition

facebookresearch/denoiser

A real-time speech enhancement model that runs on a laptop CPU, useful for AI audio processing.

1.9K

Archived

Python

Audio & Speech

API Frameworks

PyTorch

#speech-enhancement#audio-processing#real-time

gpt-omni/mini-omni2

An open-source project towards developing a GPT-4-based AI assistant with vision, speech, and duplex capabilities.

1.9K

Archived

Python

LLM Frameworks

Agents & Orchestration

Python

#gpt-4#multimodal-ai#open-source

syhw/wer_are_we

An open-source library for tracking the state of the art and recent results in speech recognition research.

1.9K

Archived

Speech Recognition

#speech-recognition#wer#deep-learning

praat/praat.github.io

An open-source software for doing phonetics by computer, focused on speech analysis.

1.9K

Active

API Frameworks

Databases

#speech-analysis#acoustics#phonetics

ricky0123/vad

A TypeScript library for a voice activity detector (VAD) with a simple API for the browser.

1.9K

Active

TypeScript

AI Voice & Speech

Frontend Frameworks

TypeScript

#speech-to-text#voice-activity-detection#web-audio-api

1...911...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.