Explore Projects

Discover 368 open source projects

Active filters (1):
Search: speechร—
Clear all

Showing 181-200 of 368 projects

ming024/FastSpeech2

FastSpeech 2 implementation for high-quality end-to-end text-to-speech

2.2K
Archived
Python
Prompt Engineering
None
React
#text-to-speech#natural-language-processing#machine-learning

react-native-voice/voice

A React Native library for voice recognition on iOS and Android, with online and offline support.

2.2K
Active
TypeScript
React Native
AI Voice & Speech
React Native
#voice-recognition#speech-recognition#android

SeanNaren/deepspeech.pytorch

An open-source library for building speech recognition models using the DeepSpeech2 architecture.

2.1K
Archived
Python
AI Voice & Speech
PyTorch
#speech-recognition#deep-learning#audio-processing

yan5xu/ququ

An open-source, privacy-first desktop voice assistant that integrates local speech recognition and configurable language models.

2.0K
Stable
JavaScript
AI Voice & Speech
AI App Builders
Electron
#ai-voice-recognition#speech-to-text#local-processing

TEN-framework/ten-vad

A lightweight, high-performance voice activity detector (VAD) library for conversational AI and real-time speech processing.

2.0K
Stable
C
AI Voice & Speech
API Frameworks
C
#audio#speech-processing#real-time

r9y9/deepvoice3_pytorch

A PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models.

2.0K
Archived
Python
Speech-to-Text
Speech-Synthesis
PyTorch
#speech-processing#text-to-speech#multi-speaker

cosin2077/easyVoice

An open-source text-to-speech tool supporting long-form text and multi-voice narration.

2.0K
Active
TypeScript
AI Voice & Speech
API Frameworks
TypeScript
#tts#text-to-speech#edge-tts

nobody132/masr

Mandarin Automatic Speech Recognition system using Python

2.0K
Archived
Python
React
#speech-recognition#mandarin-chinese#pytorch

crow-translate/crow-translate

A lightweight, open-source translator that supports multiple translation engines and features like OCR and text-to-speech.

1.9K
Archived
C++
General Utilities
CLI Tools
#translation#ocr#text-to-speech

julius-speech/julius

Open-source large vocabulary continuous speech recognition engine for various applications.

1.9K
Experimental
C
AI Voice & Speech
#speech-recognition#audio-processing#voice

miaomiaosoft/PandaOCR.Pro

A multi-functional OCR tool for text recognition, translation, text-to-speech, manga translation, and more.

1.9K
Stable
Computer Vision
File Storage
#ocr#text-recognition#translation

astorfi/lip-reading-deeplearning

Cross-modal lip reading using 3D convolutional neural networks for speech recognition.

1.9K
Archived
Python
Computer Vision
Speech Recognition
TensorFlow
#speech-recognition#computer-vision#deep-learning

jiesutd/NCRFpp

NCRF++: A Neural Sequence Labeling Toolkit for tasks like NER, POS tagging, and text chunking.

1.9K
Archived
Python
Agents & Orchestration
Named Entity Recognition
PyTorch
#neural-networks#sequence-labeling#chunking

alexpinel/Dot

A developer-focused platform for text-to-speech, RAG, and LLMs, with local-first architecture.

1.9K
Archived
JavaScript
LLM Frameworks
RAG & Vector
React
#text-to-speech#rag#llm

alan-ai/alan-sdk-ios

The Alan AI SDK for iOS enables developers to add voice AI and conversational interfaces to their mobile apps.

1.9K
Experimental
Objective-C
AI Voice & Speech
iOS
iOS
#alan-ai#voice-ai#speech-recognition

facebookresearch/denoiser

A real-time speech enhancement model that runs on a laptop CPU, useful for AI audio processing.

1.9K
Archived
Python
Audio & Speech
API Frameworks
PyTorch
#speech-enhancement#audio-processing#real-time

gpt-omni/mini-omni2

An open-source project towards developing a GPT-4-based AI assistant with vision, speech, and duplex capabilities.

1.9K
Archived
Python
LLM Frameworks
Agents & Orchestration
Python
#gpt-4#multimodal-ai#open-source

syhw/wer_are_we

An open-source library for tracking the state of the art and recent results in speech recognition research.

1.9K
Archived
Speech Recognition
#speech-recognition#wer#deep-learning

praat/praat.github.io

An open-source software for doing phonetics by computer, focused on speech analysis.

1.9K
Active
C
API Frameworks
Databases
#speech-analysis#acoustics#phonetics

ricky0123/vad

A TypeScript library for a voice activity detector (VAD) with a simple API for the browser.

1.9K
Active
TypeScript
AI Voice & Speech
Frontend Frameworks
TypeScript
#speech-to-text#voice-activity-detection#web-audio-api
1...911...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.