Explore Projects

Discover 368 open source projects

Active filters (1):
Search: speechร—
Clear all

Showing 81-100 of 368 projects

instillai/machine-learning-course

An open-source machine learning course with Python, covering algorithms, AI, and more.

7.1K
Archived
Python
Tutorials & Courses
Python
#machine-learning#artificial-intelligence#algorithms

moonshine-ai/moonshine

Fast and accurate automatic speech recognition (ASR) for edge devices

7.0K
Stable
Python
AI Voice & Speech
#speech-recognition#edge-computing#audio-processing

PaddlePaddle/models

Officially maintained deep learning models by PaddlePaddle, covering computer vision, NLP, speech, and more.

6.9K
Archived
Python
Computer Vision
Natural Language Processing
PaddlePaddle
#deep-learning#models#neural-network

pliang279/awesome-multimodal-ml

A comprehensive reading list for research topics in multimodal machine learning.

6.8K
Archived
Computer Vision
Natural Language Processing
#multimodal-learning#reading-list#machine-learning

TalAter/annyang

Speech recognition library for your web application, enabling voice interactions.

6.7K
Archived
JavaScript
Component Libraries (React)
AI Voice & Speech
React
#speech#speech-recognition#voice

NLPchina/ansj_seg

A high-performance Chinese text segmentation library with support for named entity recognition and part-of-speech tagging.

6.5K
Archived
Java
API Frameworks
ORMs & Query Builders
Java
#nlp#chinese#text-processing

flashlight/wav2letter

Facebook AI Research's end-to-end speech recognition toolkit written in C++.

6.4K
Active
C++
Speech Recognition
#speech-recognition#deep-learning#end-to-end

Blaizzy/mlx-audio

A high-performance text-to-speech, speech-to-text, and speech-to-speech library for Apple Silicon devices.

6.1K
Active
Python
AI Voice & Speech
CLI Tools
Apple MLX
#apple-silicon#speech-recognition#speech-synthesis

PaddlePaddle/PaddleX

PaddleX is an all-in-one development tool based on PaddlePaddle, providing AI pipelines for computer vision, NLP, and more.

6.1K
Active
Python
Computer Vision
Natural Language Processing
Python
#computer-vision#natural-language-processing#ocr

OpenBMB/VoxCPM

An open-source, tokenizer-free text-to-speech (TTS) model for context-aware speech generation and voice cloning.

6.0K
Active
Python
AI Voice & Speech
API Frameworks
PyTorch
#text-to-speech#voice-cloning#speech-synthesis

canopyai/Orpheus-TTS

Orpheus-TTS is a high-quality, real-time text-to-speech library for creating human-sounding AI voices.

6.0K
Stable
Python
AI Voice & Speech
Python
#llm#realtime#tts

snakers4/silero-models

Pre-trained text-to-speech models for various languages, made simple to use.

5.8K
Active
Jupyter Notebook
AI Voice & Speech
API Frameworks
PyTorch
#text-to-speech#speech-synthesis#pre-trained-models

argmaxinc/WhisperKit

An open-source on-device speech recognition library for Apple Silicon devices, built with Swift and Transformers.

5.7K
Active
Swift
AI Voice & Speech
iOS
#speech-recognition#transformers#inference

huggingface/parler-tts

Inference and training library for high-quality text-to-speech (TTS) models.

5.5K
Archived
Python
AI Voice & Speech
API Frameworks
Python
#text-to-speech#tts#speech-synthesis

promptslab/Awesome-Prompt-Engineering

This repository provides a curated collection of resources for Prompt Engineering with a focus on large language models like ChatGPT and GPT-3.

5.5K
Stable
Python
LLM Frameworks
Prompt Engineering
Python
#chatgpt#gpt-3#prompt-engineering

ibab/tensorflow-wavenet

A TensorFlow implementation of DeepMind's WaveNet paper for generating high-quality speech audio.

5.4K
Archived
Python
ML Ops
Computer Vision
TensorFlow
#speech-generation#audio-processing#deep-learning

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization using OpenAI Whisper

5.4K
Stable
Jupyter Notebook
React
#asr#speaker-diarization#speech-recognition

modelscope/FunClip

Open-source video speech recognition & clipping tool with LLM-based AI clipping capabilities

5.4K
Experimental
Python
LLM Frameworks
AI Voice & Speech
gradio
#speech-recognition#video-subtitles#llm

NVIDIA/tacotron2

A PyTorch implementation of Tacotron 2, a state-of-the-art text-to-speech model, with faster-than-realtime inference.

5.3K
Archived
Jupyter Notebook
Speech & Audio
Inference
PyTorch
#text-to-speech#audio-generation#machine-learning

claritylab/lucida

Lucida is a speech and vision based intelligent personal assistant built with Java.

4.8K
Archived
Java
AI Voice & Speech
Computer Vision
#personal-assistant#speech-recognition#computer-vision
1...46...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.