Explore Projects

Discover 368 open source projects

Active filters (1):

Search: speech×

Clear all

Showing 81-100 of 368 projects

instillai/machine-learning-course

An open-source machine learning course with Python, covering algorithms, AI, and more.

7.1K

Archived

Python

Tutorials & Courses

Python

#machine-learning#artificial-intelligence#algorithms

moonshine-ai/moonshine

Fast and accurate automatic speech recognition (ASR) for edge devices

7.0K

Stable

Python

AI Voice & Speech

#speech-recognition#edge-computing#audio-processing

PaddlePaddle/models

Officially maintained deep learning models by PaddlePaddle, covering computer vision, NLP, speech, and more.

6.9K

Archived

Python

Computer Vision

Natural Language Processing

PaddlePaddle

#deep-learning#models#neural-network

pliang279/awesome-multimodal-ml

A comprehensive reading list for research topics in multimodal machine learning.

6.8K

Archived

Computer Vision

Natural Language Processing

#multimodal-learning#reading-list#machine-learning

TalAter/annyang

Speech recognition library for your web application, enabling voice interactions.

6.7K

Archived

JavaScript

Component Libraries (React)

AI Voice & Speech

React

#speech#speech-recognition#voice

NLPchina/ansj_seg

A high-performance Chinese text segmentation library with support for named entity recognition and part-of-speech tagging.

6.5K

Archived

Java

API Frameworks

ORMs & Query Builders

Java

#nlp#chinese#text-processing

flashlight/wav2letter

Facebook AI Research's end-to-end speech recognition toolkit written in C++.

6.4K

Active

C++

Speech Recognition

#speech-recognition#deep-learning#end-to-end

Blaizzy/mlx-audio

A high-performance text-to-speech, speech-to-text, and speech-to-speech library for Apple Silicon devices.

6.1K

Active

Python

AI Voice & Speech

CLI Tools

Apple MLX

#apple-silicon#speech-recognition#speech-synthesis

PaddlePaddle/PaddleX

PaddleX is an all-in-one development tool based on PaddlePaddle, providing AI pipelines for computer vision, NLP, and more.

6.1K

Active

Python

Computer Vision

Natural Language Processing

Python

#computer-vision#natural-language-processing#ocr

OpenBMB/VoxCPM

An open-source, tokenizer-free text-to-speech (TTS) model for context-aware speech generation and voice cloning.

6.0K

Active

Python

AI Voice & Speech

API Frameworks

PyTorch

#text-to-speech#voice-cloning#speech-synthesis

canopyai/Orpheus-TTS

Orpheus-TTS is a high-quality, real-time text-to-speech library for creating human-sounding AI voices.

6.0K

Stable

Python

AI Voice & Speech

Python

#llm#realtime#tts

snakers4/silero-models

Pre-trained text-to-speech models for various languages, made simple to use.

5.8K

Active

Jupyter Notebook

AI Voice & Speech

API Frameworks

PyTorch

#text-to-speech#speech-synthesis#pre-trained-models

argmaxinc/WhisperKit

An open-source on-device speech recognition library for Apple Silicon devices, built with Swift and Transformers.

5.7K

Active

Swift

AI Voice & Speech

iOS

#speech-recognition#transformers#inference

huggingface/parler-tts

Inference and training library for high-quality text-to-speech (TTS) models.

5.5K

Archived

Python

AI Voice & Speech

API Frameworks

Python

#text-to-speech#tts#speech-synthesis

promptslab/Awesome-Prompt-Engineering

This repository provides a curated collection of resources for Prompt Engineering with a focus on large language models like ChatGPT and GPT-3.

5.5K

Stable

Python

LLM Frameworks

Prompt Engineering

Python

#chatgpt#gpt-3#prompt-engineering

ibab/tensorflow-wavenet

A TensorFlow implementation of DeepMind's WaveNet paper for generating high-quality speech audio.

5.4K

Archived

Python

ML Ops

Computer Vision

TensorFlow

#speech-generation#audio-processing#deep-learning

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization using OpenAI Whisper

5.4K

Stable

Jupyter Notebook

React

#asr#speaker-diarization#speech-recognition

modelscope/FunClip

Open-source video speech recognition & clipping tool with LLM-based AI clipping capabilities

5.4K

Experimental

Python

LLM Frameworks

AI Voice & Speech

gradio

#speech-recognition#video-subtitles#llm

NVIDIA/tacotron2

A PyTorch implementation of Tacotron 2, a state-of-the-art text-to-speech model, with faster-than-realtime inference.

5.3K

Archived

Jupyter Notebook

Speech & Audio

Inference

PyTorch

#text-to-speech#audio-generation#machine-learning

claritylab/lucida

Lucida is a speech and vision based intelligent personal assistant built with Java.

4.8K

Archived

Java

AI Voice & Speech

Computer Vision

#personal-assistant#speech-recognition#computer-vision

1...46...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.