Explore Projects

Discover 368 open source projects

Active filters (1):
Search: speech×
Clear all

Showing 261-280 of 368 projects

bytedance/SALMONN

SALMONN is a suite of advanced multi-modal large language models (LLMs) for audio, speech, and video understanding.

1.4K
Stable
LLM Frameworks
Speech Recognition
#audio-processing#speech-recognition#video-understanding

coqui-ai/open-speech-corpora

A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.

1.4K
Archived
AI Voice & Speech
Databases
#speech-recognition#speech-synthesis#speech-processing

lenML/Speech-AI-Forge

A Python-based project that provides a TTS API server and Gradio-based web UI for speech synthesis and voice generation.

1.4K
Active
Python
AI Voice & Speech
API Frameworks
Gradio
#text-to-speech#speech-synthesis#api-server

roshan-research/hazm

A Persian natural language processing toolkit for tasks like tokenization, lemmatization, and part-of-speech tagging.

1.4K
Stable
Python
NLP
Backend Frameworks
Python
#natural-language-processing#persian#tokenizer

mkiol/dsnote

A Linux app for speech-to-text, text-to-speech, and machine translation, with offline capabilities.

1.4K
Active
C++
AI Voice & Speech
API Frameworks
#speech-recognition#speech-synthesis#machine-translation

Enemyx-net/VibeVoice-ComfyUI

ComfyUI integration for Microsoft's VibeVoice text-to-speech model

1.4K
Stable
Python
AI Voice & Speech
ComfyUI Custom Nodes
React
#text-to-speech#voice-cloning#ComfyUI

finnvoor/yap

A CLI for on-device speech transcription using Speech.framework on macOS

1.4K
Stable
Swift
AI Code Editors
MCP Frameworks
React
#speech-transcription#on-device#macOS

telegramlist/telegramlist

This is a Telegram group index list focused on freedom of speech, not directly related to AI tools or vibe coders.

1.4K
Archived
Resource Collections
#telegram#group-index#freedom-of-speech

k2-fsa/icefall

icefall is a Python library for building Automatic Speech Recognition (ASR) systems using Finite State Transducers (FSTs).

1.4K
Stable
Python
LLM Frameworks
API Frameworks
Python
#speech-recognition#asr#fst

LuckyHookin/edge-TTS-record

A tool to record Microsoft Edge browser's text-to-speech (TTS) audio and output it as .wav files on Windows.

1.4K
Archived
HTML
Frontend Frameworks
CLI Tools
Vue.js
#edge#tts#audio-recording

lamm-mit/PDF2Audio

A Jupyter Notebook project that converts PDF documents to audio using AI-powered text-to-speech.

1.4K
Experimental
Jupyter Notebook
AI Voice & Speech
Jupyter Notebook
#text-to-speech#pdf#audio

swapagarwal/JARVIS-on-Messenger

A community-driven Python bot that aims to be simple to serve humans with everyday tasks

1.4K
Archived
Python
Agents & Orchestration
API Frameworks
Python
#assistant#bot#chat

stepfun-ai/Step-Audio2

Step-Audio 2 is an end-to-end multi-modal large language model for industry-strength audio understanding and speech conversation.

1.4K
Stable
Python
LLM Frameworks
AI Voice & Speech
Python
#audio-understanding#speech-conversation#multi-modal

zenorocha/voice-elements

A web component wrapper for the Web Speech API, enabling voice recognition and speech synthesis.

1.3K
Archived
HTML
AI Voice & Speech
Polymer
#voice-recognition#speech-synthesis#web-components

R3gm/SoniTranslate

Synchronized Translation for Videos: Automatic dubbing and subtitling for video content.

1.3K
Stable
Python
AI Voice & Speech
CMS & Content
#video-dubbing#speech-to-text#text-to-speech

DengBoCong/nlp-paper

A collection of natural language processing (NLP) research papers with code implementations in TensorFlow and PyTorch.

1.3K
Archived
Python
LLM Frameworks
API Frameworks
PyTorch
#nlp#machine-learning#deep-learning

kripken/speak.js

speak.js is a text-to-speech library for JavaScript that uses the eSpeak speech synthesis engine.

1.3K
Archived
C++
AI Voice & Speech
#text-to-speech#speech-synthesis#javascript

altic-dev/FluidVoice

macOS offline speech-to-text app using local ML—no cloud, fully private voice dictation

1.3K
Active
Swift
Desktop Model Runners
AI Voice & Speech
Swift
#offline-dictation#voice-to-text#local-inference

CSTR-Edinburgh/merlin

This open-source Python library is a toolkit for building speech synthesis and voice conversion systems using deep learning.

1.3K
Archived
Python
Speech Synthesis
Voice Conversion
#speech-synthesis#voice-conversion#text-to-speech

Robitx/gp.nvim

A Neovim AI plugin that enables ChatGPT sessions, Instructable text/code operations, and Speech to Text functionality.

1.3K
Stable
Lua
LLM Wrappers & SDKs
AI Code Editors
Neovim
#neovim#chatgpt#speech-to-text
1...1315...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.