Explore Projects

Discover 368 open source projects

Active filters (1):

Search: speech×

Clear all

Showing 261-280 of 368 projects

bytedance/SALMONN

SALMONN is a suite of advanced multi-modal large language models (LLMs) for audio, speech, and video understanding.

1.4K

Stable

LLM Frameworks

Speech Recognition

#audio-processing#speech-recognition#video-understanding

coqui-ai/open-speech-corpora

A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.

1.4K

Archived

AI Voice & Speech

Databases

#speech-recognition#speech-synthesis#speech-processing

lenML/Speech-AI-Forge

A Python-based project that provides a TTS API server and Gradio-based web UI for speech synthesis and voice generation.

1.4K

Active

Python

AI Voice & Speech

API Frameworks

Gradio

#text-to-speech#speech-synthesis#api-server

roshan-research/hazm

A Persian natural language processing toolkit for tasks like tokenization, lemmatization, and part-of-speech tagging.

1.4K

Stable

Python

NLP

Backend Frameworks

Python

#natural-language-processing#persian#tokenizer

mkiol/dsnote

A Linux app for speech-to-text, text-to-speech, and machine translation, with offline capabilities.

1.4K

Active

C++

AI Voice & Speech

API Frameworks

#speech-recognition#speech-synthesis#machine-translation

Enemyx-net/VibeVoice-ComfyUI

ComfyUI integration for Microsoft's VibeVoice text-to-speech model

1.4K

Stable

Python

AI Voice & Speech

ComfyUI Custom Nodes

React

#text-to-speech#voice-cloning#ComfyUI

finnvoor/yap

A CLI for on-device speech transcription using Speech.framework on macOS

1.4K

Stable

Swift

AI Code Editors

MCP Frameworks

React

#speech-transcription#on-device#macOS

telegramlist/telegramlist

This is a Telegram group index list focused on freedom of speech, not directly related to AI tools or vibe coders.

1.4K

Archived

Resource Collections

#telegram#group-index#freedom-of-speech

k2-fsa/icefall

icefall is a Python library for building Automatic Speech Recognition (ASR) systems using Finite State Transducers (FSTs).

1.4K

Stable

Python

LLM Frameworks

API Frameworks

Python

#speech-recognition#asr#fst

LuckyHookin/edge-TTS-record

A tool to record Microsoft Edge browser's text-to-speech (TTS) audio and output it as .wav files on Windows.

1.4K

Archived

HTML

Frontend Frameworks

CLI Tools

Vue.js

#edge#tts#audio-recording

lamm-mit/PDF2Audio

A Jupyter Notebook project that converts PDF documents to audio using AI-powered text-to-speech.

1.4K

Experimental

Jupyter Notebook

AI Voice & Speech

Jupyter Notebook

#text-to-speech#pdf#audio

swapagarwal/JARVIS-on-Messenger

A community-driven Python bot that aims to be simple to serve humans with everyday tasks

1.4K

Archived

Python

Agents & Orchestration

API Frameworks

Python

#assistant#bot#chat

stepfun-ai/Step-Audio2

Step-Audio 2 is an end-to-end multi-modal large language model for industry-strength audio understanding and speech conversation.

1.4K

Stable

Python

LLM Frameworks

AI Voice & Speech

Python

#audio-understanding#speech-conversation#multi-modal

zenorocha/voice-elements

A web component wrapper for the Web Speech API, enabling voice recognition and speech synthesis.

1.3K

Archived

HTML

AI Voice & Speech

Polymer

#voice-recognition#speech-synthesis#web-components

R3gm/SoniTranslate

Synchronized Translation for Videos: Automatic dubbing and subtitling for video content.

1.3K

Stable

Python

AI Voice & Speech

CMS & Content

#video-dubbing#speech-to-text#text-to-speech

DengBoCong/nlp-paper

A collection of natural language processing (NLP) research papers with code implementations in TensorFlow and PyTorch.

1.3K

Archived

Python

LLM Frameworks

API Frameworks

PyTorch

#nlp#machine-learning#deep-learning

kripken/speak.js

speak.js is a text-to-speech library for JavaScript that uses the eSpeak speech synthesis engine.

1.3K

Archived

C++

AI Voice & Speech

#text-to-speech#speech-synthesis#javascript

altic-dev/FluidVoice

macOS offline speech-to-text app using local ML—no cloud, fully private voice dictation

1.3K

Active

Swift

Desktop Model Runners

AI Voice & Speech

Swift

#offline-dictation#voice-to-text#local-inference

CSTR-Edinburgh/merlin

This open-source Python library is a toolkit for building speech synthesis and voice conversion systems using deep learning.

1.3K

Archived

Python

Speech Synthesis

Voice Conversion

#speech-synthesis#voice-conversion#text-to-speech

Robitx/gp.nvim

A Neovim AI plugin that enables ChatGPT sessions, Instructable text/code operations, and Speech to Text functionality.

1.3K

Stable

Lua

LLM Wrappers & SDKs

AI Code Editors

Neovim

#neovim#chatgpt#speech-to-text

1...1315...19

Stay in the loop

Get weekly updates on trending AI coding tools and projects.