Showing 261-280 of 368 projects
SALMONN is a suite of advanced multi-modal large language models (LLMs) for audio, speech, and video understanding.
A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.
A Python-based project that provides a TTS API server and Gradio-based web UI for speech synthesis and voice generation.
A Persian natural language processing toolkit for tasks like tokenization, lemmatization, and part-of-speech tagging.
A Linux app for speech-to-text, text-to-speech, and machine translation, with offline capabilities.
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
A CLI for on-device speech transcription using Speech.framework on macOS
This is a Telegram group index list focused on freedom of speech, not directly related to AI tools or vibe coders.
icefall is a Python library for building Automatic Speech Recognition (ASR) systems using Finite State Transducers (FSTs).
A tool to record Microsoft Edge browser's text-to-speech (TTS) audio and output it as .wav files on Windows.
A Jupyter Notebook project that converts PDF documents to audio using AI-powered text-to-speech.
A community-driven Python bot that aims to be simple to serve humans with everyday tasks
Step-Audio 2 is an end-to-end multi-modal large language model for industry-strength audio understanding and speech conversation.
A web component wrapper for the Web Speech API, enabling voice recognition and speech synthesis.
Synchronized Translation for Videos: Automatic dubbing and subtitling for video content.
A collection of natural language processing (NLP) research papers with code implementations in TensorFlow and PyTorch.
speak.js is a text-to-speech library for JavaScript that uses the eSpeak speech synthesis engine.
macOS offline speech-to-text app using local ML—no cloud, fully private voice dictation
This open-source Python library is a toolkit for building speech synthesis and voice conversion systems using deep learning.
A Neovim AI plugin that enables ChatGPT sessions, Instructable text/code operations, and Speech to Text functionality.
Get weekly updates on trending AI coding tools and projects.