Showing 1-8 of 8 projects
A Python-based webservice API that provides an easy-to-use interface for the OpenAI Whisper automatic speech recognition model.
A comprehensive collection of research papers on automatic speech recognition, speech synthesis, and related topics.
An open-source deep learning toolkit for building Speech-to-Text models and deploying them easily.
A lightweight, high-performance voice activity detector (VAD) library for conversational AI and real-time speech processing.
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, with outstanding singing lyrics recognition.
Frontier CoreML audio models for iOS and macOS apps with text-to-speech, speech-to-text, voice activity detection, and speaker diarization.
PORORO is a powerful Python library that provides a wide range of neural models for natural language processing tasks.
TensorFlowASR is an almost state-of-the-art automatic speech recognition library in TensorFlow 2 for vibe coders.
Get weekly updates on trending AI coding tools and projects.