Showing 281-300 of 426 projects
A pure Java speech recognition library that can be used in various applications.
A curated list of Machine Learning, AI, and NLP solutions for iOS development.
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
An open-source robotics operating system (ROS) with support for speech recognition, semantic understanding, visual control, and Gazebo simulation.
Automatically generate YouTube subtitles using OpenAI's Whisper speech recognition model
A C++ library for synchronizing subtitles with audio/video content using speech recognition.
Provides demos and examples for the Web Speech API, a powerful tool for adding speech recognition and synthesis to web apps.
A unified UI and API for processing and training images for facial recognition across various AI tools.
A Python library for processing and analyzing scientific audio data, particularly for bird song detection and recognition.
A Facebook Messenger Bot with voice recognition, NLP, and features like restaurant search and memo transcription.
A curation of the latest CVPR (Computer Vision and Pattern Recognition) papers, code, and demos for AI-powered developers.
A TensorFlow-based face recognition neural network library for Python developers.
An open-source virtual assistant for Ubuntu-based Linux distributions, focused on speech recognition and natural language processing.
A neural network model for detecting different emotions from audio speeches using Python and deep learning.
An OBS plugin that enables real-time speech recognition and captioning using AI models like OpenAI Whisper.
This repository contains solutions to assignments for the CS231n course on Convolutional Neural Networks for Visual Recognition.
SALMONN is a suite of advanced multi-modal large language models (LLMs) for audio, speech, and video understanding.
A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.
A highly accurate deep learning-based license plate detection and recognition system for real-time use.
A Linux app for speech-to-text, text-to-speech, and machine translation, with offline capabilities.
Get weekly updates on trending AI coding tools and projects.