Showing 341-360 of 426 projects
This project provides advanced voiceprint recognition models and data preprocessing methods using PyTorch.
A speech recognition server based on Vosk and Kaldi libraries, supporting WebSocket, gRPC, and WebRTC protocols.
A PyTorch library for video activity recognition using C3D, R3D, and R2Plus1D models.
SincNet is a neural architecture for efficiently processing raw audio samples for speech and audio processing tasks.
A Python command-line client for the Whisper speech-to-text model by OpenAI, using the CTranslate2 library.
A research and production-oriented toolkit for speaker verification, recognition, and diarization using AI and ML techniques.
A real-time emotion recognition library using computer vision and deep learning.
Train, evaluate, optimize, and deploy computer vision models with OpenVINO, a toolkit for accelerating deep learning on edge devices.
A Python toolbox for skeleton-based action recognition using deep learning and PyTorch.
Open-source PyTorch implementation of an end-to-end automatic speech recognition (ASR) system.
A high-performance, end-to-end deep learning-based captcha recognition model for developers.
Fine-tune and deploy the Whisper speech recognition model with accelerated inference and support for various platforms.
A curated list of papers and resources on open set recognition, out-of-distribution detection, and open world recognition.
A web app that uses AI to index videos, enable semantic search, and export scenes for developers building with AI tools.
A voice chat app built with Python that leverages AI-powered speech recognition and text-to-speech capabilities.
A face recognition library for Go that uses the dlib computer vision library.
Mousai is a Rust-based audio recognition library that can identify songs in seconds, similar to Shazam.
A Python framework for sequence labeling evaluation, useful for named-entity recognition and POS tagging.
A powerful named entity recognition tool for building AI-powered applications in Python.
A Python library for training speaker recognition models using the VoxCeleb dataset.
Get weekly updates on trending AI coding tools and projects.