Explore Projects

Discover 426 open source projects

Active filters (1):

Search: recognition×

Clear all

Showing 341-360 of 426 projects

yeyupiaoling/VoiceprintRecognition-Pytorch

This project provides advanced voiceprint recognition models and data preprocessing methods using PyTorch.

1.2K

Stable

Python

AI Voice & Speech

API Frameworks

PyTorch

#speaker-recognition#voice-recognition#arcface

alphacep/vosk-server

A speech recognition server based on Vosk and Kaldi libraries, supporting WebSocket, gRPC, and WebRTC protocols.

1.2K

Experimental

Python

AI Voice & Speech

BaaS Platforms

Python

#speech-recognition#asr#kaldi

jfzhang95/pytorch-video-recognition

A PyTorch library for video activity recognition using C3D, R3D, and R2Plus1D models.

1.2K

Archived

Python

Computer Vision

API Frameworks

PyTorch

#video-activity-recognition#c3d#r3d

mravanelli/SincNet

SincNet is a neural architecture for efficiently processing raw audio samples for speech and audio processing tasks.

1.2K

Archived

Python

Audio & Speech

Signal Processing

PyTorch

#audio-processing#speech-recognition#signal-processing

Softcatala/whisper-ctranslate2

A Python command-line client for the Whisper speech-to-text model by OpenAI, using the CTranslate2 library.

1.2K

Stable

Python

LLM Wrappers & SDKs

API Frameworks

Python

#openai-whisper#speech-recognition#speech-to-text

wenet-e2e/wespeaker

A research and production-oriented toolkit for speaker verification, recognition, and diarization using AI and ML techniques.

1.2K

Active

Python

Speech & Voice

API Frameworks

PyTorch

#speech-recognition#speaker-verification#speaker-diarization

otaha178/Emotion-recognition

A real-time emotion recognition library using computer vision and deep learning.

1.2K

Archived

Python

Computer Vision

API Frameworks

Python

#emotion-recognition#computer-vision#deep-learning

open-edge-platform/training_extensions

Train, evaluate, optimize, and deploy computer vision models with OpenVINO, a toolkit for accelerating deep learning on edge devices.

1.2K

Active

Python

Computer Vision

API Frameworks

PyTorch

#computer-vision#openvino#deep-learning

kennymckormick/pyskl

A Python toolbox for skeleton-based action recognition using deep learning and PyTorch.

1.2K

Experimental

Python

Computer Vision

API Frameworks

PyTorch

#action-recognition#computer-vision#deep-learning

Alexander-H-Liu/End-to-end-ASR-Pytorch

Open-source PyTorch implementation of an end-to-end automatic speech recognition (ASR) system.

1.2K

Archived

Python

Speech & Voice

API Frameworks

PyTorch

#speech-recognition#asr#pytorch

dee1024/pytorch-captcha-recognition

A high-performance, end-to-end deep learning-based captcha recognition model for developers.

1.2K

Archived

Python

Computer Vision

API Frameworks

PyTorch

#computer-vision#deep-learning#captcha-recognition

yeyupiaoling/Whisper-Finetune

Fine-tune and deploy the Whisper speech recognition model with accelerated inference and support for various platforms.

1.2K

Stable

Fine-tuning

Inference

PyTorch

#speech-recognition#whisper#fine-tune

iCGY96/awesome_OpenSetRecognition_list

A curated list of papers and resources on open set recognition, out-of-distribution detection, and open world recognition.

1.2K

Archived

Open Set Recognition

Out-of-Distribution Detection

#open-set-recognition#open-set-domain-adaptation#open-world-recognition

IliasHad/edit-mind

A web app that uses AI to index videos, enable semantic search, and export scenes for developers building with AI tools.

1.2K

Active

TypeScript

Computer Vision

Search-as-a-Service

Electron

#ai#computer-vision#video-processing

modal-labs/quillman

A voice chat app built with Python that leverages AI-powered speech recognition and text-to-speech capabilities.

1.2K

Experimental

Python

AI Voice & Speech

Serverless

#speech-recognition#text-to-speech#serverless

Kagami/go-face

A face recognition library for Go that uses the dlib computer vision library.

1.2K

Archived

Computer Vision

API Frameworks

#face-recognition#computer-vision#dlib

SeaDve/Mousai

Mousai is a Rust-based audio recognition library that can identify songs in seconds, similar to Shazam.

1.2K

Active

Rust

Audio Recognition

Component Libraries (GTK)

#audio-recognition#music-identification#gtk

chakki-works/seqeval

A Python framework for sequence labeling evaluation, useful for named-entity recognition and POS tagging.

1.2K

Archived

Python

Computer Vision

API Clients & Testing

Python

#named-entity-recognition#sequence-labeling#natural-language-processing

glample/tagger

A powerful named entity recognition tool for building AI-powered applications in Python.

1.2K

Archived

Python

NLP & Text

API Frameworks

Python

#named-entity-recognition#text-processing#natural-language-processing

clovaai/voxceleb_trainer

A Python library for training speaker recognition models using the VoxCeleb dataset.

1.2K

Archived

Python

Computer Vision

API Frameworks

Python

#speaker-recognition#speaker-verification#metric-learning

1...1719...22

Stay in the loop

Get weekly updates on trending AI coding tools and projects.