Showing 241-260 of 426 projects
A curated list of promising OCR (Optical Character Recognition) resources for developers.
A Python library for forced audio alignment, useful for speech recognition and audio processing tasks.
A self-supervised video representation learning model for video understanding tasks.
PaddleVideo is a powerful toolkit for video understanding tasks like action recognition, localization, and detection.
A Python library for Chinese word segmentation, part-of-speech tagging, and named entity recognition.
Underthesea is a powerful Vietnamese NLP toolkit for developers working with natural language processing tasks.
A curated list of resources dedicated to scene text localization and recognition.
A high-quality PDF to Markdown conversion tool powered by large language model visual recognition.
A self-coding system for Ionic apps using AI-powered chatbot and voice assistant SDK.
Self-hosted personal gallery app with online photo management, EXIF parsing, geolocation, and WebGL viewer.
A Python-based desktop AI assistant that integrates with various LLMs and AI tools for coding, task automation, and more.
A Python binding to the Apache Tika REST service, enabling text extraction and parsing in Python.
A Python-based agent that uses speech recognition and text-to-speech to enable conversational interactions via WhatsApp.
A highly accurate natural language detection library for Python, suitable for short and mixed-language text.
A Python-based tool for generating subtitles using OpenAI's Whisper speech recognition model.
Real-time speech recognition and voice activity detection for offline use on multiple platforms.
A collection of Python code and resources related to natural language processing tasks like topic modeling, text classification, and machine translation.
Frontier CoreML audio models for iOS and macOS apps with text-to-speech, speech-to-text, voice activity detection, and speaker diarization.
This Python project helps developers build knowledge graphs from scratch, including named entity recognition, relation extraction, and question answering.
Implementation of the state-of-the-art YOLOv13 object detection model with hypergraph-enhanced visual perception.
Get weekly updates on trending AI coding tools and projects.