Showing 521-540 of 848 projects
A Python-based multimodal OCR tool for efficient offline processing of LaTeX, ZhEn, and tables on Windows.
A collection of pre-trained models for the dlib computer vision and machine learning library.
A Python library that enables lifting 2D objects to 3D using Segment-Anything model.
A computer vision and deep learning-based face mask detection system using OpenCV and TensorFlow/Keras.
This open-source repository provides a collection of state-of-the-art visual object tracking algorithms built with PyTorch for developers working on computer vision and deep learning projects.
A dense image captioning library in Torch for developers working on computer vision AI projects.
Official repository for a paper on a large vision-language model for medical applications
A Python-based data infrastructure platform for declarative, incremental multimodal AI workloads.
High-performance PyTorch implementation of the SSD (Single Shot MultiBox Detector) for object detection.
A fast symbolic computation library for robotics, with capabilities like code generation and nonlinear optimization.
A collection of papers and code for CVPR conferences focused on low-level computer vision tasks.
Pretrained EfficientNet, EfficientNet-Lite, MixNet, MobileNetV3/V2, and other popular computer vision models for PyTorch.
An open-source Xray panel that supports multiple protocols, user management, and advanced features.
A Chinese typeface library from Ming Dynasty woodblock printed books, useful for developers working with Chinese typography and computer vision.
A toolkit for 3D reconstruction and structure-from-motion computer vision tasks.
A PyTorch implementation of the Non-local Block, a key component in computer vision models.
FCIS is a library for instance-aware semantic segmentation, a computer vision technique useful for AI and robotics applications.
A collection of edge/contour/boundary detection papers and toolbox for computer vision tasks.
A comprehensive list of Implicit Representations and NeRF papers relating to Robotics/RL domain.
A powerful vision-language foundation model designed to advance multimodal AI understanding and reasoning.
Get weekly updates on trending AI coding tools and projects.