Showing 781-800 of 848 projects
A powerful open-world object detection model for computer vision tasks, leveraging the DINO framework.
A deep reinforcement learning library for training robotic agents to plan pushing and grasping actions for manipulation tasks.
AnomalyGPT is a powerful tool for detecting industrial anomalies using large vision-language models.
Aria is an open-source multimodal AI framework for building vision and language models.
TableBank is a benchmark dataset for table detection and recognition, useful for building computer vision models.
An open-source OCR engine developed by SYSU DeepDriving Lab, focused on computer vision tasks.
Powerful SOTA computer vision model for various image enhancement tasks like denoising, deblurring, and more.
An AI-powered anime super-resolution tool built for developers who work with computer vision models.
MSF is a modular framework for multi-sensor fusion based on an Extended Kalman Filter, useful for robotics and computer vision applications.
A collection of notes and projects on machine learning, deep learning, computer vision, NLP, and web scraping.
PixelLib is a Python library for image and video segmentation using deep learning models like Mask R-CNN, DeepLab, and PointRend.
A curated list of SLAM (Simultaneous Localization and Mapping) resources for developers working in computer vision and robotics.
A PyTorch library for training and deploying efficient computer vision models using ViT-based architectures.
A collection of research papers and resources on computer vision tasks like image classification, object detection, and face recognition.
A comprehensive benchmark for spatio-temporal predictive learning, with a focus on AI-powered weather forecasting and video prediction.
A library for explaining the decisions made by Vision Transformers, a type of AI model used for computer vision tasks.
A curated list of papers on trajectory and motion prediction, a key topic in computer vision and robotics.
Algorithm to texture 3D reconstructions from multi-view stereo images, useful for computer vision and 3D graphics projects.
An iOS app that uses computer vision to find and extract text from real-world objects using the device's camera.
A general representation model for cross-modal learning across vision, audio, and language.
Get weekly updates on trending AI coding tools and projects.