Showing 1-20 of 32 projects
Object detection toolbox for PyTorch with support for multiple tasks and state-of-the-art models.
A deep learning model that converts images of mathematical equations into LaTeX code.
Advanced AI Explainability library for computer vision models built with PyTorch.
This repository contains demos for the Transformers library by HuggingFace, a popular NLP and computer vision library.
An ultra-simple, state-of-the-art codebase for autoregressive image generation using advanced AI models.
A Python library for ingesting, parsing, and optimizing any data format for enhanced compatibility with GenAI frameworks.
An open-source project that provides a state-of-the-art image restoration model using the Swin Transformer architecture.
Efficient AI model backbones developed by Huawei's Noah's Ark Lab, including GhostNet, TNT, and MLP.
A pre-training toolbox and benchmark for vision AI models, including self-supervised learning and state-of-the-art architectures.
JAX library for computer vision research with transformers, attention mechanisms, and vision models
A fast and simple framework for building neural data processing pipelines using Python.
Efficient vision foundation models for high-resolution generation and perception.
A comprehensive multimodal system for long-term streaming video and audio interactions using large language models.
A video foundation model and dataset for multimodal understanding and video understanding tasks.
Official PyTorch implementation for a novel method to visualize classifications by Transformer based networks.
A simple vision transformer baseline for human pose estimation, with pre-trained models and advanced capabilities.
An all-in-one computer vision toolkit for developers building AI-powered applications.
A collection of Microsoft's work on NAS and Vision Transformer for efficient AI models.
A self-supervised video representation learning model for video understanding tasks.
A Python library that helps developers extract structured data from tricky documents using vision-language models.
Get weekly updates on trending AI coding tools and projects.