Showing 21-32 of 32 projects
A video restoration transformer for deblurring, denoising, and super-resolution of videos.
A PyTorch library that provides Vision Transformer (ViT) adapters for dense prediction tasks like object detection and semantic segmentation.
All-in-one training for vision models with pretraining, fine-tuning, and distillation capabilities.
A collection of recent Transformer-based computer vision and related research papers.
A curated collection of resources on applying Transformers to medical imaging tasks like segmentation, classification, and synthesis.
A curated list of attention modules and plug-and-play modules for computer vision in Python.
A Tokens-to-Token Vision Transformer (T2T-ViT) model for training Vision Transformers from scratch on ImageNet.
Official PyTorch implementation of VoxFormer, a state-of-the-art 3D computer vision model for autonomous driving and scene understanding.
A curated list of foundation models for vision and language tasks, useful for vibe coders building AI-powered applications.
A library for explaining the decisions made by Vision Transformers, a type of AI model used for computer vision tasks.
A general representation model for cross-modal learning across vision, audio, and language.
A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery.
Get weekly updates on trending AI coding tools and projects.