Showing 1-20 of 79 projects
LLaVA is a visual instruction tuning framework for large language and vision models, enabling GPT-4 level capabilities.
SWE-agent is an AI-powered tool that automatically fixes GitHub issues using large language models.
A PyTorch-based library for robust and high-quality blind face restoration using a codebook lookup transformer.
An open-source real-time object detection library powered by the YOLOv10 neural network model.
An ultra-simple, state-of-the-art codebase for autoregressive image generation using advanced AI models.
A highly capable foundation model for monocular depth estimation, a key component in computer vision.
An AI-powered story generation tool for developers interested in vibe coding and creative AI applications.
A research paper that introduces a novel approach for using large language models to solve complex problems through a tree-based reasoning process.
Official implementation of a paper on segmentation models for computer vision tasks.
State-of-the-art neural search engine
HippoRAG is a novel RAG framework that enables LLMs to continuously integrate knowledge across external documents.
Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion
Pointcept is a codebase for point cloud perception research, featuring the latest works on 3D computer vision.
Multimodal conversational video generation powered by AI, enabling new vibe-coder collaboration experiences.
Real-time object detector using YOLOv12 with attention-centric architecture
A benchmark for multimodal AI agents to tackle open-ended tasks in real computer environments.
A powerful multimodal AI model for real-time vision and speech interaction, built for developers who work with AI tools.
Autoformer: A deep learning model for long-term time series forecasting, focused on developers building with AI tools.
An end-to-end multimodal SVG generator that leverages pre-trained Vision-Language Models to create complex and detailed SVGs.
PartCrafter is a 3D mesh generation tool that uses compositional latent diffusion transformers to create structured 3D objects.
Get weekly updates on trending AI coding tools and projects.