Showing 1-7 of 7 projects
A modular deep learning framework for multimodal AI research and applications from Facebook AI Research (FAIR).
Open-source toolkit for evaluating large multi-modal AI models, supporting 220+ models and 80+ benchmarks.
InternGPT is an open-source demo platform that showcases various AI models, including DragGAN, ChatGPT, ImageBind, and multimodal chat.
A comprehensive survey of various question answering (QA) systems, including KBQA, TextQA, TableQA, VisualQA, and MRC.
Bottom-up attention model for image captioning and visual question answering, built on Faster R-CNN and Visual Genome.
Prismer: A Vision-Language Model with Multi-Task Experts for image-captioning and vision-language-model applications.
An AI-powered image captioning and image-text search platform for developers building with AI tools.
Get weekly updates on trending AI coding tools and projects.