Showing 1-3 of 3 projects
PyTorch code for Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Official repository for the OFA (Unifying Architectures, Tasks, and Modalities) AI model, supporting various vision-language tasks.
Bottom-up attention model for image captioning and visual question answering, built on Faster R-CNN and Visual Genome.
Get weekly updates on trending AI coding tools and projects.