Explore Projects

Discover 4 open source projects

Active filters (1):

Search: vision-language-transformer×

Showing 1-4 of 4 projects

LAVIS is a comprehensive library for multimodal deep learning, including image captioning, visual question answering, and more.

11.2K

Archived

Jupyter Notebook

Vision-Language Transformer

PyTorch

#deep-learning#multimodal-learning#vision-language

Official implementation of the paper 'Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection'.

9.8K

Archived

Python

Computer Vision

Python

#object-detection#open-world#open-world-detection

PyTorch code for Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

5.7K

Archived

Jupyter Notebook

React

#vision-language#pre-training#unified-vision-language

An innovative AI-powered document understanding and OCR platform from Alibaba Research.

1.8K

Experimental

C++

Computer Vision

Document Intelligence

#ocr#document-recognition#document-understanding

Get weekly updates on trending AI coding tools and projects.