Explore Projects

Discover 4 open source projects

Active filters (1):
Search: vision-language-transformerร—
Clear all

Showing 1-4 of 4 projects

salesforce/LAVIS

LAVIS is a comprehensive library for multimodal deep learning, including image captioning, visual question answering, and more.

11.2K
Archived
Jupyter Notebook
Vision-Language Transformer
PyTorch
#deep-learning#multimodal-learning#vision-language

IDEA-Research/GroundingDINO

Official implementation of the paper 'Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection'.

9.8K
Archived
Python
Computer Vision
Python
#object-detection#open-world#open-world-detection

salesforce/BLIP

PyTorch code for Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

5.7K
Archived
Jupyter Notebook
React
#vision-language#pre-training#unified-vision-language

AlibabaResearch/AdvancedLiterateMachinery

An innovative AI-powered document understanding and OCR platform from Alibaba Research.

1.8K
Experimental
C++
Computer Vision
Document Intelligence
#ocr#document-recognition#document-understanding

Stay in the loop

Get weekly updates on trending AI coding tools and projects.