Explore Projects

Discover 10 open source projects

Active filters (1):
Search: multimodal-deep-learningร—
Clear all

Showing 1-10 of 10 projects

salesforce/LAVIS

LAVIS is a comprehensive library for multimodal deep learning, including image captioning, visual question answering, and more.

11.2K
Archived
Jupyter Notebook
Vision-Language Transformer
PyTorch
#deep-learning#multimodal-learning#vision-language

AI4Finance-Foundation/FinRobot

An open-source AI agent platform for financial analysis using large language models (LLMs)

6.3K
Active
Jupyter Notebook
LLM Frameworks
API Frameworks
Jupyter Notebook
#aiagent#chatgpt#finance

KimMeen/Time-LLM

An official implementation of a time series forecasting model using large language models.

2.5K
Stable
Python
LLM Frameworks
Time Series
Python
#time-series-forecasting#large-language-models#deep-learning

Yutong-Zhou-cv/Awesome-Text-to-Image

A curated list of resources on text-to-image generation and synthesis, useful for AI-focused developers.

2.4K
Active
Computer Vision
Tutorials & Courses
React
#text-to-image#computer-vision#generative-adversarial-networks

kyegomez/BitNet

Implementation of a 1-bit Transformer model for large language models in PyTorch.

1.9K
Active
Python
LLM Frameworks
API Frameworks
PyTorch
#artificial-intelligence#deep-learning#transformers

AlibabaResearch/AdvancedLiterateMachinery

An innovative AI-powered document understanding and OCR platform from Alibaba Research.

1.8K
Experimental
C++
Computer Vision
Document Intelligence
#ocr#document-recognition#document-understanding

DWCTOD/CVPR2024-Papers-with-Code-Demo

A curation of the latest CVPR (Computer Vision and Pattern Recognition) papers, code, and demos for AI-powered developers.

1.4K
Archived
Computer Vision
Tutorials & Courses
#computer-vision#cvpr#tutorials

jrzaurin/pytorch-widedeep

A flexible package for multimodal deep learning to combine tabular, text, and image data using Wide and Deep models in PyTorch.

1.4K
Stable
Python
LLM Frameworks
API Frameworks
PyTorch
#deep-learning#multimodal#tabular-data

yuewang-cuhk/awesome-vision-language-pretraining-papers

A curated collection of recent advances in vision-language pretrained models (VL-PTMs) for AI and multimodal applications.

1.2K
Archived
Computer Vision
LLM Frameworks
#vision-language#multimodal#pretrained-models

TheShadow29/awesome-grounding

A curated list of research papers on visual grounding, a key technique for multimodal AI.

1.1K
Stable
Computer Vision
Language Grounding
#computer-vision#language-grounding#multimodal-ai

Stay in the loop

Get weekly updates on trending AI coding tools and projects.