dandelin/ViLT

A Vision-and-Language Transformer model for multimodal tasks without the need for convolution or region supervision.

Python
AI & Machine Learning
Computer Vision
Apache-2.0

1.5K

Stars

230

Forks

Mar 25, 2021

Created

Apr 3, 2024

Last Updated

Project Analytics

Stars Growth (1 Month)

+7

+0.5% change

Avg Daily Growth (1 Month)

+0.3

stars per day

Fork/Star Ratio (All Time)

15.0%

Good engagement

Lifetime Growth

0.8

stars/day over 1.8K days

Stars Over Time

Forks Over Time

Open Issues Over Time

Pull Requests Over Time

Commits Over Time

AI-Generated Tags

vision-language
multimodal
transformer
machine-learning
computer-vision
open-source

Comments (0)

Sign in to leave a comment or vote

Sign In

No comments yet. Be the first to comment!

Stay in the loop

Get weekly updates on trending AI coding tools and projects.