Showing 21-36 of 36 projects
Official PyTorch implementation for a novel method to visualize classifications by Transformer based networks.
A collection of Microsoft's work on NAS and Vision Transformer for efficient AI models.
A PyTorch library that provides Vision Transformer (ViT) adapters for dense prediction tasks like object detection and semantic segmentation.
A curated list of recent Transformer-based computer vision papers and implementations.
A GPT-SoVITS ONNX Inference Engine & Model Converter to enable voice cloning and text-to-speech for developers.
An emotion-controllable text-to-speech model for vibe coders, built on the VITS framework.
A mobile app that enables developers to create 2D anime-style AI companions using ChatGPT and Live2D.
All-in-one training for vision models with pretraining, fine-tuning, and distillation capabilities.
PaddleViT is a state-of-the-art Visual Transformer and MLP model library for the PaddlePaddle 2.0+ deep learning framework.
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!
A Tokens-to-Token Vision Transformer (T2T-ViT) model for training Vision Transformers from scratch on ImageNet.
A C++ inference library for various SVC/TTS models, including DiffSinger, DiffSVC, HiFiGAN, and VITS.
A PyTorch implementation of the paper 'All are Worth Words: A ViT Backbone for Diffusion Models'.
A PyTorch library for training and deploying efficient computer vision models using ViT-based architectures.
A library for explaining the decisions made by Vision Transformers, a type of AI model used for computer vision tasks.
A simple API for the VITS text-to-speech model, with additional features for vibe coders.
Get weekly updates on trending AI coding tools and projects.