Showing 1-20 of 31 projects
Model framework for state-of-the-art ML models in text, vision, audio, and multimodal tasks.
Multimodal AI agent stack for GUI and browser automation
High-performance serving framework for large language and multimodal models
A collection of tutorials and notebooks on state-of-the-art computer vision models and techniques for developers.
A comprehensive collection of resources for anomaly detection, including books, papers, videos, and toolboxes.
Easily fine-tune, evaluate and deploy open-source large language models like GPT-OSS and Llama.
Dolphin is a document image parsing library that uses heterogeneous anchor prompting for OCR and layout analysis.
A comprehensive SDK for running frontier LLMs and VLMs on multiple hardware platforms with day-0 model support.
ERNIE is an industrial-grade development toolkit based on PaddlePaddle for building AI-powered applications.
A Python-based library for solving visual understanding tasks using reinforced visual-linguistic models (VLMs).
A low-code MCP framework for building complex and innovative RAG pipelines with AI tools.
An advanced multimodal AI model series for vision-language reasoning, developed by Skywork AI.
An AI-powered file management tool that organizes local files while ensuring privacy.
A benchmark for multimodal AI agents to tackle open-ended tasks in real computer environments.
A streamlined framework for efficient evaluation and performance benchmarking of large models like LLMs and VLMs.
A framework for building AI agents with strong reasoning abilities, self-improvement, and skill curation in a general computing environment.
A scalable multimodal reasoning framework for AI-powered applications with a focus on video and image understanding.
A go-based tool to process images with features like color palette extraction, OCR, upscaling, and more.
A tutorial on using large language models (LLMs) and vision language models (VLMs) for various AI-powered coding tasks.
A curated reading list for security, safety, and privacy of large language models (LLMs) and AI systems.
Get weekly updates on trending AI coding tools and projects.