Explore Projects

Discover 321 open source projects

Active filters (1):

Search: inference×

Showing 61-80 of 321 projects

dusty-nv/jetson-inference

A guide to deploying deep-learning inference networks and computer vision primitives with NVIDIA Jetson hardware and TensorRT.

8.7K

Stable

C++

Computer Vision

Inference

#computer-vision#deep-learning#nvidia

intel/ipex-llm

An accelerator for local LLM inference and fine-tuning on Intel XPUs, with seamless integration into popular LLM frameworks.

8.7K

Active

Python

LLM Frameworks

LLM Wrappers & SDKs

PyTorch

#llm#inference#fine-tuning

OpenBMB/MiniCPM

Ultra-efficient large language models (LLMs) for end devices, enabling fast on-device reasoning and inference.

8.7K

Stable

Jupyter Notebook

LLM Frameworks

CLI Tools

#large-language-models#on-device-inference#cli-tools

bentoml/BentoML

BentoML is an easy-to-use framework for building and deploying production-ready machine learning models as APIs.

8.5K

Active

Python

LLM Frameworks

API Clients & Testing

Python

#ai-inference#llm-inference#llm-serving

OptimalScale/LMFlow

An extensible toolkit for finetuning and inference of large language models, enabling 'large models for all'.

8.5K

Active

Python

LLM Frameworks

Fine-tuning

PyTorch

#chatgpt#language-model#transformer

facebookresearch/sam3

The repository provides code and tools for running inference and fine-tuning with the Meta Segment Anything Model 3 (SAM 3).

8.0K

Active

Python

Inference

Fine-tuning

Python

#computer-vision#image-segmentation#model-serving

py-why/dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions.

8.0K

Active

Python

Causal Inference

Databases

Python3

#causal-inference#causal-models#bayesian-networks

wang-xinyu/tensorrtx

TensorRT implementation of popular deep learning networks for efficient inference on GPUs

7.7K

Active

C++

Computer Vision

API Frameworks

#computer-vision#tensorrt#deep-learning

google-deepmind/alphafold3

AlphaFold 3 is a Python-based inference pipeline for protein structure prediction using deep learning.

7.7K

Active

Python

Inference

Computer Vision

#protein-structure-prediction#deep-learning#computer-vision

InternLM/lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving large language models (LLMs).

7.7K

Active

Python

LLM Frameworks

Inference

Python

#llm#inference#deployment

LMCache/LMCache

Supercharge your large language models (LLMs) with the fastest key-value cache layer for lightning-fast inference.

7.5K

Active

Python

LLM Wrappers & SDKs

Caching

PyTorch

#llm#inference#cache

bigcode-project/starcoder

StarCoder is a Python library for fine-tuning and inference of large language models.

7.5K

Archived

Python

LLM Frameworks

Fine-tuning

Python

#llm#fine-tuning#inference

Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB

A lightweight face detection model optimized for inference on edge devices.

7.5K

Archived

Python

React

#face-detection#inference#lightweight

PaddlePaddle/Paddle-Lite

PaddlePaddle Lite is a high-performance deep learning inference engine for mobile and edge devices.

7.2K

Experimental

C++

Inference

Mobile

#deep-learning#mobile-deep-learning#embedded

gcanti/io-ts

A runtime type system for IO decoding/encoding in TypeScript, providing a flexible and type-safe way to work with data.

6.8K

Archived

TypeScript

API Clients & Testing

Backend Frameworks

TypeScript

#types#inference#runtime

google/gemma.cpp

A lightweight, standalone C++ inference engine for Google's Gemma AI models.

6.7K

Active

C++

Inference

API Frameworks

#c++#inference-engine#google-models

EricLBuehler/mistral.rs

A Rust library for blazingly fast LLM inference, useful for AI coding and ML applications.

6.7K

Active

Rust

LLM Frameworks

AI Code Generation

Rust

#llm#inference#rust

microsoft/nlp-recipes

A comprehensive collection of best practices and examples for natural language processing (NLP) using Python.

6.4K

Archived

Python

LLM Frameworks

API Frameworks

Python

#natural-language-processing#deep-learning#machine-learning

allenai/OLMo

Modeling, training, evaluation, and inference code for OLMo, a large language model.

6.4K

Stable

Python

LLM Frameworks

Python

#language-model#llm#ai

ai-dynamo/dynamo

A distributed inference serving framework for AI applications, built with Rust for high performance and scalability.

6.2K

Active

Rust

Inference

API Frameworks

#machine-learning#inference-serving#distributed-systems

1...35...17

Stay in the loop

Get weekly updates on trending AI coding tools and projects.