Explore Projects

Discover 6 open source projects

Active filters (1):
Search: flash-attentionร—
Clear all

Showing 1-6 of 6 projects

Dao-AILab/flash-attention

Optimized attention mechanism for deep learning

22.5K
Active
Python
Inference
Computer Vision
PyTorch
#flash-attention#deep-learning#pytorch

QwenLM/Qwen

Qwen is a large language model series by Alibaba Cloud with multiple variants and capabilities.

20.6K
Active
Python
LLM Frameworks
Inference
Hugging Face
#large-language-model#alibaba-cloud#qwen

xlite-dev/LeetCUDA

LeetCUDA is a comprehensive collection of modern CUDA learning resources, including 200+ CUDA kernels, Tensor Cores, HGEMM, and FA-2 MMA.

9.8K
Active
Cuda
ML Ops
PyTorch
#cuda#cuda-toolkit#cuda-demo

ymcui/Chinese-LLaMA-Alpaca-2

An open-source Chinese version of the LLaMA and Alpaca language models with 64K long context support for advanced NLP applications.

7.2K
Experimental
Python
LLM Frameworks
Fine-tuning
Python
#llm#alpaca#llama

InternLM/InternLM

Official release of the InternLM series of large language models focused on building AI tools and chatbots.

7.2K
Stable
Python
LLM Frameworks
Fine-tuning
Python
#chatbot#llm#fine-tuning

xlite-dev/Awesome-LLM-Inference

A curated list of awesome papers and code for optimizing LLM/VLM inference performance

5.0K
Active
Python
LLM Frameworks
LLM Wrappers & SDKs
#llm#inference#optimization

Stay in the loop

Get weekly updates on trending AI coding tools and projects.