Skip to main content
llama.cpp logo

llama.cpp

LLM inference library in C/C++ with CPU, Metal, CUDA, and OpenVINO support

About

llama.cpp provides an efficient C/C++ library for running large language models on local hardware. It supports a wide range of backends including CPU, Metal, CUDA, and OpenVINO for optimized inference.