llama.cpp

Georgi Gerganov's CPU-and-Metal LLM inference engine in C/C++. Powers Ollama, LM Studio, and basically every 'local LLM' app on the planet.

From Wikipedia

llama.cpp is an open-source software library that performs inference on various large language models such as Llama. It is co-developed alongside the GGML project, a general-purpose tensor library.