Technology · AI

Llama

Meta's open-weights family — the default foundation of the self-hosted AI world, from data centres down to phones.

Made by
Meta
First release
2023
Interface
Self-hosted · every cloud
Known for
The open-weights ecosystem standard
In plain English

What it is, and why we use it.

Llama is the model family that made open weights mainstream. Meta releases the weights publicly, so anyone can run, fine-tune and ship them on their own hardware — which is why most of the on-device and private-cloud AI you encounter has a Llama somewhere in its ancestry. The tooling ecosystem around it (llama.cpp, Ollama, vLLM) is the deepest in open AI.

Llama is our on-device and private-deployment workhorse: when a flow needs AI but data can't leave the device, a small Llama variant runs locally so nothing is sent off the phone. When a client needs AI behind their firewall with full control of the weights, Llama is usually the starting point.

Key differences

Llama vs DeepSeek vs ChatGPT / GPT.

Llama against the other open-weights heavyweight and the closed frontier — the self-hosting decision in one table.

DimensionLlamaDeepSeekChatGPT / GPT
OpennessOpen weights, permissive licenceOpen weightsClosed API
Ecosystem & toolingDeepest — llama.cpp, Ollama, vLLM, fine-tune recipesGrowing fastLargest closed ecosystem
On-deviceBest small-model lineup for phones and laptopsMostly server-classCloud only
Peak reasoningStrong, behind the frontier peakFrontier-class R-seriesFrontier
CostFree weights — you pay for hardwareFree weights / very cheap APIPremium API

Llama wins when

  • AI must run on-device or fully offline
  • You want the richest fine-tuning and serving ecosystem
  • Licence clarity matters for a commercial product

DeepSeek wins when

  • You need maximum reasoning per dollar
  • Server-side volume work dominates
  • You're comfortable with newer tooling

ChatGPT / GPT wins when

  • Managed service beats infrastructure control
  • Multimodal breadth is required
  • Time-to-market is everything
Our take

Llama is how AI ships when the data can't leave the building — or the device. It's the quiet foundation of our privacy-first work, run and tuned by engineers who understand exactly what's in the weights.

Thinking about Llama?

Tell us what you're building — we'll tell you honestly whether Llama is the right tool for it.