Meta's open-weights family — the default foundation of the self-hosted AI world, from data centres down to phones.
Llama is the model family that made open weights mainstream. Meta releases the weights publicly, so anyone can run, fine-tune and ship them on their own hardware — which is why most of the on-device and private-cloud AI you encounter has a Llama somewhere in its ancestry. The tooling ecosystem around it (llama.cpp, Ollama, vLLM) is the deepest in open AI.
Llama is our on-device and private-deployment workhorse: when a flow needs AI but data can't leave the device, a small Llama variant runs locally so nothing is sent off the phone. When a client needs AI behind their firewall with full control of the weights, Llama is usually the starting point.
Llama against the other open-weights heavyweight and the closed frontier — the self-hosting decision in one table.
| Dimension | Llama | DeepSeek | ChatGPT / GPT |
|---|---|---|---|
| Openness | Open weights, permissive licence | Open weights | Closed API |
| Ecosystem & tooling | Deepest — llama.cpp, Ollama, vLLM, fine-tune recipes | Growing fast | Largest closed ecosystem |
| On-device | Best small-model lineup for phones and laptops | Mostly server-class | Cloud only |
| Peak reasoning | Strong, behind the frontier peak | Frontier-class R-series | Frontier |
| Cost | Free weights — you pay for hardware | Free weights / very cheap API | Premium API |
Llama is how AI ships when the data can't leave the building — or the device. It's the quiet foundation of our privacy-first work, run and tuned by engineers who understand exactly what's in the weights.
Tell us what you're building — we'll tell you honestly whether Llama is the right tool for it.