Tag: ollama

All the articles with the tag "ollama".

LLM Backends: vLLM vs llama.cpp vs Ollama

8 Mar, 2026

vLLM, llama.cpp, and Ollama all run local LLMs — compare throughput, memory use, GPU support, and which fits your hardware.
Key Parameters of Large Language Models

15 Jul, 2024

Temperature, top-p, top-k, context length — LLM inference parameters explained so you stop guessing why the model gives weird output.
Large Language Model Formats and Quantization

29 Apr, 2024

GGUF, GGML, AWQ, GPTQ — LLM file formats and quantization levels explained: trade-offs between model quality, size, and inference speed.
Exploring the Diverse World of LLM Models

24 Apr, 2024

LLaMA, Mistral, Falcon, GPT — the LLM landscape is crowded. Compare model families, sizes, licensing, and what each is actually good for.
Ollama: Powerful Language Models on Your Own Machine

6 Apr, 2024

Ollama makes running local LLMs dead simple — pull a model, start the server, and get a private ChatGPT running on your own hardware.

LLM Backends: vLLM vs llama.cpp vs Ollama