Topic
AI & LLMs
Models you can run on your own hardware, prompt patterns that ship, agent frameworks that don't catch fire, and the awkward questions nobody answers in the breathless launch posts. Ollama, vLLM, llama.cpp, LocalAI, plus the quieter stuff — embeddings, RAG, evals, and figuring out when the cloud API is actually the right answer. If you'd rather understand the trade-offs than chase benchmarks, you'll feel at home here.
46 articles in this topic.
Featured posts
-
Open WebUI Tools, Functions & Pipelines: Extend Your Local LLM
Open WebUI Tools, Functions, and Pipelines do different things — and the names don't help. What each one actually does, when to use which, and working code for all three.
11 min read -
Self-Supervised Learning Explained
Self-supervised learning is the technique behind GPT, BERT, and modern LLMs. Learn how models teach themselves from unlabeled data.
7 min read -
Ollama Model Management: Beyond ollama run
You know how to pull and run a model. Now learn Modelfiles, GPU layer tuning, the REST API, running multiple models without OOM-killing your server, and actually useful system prompts.
7 min read -
Continue.dev vs Cody vs Tabby: AI Code Help Without the Cloud
GitHub Copilot is great until you read the ToS. Continue.dev, Cody, and Tabby bring AI code assistance to your editor with local or self-hosted models — no code leaves your machine.
6 min read -
LangGraph vs CrewAI vs AutoGen: AI Agents Without the Hype
LangGraph gives you graph-level control. CrewAI gives your agents job titles. AutoGen makes them have a conversation. Here's which one to reach for when building real AI workflows.
6 min read -
Qdrant vs Weaviate vs Chroma: Vector DB Showdown
Every RAG tutorial says 'just use Chroma.' Then you hit production. Here's what Qdrant, Weaviate, and ChromaDB actually offer and when each one earns its place.
7 min read
All AI & LLMs articles
- Open WebUI Tools, Functions & Pipelines: Extend Your Local LLM
- Self-Supervised Learning Explained
- Ollama Model Management: Beyond ollama run
- Continue.dev vs Cody vs Tabby: AI Code Help Without the Cloud
- LangGraph vs CrewAI vs AutoGen: AI Agents Without the Hype
- Qdrant vs Weaviate vs Chroma: Vector DB Showdown
- LangChain vs LlamaIndex: RAG Framework Showdown
- The Embedding Model Choice Nobody Explains
- GPU Memory Math: Will This Model Actually Fit?
- Beyond RAG: When a Virtual Filesystem Works Better
- Running Gemma 4 Locally with Ollama
- 1-Bit LLMs: The Quantization Endgame
- AMD Lemonade: Local LLM Serving for AMD GPUs
- When to Use Structured Output (JSON Mode) in LLMs
- Using AI to Find Security Bugs in Your Code
- LLM Temperature and top_p Explained Without the Math
- LLM Backends: vLLM vs llama.cpp vs Ollama
- RAG Chunking: Why Chunk Size Is Everything
- LiteLLM & vLLM: One API to Rule All Your Models
- System Prompts: The LLM Feature Most People Ignore
- LLM Quantization: Q4_K_M Isn't Always the Best Choice
- Running Multiple Ollama Models Without Running Out of RAM
- Piper vs Coqui: Text-to-Speech on Your Own Hardware (Because AWS Polly Charges Per Character Like It's 1999 SMS)
- Context Window vs Token Limit: Not the Same Thing
- Ollama Memory Management: Why Models Keep Loading
- RAG on a Budget: Building a Knowledge Base with Ollama & ChromaDB
- Stable Diffusion vs ComfyUI vs Fooocus: AI Image Generation at Home
- n8n + LLM: Building Automations That Actually Think
- Text Generation Web UI vs KoboldCpp: Power User LLM Interfaces
- LangGraph vs CrewAI vs AutoGen: AI Agent Frameworks for Mere Mortals
- Open WebUI vs LibreChat: Self-Hosted ChatGPT Alternatives Compared
- LLM Fine-Tuning for Mortals: LoRA, QLoRA, and Your Gaming GPU
- Ollama Beyond the Basics: Model Management, Custom Models, and Optimization
- Whisper & Faster-Whisper: Self-Hosted Speech-to-Text That Actually Works
- Continue.dev vs Cody vs Tabby: AI Code Assistants That Live on Your Machine
- CUDA vs ROCm vs CPU: Running AI on Whatever GPU You've Got
- Flowise vs Langflow: Build AI Pipelines Without Writing a Novel
- n8n vs Node-RED: Automate Everything Without Learning to Code (Much)
- Key Parameters of Large Language Models
- Prompts for Image Generation in Stable Diffusion
- Prompt Engineering for Generative AI 101
- Large Language Model Formats and Quantization
- Exploring the Diverse World of LLM Models
- Ollama: Powerful Language Models on Your Own Machine
- Unleash the Power of LLMs with LocalAI
- Machine Learning models (AI)