Skip to content
Go back

Open WebUI Tools, Functions & Pipelines: Extend Your Local LLM

By SumGuy 11 min read
Open WebUI Tools, Functions & Pipelines: Extend Your Local LLM

You Installed Open WebUI. Now What?

So you’ve got Ollama running, you’ve got Open WebUI in front of it, and you’ve been having conversations with Llama 3 like a normal person. That’s great. But you’ve also probably noticed there are menu items labeled “Tools,” “Functions,” and something called “Pipelines” — and if you clicked them, you found a Python editor, a confusing description, and maybe a vague sense that there’s a whole other level to this thing.

There is. And it’s actually useful, not just impressive-looking.

Open WebUI’s extension system is one of the most underrated parts of the project. The problem is that “Tools,” “Functions,” and “Pipelines” sound interchangeable until you need one of them, at which point the distinctions matter a lot. This post is the map you needed before you started clicking around.

Let’s break them all down, build a working example of each, and end with a decision rule you can actually use.


The Mental Model

Before code, here’s the 30-second version:

They look similar in the UI because they’re all Python. They solve wildly different problems.


Tools: Giving the Model a Screwdriver

Tools are what people mean when they say “function calling” or “tool use.” You write a Python class with methods decorated to describe what they do, the model sees those descriptions in its system context, and when the model decides it needs to use one, Open WebUI runs the code and feeds the result back into the conversation.

The model has to support tool use — Llama 3.1+, Mistral, Qwen 2.5, most modern models do. Older 7B models often don’t.

A Weather Tool

Here’s a minimal tool that fetches current weather from wttr.in (no API key required, because we’re not masochists):

weather_tool.py
"""
title: Weather Lookup
author: sumguy
description: Fetches current weather for a given city using wttr.in
version: 0.1.0
required_open_webui_version: 0.3.0
"""
import requests
from pydantic import BaseModel, Field
class Tools:
class Valves(BaseModel):
"""Optional config exposed in the UI."""
units: str = Field(
default="metric",
description="Temperature units: metric or imperial"
)
def __init__(self):
self.valves = self.Valves()
def get_weather(self, city: str) -> str:
"""
Get the current weather for a city.
:param city: Name of the city to look up weather for
:return: Weather summary as plain text
"""
unit_flag = "m" if self.valves.units == "metric" else "u"
url = f"https://wttr.in/{city}?format=3&{unit_flag}"
try:
resp = requests.get(url, timeout=5)
resp.raise_for_status()
return resp.text.strip()
except requests.RequestException as e:
return f"Weather lookup failed: {e}"

Paste this into Open WebUI → Workspace → Tools → Create Tool. Enable it on a model. Then ask “what’s the weather in Berlin?” and watch the model call get_weather("Berlin") and use the result.

A few things to notice:

Tools are sandboxed-ish — they run in the Open WebUI process with whatever network access the server has. We’ll come back to why that matters.


Functions: Hooking the Chat Lifecycle

Functions are different. The model never “decides” to call a Function — Functions run automatically as messages flow through the system. Think of them as middleware.

There are three types:

TypeWhen it runsWhat it’s for
FilterBefore and after every messageTransform/inspect/block content
PipeInstead of a model callCustom model sources, routing
ActionOn user click (button in UI)Post-processing, export, triggers

Filters are the most common and most useful for self-hosters. Let’s build one.

A PII Redaction Filter

Your prompts go to a local model, but maybe you’re logging them, or you’ve got multiple users on your instance and you don’t want someone accidentally pasting their credentials and having them end up in conversation history. This filter strips email addresses from outgoing prompts:

pii_filter.py
"""
title: Email Redaction Filter
author: sumguy
description: Strips email addresses from user messages before they reach the model
version: 0.1.0
required_open_webui_version: 0.3.0
"""
import re
from pydantic import BaseModel
class Filter:
class Valves(BaseModel):
enabled: bool = True
replacement: str = "[REDACTED_EMAIL]"
def __init__(self):
self.valves = self.Valves()
def inlet(self, body: dict, user: dict | None = None) -> dict:
"""
Runs before the message reaches the model.
Strips email addresses from user message content.
"""
if not self.valves.enabled:
return body
email_pattern = r"[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}"
messages = body.get("messages", [])
for msg in messages:
if msg.get("role") == "user" and isinstance(msg.get("content"), str):
msg["content"] = re.sub(
email_pattern,
self.valves.replacement,
msg["content"]
)
body["messages"] = messages
return body
def outlet(self, body: dict, user: dict | None = None) -> dict:
"""
Runs after the model responds.
Pass-through here — we only care about inlet.
"""
return body

The inlet method sees the request before the model does. The outlet method sees the response before it hits the UI. You can use both, either, or neither — just implement what you need.

Go to Workspace → Functions → Create Function, paste it in, then enable it globally or per-model in the admin settings.

Pipes (the other Function type) let you present arbitrary backends as model options in the dropdown. If you want to route certain conversations to a remote API, a different local model via a custom URL, or a completely different inference backend — a Pipe is how you do it. They’re more complex, but the pattern is the same: a pipe() method that receives the messages and returns a string or a generator for streaming.


Pipelines: The Separate Service

Pipelines is a whole different animal. It’s a standalone Python FastAPI service — you run it separately, point Open WebUI at it as an OpenAI-compatible endpoint, and it shows up as a model in your model dropdown.

Why bother? A few reasons:

Here’s the Docker Compose setup to get it running alongside your existing stack:

docker-compose.yml
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
- OPENAI_API_BASE_URLS=http://pipelines:9099
- OPENAI_API_KEYS=your-pipelines-key
volumes:
- open-webui:/app/backend/data
depends_on:
- ollama
- pipelines
ollama:
image: ollama/ollama:latest
volumes:
- ollama:/root/.ollama
pipelines:
image: ghcr.io/open-webui/pipelines:main
ports:
- "9099:9099"
environment:
- PIPELINES_API_KEY=your-pipelines-key
volumes:
- pipelines:/app/pipelines
volumes:
open-webui:
ollama:
pipelines:

A Multi-Doc RAG Pipeline Skeleton

This isn’t a full RAG implementation (that deserves its own post), but here’s the skeleton that shows how a Pipeline is structured:

rag_pipeline.py
"""
title: Simple RAG Pipeline
author: sumguy
description: Retrieves relevant context from a document store before answering
version: 0.1.0
"""
from typing import Generator, Iterator, Union
from pydantic import BaseModel
class Pipeline:
class Valves(BaseModel):
# Config exposed in the WebUI admin panel
collection_name: str = "my_docs"
top_k: int = 3
ollama_base_url: str = "http://ollama:11434"
ollama_model: str = "llama3.1:8b"
def __init__(self):
self.valves = self.Valves()
# Initialize your vector store client here
# self.chroma = chromadb.HttpClient(host="chroma", port=8000)
async def on_startup(self):
"""Called when the pipeline service starts."""
print(f"RAG Pipeline started, collection: {self.valves.collection_name}")
async def on_shutdown(self):
"""Called on shutdown."""
pass
def pipe(
self,
user_message: str,
model_id: str,
messages: list[dict],
body: dict
) -> Union[str, Generator, Iterator]:
"""
Main handler. Receives the user's message, retrieves context,
then calls the LLM with augmented prompt.
"""
# Step 1: retrieve relevant chunks from vector store
context_chunks = self._retrieve(user_message)
# Step 2: build augmented prompt
context_str = "\n\n".join(context_chunks)
augmented_prompt = (
f"Answer based on this context:\n\n{context_str}\n\n"
f"Question: {user_message}"
)
# Step 3: call your LLM (Ollama, OpenAI, whatever)
# Here you'd use requests or the ollama client lib
# and return a string or yield chunks for streaming
return f"[RAG would answer here using context from {len(context_chunks)} chunks]"
def _retrieve(self, query: str) -> list[str]:
"""
Pull relevant document chunks from the vector store.
Replace this with your actual retrieval logic.
"""
# Example: return self.chroma.query(...)
return [f"Placeholder chunk for query: {query}"]

Drop this in the pipelines/ volume directory, restart the pipeline service, and it shows up as a model option in Open WebUI. Full RAG setup means wiring in ChromaDB or Qdrant, an embedding model, and document ingestion — but the Pipeline wrapper here stays exactly this shape.


The Security Warning You Skipped

Here it is, and I’m going to say it clearly: Tools and Functions run arbitrary Python on your server with the network access and file permissions of the Open WebUI process.

If you install a Tool from the community hub without reading it, you’re running random code from the internet on your home server. That hub is great — there are hundreds of useful Tools for web search, calendar integration, home automation — but treat it like you’d treat a random GitHub repo. Read the code. It’s Python, it’s short, you can do this.

A few things to be especially paranoid about:

Run Open WebUI as a non-root user with minimal filesystem access. Consider network policies if you’re running this on a machine with other sensitive services. The tool call happens server-side, not in a browser sandbox.

This isn’t a reason to avoid the extension system — it’s a reason to not mindlessly paste code from the community hub into production.


The Decision Rule

You’re staring at the UI wondering which extension point to use. Here’s the flowchart in plain English:

Use a Tool when: you want the model to optionally call something — APIs, calculations, lookups — and the model should decide when that’s appropriate. Weather, web search, calendar queries, code execution.

Use a Filter Function when: you need to transform every message automatically, without the model choosing. PII scrubbing, prompt injection, content moderation, logging, response post-processing. The user and model don’t need to know it’s happening.

Use a Pipe Function when: you want to present a custom backend as a model in the dropdown. Routing logic, A/B testing between models, wrapping a custom API as a “model.”

Use a Pipeline (the separate service) when: your use case is heavy, stateful, or has dependencies you don’t want inside the WebUI container. RAG with a real document store, agent loops with tool orchestration, multi-model chaining, anything that needs its own scaling story.

When in doubt: start with a Tool. They’re the simplest, they’re scoped to the model’s decision-making, and they’re easy to test by just asking the model to use them.


Where to Find More

The Open WebUI community hub has hundreds of Tools and Functions. Filter by stars, read the code, and remember the security note above before you click Install.

The official docs at docs.openwebui.com have the full Valves reference, streaming patterns for Pipes, and the Pipeline API spec. They’re actually pretty good once you know which section to look in.

Your local LLM setup is already more capable than most people running cloud-hosted chat. The extension system is what turns “a ChatGPT clone pointed at Ollama” into something genuinely tailored to your workflow — whether that’s a home automation assistant that controls your lights, a research tool that searches your local document library, or just a filter that stops you from accidentally asking your AI about your AWS credentials.

Start with a weather Tool. You’ll be writing RAG pipelines by next weekend. Your 2 AM self will appreciate having read this first.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Previous Post
Immich Hardware Acceleration: Stop Cooking Your CPU
Next Post
Coolify vs Dokploy: Self-Hosted Vercel for People Who Don't Trust Vercel

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts