2024-08-04 10:43:23 +00:00
|
|
|
# OpenWebUI Filters and Tools
|
2024-07-18 20:13:00 +00:00
|
|
|
|
2024-07-18 20:18:27 +00:00
|
|
|
_Mirrored at Github: https://github.com/ProjectMoon/open-webui-filters_
|
|
|
|
|
2024-08-07 12:14:02 +00:00
|
|
|
Documentation (HTML):
|
|
|
|
[https://agnos.is/projects/open-webui-filters/][docs-html]
|
|
|
|
|
|
|
|
Documentation (Gemini):
|
|
|
|
[gemini://agnos.is/projects/open-webui-filters/][docs-gemini]
|
|
|
|
|
2024-08-04 10:43:23 +00:00
|
|
|
My collection of OpenWebUI Filters and Tools.
|
2024-07-18 20:13:00 +00:00
|
|
|
|
|
|
|
So far:
|
|
|
|
|
2024-07-24 22:55:58 +00:00
|
|
|
- **Checkpoint Summarization Filter:** A work-in-progress replacement
|
|
|
|
for the narrative memory filter for more generalized use cases.
|
2024-07-18 20:13:00 +00:00
|
|
|
- **GPU Scaling Filter:** Reduce number of GPU layers in use if Ollama
|
2024-10-07 20:12:20 +00:00
|
|
|
crashes due to running out of VRAM. Deprecated.
|
2024-07-22 18:55:29 +00:00
|
|
|
- **Output Sanitization Filter:** Remove words, phrases, or
|
|
|
|
characters from the start of model replies.
|
2024-08-04 10:43:23 +00:00
|
|
|
- **OpenStreetMap Tool:** Tool for querying OpenStreetMap to look up
|
|
|
|
address details and nearby points of interest.
|
2024-10-07 20:12:20 +00:00
|
|
|
- **Collapsible Thought Filter:** Hide LLM reasoning/thinking in a
|
|
|
|
collapisble block.
|
2024-07-18 20:13:00 +00:00
|
|
|
|
2024-09-27 21:10:42 +00:00
|
|
|
## Checkpoint Summarization Filter
|
2024-07-24 22:55:58 +00:00
|
|
|
|
|
|
|
A new filter for managing context use by summarizing previous parts of
|
|
|
|
the chat as the conversation continues. Designed for both general
|
|
|
|
chats and narrative/roleplay use. Work in progress.
|
|
|
|
|
|
|
|
### Configuration
|
|
|
|
|
|
|
|
There are currently 4 settings:
|
|
|
|
|
|
|
|
- **Summarizer Model:** The model used to summarize the conversation
|
|
|
|
as the chat continues. This must be a base model.
|
|
|
|
- **Large Context Summarizer Model:** If large context summarization
|
|
|
|
is turned on, use this model for summarizing huge contexts.
|
|
|
|
- **Summarize Large Contexts:** If enabled, the filter will attempt
|
|
|
|
to load the entire context into the large summarizer model for
|
|
|
|
creating an initial checkpoint of an existing conversation.
|
|
|
|
- **Wiggle Room:** This is the amount of 'wiggle room' for estimating
|
|
|
|
a context shift. This number is subtracted from `num_ctx` for the
|
|
|
|
purposes of determining whether or not a context shift has
|
|
|
|
occurred.
|
|
|
|
|
|
|
|
### Usage
|
|
|
|
|
|
|
|
In general, you should only need to specify the summarizer model and
|
|
|
|
enable the filter on the OpenWebUI models that you want it to work on.
|
|
|
|
Or even enable it globally. The filter works best when used from a new
|
|
|
|
conversation, but it does have the (currently limited) ability to deal
|
|
|
|
with existing conversations.
|
|
|
|
|
|
|
|
- When the filter detects a context shift in the conversation, it
|
|
|
|
will summarize the pre-existing context. After that, the summary
|
|
|
|
is appended to the system prompt, and old messages before
|
|
|
|
summarization are dropped.
|
|
|
|
- When the filter detects the next context shift, this process is
|
|
|
|
repeated, and a new summarization checkpoint is created. And so
|
|
|
|
on.
|
|
|
|
|
|
|
|
If the filter is used in an existing conversation, it will summarize
|
|
|
|
on the first time that it detects a context shift in the conversation:
|
|
|
|
|
|
|
|
- If there are enough messages that the conversation is considered
|
|
|
|
"big," and large context summarization is **disabled**, all but the
|
|
|
|
last 4 messages will be dropped to form the summary.
|
|
|
|
- If the conversation is considered "big," and large context
|
|
|
|
summarization is **enabled**, then the large context model will be
|
|
|
|
loaded to do the summarization, and the **entire conversation**
|
|
|
|
will be given to it.
|
|
|
|
|
|
|
|
#### User Commands
|
|
|
|
|
|
|
|
There are some basic commands the user can use to interact with the
|
|
|
|
filter in a conversation:
|
|
|
|
|
|
|
|
- `!nuke`: Deletes all summary checkpoints in the chat, and the
|
|
|
|
filter will attempt to summarize from scratch the next time it
|
|
|
|
detects a context shift.
|
|
|
|
|
|
|
|
### Limitations
|
|
|
|
|
|
|
|
There are some limitations to be aware of:
|
|
|
|
- If you enable large context summarization, you need to make sure
|
|
|
|
your system is capable of loading and summarizing an entire
|
|
|
|
conversation.
|
|
|
|
- Handling of branching conversations and regenerated responses is
|
|
|
|
currently rather messy. It will kind of work. There are some plans
|
|
|
|
to improve this.
|
|
|
|
- If large context summarization is disabled, pre-existing large
|
|
|
|
conversations will only summarize the previous 4 messages when the
|
|
|
|
first summarization is detected.
|
|
|
|
- The filter only loads the most recent summary, and thus the AI
|
|
|
|
might "forget" much older information.
|
|
|
|
|
2024-07-18 20:13:00 +00:00
|
|
|
## GPU Scaling Filter
|
|
|
|
|
2024-10-07 20:12:20 +00:00
|
|
|
_Deprecated. Use the setting in OpenWebUI chat controls._
|
|
|
|
|
2024-07-18 20:13:00 +00:00
|
|
|
This is a simple filter that reduces the number of GPU layers in use
|
|
|
|
by Ollama when it detects that Ollama has crashed (via empty response
|
|
|
|
coming in to OpenWebUI). Right now, the logic is very basic, just
|
|
|
|
using static numbers to reduce GPU layer counts. It doesn't take into
|
|
|
|
account the number of layers in models or dynamically monitor VRAM
|
|
|
|
use.
|
|
|
|
|
|
|
|
There are three settings:
|
2024-07-22 18:55:29 +00:00
|
|
|
|
2024-07-18 20:13:00 +00:00
|
|
|
- **Initial Reduction:** Number of layers to immediately set when an
|
|
|
|
Ollama crash is detected. Defaults to 20.
|
|
|
|
- **Scaling Step:** Number of layers to reduce by on subsequent crashes
|
|
|
|
(down to a minimum of 0, i.e. 100% CPU inference). Defaults to 5.
|
|
|
|
- **Show Status:** Whether or not to inform the user that the
|
|
|
|
conversation is running slower due to GPU layer downscaling.
|
|
|
|
|
2024-07-22 18:55:29 +00:00
|
|
|
## Output Sanitization Filter
|
|
|
|
|
|
|
|
This filter is intended for models that often output unwanted
|
|
|
|
characters or terms at the beginning of replies. I have noticed this
|
|
|
|
especially with Beyonder V3 and related models. They sometimes output
|
|
|
|
a `":"` or `"Name:"` in front of replies. For example, if system prompt is
|
|
|
|
`"You are Quinn, a helpful assistant."` the model will often reply with
|
|
|
|
`"Quinn:"` as its first word.
|
|
|
|
|
|
|
|
There is one setting:
|
|
|
|
|
|
|
|
- **Terms:** List of terms or characters to remove. This is a list,
|
|
|
|
and in the UI, each item should be separated by a comma.
|
|
|
|
|
|
|
|
For the above example, the setting textbox should have `:,Quinn:` in
|
|
|
|
it, to remove a single colon from the start of replies, and `Quinn:`
|
|
|
|
from the start of replies.
|
|
|
|
|
|
|
|
### Other Notes
|
|
|
|
|
|
|
|
Terms are removed in the order defined by the setting. The filter
|
|
|
|
loops through each term and attempts to remove it from the start of
|
|
|
|
the LLM's reply.
|
|
|
|
|
2024-08-04 10:43:23 +00:00
|
|
|
## OpenStreetMap Tool
|
|
|
|
|
2024-09-27 21:23:56 +00:00
|
|
|
_Recommended models: Llama 3.1+, Mistral Nemo, Mistral Small, Qwen 2.5._
|
2024-08-05 07:48:44 +00:00
|
|
|
|
2024-08-04 10:43:23 +00:00
|
|
|
A tool that can find certain points of interest (POIs) nearby a
|
|
|
|
requested address or place.
|
|
|
|
|
2024-09-27 21:23:56 +00:00
|
|
|
These are the current settings:
|
2024-08-04 10:43:23 +00:00
|
|
|
- **User Agent:** The custom user agent to set for OSM and Overpass
|
|
|
|
Turbo API requests.
|
|
|
|
- **From Header:** The email address for the From header for OSM and
|
|
|
|
Overpass API requests.
|
|
|
|
- **Nominatim API URL:** URL of the API endpoint for Nominatim, the
|
|
|
|
reverse geocoding (address lookup) service. Defaults to the public
|
2024-09-30 16:01:25 +00:00
|
|
|
instance. This must be the root URL, for example:
|
|
|
|
`https://nominatim.openstreetmap.org/`.
|
2024-08-04 10:43:23 +00:00
|
|
|
- **Overpass Turbo API URL:** URL of the API endpoint for Overpass
|
|
|
|
Turbo, for searching OpenStreetMap. Defaults to the public
|
|
|
|
endpoint.
|
2024-09-17 20:10:04 +00:00
|
|
|
- **Instruction Oriented Interpretation:** Controls the level of
|
|
|
|
detail in the instructions for interpreting results given to the
|
|
|
|
LLM. By default, it gives detailed instructions. Turn this setting
|
|
|
|
off if results are inconsistent, wrong, or missing.
|
2024-09-23 20:12:17 +00:00
|
|
|
- **Status Indicators:** If enabled, emit update events to the web
|
|
|
|
UI, showing what the tool is doing and what search results it has
|
|
|
|
found, or if it has encountered an error.
|
2024-09-27 21:23:56 +00:00
|
|
|
- **ORS API Key:** Provide an API key for Open Route Service to
|
|
|
|
calculate navigational routes to nearby places, to provide more
|
|
|
|
accurate search results.
|
|
|
|
- **ORS Instance:** By default, use the public Open Route Service
|
|
|
|
instance. Can be changed to point to another ORS instance.
|
2024-08-04 10:43:23 +00:00
|
|
|
|
|
|
|
The tool **will not run** without the User Agent and From headers set.
|
|
|
|
This is because the public instance of the Nominatim API will block
|
|
|
|
you if you do not set these. Use of the public Nominatim instance is
|
|
|
|
governed by their [terms of use][nom-tou].
|
|
|
|
|
2024-08-05 07:48:44 +00:00
|
|
|
The default API services are suitable for applications with a low
|
|
|
|
volume of traffic (absolute max 1 API call per second). If you are
|
|
|
|
running a production service, you should set up your own Nominatim and
|
|
|
|
Overpass services with caching.
|
|
|
|
|
2024-10-07 20:12:20 +00:00
|
|
|
### How to enable 'Where is the closest X to my location?'
|
2024-09-23 20:12:17 +00:00
|
|
|
|
|
|
|
In order to have the OSM tool be able to answer questions like "where
|
|
|
|
is the nearest grocery store to me?", it needs access to your realtime
|
|
|
|
location. This can be accomplished with the following steps:
|
|
|
|
- Enable user location in your user settings.
|
|
|
|
- Create a model with a _system prompt_ that references the variable
|
|
|
|
`{{USER_LOCATION}}`.
|
|
|
|
- OpenWebUI will automatically substitute the GPS coordinates
|
|
|
|
reported by the browser into the model's system prompt on every
|
|
|
|
message.
|
|
|
|
|
2024-10-07 20:12:20 +00:00
|
|
|
# Collapsible Thought Filter
|
|
|
|
|
|
|
|
Hides model reasoning/thinking processes in a collapisble block in the
|
|
|
|
UI response, similar to OpenAI o1 replies in ChatGPT. Designed to be
|
|
|
|
used with [Reflection 70b](https://ollama.com/library/reflection) and
|
|
|
|
similar models.
|
|
|
|
|
|
|
|
Current settings:
|
|
|
|
|
|
|
|
- Priority: what order to run this filter in.
|
|
|
|
- Thought Title: the title of the collapsed thought block.
|
|
|
|
- Thought Tag: The XML tag that contains the model's reasoning.
|
|
|
|
- Output Tag: The XML tag that contains the model's final output.
|
|
|
|
- Use Thoughts As Context: Whether or not to send LLM reasoning text
|
|
|
|
as context. Disabled by default because it drastically increases
|
|
|
|
token use.
|
|
|
|
|
|
|
|
**Note on XML tag settings:** These should be tags WITHOUT the `<>`.
|
|
|
|
If you customize the tag setting, make sure your setting is just
|
|
|
|
`mytag` and not `<mytag>` or anything else.
|
|
|
|
|
2024-07-18 20:13:00 +00:00
|
|
|
# License
|
2024-07-15 14:39:54 +00:00
|
|
|
|
2024-07-22 18:55:29 +00:00
|
|
|
<img src="./agplv3.png" alt="AGPLv3" />
|
|
|
|
|
|
|
|
All filters are licensed under [AGPL v3.0+][agpl]. The code is free
|
|
|
|
software, that you can run, redistribute, modify, study, and learn
|
|
|
|
from as you see fit, as long as you extend that same freedom to
|
2024-07-22 19:00:16 +00:00
|
|
|
others, in accordance with the terms of the AGPL. Make sure you are
|
|
|
|
aware how this might affect your OpenWebUI deployment, if you are
|
|
|
|
deploying OpenWebUI in a public environment!
|
2024-07-22 18:55:29 +00:00
|
|
|
|
2024-10-07 20:12:20 +00:00
|
|
|
Some filters may have code in them subject to other licenses. In those
|
|
|
|
cases, the licenses and what parts of the code they apply to are
|
|
|
|
detailed in that specific file.
|
|
|
|
|
2024-07-22 18:55:29 +00:00
|
|
|
[agpl]: https://www.gnu.org/licenses/agpl-3.0.en.html
|
2024-07-24 22:55:58 +00:00
|
|
|
[checkpoint-filter]: #checkpoint-summarization-filter
|
2024-08-04 10:43:23 +00:00
|
|
|
[nom-tou]: https://operations.osmfoundation.org/policies/nominatim/
|
2024-08-07 12:14:02 +00:00
|
|
|
[docs-html]: https://agnos.is/projects/open-webui-filters/
|
|
|
|
[docs-gemini]: gemini://agnos.is/projects/open-webui-filters/
|