SoulSearch v0.3: Ollama Support, Session Memory, and Web Search — Open Source Only
TL;DR: SoulSearch v0.3 brings Ollama support (run local LLMs), session-specific memory (separate from global Git memory), and Brave Search as a proper tool. These features are open-source only — the Chrome Web Store version stays at v0.2 for stability. Get the new features from GitHub.
Video Demo
Watch SoulSearch v0.3 in action — browsing pages, finding certifications, and using the new Side Panel agent mode:
Watch on Google Drive if the embed doesn’t load.
This is a follow-up to our original SoulSearch announcement. If you haven’t read that yet, start there for the full overview.
What’s New in v0.3?
Version 0.3 adds six major features, all focused on privacy, flexibility, and local-first AI:
| Feature | What It Does |
|---|---|
| Ollama Support | Run any local LLM — no cloud API required |
| Session Memory | Keep notes per-session, separate from global |
| Memory Strategy | Choose Truncate (fast) or RLM (compress) |
| Brave Search Tool | Agent can search the web when needed |
| Separate Agent Model | Use different models for chat vs agent |
| Separate Agent Provider | Mix providers — e.g., Ollama for chat, Claude for agent |
| Side Panel Agent | Persistent agent UI that stays open during page interactions |
| Tabbed Memory Panel | View Session and Global memory separately |
Why Open Source Only?
The Chrome Web Store version (v0.2) uses cloud LLMs like Claude and GPT-4. It’s stable, tested, and works out of the box.
The open-source version on GitHub is where we experiment. Ollama support, session memory, and the new agent features require more technical setup — running Ollama, configuring CORS, choosing tool-capable models. We want users who grab these features to know what they’re signing up for.
If you want plug-and-play: Install from the Chrome Web Store.
If you want cutting-edge + local: Clone from GitHub and run the feat/ollama-support branch (soon to merge to main).
How Does Ollama Support Work?
SoulSearch now supports Ollama as a provider. Configure it in Settings:
- Provider: Ollama
- Ollama URL:
http://localhost:11434(default) - Model:
llama3.2,qwen2.5, or any installed model
No API key needed. Your queries stay entirely on your machine.
Important: Start Ollama with CORS enabled for Chrome extensions:
OLLAMA_ORIGINS="chrome-extension://*" ollama serve
For agent mode (browser automation), you need a tool-capable model. Vision models like llama3.2-vision don’t support tool calling. Use llama3.2 or qwen2.5 for agent tasks.
What Is Session-Specific Memory?
Previously, all saved memories went to MEMORY.md — your global, Git-backed memory shared across all sessions.
Now you have two options when saving an AI response:
- 💾 Session — Saves to this session only. Not synced to Git. Disappears when you delete the session.
- 🌐 Global — Saves to MEMORY.md. Synced to your Git repo. Available in all sessions.
The memory panel now has tabs:
- 📝 Session — Shows memory for the current session
- 🌐 Global — Shows your full MEMORY.md
The session dropdown also shows a 🧠 indicator with memory count (e.g., “Session 1 (5) 🧠3” means 5 messages, 3 saved memories).
How Does the Memory Strategy Setting Work?
When your memory gets too long, it can exceed the model’s context window. v0.3 adds a toggle:
- Truncate (fast) — Keeps newest memories, drops oldest. No extra API calls. Default.
- RLM (thorough) — Compresses memory using the LLM before sending. Slower but preserves more context.
The threshold is 6000 characters. Below that, full memory is included. Above that, the strategy kicks in.
How Does Brave Search Work?
If you add a Brave Search API key in Settings, the agent gains a web_search tool.
When you ask about current events, news, or anything not on the current page, the agent can search the web:
User: What's the latest news on the Iran situation?
Agent: [calls web_search("Iran news latest")]
Agent: Here's what I found: ...
No regex pattern matching. The model decides when to search based on your question.
Get a free API key at api.search.brave.com.
What About Vision Models?
Ollama offers vision models like llama3.2-vision that can “see” images. However, these models don’t support tool calling — which agent mode requires.
Solution: v0.3 adds a separate Agent Model setting.
- Chat Model:
llama3.2-vision— for visual understanding - Agent Model:
llama3.2— for browser automation with tools
If you leave Agent Model empty, it uses your chat model. If that model doesn’t support tools, you’ll get a clear error message telling you to set a tool-capable agent model.
How Do I Get v0.3?
Clone the repo and checkout the feature branch:
git clone https://github.com/menonpg/soulsearch.git
cd soulsearch
git checkout feat/ollama-support
Load unpacked in Chrome:
- Go to
chrome://extensions/ - Enable Developer Mode
- Click “Load unpacked” → select the
soulsearchfolder
Start Ollama with CORS:
OLLAMA_ORIGINS="chrome-extension://*" ollama serve
Configure in Settings:
- Provider: Ollama
- Model: llama3.2 (or your preferred model)
- Agent Model: llama3.2 (if using a vision model for chat)
- Brave API Key: (optional, for web search)
Frequently Asked Questions
Does SoulSearch v0.3 work offline?
Yes, with Ollama. Your LLM runs locally, memory is stored locally (and optionally in your Git repo). No internet required except for initial model download.
Which Ollama models support agent mode?
Most text models support tool calling: llama3.2, qwen2.5, mistral. Vision models like llama3.2-vision do NOT support tools. Use a text model for agent tasks.
Is my data sent to any servers?
Only if you choose a cloud provider (Anthropic, OpenAI). With Ollama, everything stays on your machine. Git sync only happens when you explicitly push.
Will v0.3 come to the Chrome Web Store?
Eventually, yes — once it’s battle-tested. For now, we’re keeping the store version stable while the open-source version evolves faster.
Can I use both session and global memory together?
Yes. Session memory is always included in that session’s context. Global memory is included based on your memory strategy setting.
What happened to the regex-based search detection?
Removed. The old version tried to detect search intent with pattern matching (“search for X”, “look up Y”). Now the model decides via proper tool calling.
Links: