mgrep: Semantic grep for Code, PDFs, and Images — grep Finally Meets the AI Era

By Prahlad Menon Published 2026-03-18 2 min read

grep was born in 1973. It’s fast, universal, and brutally literal — give it a pattern and it returns every line that matches. But here’s the problem every developer eventually hits: you know what you’re looking for, but not the exact words used to name it. So you try grep -r "auth", then grep -r "token", then grep -r "session", and watch your coding agent burn through tokens trying hundreds of patterns on a large codebase. You needed meaning. grep gave you syntax.

mgrep from mixedbread-ai is the answer. It’s a CLI tool that performs semantic search across your entire project — code, PDFs, images, text — using natural language queries. Not keyword matching. Meaning.

# grep — you need the exact word
grep -r "exponential_backoff" ./src

# mgrep — describe the intent
mgrep "retry logic with backoff" ./src

Let’s break down what it does and how to actually use it.

What mgrep Does Differently

Where the vector database lives

This is the first question most developers ask, so let’s answer it upfront: mgrep uses Mixedbread’s own proprietary cloud vector store — it is not pluggable. There is no option to bring your own Pinecone, Weaviate, or Redis Vector. When you run mgrep watch, your files are chunked, embedded, and uploaded to a Mixedbread Store on their platform. Searches hit their cloud, reranking runs on their infrastructure, and results come back to your terminal.

What this means practically:

Your files leave your machine and live on Mixedbread’s servers. For most dev projects this is fine; for sensitive codebases, know this before indexing.
You get a managed, always-available store with no infra to run — no self-hosted Qdrant, no Chroma, no Docker Compose.
Because stores are cloud-backed, your whole team queries the same corpus without re-uploading. A teammate on a different machine can run mgrep "auth flow" --store my-project and get the same results.
Usage is visible at platform.mixedbread.com — track what’s indexed and how much capacity you’ve used.

Under the hood, mgrep uses Mixedbread Search — their embedding and reranking pipeline that converts queries and file contents into vectors, finds the top-k matches, then reranks for tighter relevance. It’s the same architecture behind RAG systems, but fully managed and packaged as a Unix-style CLI tool you can pipe, redirect, and script.

What makes it more than just “grep with embeddings”:

Truly multi-modal — the same query searches across code, text, PDFs, and images. One vector space.
Background indexing — mgrep watch runs a file watcher that keeps your store synced as you edit. Index once, search forever.
Web search built-in — --web queries the internet alongside your local files. No browser tab needed.
Agentic mode — --agentic breaks complex multi-part questions into sub-queries automatically.
Coding agent integrations — first-class support for Claude Code, Codex, OpenCode, Factory Droid.
2x token reduction — in their 50-task benchmark with Claude Code, mgrep-based workflows used roughly half the tokens of grep-based workflows at similar quality. The model reasons instead of scanning.

It respects .gitignore out of the box, supports .mgrepignore for additional exclusions, and stores are cloud-backed — meaning your whole team can query the same corpus without re-uploading.

Getting Started in 5 Minutes

Install:

npm install -g @mixedbread/mgrep
# or: pnpm add -g @mixedbread/mgrep
# or: bun add -g @mixedbread/mgrep

Sign in:

mgrep login
# Opens browser auth. For CI/CD, use API key instead:
export MXBAI_API_KEY=your_api_key_here

Index your project:

cd path/to/your/project
mgrep watch

This does an initial sync and then keeps your store updated as files change. Leave it running in a terminal tab.

Search:

mgrep "where do we set up authentication?"
mgrep "database connection pooling" src/
mgrep -m 25 "error handling for network timeouts"  # return 25 results max
mgrep -a "how does the billing module work?"        # get a summarized answer

That’s it. First-time setup takes about 3 minutes.

Practical Command Reference

Core search

# Basic semantic search
mgrep "query" [path]

# With options
mgrep -m 20 "retry logic"                 # max 20 results
mgrep -c "store schema"                    # show content snippets
mgrep -a "how does auth work?"             # generate a summarized answer
mgrep -s "recent changes to the API"       # sync files first, then search

Web search

# Search web alongside local files
mgrep --web "best practices for rate limiting"

# Get a direct answer from web sources
mgrep --web --answer "how to implement JWT refresh tokens in Python"

Agentic mode (complex questions)

# mgrep breaks this into sub-queries automatically
mgrep --agentic "what are all the places we handle user permissions?"

# Combine with --answer for a synthesized response
mgrep --agentic -a "how does the payment flow work end to end?"

Watch / indexing

mgrep watch                                         # index + watch current dir
mgrep watch --max-file-size 5242880                 # raise file limit to 5MB
mgrep watch --max-file-count 5000                   # raise file count limit

Agent integrations

mgrep install-claude-code     # adds mgrep to Claude Code
mgrep install-codex           # adds mgrep to Codex
mgrep install-opencode        # adds mgrep to OpenCode
mgrep install-droid           # adds mgrep to Factory Droid

One command — it signs you in if needed and wires up the integration.

When to Use mgrep vs. grep

They’re complementary, not competing. Here’s the practical breakdown:

Situation	Use
Find every file that imports `requests`	`grep`
Refactor a function name across the codebase	`grep` / ripgrep
”Where do we handle expired tokens?”	`mgrep`
Onboarding to a new codebase	`mgrep`
”Find the diagram showing the auth flow” (images + code)	`mgrep`
Search a PDF spec alongside your code	`mgrep`
”What does this module do?”	`mgrep -a`
CI/CD exact pattern matching	`grep`

The rule of thumb: grep for exact, mgrep for intent.

Real-World Use Cases

1. Codebase onboarding

New to a large repo? Instead of reading everything:

mgrep watch
mgrep -a "how does the job queue work?"
mgrep -a "what's the data model for users?"

Get oriented in minutes instead of hours.

2. Finding undocumented business logic

mgrep "pricing calculation for enterprise tiers"
mgrep "special case handling for EU customers"

No idea what the function is named. Doesn’t matter.

3. Mixed codebase + docs search

If your docs/ folder has PDFs, architecture diagrams, or design specs:

mgrep "database schema diagram"
# Returns both schema.sql AND the whiteboard PNG in docs/

One query, all file types.

4. Coding agent token reduction

Install once and let your agent use it:

mgrep install-claude-code

Instead of your agent running grep in a loop — filling its context window with irrelevant matches — it asks mgrep for the 5 most semantically relevant snippets and reasons from there. Their benchmark: ~2x token reduction, same or better answer quality.

5. Quick web answers without leaving the terminal

mgrep --web --answer "how do I configure uvicorn for production?"

No browser, no context switch.

Configuration

Create .mgreprc.yaml in your project root (or ~/.config/mgrep/config.yaml for global settings):

# .mgreprc.yaml
maxFileSize: 5242880   # 5MB (default: 1MB)
maxFileCount: 5000     # (default: 1000)

Key environment variables for CI/CD:

export MXBAI_API_KEY=your_key       # headless auth (no browser)
export MGREP_MAX_COUNT=25           # default result count
export MGREP_CONTENT=1              # always show content snippets
export MGREP_ANSWER=1               # always generate answers
export MGREP_RERANK=0               # disable reranking for speed

Precedence: CLI flags > env vars > local config > global config > defaults.

Important Limits and Privacy Considerations

Default max file size: 1MB per file (override with --max-file-size or MGREP_MAX_FILE_SIZE)
Default max file count: 1,000 files per sync (override with --max-file-count)
Files go to Mixedbread’s cloud — this is not optional. The vector store is their managed infrastructure, not local. Review what you index for sensitive projects.
Background sync when used with coding agents — mgrep watch starts automatically when you start a session and stops when it ends. Files are synced to Mixedbread during this time.
Store isolation — use --store my-project-name to namespace indexes. Useful for keeping client work, internal tools, and experiments separate.
Ignoring sensitive files — create .mgrepignore in your project root (same syntax as .gitignore) to exclude secrets, credentials, and private config before they’re uploaded.

# .mgrepignore
.env
.env.*
secrets/
*.pem
*.key
config/credentials.yml

Check usage and manage stores at platform.mixedbread.com.

Troubleshooting

Login keeps reopening:

mgrep logout
mgrep login

Watcher is noisy or stale after a big refactor:

# Use a named store to isolate experiments
mgrep watch --store my-feature-branch
mgrep "query" --store my-feature-branch

Need a completely fresh index: Delete the store from the Mixedbread dashboard, then run mgrep watch — it auto-creates a new one.

The Bottom Line

mgrep isn’t trying to replace grep. It’s trying to make the other 40% of searches — the ones where you know what you mean but not what words were used — actually work. The Unix philosophy (composable, quiet, pipeable) is intact. The 50-year limitation of literal pattern matching is gone.

If you work on large codebases, onboard onto unfamiliar code regularly, or run coding agents that burn tokens on grep loops, mgrep is worth 5 minutes to install and try.

Links:

GitHub: mixedbread-ai/mgrep (Apache 2.0)
Live demo
npm package
Mixedbread platform (usage + quotas)

Have you tried mgrep in your workflow? What search problems do you still hit that even semantic search doesn’t solve? Drop a comment below.