Skills vs MCP: The Token Efficiency War (And Why It's Not Either/Or)

By Prahlad Menon 3 min read

When Claude Skills launched in October 2025, developer Simon Willison called them “maybe a bigger deal than MCP.” His reasoning: MCP’s token consumption was killing context windows, and Skills seemed to solve that problem elegantly.

A few months later, we have real benchmarks. And the numbers are stark.

The Benchmark That Changes Everything

Scalekit ran 75 benchmark runs comparing CLI, CLI+Skills, and MCP for identical GitHub tasks. Same model (Claude Sonnet 4), same prompts, only the tool interface changed.

Token usage for “What language is this repo?”:

  • CLI: 1,365 tokens
  • CLI + Skills: 4,724 tokens
  • MCP: 44,026 tokens

That’s 32× more tokens for MCP to answer a simple question.

The difference? Schema injection. GitHub’s Copilot MCP server exposes 43 tools. Every conversation loads all 43 tool definitions — names, descriptions, input schemas, output schemas — even if the agent only uses one.

The Cost Math

At Claude Sonnet 4 pricing ($3/M input, $15/M output), running 10,000 operations per month:

ApproachMonthly Cost
CLI~$3.20
CLI + Skills~$4.50
MCP (Direct)~$55.20

That’s a 17× cost multiplier for MCP. And it gets worse: MCP had a 28% failure rate in testing (timeout errors), while CLI hit 100% reliability.

How Skills Achieve Efficiency

Skills work fundamentally differently than MCP. Instead of injecting tool schemas, Skills inject knowledge about how to use existing tools.

A Skill is just a markdown file with tips:

  • Which gh flags to use
  • Output formatting patterns
  • Common workflows

The agent already knows how to use bash. The Skill just makes it smarter about which bash commands to run. No schema overhead. No tool definitions. Just 800 tokens of guidance that reduces tool calls by a third.

Armin Ronacher, creator of Flask, explains why he moved entirely from MCP to Skills:

“Skills are really just short summaries of which skills exist and in which file the agent can learn more about them. Crucially, skills do not actually load a tool definition into the context. The tools remain the same: bash and the other tools the agent already has.”

The killer feature: Skills can be self-maintaining. When a Skill breaks, you ask the agent to fix it. The agent maintains its own tools. MCP servers, by contrast, change their APIs without warning — and your integrations break silently.

MCP’s January 2026 Comeback

MCP didn’t stand still. In January 2026, Anthropic shipped progressive discovery — the same trick that made Skills efficient.

Now when you load an MCP, you get:

  • Tool name + short description: 20-50 tokens each
  • Full schema loads only when the agent decides to use that tool

Results:

  • Token overhead dropped 85% (77,000 → 8,700 tokens for 50+ tools)
  • Tool calling accuracy improved: Claude Opus 4 went from 49% to 74%

This closes the gap significantly. But Skills still win on pure efficiency because they avoid schema injection entirely.

When MCP Still Wins

Here’s where the “just use Skills” advice breaks down: multi-user applications.

If you’re building a personal developer tool, CLI+Skills is the obvious choice. The agent inherits your credentials, acts with your permissions, and the only person at risk is you.

But if you’re building B2B SaaS — a project management tool, support platform, or code review assistant — your agent acts as your customer’s employees, inside your customer’s organizations, touching your customer’s data.

That requires:

  1. Per-user OAuth — Each user grants scoped access. They can revoke it. Your app never touches their credentials.

  2. Tenant isolation — Acme’s repos must never appear in Globex’s Jira. This is data isolation, not just access control.

  3. Audit trails — When the security team asks “which user triggered that action?”, you need a protocol-level answer.

CLI agents can’t provide these. The properties that make CLI efficient — ambient auth, arbitrary execution, zero protocol overhead — are exactly what creates security incidents when agents cross from developer tool to customer-facing product.

The OpenClaw security incidents illustrated this perfectly: 10,000+ exposed instances leaking credentials, 12% of community skills found malicious, 770,000 agents vulnerable to remote hijacking. These aren’t code bugs — they’re architectural consequences of running shell access without authorization boundaries.

The Decision Framework

ScenarioBest Approach
Personal automationCLI + Skills
Developer toolsCLI + Skills
Internal team toolsSkills (maybe MCP)
B2B SaaS (multi-tenant)MCP with OAuth
Customer-facing agentsMCP with Gateway

The Hybrid Future

The smart play isn’t Skills or MCP — it’s Skills wrapping MCP.

Use Skills for:

  • Teaching the agent domain knowledge
  • Workflow orchestration
  • Context-efficient instruction delivery

Use MCP for:

  • Authenticated external integrations
  • Multi-user scenarios
  • Actions requiring audit trails

Phil Whittaker puts it well:

“Skills and MCPs aren’t competing solutions to the same problem. They’re fundamentally different architectures serving different purposes. Skills excel at information delivery and adaptive context management. MCPs provide structured tool integration with authorization boundaries.”

Practical Recommendations

If you’re building on OpenClaw:

  1. Default to Skills for everything that doesn’t require external auth
  2. Use mcporter to expose MCPs as CLI tools when you need both
  3. Add an 800-token skill file for any complex tool — it’s the highest-ROI optimization available
  4. Monitor token usage with observability tools like opik-openclaw

If you’re evaluating architecture:

  1. Count your tools. If < 10, MCP overhead is manageable
  2. Count your users. If > 1, you need MCP’s auth infrastructure
  3. Count your tenants. If > 1, you need MCP’s isolation guarantees

The token efficiency war isn’t over. But the winner isn’t Skills or MCP — it’s knowing when to use each.


Sources: