What is MCP?
Model Context Protocol (MCP) is an open standard created by Anthropic for connecting AI models to external tools and data sources. It works by running local "MCP servers" that expose tools to the AI agent via a JSON-RPC interface. Each server declares its available tools using JSON schemas that describe the function name, parameters, types, and descriptions.
When an AI agent starts a session, every connected MCP server injects its full set of tool schemas into the model's context window. This gives the model knowledge of what tools are available and how to call them. MCP has gained significant adoption since its release, with servers available for databases, file systems, web browsers, APIs, and more.
The protocol itself is well-designed for what it does: it provides a standard way for AI models to discover and invoke tools with type safety and structured communication. But this design choice — injecting schemas into every prompt — comes with a cost that compounds as you add more servers.
The Token Cost Problem
Every MCP server injects its complete tool schema into the model's context window on every single prompt. This isn't a one-time cost — it's paid on every message in every conversation. Here's what that looks like in practice:
3 MCP servers (files + database + API) = 24,000 – 45,000 tokens per prompt
5 MCP servers = 40,000 – 75,000 tokens per prompt
Claude's context window = 200,000 tokens
5 MCP servers consume 20–37% of the entire context before the agent does anything
At Claude's pricing ($3 / 1M input tokens):
5 servers × 1,000 prompts = ~$0.22 wasted on tool schemas alone
Scale that to 100,000 prompts/month = ~$22/month just to tell the model what tools exist
But cost is only half the problem. The real damage is to reasoning quality. When 30–40% of the context window is filled with JSON schemas the model may never use, there's less room for the actual conversation, code, documents, and chain-of-thought reasoning that the agent needs to do its job. Every token spent on tool descriptions is a token that can't be spent on solving the user's problem.
This problem gets worse with longer conversations. As the dialogue grows, the model is juggling the user's request, its own reasoning, and tens of thousands of tokens of tool definitions — most of which are irrelevant to the current task. The agent that needs to process a CSV file doesn't need the database schema, the web browser tools, or the email API definitions cluttering its context.
How CLI Tools Solve This
CliHub takes a fundamentally different approach: CLI tools are binaries on your PATH. They exist on the system but consume zero tokens in the model's context until the agent actually decides to use one. There's no ambient schema cost.
Here's how an AI agent works with CliHub:
- Discovery is on-demand. The agent runs
clihub list --jsononly when it needs to find a tool. The full catalog of 104 tools is about 87KB of JSON — but the agent only pays this cost once, when it needs it, not on every prompt. - Installation is one command. Once the agent picks a tool, it runs
clihub install jq. CliHub auto-detects the right package manager (brew, pip, npm, cargo) and installs it. The tool is immediately available on PATH. - Usage is native. The agent calls
jq --helpto learn the tool's interface, then uses it directly. No JSON-RPC, no schema translation, no server process to maintain. - LLMs already know how. Large language models were trained on billions of shell scripts, man pages, Stack Overflow answers, and README files. They already know how to use
git,curl,jq,grep, and hundreds of other CLI tools without any schema injection. The training data is the protocol.
The key insight is that --help is a universal, self-documenting protocol that every CLI tool already speaks. And --json provides structured output that agents can parse reliably. You don't need to reinvent tool communication when the command line already solved it decades ago.
Side-by-Side Comparison
| MCP | CLI (CliHub) | |
|---|---|---|
| Context cost | 8k–15k tokens / server, every prompt | 0 tokens until agent runs --help |
| Discovery | Schema injected into every prompt | On-demand: clihub search |
| LLM familiarity | Needs schema learning per server | Trained on billions of CLI examples |
| Composability | Single-tool calls via JSON-RPC | Unix pipes: curl | jq | csvlook |
| Error handling | Custom per server implementation | Exit codes + stderr (universal) |
| Setup complexity | Run server process, configure client | pip install clihub-ai, done |
| Ecosystem size | Growing, hundreds of servers | 104 curated + every CLI ever made |
| Offline support | Servers can run locally | Fully offline, bundled registry |
| Structured output | JSON-RPC responses | --json flag on every command |
| Learning curve | MCP spec + server config + client setup | If you know the shell, you know CliHub |
The Agent Workflow
Here's what it looks like when an AI agent uses CliHub to accomplish a task. No schema injection, no server configuration — just standard command-line tools:
Notice the key difference: the agent only pays the token cost for tool discovery and --help output when it actually needs a tool. In the rest of the conversation, there are zero tokens wasted on tool schemas. The agent has its full context window available for reasoning about the user's actual task.
When MCP Still Makes Sense
MCP isn't wrong — it's designed for a different set of problems. There are scenarios where MCP is genuinely the better choice:
- Stateful services. Database connections that need persistent sessions, authenticated API clients with refresh tokens, or services that maintain complex state between calls are better served by a long-running server process.
- Real-time bidirectional communication. If the tool needs to push updates to the agent (e.g., file watchers, live data streams), MCP's bidirectional protocol handles this natively.
- Fine-grained access control. MCP servers can implement authorization logic, rate limiting, and audit logging in ways that are harder to enforce with standalone CLI tools.
- Complex multi-step transactions. Operations that need rollback semantics or coordination across multiple resources benefit from MCP's server-managed state.
The best agent architectures will likely use both: MCP for stateful, authenticated services and CLI tools for everything else. CliHub handles the “everything else” — the 90% of tool use that's about processing files, transforming data, searching code, and running utilities.
Get Started
Ready to give your AI agent access to 104 CLI tools at zero context cost? Getting started takes one command: