Every single AI integration feels under-engineered (or not even engineered in case of tokenslop), as the creators put exactly the same amount of thought that $LLMOFTHEWEEK did into vomiting "You're absolutely right, $TOOL is a great solution for solving your issue!"
We're yet to genuinely standardise bloody help texts for basic commands (Does -h set the hostname, or does it print the help text? Or is it -H? Does --help exist?). Writing man-pages seems like a lost art at this point, everyone points to $WEBSITE/docs (which contains, as you guessed, LLM slopdocs).
We're gonna end up seeing the same loops of "Modern standard for AI" -> "Standard for AI" -> "Not even a standard" -> "Thing of the past" because all of it is fundamentally wrong to an extent. LLMs are purely textual in context, while network protocols are more intricate by pure nature. An LLM will always and always end up overspeccing a /api/v1/ping endpoint while ICMP ping can do that within bits. Text-based engineering, while visible (in the sense that a tech-illiterate person will find it easy to interpret), will always end up forming abstractions over core - you'll end up with a shaky pyramid that collapses the moment your $LLM model changes encodings.
A lot of the best tooling around AI we're seeing is adding deterministic gates that the probabilistic AI agents work with. This is why I'm using MCP over http. I'm happy for the the agent to use it's intelligence and creativity to help me solve problems, but for a range of operations, I want a gate past which actions run with the certainty of normal software functions. NanoClaw sells itself on using deterministic filtering of your WhatsApp messages before the agent gets to see them, and proxies API keys so the agent bever gets them - this is a similar type of deterministic gate that allows for more confidence when working with AI.
I follow a similar pattern. My autonomous agent Smith has a service mesh that I plug MCPs into, which gives me a single place to define policy (OPA for life) and monitoring. The service gateway own credentials. This pattern is secure, easy to manage and lets you can programmatically generate a CLI from the tool catalog. https://github.com/sibyllinesoft/smith-gateway if you want to understand the model and how to implement it yourself.
The boundary also needs to hold if the agent is compromised. Proxying keys is the right instinct. We took the same approach at the action layer: cryptographic warrants scoped to the task, delegation-aware, verified at the MCP tool boundary before execution. Open source core. https://github.com/tenuo-ai/tenuo
This resonates. The pattern I keep seeing is that the best AI tooling right now is about constraining the agent, not giving it more freedom. MCP gives you a clean boundary between what the AI decides and what the system executes deterministically. I use MCP servers with Claude Code and the biggest win is exactly what you described, the AI handles the creative problem solving but the actual actions go through predictable, auditable paths.
It seems like we're going back to expert systems in a kind of inverted sense with all of this chaining of deterministic steps. But now the "experts" are specialized and well-defined actions available to something smart enough to compose them to create new, more powerful actions. We've moved the determinism to the right spot, maybe? Just a half-thought.
I'm just trying to learn this stuff now, so I don't the literature. The "trajectory view" through action space is what makes the most sense to me.
Along these lines, another half-baked pattern I see is kind of a time-lagged translation of stuff from modern stat mech to deep learning/"AI". First it was energy based systems and the complex energy landscape view, a-la spin glasses and boltzmann machines. The "equilibrium" state-space view, concerned with memory and pattern storage/retrieval. Hinton, amit, hopfield, mackay and co.
Now, the trajectory view that started in the 90s with jarzynski and crooks and really bloomed in 2010+ with "stochastic thermodynamics" seems to be a useful lens. The agent stuff is very "nonequilibrium"/ "active"-system coded, in the thermo sense... With the ability to create, modify, and exploit resources (tools/memory) on the fly, there's deep history and path dependence. I see ideas from recent wolpert and co.(Susanne still, crooks again, etc.) w.r.t. thermodynamics of computation providing a kind of through line, all trajectory based. That's all very vague I know, but I recently read the COALA paper and was very enchanted and have been trying to combine what I actually know with this new foreign agent stuff.
It's also very interesting to me how the Italian stat mech school, the parisi family, have continuously put out bangers trying to actually explain machine learning and deep learning success.
I'd love to hear if anyone is thinking along similar lines, or thinks I'm way off track, has paper recs please let me know! Especially papers on the trajectory view of agents.
I think we need to just think of agents as people. The same principles around how we authenticate, authorize and revoke permissions to people should apply to agents. We don't leave the server room door open for users to type commands into physical machines for good reason, and so we shouldn't be doing the same with agents, unless fully sandboxed or the blast radius of malign or erroneous action is fully accepted.
MCP is a fixed specification/protocol for AI app communication (built on top of an HTTP CRUD app). This is absolutely the right way to go for anything that wants to interoperate with an AI app.
For a long time now, SWEs seem to have bamboozled into thinkg the only way you can connect different applications together are "integrations" (tightly coupling your app into the bespoke API of another app). I'm very happy somebody finally remembered what protocols are for: reusable communications abstractions that are application-agnostic.
The point of MCP is to be a common communications language, in the same way HTTP is, FTP is, SMTP, IMAP, etc. This is absolutely necessary since you can (and will) use AI for a million different things, but AI has specific kinds of things it might want to communicate with specific considerations. If you haven't yet, read the spec: https://modelcontextprotocol.io/specification/2025-11-25
Why is this the right way to go? It's not solving the problem it looks like it's solving. If your challenge is that you need to communicate with a foreign API, the obvious solution to that is a progressively discoverable CLI or API specification --- the normal tool developers use.
The reason we have MCP is because early agent designs couldn't run arbitrary CLIs. Once you can run commands, MCP becomes silly.
There is a clear problem that you'd like an "automatic" solution for, but it's not "we don't have a standard protocol that captures every possible API shape", it's "we need a good way to simulate what a CLI does for agents that can't run bash".
A lot of the reasons to use MCP are contained in the architecture document (https://modelcontextprotocol.io/specification/2025-11-25/arc...) and others. Among them, chief is security, but then there's standardization of AI-specific features, and all the features you need in a distributed system with asynchronous tasks and parallel operation. There is a lot of stuff that has nothing to do with calling tools.
For any sufficiently complex set of AI tasks, you will eventually need to invent MCP. The article posted here talks about those cases and reasons. However, there are cases when you should not use MCP, and the article points those out too.
Security is the chief reason in that it's the most important, since AI security is like nuclear waste. But the reason you should use it is it's a standard, and it's better to use one standard and be compatible with 10,000 apps, than have to write 10,000 custom integrations.
When I first used ChatGPT, I thought, "surely someone has written some kind of POP3 or IMAP plugin for ChatGPT so it can just connect to my mail server and download my mail." Nope; you needed to write a ChatGPT-specific integration for mail, which needed to be approved by ChatGPT, etc. Whereas if they supported any remote MCP server, I could just write an MCP server for mail, and have ChatGPT connect to it, ask it to "/search_mail_for_string" or whatever, and poof, You Have Mail(tm).
They did the right thing in hindsight: leave security open until clear patterns emerge, then solidify those patterns into a spec. The spec is still in draft and currently, they are trying to find a simpler solution for client registration than DCR, which apparently ephemeral clients seems to solve for now.
If they had made the security spec without waiting for user information they would most certainly have chosen a suboptimal solution.
I am creator of HasMCP (my response could have a little bias). Not everyone has home/work computer by preference mostly. I know a lot of people just use iPad or Android tablet in addition to their phone. They still use applications to work on the things. This number is not a small amount of people. They need to access openworld data or service specific data. This is where MCP is still the one of the best ways.
It tries to standardize the auth, messaging, feedback loop where API can't do alone. A CLI app can do for sure but we are talking about a standard maybe the way is something like mcpcli that you can install your phone but still would you really prefer installing bunch of application to your personal device?
Some points that MCP is still not good as of today:
- It does not have a standard to manage context in a good way. You have to find your hack. The mostly accepted one search, add/rm tool. Another one is cataloging the tools.
- lack of client tooling to support elicitation on many clients (it really hurts productivity but this is not solved with cli too)
- lack of mcp-ui adoption (mcp-ui vs openai mcp app)
I would suggest keep building to help you and your users. I am not sponsor of MCP, just sharing my personal opinion. I am also creator HasCLI but kindly biased for MCP then CLI in terms of coverage and standardization.
The biggest disappointment I have with MCP today is that many clients are still half-assed on supporting the functions outside of MCP tools.
Namely, two very useful features resources and prompts have varying levels of support across clients (Codex being one of the worst).
These two are possibly the most powerful ones since they allow consistent, org-level remote delivery of context and I would like to see all major clients support these two and eventually catch up on the other features like elicitation, progress, tasks, etc.
> It tries to standardize the auth, messaging, feedback loop where API can't do alone.
If it tried to do that, you wouldn't have the pain point list.
It's a vibe coded protocol that keeps using one-directional protocols for bi-directional communication, invents its own terms for existing stuff (elicitation lol), didn't even have any auth at the beginnig etc.
For the Agent to use CLI, don't we have to install CLI in the run-time environment first? Instead for the MCP over streamable HTTP we don't have to install anything and just specify the tool call in the context in't it?
This rolls up to my original point. I get that if you stipulate the agent can't run code, you need some kind of systems solution to the problem of "let the agent talk to an API". I just don't get why that's a network protocol coupling the agent to the API and attempting to capture the shape of every possible API. That seems... dumb.
The argument that mcp is poorly designed is different than “just use cli” which is further different than mcp is a dead end.
I agree mcp is bad as a protocol and likely not what solves the problem long term. But clearly the cli focus is an artifact of coding agents being the tip of the iceberg that we are seeing for llm agent use cases.
MCP also doesn't work for coworkers that are technical. It works for their agents only.
CLI works for both agents and technical people.
REST API works for both agents and technical people.
MCP works only for agents (unless I can curl to it, there are some HTTP based ones)
>CLI doesn’t work for your coworkers that aren’t technical.
This actually isn't true. I've written bespoke CLI tools for my small business and non-technical people run them without issue. They get intimidated at first but within a day or so they're completely used to it - it's basically just magic incantations on a black box.
CLI’s and shell commands can be wrapped by and packaged into scripts, those scripts can have meaningful names. On Windows at least you can assign special icons to shortcuts to those scripts.
I’ve used that approach to get non-technical near-retirees as early adopters of command line tooling (version control and internal apps). A semantic layer to the effect of ‘make-docs, share-docs, get-newest-app, announce-new-app-version’.
The users saw a desktop folder with big buttons to double click. Errors opened up an email to devs/support with full details (minimizing error communication errors and time to fix). A few minutes of training, expanded and refined to meet individual needs, and our accountants & SME’s loved SVN/Git. And the discussion was all about process and needs, not about tooling or associated mental models.
This should be trivial if you have proper API documentation in something like swagger. You can generate a cli tool with no "figuring out" anything either.
Nothing is “trivial” when you combine humans and computers. I worked at the MIT Computing Help Desk during my undergraduate years. We joked that we received callas from Nobel laureates who could find subatomic particles but couldn’t find the Windows Start button.
My company is currently trying to rollout shared MCPs and skills throughout the company. The engineers who have been using AI tools for the past 1-2 years have few, if any, issues. The designers, product managers, and others have numerous issues.
Having a single MCP gateway with very clear instructions for connecting to Claude Desktop and authenticating with Google eliminates numerous problems that would arise from installing and authenticating a CLI.
The MCP is also available on mobile devices. I can jot down ideas and interact with real data with Claude iOS and the remote MCP. Can’t do that with a CLI.
It's significantly more difficult to secure random clis than those apis. All llm tools today bypass their ignore files by running commands their harness can't control.
I'm fuzzy when we're talking about what makes an LLM work best because I'm not really an expert. But, on this question of securing/constraining CLIs and APIs? No. It is not easier to secure an MCP than it is a CLI. Constraining a CLI is a very old problem, one security teams have been solving for at least 2 decades. Securing MCPs is an open problem. I'll take the CLI every time.
You should read the article, it explains very well why that is completely wrong. cLIs don’t have a good story about security, are you serious?? They either use a secret , in which case the LLM will have the exact same permission as you as a user, which is bonkers (not to mention the LLM can leak your secret now to anyone by making a simple curl request) and prevents AI auditing since it’s not the AI that seems to use the secret, it’s just you! And the other alternative is to run OAuth flows by making you authorize in the browser :). That at least allows some sort of auditing since the agent can use a specific OAuth client to authorize you. But now you have no ability to run the agent unattended, you will need to log in to every possible CLI service before you let the agent work, which means your agent is just sitting there with all your access. Ignorance about best security practices really makes this industry a joke. We need zero standing trust. Auditability. Minimum access required for a task. By letting your agent use your CLIs as if it was you, you throw away all of that.
OP never mentioned letting the agent run as him or use his secrets. All of the issues you mention can be solved by giving the agent it’s own set of secrets or using basic file permissions, which are table stakes.
Back to the MCP debate, in a world where most web apis have a schema endpoint, their own authentication and authorization mechanisms, and in many instances easy to install clients in the form of CLIs … why do we need a new protocol, a new server, a new whatever. KISS
> OP never mentioned letting the agent run as him or use his secrets
That is implicit with a CLI because it is being invoked in the user session unless the session itself has been sandboxed first. Then for the CLI to access a protected resource, it would of course need API keys or access tokens. Sure, a user could set up a sandbox and could provision agent-specific keys, but everyone could always enable 2FA, pick strong passwords, use authenticators, etc . and every org would have perfect security.
Yes, this has been the gradual evolution of AI context and tooling. Same thing is occurring with some of the use cases of a vector DB and RAG. Once you can have the agent interact with the already existing conventional data store using existing queries, there is no point in introducing that work flow for inference.
no, it's all about auth. MCP lets less-technical people plug their existing tools into agents. They can click through the auth flow in about 10 seconds and everything just works. They cannot run CLIs because they're not running anything locally, they're just using some web app. The creator of the app just needed to support MCP and they got connectivity with just about everything else that supports MCP.
Write better CLIs for the agents of the less-technical people. The MCPs you're talking about don't exist yet either. This doesn't seem complicated; MCP seems like a real dead end.
How are those CLIs being installed and run on hosted services? You'll need to sandbox them and have a way to install them automatically which seems difficult. How does the auth flow work? You'd need to invent some convention or write glue for each service. These are far more complicated than just using MCP, regardless of the benefits of the protocol itself.
I think a big part of why this discussion is coming up again and again is that people assume the way they are using AI is universal, but there's a bunch of different ways to leverage it. If you have an agent which runs within a product it usually cannot touch the outside world at all by design, you do not need an explicit sandbox (i.e. a VM or container) at all because it lives in an isolated environment. As soon as you say "we use CLIs not MCP" well now you need a sandbox and everything else that goes along with it.
If you can tell ahead of time what external connectors you need and you're already sandboxing then by all means go with CLIs, if you can't then MCP is literally the only economical and ergonomic solution as it stands today.
> ...people assume the way they are using AI is universal
This is what led me back to MCP. Our team is using Claude CLI, Claude VSCX, Codex, OpenCode, GCHP, and we need to support GH Agents in GH Actions.
We wanted telemetry and observability to see how agents are using tool and docs.
There's no sane way to do this as an org without MCP unless we standardize and enforce a specific toolset/harness that we wrap with telemetry. And no one wants that.
> Why is this the right way to go? It's not solving the problem it looks like it's solving. If your challenge is that you need to communicate with a foreign API, the obvious solution to that is a progressively discoverable CLI or API specification --- the normal tool developers use.
That sounds like a hack to get around the lack of MCP. If your goal is to expose your tools through an interface that a coding agent can easily parse and use, what compels you to believe throwing amorphous structured text is a better fit than exposing it through a protocol specially designed to provide context to a model?
> The reason we have MCP is because early agent designs couldn't run arbitrary CLIs. Once you can run commands, MCP becomes silly.
I think you got it backwards. Early agents couldn't handle, and the problem was solved with the introduction of an interface that models can easily handle. It became a solved problem. Now you only argue that if today's models work hard enough, they can be willed into doing something with tools without requiring a MCP. That's neat, but a silly way to reinvent the wheel - poorly.
If AI is AI, why does it need a protocol to figure out how to interact with HTTP, FTP, etc.? MCP is a way to quickly get those integrations up and running, but purely because the underlying technology has not lived up to its hyped abilities so far. That's why people think of MCP as a band-aid fix.
Why the desire to reinvent the wheel every time? Agents can do it accurately, but you have to wait for them to figure it out every time, and waste tokens on non-differentiated work
The agents are writing the mcps, so they can figure out those http and ftp calls. MCP makes it so they dont have to every time they want to do something.
I wouldnt hire a new person to read a manual and then make a bespoke json to call an http server, every single time i want to make a call, and thats not a knock on the person's intelligence. Its just a waste of time doing the same work over and over again. I want the results of calling the API, not to spend all my time figuring out how to call the API
It’s simply about making standard, centralized plugins available. Right now Claude benefits from a “link GitHub Connector” button with a clear manifest of actions.
Obviously if the self-modifying, Clawd-native development thing catches on, any old API will work. (Preferably documented but that’s not a hard requirement.)
For now though, Anthropic doesn’t host a clawd for you, so there isn’t yet a good way for it to persist customs integrations.
each ai need context management per conversation this is something that would be very clunky to replicate on top of http or ftp (as in requiring side channel information due session and conversation management)
Everyone looks at api and sure mcp seem redundant there but look at agent driving a browser the get dom method depends on all the action performed from when the window opened and it needs to be per agent per conversation
Can you do that as rest sure sneak a session and conversation in a parameter or cookie but then the protocol is not really just http is it it's all this clunky coupling that comes with a side of unknowns like when is a conversation finished did the client terminate or were just between messages and as you go and solve these for the hundredth time you'd start itching for standardization
It makes it part of the protocol so the llm doesn't have to handle it, which is brittle
And look at the patent post I've replied to choice of protocol, I'd like to see a session token over ftp where you need to track the current folder per conversation.
But the agent harness is still handling the session token for you either way. MCP might be an easy way for agent harness creators to abstract the issue away, but I don’t want to lose all REST conventions just to make it a little easier for them to write an agent harness.
It makes it harder for the LLM to understand what’s going on, not easier.
No, but MCPs aren’t free to build either. So if you need to build an API on top, why would you build an MCP instead of using one of the existing standards that both LLMs and humans already know how to work with?
You're interacting with an LLM, so correctness is already out the window. So model-makers train LLMs to work better with MCP to increase correctness. So the only reason correctness is increased with MCP is because LLMs are specifically trained against it.
So why MCP? Are there other protocols that will provide more correctness when trained? Have we tried? Maybe a protocol that offers more compression of commands will overall take up more context, thus offering better correctness.
MCP seems arbitrary as a protocol, because it kinda is. It doesn't >>cause<< the increase in correctness in of itself, the fact that it >>is<< a protocol is the reason it may increase correctness. Thus, any other protocol would do the same thing.
> You're interacting with an LLM, so correctness is already out the window.
With all due respect if you are prompting correctly and following approaches such as TDD / extensive testing then correctness is not out the window. That is a misunderstanding likely caused by older versions of these models.
Correctness can be as complete as any other new code, I've used the AI to port algorithms from Python to Rust which I've then tested against math oracles and published examples. Not only can I check my code mathematically but in several instances I've found and fixed subtle bugs upstream. Even in well reviewed code that has been around for many years and is well used. It is simply a tool.
> So why MCP? ... MCP seems arbitrary as a protocol
You're right, it is an arbitrary protocol, but it's one that is supported by the industry.
See the screencaps at the end of the post that show why this protocol. Maybe one day, we will get a better protocol. But that day is not today; today we have MCP.
You mean, why not ask the AI to "find a way to use FTP", including either using a tool, or writing its own code? Besides the security issues?
One simple reason is "determinism". If you ask the AI to "just figure it out", it will do that in different ways and you won't have a reliable experience. The protocol provides AI a way to do this without guessing or working in different ways, because the server does all the work, deterministically.
But the second reason is, all the other reasons. There is a lot in the specification, that the AI literally cannot figure out, because it would require custom integration with every application and system. MCP is also a client/server distributed system, which "calling a tool" is not, so it does stuff that is impossible to do on your existing system, without setting up a whole other system... a system like MCP. And all this applies to both the clients, and the servers.
Here's another way to think of it. The AI is a psychopath in prison. You want the psycho to pick up your laundry. Do you hand the psycho the keys to your car? Or do you hand him a phone, where he can call someone who is in charge of your car? Now the psycho doesn't need to know how to drive a car, and he can't drive it off a bridge. All he can do is talk to your driver and tell him where to go. And your driver will definitely not drive off a bridge or stab anyone. And this works for planes, trains, boats, etc, just by adding a phone in between.
Exactly this. I've made some MCP servers and attached tons of other people's MCP servers to my llms and I still don't understand why we can't just use OpenAPI.
Why did we have to invent an entire new transport protocol for this, when the only stated purpose is documentation?
By and large, it is a very simple protocol and if you build something with it, you will see that it is just a series of defined flows and message patterns. When running over streamable HTTP, it is more or less just a simple REST API over HTTP with JSON RPC payload format and known schema.
World would be surely a saner place if instead of “MCP vs CLI” people would talk about “JSON-RPC vs execlp(3)”.
Not accurate, but at least makes on think of the underlying semantics. Because, really, what matters is some DSL to discover and describe action invocations.
No, this misunderstands what MCP is for and how it works.
Let's say you use Claude's chat interface. How can you make Claude connect to, say, the lights in your house?
Without MCP, you would need Anthropic the company to add support to Claude the web interface to connect over a network to your home, use some custom routing software (that you don't have) to communicate over whatever lightbulb-specific IoT protocol your bulbs use, to be able to control them. Claude needs to support your specific lightbulb stack, and some kind of routing software would need to be added in your home to connect the external network to the internal devices.
But with MCP, Claude only has to support MCP. They don't have to know anything about your lightbulbs or have some custom routing thing for your home. You just need to run an MCP server that talks to the lightbulbs... which the lightbulb company should make and publish, so you don't have to do anything but download the lightbulb MCP server and run it. Now Claude can talk to your lightbulbs, and neither you nor Claude had to do any extra work.
In addition to the communication, there is also asynchronous task control features, AI-specific features, security features, etc that are all necessary for AI work. All this is baked into MCP.
This is the power of standardized communications abstractions. It's why everyone uses HTTP and doesn't have their own custom application-specific tcp-server-language. The world wide web would just be 10 websites.
No, that's not MCP. That's a pleasant idea that MCP has been shoehorned into trying to solve. But MCP the spec is far more complicated than it needs to be to support that story. Streamable HTTP transport makes it much more workable, and I imagine was designed by real people rather than the version prior to that, but it's still much more than it needs.
Ultimately, 90% of use cases would be solved by a dramatically simpler spec which was simply an API discovery mechanism, maybe an OpenAPI spec at a .well-known location, and a simple public-client based OAuth approach for authentication and authorization. The full-on DCR approach and stateful connections specified in the spec is dramatically harder to implement.
Tell me how many ways that print help message for a command you have seen and say "reusable" again. Mcp is exactly exists to solve this. The rest is just json rpc with simple key value pairs.
You can probably let llm guess the help flag and try to parse help message. But the success rate is totally depends on model you are using.
As soon as MCP came out I thought it was over engineered crud and didn’t invest any time in it. I have yet to regret this decision. Same thing with LangChain.
This is one key difference between experienced and inexperienced devs; if something looks like crud, it probably is crud. Don’t follow or do something because it’s popular at the time.
All the code I work on now has an MCP interface so that the LLM can debug more easily. I'd argue it is as important as the UI these days. The amount of time it has saved me is unreal. It might be worth investing a very small amount of your time in it to see if it is a good fit. Even a poor protocol can provide useful functionality.
I've just been discovering this pattern too. It's made a huge difference. Trying to get Claude to remote control an app for testing via the various other means was miserable and unreliable.
I got it to build an MCP server into the app that supported sending commands to allow Claude to interact with it as if it was a user, including keypresses and grabbing screenshots, and the difference was immediate and really beneficial.
Visual issues were previously one of the things it would tend to struggle with.
I assume that this is dependent on app, and it's quite possible that your approach is best in some cases.
In my case I started with something somewhat like Playwright, and claude had a habit of interacting with the app more directly than a user would be able to and so not spotting problems because of it. Forcing it to interact by pressing keys rather than delving into the dom or executing random javascript helped. In particular I wanted to be able to chat with it as it tried things interactively. This is more to help with manual tests or exploratory testing rather than classic automated testing.
My current app is a desktop app, so playwright isn't as applicable.
I've managed to ignore MCP servers for a long time as well, but recently I found myself creating one to help the LLM agents with my local language (Papiamentu) in the dialect I want.
I made a prolog program that knows the valid words and spelling along with sentence conposition rules.
Via the MCP server a translated text can be verified. If its not faultless the agent enters a feedback loop until it is.
The nice thing is that it's implemented once and I can use it in opencode and claude without having to explain how to run the prolog program, etc.
- Do you work in a team context of 10+ engineers?
- Do you all use different agent harnesses?
- Do you need to support the same behavior in ephemeral runtimes (GH Agents in Actions)?
- Do you need to share common "canonical" docs across multiple repos?
- Is it your objective to ensure a higher baseline of quality and output across the eng org?
- Would your workload benefit from telemetry and visibility into tool activation?
If none of those apply, then it's not for you. Server hosted MCP over streamable HTTP benefits orgs and teams and has virtually no benefit for individuals.
What I want to know is what's the difference between a remote mcp and an api with an openapi.json endpoint for self-discovery? It's just as centralized
It's instructive to skim the top level of the MCP spec to get a sense. But you can also scroll to the end of the post and see the three .gifs there and see why MCP: because it also defines interaction models with the clients and exposes MCP prompts as `/` (slash) commands and MCP resources as `@` (at) references among other things.
You are right: MCP tools are in essence OpenAPI specs with some niceties like standardized progress reporting. But MCP is more than tools.
I can't go into specifics about exactly what I'm doing but I can speak generically:
I have been working on a system using a Fjall datastore in Rust. I haven't found any tools that directly integrate with Fjall so even getting insight into what data is there, being able to remove it etc is hard so I have used https://github.com/modelcontextprotocol/rust-sdk to create a thin CRUD MCP. The AI can use this to create fixtures, check if things are working how they should or debug things e.g. if a query is returning incorrect results and I tell the AI it can quickly check to see if it is a datastore issue or a query layer issue.
Another example is I have a simulator that lets me create test entities and exercise my system. The AI with an MCP server is very good at exercising the platform this way. It also lets me interact with it using plain english even when the API surface isn't directly designed for human use: "Create a scenario that lets us exercise the bug we think we have just fixed and prove it is fixed, create other scenarios you think might trigger other bugs or prove our fix is only partial"
One more example is I have an Overmind style task runner that reads a file, starts up every service in a microservice architecture, can restart them, can see their log output, can check if they can communicate with the other services etc. Not dissimilar to how the AI can use Docker but without Docker to get max performance both during compilation and usage.
Last example is using off the shelf MCP for VCS servers like Github or Gitlab. It can look at issues, update descriptions, comment, code review. This is very useful for your own projects but even more useful for other peoples: "Use the MCP tool to see if anyone else is encountering similar bugs to what we just encountered"
Its very similar to the switch from a text editor + command line, to having an IDE with a debugger.
the AI gets to do two things:
- expose hidden state
- do interactions with the app, and see before/after/errors
it gives more time where the LLM can verify its own work without you needing to step in. Its also a bit more integration test-y than unit.
if you were to add one mcp, make it Playwright or some similar browser automation mcp. Very little has value add over just being able to control a browser
That's also one of the things that worries me the most. What kind of data is being sent to these random endpoints? What if they to rogue or change their behavior?
mcp is generally a static set of tools, where auth is handled by deterministic code and not exposed to the agent.
the agent sees tools as allowed or not by the harness/your mcp config.
For the most part, the same company that you're connecting to is providing the mcp, so its not having your data go to random places, but you can also just write your own. its fairly thin wrappers of a bit of code to call the remote service, and a bit of documentation of when/what/why to do so
Sniff tests are useful, but they're not wisdom. Most of these stacks are churn wrapped in a repo, so bailing early is usually the right call, yet every so often some ugly little tool sticks because it cuts through one miserable integration problem better than the cleaner options and people keep it around long after the pitch deck evaporates.
The failure mode is turning taste into a religion. If you never touch anything that looks crude on day one, you also miss the occasional weird thing that later becomes boring infra.
Much like how "literally" doesn't literally mean "literally" anymore, "over-engineered" in most cases doesn't mean "too much engineering happened" but "wrong design/abstractions", which of course translates to "designs/abstractions I don't like".
This is quite literally the opposite opinion I and many others had when first exploring MCP. It's so _obviously_ simple, which is why it gained traction in the first place.
So let's say you have a rag llm chat api connected to an enterprises document corpus.
Do you not expose an mcp endpoint? Literally every vscode or opencode node gets it for free (a small json snippet in their mcp.json config) If you do auth right
Not only editors, but also different runtime contexts like GitHub Agents running in Actions.
We can plug in MCP almost anywhere with just a small snippet of JSON and because we're serving it from a server, we get very clear telemetry regardless of tooling and envrionment.
What are you using for hosting and deploying the MCP servers? I’d like something low friction for enterprise teams to be able to push their MCP definitions as easily as pushing a Git repo (or ideally, as part of a Git repo, kinda like GitHub pages). It’s obviously not sustainable for every team to host their own MCP servers in their own way.
So what’s the best centralized gateway available today, with telemetry and auth and all the goodness espoused in this blog post?
MCP is effectively "just another HTTP REST API"; OAuth and everything. The key parts of the protocol is the communication shape and sequence with the client, which most SDKs abstract for you.
The SDKs for MCPs make it very straightforward to do so now and I would recommend experimenting with them. It is as easy to deploy as any REST API.
MCP is fine, particular remote MCP which is the lowest friction way to get access to some hosted service with auth handled for you.
However, MCP is context bloat and not very good compared to CLIs + skills mechanically. With a CLI you get the ability to filter/pipe (regular Unix bash) without having to expand the entire tool call every single time in context.
CLIs also let you use heredoc for complex inputs that are otherwise hard to escape.
CLIs can easily generate skills from the —help output, and add agent specific instructions on top. That means you can give the agent all the instructions it needs to know how to use the tools, what tools exist, lazy loaded, and without bloating the context window with all the tools upfront (yes, I know tool search in Claude partially solves this).
CLIs also don’t have to run persistent processes like MCP but can if needed
I’ve always felt like MCP is way better suited towards consumer usage rather than development environments. Like, yeah, MCP uses a lot of a context window, is more complex than it should be in structure, and it isn’t nearly as easy for models to call upon as a command line tool would be. But I believe that it’s also the most consumer friendly option available right now.
It’s much easier for users to find what exactly a model can do with your app over it compared to building a skill that would work with it since clients can display every tool available to the user. There’s also no need for the model to setup any environment since it’s essentially just writing out a function, which saves time since there’s no need to setup as many virtual machine instructions.
It obviously isn’t as useful in development environments where a higher level of risk can be accepted since changes can always be rolled back in the repository.
If I recall correctly, there’s even a whole system for MCP being built, so it can actually show responses in a GUI much like Siri and the Google Assistant can.
> If I recall correctly, there’s even a whole system for MCP being built, so it can actually show responses in a GUI much like Siri and the Google Assistant can
I still think MCP is completely unnecessary (and have from the start). The article correctly points out where CLI > MCP but stops short on 2 points:
1. Documenting the interface without MCP. This problem is best solved by the use of Skills which can contain instructions for both CLIs and APIs (or any other integration). Agents only load the relevant details when needed. This also makes it easy to customize the docs for the specific cases you are working with and build skills that use a subset of the tools.
2. Regarding all of the centralization benefits attributed to remote MCPs - you can get the same benefits with a traditional centralized proxy as well. MCP doesn't inherently grant you any of those benefits. If I use AWS sso via CLI, boom all of my permissions are tied to my account, benefit from central management, and have all the observability benefits.
In my mind, use Skills to document what to do and benefit from targeted progressive disclosure, and use CLIs and REST APIs for the actual interaction with services.
> This problem is best solved by the use of Skills which can contain instructions for both CLIs and APIs
You've just reversed the context benefits because the content of the skill...goes into context.
> ...you can get the same benefits with a traditional centralized proxy as well. MCP doesn't inherently grant you any of those benefits.
You've just rebuilt MCP...but bespoke, unstructured, and does not plug into industry tooling. MCP prompts are activated as `/` (slash) commands. MCP resources are activated as `@` (at) references. You can't do this with a proxy.
See the three .gifs at the end of the post to see how clients use MCP prompts and resources and definitely check the specification for these two.
As someone charged with enabling users across an enterprise with AI tooling, the majority of whom are not in the software dev category, this article is perfectly mirroring my approach. Which is reassuring!
Challenges we are solving with centralised MCP are around brand guardianship, tone of voice, internal jargon and domain context, access to common data sources, and via the resources methods in MCP access to “skills” that prescribe patterns and shims for expected paths and ways of connecting/extracting data.
the maintenance burden is the real MCP killer nobody talks about. your agent needs github? now you depend on some npm package wrapping an API that already had good docs. i just shell out to gh cli and curl - when the API changes, the agent reads updated docs and adapts. with MCP you wait on a middleman to update a wrapper.
tptacek nailed it - once agents run bash, MCP is overhead. the security argument is weird too, it shipped without auth and now claims security as chief benefit. chroot jails and scoped tokens solved this decades ago.
only place MCP wins is oauth flows for non-technical users who will never open a terminal. for dev tooling? just write better CLIs.
In v0, people can add e.g. Supabase, Neon, or Stripe to their projects with one click. We then auto-connect and auth to the integration’s remote MCP server on behalf of the user.
v0 can then use the tools the integration provider wants users to have, on behalf of the user, with no additional configuration. Query tables, run migrations, whatever. Zero maintenance burden on the team to manage the tools. And if users want to bring their own remote MCPs, that works via the same code path.
We also use various optimizations like a search_tools tool to avoid overfilling context
But then the LLM needs to write its own tools/code for interacting with said service. Which is fine, but slower and it can make mistakes vs officially provided tools
This seems misguided when you have to work in enterprise settings. MCP is a very natural fit for all the API auditing and domain borders that exist in enterprise environments, because it provides deterministic tooling and auditable interfaces for agents. Nobody wants an AI agent doing random API calls or shell commands.
There is no standard for MCP authentication, because of that it is e.g. blocked in my enterprise. Basically they want to avoid non-technicals installing random MCPs and exposing internals to internet.
The credential proxy pattern (agent never sees the key, gateway owns it) works well when the human is the principal and the agent is acting on their behalf. But it hits a wall when the agent needs to be the principal.
Email sent from a human's account on behalf of an agent is a different legal and reputational thing than email sent from the agent's own address. If the agent makes a mistake, takes an action, or enters into a relationship — whose name is on it? Right now the answer is almost always "the human's", which means agents can't really be held accountable as entities.
The deeper issue MCP hasn't addressed is that auth was built for users, not agents. OAuth gives agents delegated access. But delegation isn't identity. An agent with delegated Gmail access is acting as a deputy. An agent with its own email address and phone number is acting as a first-class participant.
Some things you want the deputy model (browsing the web, reading your calendar). Some things need a distinct identity — outreach, commitments, anything where attribution matters downstream. Those two cases need different infrastructure.
The problem with MCP isn't MCP. It's the way it's invoked by your agent.
IMO, by default MCP tools should run in forked context. Only a compacted version of the tool response should be returned to the main context. This costs tokens yes, but doesn't blow out your entire context.
If other information is required post-hoc, the full response can be explored on disk.
I think part of the problem is how these mcp service are designed. A lot of them just returns Mbs of text blob without filtering at all, and thus explodes the context.
And it's also affected by how model is trained. Gemini specifically like to read large amount of text data directly and explodes the context. But claude try to use tool for partial search or write a script to sample from a very large file. Gemini always fills the context way faster then claude when doing the same job.
But I guess in case of a bad designed mcp, there is no much model can do because the results are injected into context directly though (unless the runtime decided to redirect it to somewhere else)
Finally, I have been saying this for months and generally to big backlash. The only two aspects missing are the role of central mcp gateways and code mode. We don't know 100% how these will be used optimally but thats what the future will look like for 90% of usecases. I would go so far to say that someone will have to make a bash to js compiler for simple cases like piping common commands like cat ls rg grep, because that would allow using all the RL and training data and save all the overhead of steering away from them. Once there are virtually no local tools left, we can just scale up agent servers like opencode serve to just serve agents like a web server.
The author likes to look at every concept from all sides, yet seemingly not aware about Token Notation (TOON) and almost wishing something like that existed…
As yourself: what kind of tool I would love to have, to accomplish the work I'm asking the LLM agent to do? Often times, what is practical for humans to use, it is for LLMs. And the reply is almost never the kind of things MCP exports.
You interact with REST APIs (analogue of MCP tools) and web pages (analogue of MCP resources) every day.
I'd recommend that you take a peek at MCP prompts and resources spec and understand the purpose that these two serve and how they plug into agent harnesses.
So you love interacting with web sites sending requests with curl?
And if you need the price of an AWS service, you love to guess the service name (querying some other endpoint), then ask some tool the price for it, get JSON back, and so forth? Or you are better served by a small .md file you pre-compiled with the services you use the most, and read from it a couple of lines?
> I'd recommend that you take a peek at MCP prompts and resources spec
Don't assume that if somebody does not like something they don't know what it is. MCP makes happy developers that need the illusion of "hooking" things into the agent, but it does not make LLMs happy.
I find that skills work very well. The main SKILL file has an overview of all the capabilities of my platform at a high level and each section links to a more specific file which contains the full information with all possible parameters for that particular capability.
Then I have a troubleshooting file (also linked from the main SKILL file) which basically lists out all the 'gotchas' that are unique to my platform and thus the LLM may struggle with in complex scenarios.
After a lot of testing, I identified just 5 gotchas and wrote a short section for each one. The title of each section describes the issue and lists out possible causes with a brief explanation of the underlying mechanism and an example solution.
Adding the troubleshooting file was a game changer.
If it runs into a tricky issue, it checks that troubleshooting file. It's highly effective. It made the whole experience seamless and foolproof.
My platform was designed to reduce applications down to HTML tags which stream data to each other so the goal is low token count and no-debugging.
I basically replaced debugging with troubleshooting; the 5 cases I mentioned are literally all that was left. It seems to be able to quickly assemble any app without bugs now.
The 'gotchas' are not exactly bugs but more like "Why doesn't this value update in realtime?" kind of issues. They involve performance/scalability optimizations that the LLM needs to be aware of.
If it's a remote API, I suppose the argument is that you might as well fetch the documentation from the remote server, rather than using a skill that might go out of date. You're trusting the API provider anyway.
But it's putting a lot of trust in the remote server not to prompt-inject you, perhaps accidentally. Also, what if the remote docs don't suit local conditions? You could make local edits to a skill if needed.
Better to avoid depending on a remote API when a local tool will do.
Or just build your own remote MCP server for docs? It's easy enough now that the protocol and supporting SDKs have stabilized.
Most folks are familiar with MCP tools but not so much MCP resources[0] and MCP prompts[1]. I'd make the case that these latter two are way more powerful and significant because (most) tools support them (to varying degrees at the moment, to be fair).
For teams/orgs, these are really powerful because they simplify delivery of skills and docs and moves them out of the repo (yes, there are benefits to this, especially when the content is applicable across multiple repos) on top of surfacing telemetry that informs usage and efficacy.
Why would you do it? One reason is that now you can index your docs with more powerful tools. Postgres FTS, graph databases to build a knowledge base, extract code snippets and build a best practices snippet repo, automatically link related documents by using search, etc.
I have moved towards super-specific scripts (so I guess "CLI"?) for a few reasons:
1. You can make the script very specific for the skill and permission appropriately.
2. You can have the output of the script make clear to the LLM what to do. Lint fails? "Lint rules have failed. This is an important for reasons blah blah and you should do X before proceeding". Otherwise the Agent is too focused on smashing out the overall task and might opt route around the error. Note you can use this for successful cases too.
3. The output and token usage can be very specific what the agent needs. Saves context. My github comments script really just gives the comments + the necessary metadata, not much else.
The downsides of MCP all focus on (3), but the 1+2 can be really important too.
One part that makes me wary of these tools is security.
If I use a remote MCP or CLI that relies on network calls, and I give it in the hands of my coding assistant, wouldn't be too easy to inject prompts and exfiltrate data from my machine?
At least MCP don't have direct access to my machine, but CLIs do.
We've been working on a warrant model that ensures task-scoped authorization: constrain your agents to specific tools and specific arguments, cryptographically enforced at the MCP tool boundary. Even a fully compromised agent can't reach outside its warrant. Open source. github.com/tenuo-ai/tenuo
> (I preface that this is primarily relevant for orgs and enterprises; it really has no relevance for individual vibe-coders)
The thing about tools that "democratize" software development, whether it is Visual Studio/Delphi/QT or LLMs, is that you wind up with people in organizations building internal tools on which business processes will depend who do not understand that centralization is key. They will build these tools in ignorance of the necessity of centralization-centric approaches (APIs, MCP, etc.) and create Byzantine architectures revolving around file transfers, with increasing epicycles to try to overcome the pitfalls of such an approach.
There's a distinction between individual devs and organizations like Amazons or even a medium sized startup.
Once you have 10-20 people using agents in wildly different ways getting wildly different results, the question of "how do I baseline the capabilities across my team?" becomes very real.
In our team, we want to let every dev use the agent harness that they are comfortable with and that means we need a standard mechanism of delivering standard capabilities, config, and content across the org.
I don't see it as democratization versus corporate facism in so much as it is "can we get consistent output from developers of varying degrees of skill using these agents in different ways?"
The only value—and it’s significant—that a fixed-tools protocol like MCP can provide is to serve as the capability base for an embedded agent security model.
The agent can only perform the operations it has been expressly given tools to perform, and its invocation of those tools can be audited and otherwise governed.
Whether MCP evolves to fulfill this role effectively, time will tell.
I don't know. Skill+http endpoint feel way safer, powerful and robust. The problem is usually that the entity offering the endpoint, if the endpoint is ai powered, concur in LLM costs. While via mcp the coding agent is eating that cost, unless you are also the one running the API and so can use the coding plan endpoint to do the ai thing
If I didn't misunderstood you, it doesn't really matter if it's an endpoint or a (remote) mcp, either someone else wants to run llms to provide a service for you or they don't.
A local mcp doesn't come in play because they just couldn't offer the same features in this case.
The MCP server usually provides some functions you can run, possibly with some database interaction.
So when you run it, your codign agent is using AI to run that code (what to call, what parameters to pass, and so on). Via MCP, they don't pay any LLM cost; they just offer the code and the endpoint.
But this is usually messy for the coding agent since it fills up the context. While if you use skill + API, it's easier for the agent since there's no code in the context, just how to call the API and what to pass.
With something like this, you can then have very complex things happening in the endpoint without the agent worrying about context rot or being able to deal with that functionality.
But to have that difficult functionality, you also need to call an LLM inside the endpoint, which is problematic if the person offering the MCP service does not want to cover LLM costs.
So it does matter if it's an endpoint or an MCP because the agent is able to do more complex and robust stuff if it uses skill and HTTP.
So if I release a new cli. How do I get the LLM to know about it? Do i tell it every time to run the command? Do I build a skill. Should I release a skill with the cli? Do I just create docs on GitHub and hope the next crawl gets into the training set?
Package a skill with your CLI itself and give users instructions on how to install the skill properly. That allows the agent to read the instructions in a context efficient way when it wants to use the CLI
This article is sort of right, though MCP itself is still a very meh standard, for secure enterprise use cases, SOME agent specific standard is really valuable. It gives you a single point of management. What matters is that it's _for agents_ and it has traction.
Using MCP daily through Claude Code for browser automation and external APIs. The protocol works — the tooling around it is what needs to mature.
Biggest pain point is reliability: connections drop, tools fail silently, no good way to know if a call actually reached the server.
But the article's "just HTTP with extra steps" framing misses the point. The value is the standardized tool interface. Before MCP, every AI integration was a bespoke wrapper. A shared vocabulary for "here's a tool, here's its schema, call it" is genuinely useful, rough edges and all.
One aspect I think is often overlooked in the CLI vs. MCP debate: MCP's support for structured output and output schema (introduced in the 2025-06-18 spec). This is a genuinely underrated feature that has practical implications far beyond just "schema bloat."
Why? Because when you pair output schema with CodeAct agents (agents that reason and act by writing executable code rather than natural language, like smolagents by Hugging Face), you solve some of the most painful problems in agentic tool use:
1. Context window waste: Without output schema, agents have to call a tool, dump the raw output (often massive JSON blobs) into the context window, inspect it, and only then write code to handle it. That "print-and-inspect" pattern burns tokens and attention on data the agent shouldn't need to explore in the first place.
2. Roundtrip overhead: Writing large payloads back into tools has the same problem in reverse. Structured schemas on both input and output let the agent plan a precise, single-step program instead of fumbling through multiple exploratory turns.
And the industry is clearly converging on this pattern. Cloudflare built their "Code Mode" around the same idea (https://blog.cloudflare.com/code-mode/), converting MCP tools into a TypeScript API and having the LLM write code against it rather than calling tools directly. Their core finding: LLMs are better at writing code to call MCP than at calling MCP directly. Anthropic followed with "Programmatic tool calling" (https://www.anthropic.com/engineering/code-execution-with-mc..., https://platform.claude.com/docs/en/agents-and-tools/tool-us...), where Claude writes Python code that calls tools inside a code execution container. Tool results from programmatic calls are not added to Claude's context window, only the final code output is. They report up to 98.7% token savings in some workflows.
So the point here is: MCP isn't just valuable for the centralization, auth, and telemetry story the author laid out (which I fully agree with). The protocol itself, specifically its structured schema capabilities, directly enables more efficient and reliable agentic workflows. That's a concrete technical advantage that CLIs simply don't offer, and it's one more reason MCP will stick around.
There is another differentiator between CLIs and MCP.
The CLI are executed by the coding assistants in the project directory, which means that they can get implicit information from there (e.g. git branch and commit)
With an MCP you would need a prepare step to gather that, making things slower.
In MCP setups you do give the agent the full description of what the tool can do, but I don't see why you couldn't do the same for executables. Something like injecting `tool_exe --agent-usage` into the prompt at startup.
Great article otherwise. I've been wondering why people are so zealous about MCP vs executable tools, and it looks like it's just tradeoffs between implementation differences to me.
This came up in recent discussions about the Google apps CLI that was recently released. Google initially included an MCP server but then removed it silently - and some people believe this is because of how many different things the Google Workspace CLI exposes, which would flood the context. And it seemed like in social media, suddenly a lot of people were talking about how MCP is dead.
But fundamentally that doesn’t make sense. If an AI needs to be fed instructions or schemas (context) to understand how to use something via MCP, wouldn’t it need the same things via CLI? How could it not? This article points that out, to be clear. But what I’m calling out is how simple it is to determine for yourself that this isn’t an MCP versus CLI battle. However, most people seem to be falling for this narrative just because it’s the new hot thing to claim (“MCP is dead, Long Live CLI”).
As for Google - they previously said they are going to support MCP. And they’ve rolled out that support even recently (example from a quick search: https://cloud.google.com/blog/products/ai-machine-learning/a...). But now with the Google Workspace CLI and the existence of “Gemini CLI Extensions” (https://geminicli.com/extensions/about/), it seems like they may be trying to diminish MCP and push their own CLI-centric extension strategy. The fact that Gemini CLI Extensions can also reference MCP feels a lot like Microsoft’s Embrace, Extend, Extinguish play.
MCP loads all tools immediately. CLI does not because it’s not auto exposed to the agent, got have more control of how the context of which tools exist, and how to deliver that context.
It does not have to load all tools. As you are able to hide the details in CLI you can implement the same in MCP server and client.
Just follow the widely accepted pattern (all you need 3 tools in front):
- listTools - List/search tools
- getToolDetails - Get input arguments for the given tool name
- execTool - Execute given tool name with input arguments
HasMCP - Remote MCP framework follows/allows this pattern.
Accurate for naive MCP client implementations, but a proxy layer with inference-time routing solves exactly this control problem. BM25 semantic matching on each incoming query exposes only 3-5 relevant tool schemas to the agent rather than loading everything upfront - the 44K token cold-start cost that the article cites mostly disappears because the routing layer is doing selection work. MCPProxy (https://github.com/smart-mcp-proxy/mcpproxy-go) implements this pattern: structured schemas stay for validation and security quarantine, but the agent only sees what's relevant per query rather than the full catalog. The tradeoff isn't MCP vs CLI - it's routing-aware MCP vs naive MCP, and the former competes with CLI on token efficiency while retaining the organizational benefits the article argues for.
You've missed the point and hyperfocused on the story around context and not why an org would want to have centralized servers exposing MCP endpoints instead of CLIs
1. The part where you are providing 100 tools instead of a few really flexible tools
2. The part where you think your agent is going to know how to use 100 CLI tools that are not already in its training dataset without using extra turns walking the help content to dump out command names and schemas
3. The part where, without a schema defining the inputs, the LLM wastes iterations trying to correct the input format.
4. The part where, not having the full picture of the tools, your odds of it picking the same tools or the right tools is completely gambling that it outputs the right keywords to trigger the tool to be used.
5. The part where you forgot to mention that for your agent to know that your 100 CLI tools exist, you had to either provide it in context directly, provide it in context in a README.md, or have it output the directory listing and send that off to the LLM to evaluate before picking the tool and then possibly expanding the man pages for several tools and sub commands using several turns.
Don't get me wrong, CLIs are great if its already in the LLMs training set (`git`, for example). Not so great if it's not because it will need to walk the man pages anyways.
> The part where you are providing 100 tools instead of a few really flexible tools
I'm not sure how that solves the issue. The shape of each individual tool will be different enough that you will need different schema - something you will be passing each time in MCP and something you can avoid in CLI. Also, CLI's can also be flexible.
> The part where you think your agent is going to know how to use 100 CLI tools that are not already in its training dataset without using extra turns walking the help content to dump out command names and schemas
By CLI's we mean SKILLS.md so it won't require this hop.
> The part where, without a schema defining the inputs, the LLM wastes iterations trying to correct the input format.
What do we lose by one iteration? We lose a lot by passing all the tool shapes on each turn.
> The part where, not having the full picture of the tools, your odds of it picking the same tools or the right tools is completely gambling that it outputs the right keywords to trigger the tool to be used.
we will use skills
> The part where you forgot to mention that for your agent to know that your 100 CLI tools exist, you had to either provide it in context directly, provide it in context in a README.md, or have it output the directory listing and send that off to the LLM to evaluate before picking the tool and then possibly expanding the man pages for several tools and sub commands using several turns.
I use Claude Cowork to talk to my (remote) CMS over MCP to continually improve all content in my website. If I find a new nugget of interesting information, I tell it to improve my content with it. I created lots of tools to help it do things that would require multiple calls in a pure, basic REST api. Plus you can describe lots of guidelines right in the MCP instructions.
I hear everyone talking about skills, but I this something I should use skills for?
>The LLM has no way of knowing which CLI to use and how it should use it…unless each tool is listed with a description somewhere either in AGENTS|CLAUDE.md or a README.md
This is what the skill file is for.
>Centralizing this behind MCP allows each developer to authenticate via OAuth to the MCP server and sensitive API keys and secrets can be controlled behind the server
This doesn't require MCP. Nothing is stopping you from creating a service to proxy requests from a CLI.
The problem with this article is it doesn't recognize that skills is a more general superset compared with MCP. Anything done with MCP could have an equivalent done with a skill.
Great article, and what I would expect from someone inspecting the hype and not jumping head first, just because influencers (paid or unpaid) are screaming for engagement just because a large X account posted their opinions.
This is one of the first posts that I've see that cuts through the hype against both MCPs and CLIs with nuance findings.
There were times where it didn't make sense for using MCPs (such as connecting it to a database) and CLIs don't make sense at all for suddenly generating them for everything. It just seems like the use-case was a solution in search of a problem on top of a bad standard.
But no-one could answer "who" was the customer of each of these, which is why the hype was unjustified.
Yet another problem with MCP: every LLM harness that does support it at all supports it poorly and with bugs.
The MCP spec allows MCP servers to send back images to clients (base64-encoded, some json schema). However:
1) codex truncates MCP responses, so it will never receive images at all. This bug has been in existence forever.
2) Claude Code CLI will not pass those resulting images through its multi-modal visual understanding. Indeed, it will create an entirely false hallucination if asked to describe said images.
3) No LLM harness can deal with you bouncing your local MCP server. All require you to restart the harness. None allow reconnection to the MCP server.
I assure you there are many other similar bugs, whose presence makes me think that the LLM companies really don't like MCP, and are bugly-deprecating it.
The fundamental proposal here is that despite being bad MCP is the correct choice for Enterprise because:
> Organizations need architectures and processes that start to move beyond cowboy, vibe-coding culture to organizationally aligned agentic engineering practices. And for that, MCP is the right tool for orgs and enterprises.
…but, you can distill this to: the “cowboys” are off MCP because they've moved to yolo openclaw, where anything goes and there are no rules, no restrictions and no auditing.
…but thats a strawman from the twatter hype train.
Enterprises are not adopting openclaw.
It’s not “MCP or Openclaw”.
Thats a false dichotomy.
The correct question is: has MCP delivered the actual enterprise value and actual benefits it promised?
Or, were those empty promises?
Does the truely stupid MCP ui proposal actually work in practice?
Or, like the security and auditing, is it a disaster in practice, which was never really thought through carefully by the original authors?
It seems to me, that vendors are increasingly determining that controlled AI integrations with rbac are the correct way forward, but MCP has failed to deliver that.
Thats why MCP is dying off.
…because an open plugin ecosystem gives you broken crap like the Atlassian MCP server, and a bunch of maybe maybe 3rd party hacks.
Thats not what enterprises want, for all the reasons in the article.
I’m struggling to understand the recent wave of backlash against MCP. As a standard, it elegantly solves a very real set of integration problems without forcing you to buy into a massive framework.
It provides a unified way to connect tools (whether local via stdio or remote via HTTP), handles bidirectional JSON-RPC communication natively, and forces tools to be explicit about their capabilities, which is exactly what you want for managing LLM context and agentic workflows.
This current anti-MCP hype train feels highly reminiscent of the recent phase where people started badmouthing JSON in favor of the latest niche markup language. It’s just hype driven contrarianism trying to reinvent the wheel.
I don't even fully understand what people are suggesting instead. That we use CLI tools for everything? There are lots of things I do and tools I use that cli would be very inefficient for interacting with.
We're yet to genuinely standardise bloody help texts for basic commands (Does -h set the hostname, or does it print the help text? Or is it -H? Does --help exist?). Writing man-pages seems like a lost art at this point, everyone points to $WEBSITE/docs (which contains, as you guessed, LLM slopdocs).
We're gonna end up seeing the same loops of "Modern standard for AI" -> "Standard for AI" -> "Not even a standard" -> "Thing of the past" because all of it is fundamentally wrong to an extent. LLMs are purely textual in context, while network protocols are more intricate by pure nature. An LLM will always and always end up overspeccing a /api/v1/ping endpoint while ICMP ping can do that within bits. Text-based engineering, while visible (in the sense that a tech-illiterate person will find it easy to interpret), will always end up forming abstractions over core - you'll end up with a shaky pyramid that collapses the moment your $LLM model changes encodings.
reply