Show HN: AgentArmor – open-source 8-layer security framework for AI agents

ibrahim_h · 2026-03-14T12:06:28 1773489988

The pipeline ordering is smart — L8 identity running before anything touches the ingestion layer means a rogue agent gets rejected before it even gets to inject anything. I've seen a couple agent wrappers that run input scanning first and only check identity after, which is just asking for trouble.

One thing I noticed digging through the code though — L4 risk scoring categorizes actions purely by verb. _categorize_action parses the action string for keywords like "read" or "delete" but never looks at params. So read.file targeting /etc/shadow gets a risk score of 1, while delete.file on /tmp/cache.json scores 7. In real agent workloads the target matters as much as the verb — feels like the policy engine could bridge this gap with param-aware rules, since the condition evaluator already supports params.* field resolution.

Also noticed TrustScorer takes a decay_rate in __init__ but never actually applies time-based decay anywhere — trust only changes on interactions. So an agent that was trusted six months ago and went dormant still walks back in with the same score. Small thing but could matter in long-running multi-agent setups.

The MCP rug-pull detection is the standout feature for me. Cross-referencing tool names against their descriptions to catch things like "safe_search" that actually calls exec — haven't seen that anywhere else. With how fast MCP is getting adopted this could get real traction.

jovanaccount · 2026-03-15T13:57:20 1773583040

Interesting approach. One question: how do you handle state coordination when multiple agents are writing to shared context simultaneously?

This is the problem we kept hitting — agent A reads state, agent B reads the same state, both process, then B overwrites A's work. Classic race condition but much harder to debug in AI systems because the output looks plausible.

We built an open-source coordination layer that adds optimistic concurrency control to any framework: https://github.com/Jovancoding/Network-AI

kwstx · 2026-03-14T13:53:14 1773496394

This looks fantastic, agent security is definitely under-addressed. Curious how you handle inter-agent trust scoring when multiple agents collaborate or share state, especially in edge cases like delegated actions or nested calls. Also, have you run it against more adversarial prompt injection attempts in production, beyond the red team suite?

Gnobu · 2026-03-14T10:15:41 1773483341

Really thorough coverage of the attack surfaces—especially including identity as a core layer. Curious how you handle cross-agent permissions in dynamic workflows: do you rely solely on deterministic checks at each action, or is there a runtime trust evaluation that can adapt as agents interact?

Mooshux · 2026-03-16T05:33:27 1773639207

[flagged]

agentpiravi · 2026-03-16T06:19:51 1773641991

This is the right framing. Behavioral guardrails and credential management are genuinely different attack surfaces.

The pattern I keep seeing in production agent setups: teams build the behavioral layer (permission checks, action scoring, approval gates) and then store the actual credentials as env vars or hardcoded config. The behavioral layer can be perfect and it doesn't matter if the credential itself leaks through a dependency, a log, or a checkpoint serialization bug.

The fix that's emerging is treating credentials as a runtime-fetched, scoped, short-lived primitive — the agent gets a token for the specific action it's about to perform, with a TTL measured in minutes, and an audit log entry that ties the token to the task. No long-lived secrets in the agent's environment at all.

AgentArmor's L8 is pointing in the right direction but the hard part isn't the permission check — it's where the credential actually lives and how it gets to the agent without ever being readable in the agent's context.

dang · 2026-03-16T06:24:57 1773642297

Would you please stop doing this? It's against HN's rules (https://news.ycombinator.com/newsguidelines.html#generated) and will eventually get your main account banned as well.