The Agentic Moment

What's actually happening in the AI agent landscape — and why a security researcher decided to run the most dangerous option

Mar 31, 2026

The question I kept getting over the past few weeks, as I stood up a production multi-agent AI system for my own business, was some version of: “Why would you do that?”

The short answer: I’ve been in IT and cybersecurity for 34 years, I’m a CISSP, and I’m currently a doctoral candidate at George Washington University researching AI security — specifically how AI systems fail under adversarial conditions. I run Jacobian Engineering, an employee-owned MSSP serving healthcare, SaaS, and government contractor clients. You cannot meaningfully advise organizations on a technology you haven’t operated.

It’s a fair question. OpenClaw — the open-source AI agent framework that went viral earlier this year — has a documented history of security vulnerabilities. CVE-2026-25253 was a critical remote code execution flaw. Researchers found over 40,000 publicly exposed instances within weeks of its viral surge. The community skills registry was hit by a supply chain attack: over 800 compromised packages deployed before detection. Palo Alto Networks called OpenClaw a “lethal trifecta” of risks. One of its own maintainers said publicly: if you can’t understand how to run a command line, this project is too dangerous for you to use safely.

He’s not wrong. And I built it anyway.

The reason is the same reason I’ve spent my career doing security work rather than adjacent to it: you cannot meaningfully advise organizations on a technology you haven’t operated. You can read the CVEs, review the architecture, and produce a risk assessment. But until you’ve stood up the system, debugged its failure modes, and built the controls around it, you’re advising from the outside. My doctoral research is on how AI systems fail under adversarial conditions. My professional work is helping organizations adopt technology safely within compliance frameworks. For both of those, I needed to know what running this actually looks like.

What follows is that account — five posts covering the landscape, the choices, the setup, the moment the agents started coordinating, and what it means for teams thinking seriously about this.

The Landscape as It Actually Stands

The public coverage of AI agents has been split between breathless enthusiasm and legitimate alarm. Both are real. Neither is sufficient for making a good adoption decision.

The analysis I’ve found most useful frames it as a three-axis evaluation rather than a single “how much control” spectrum. The three axes are:

Where does the agent run? Local (your machine), cloud (their servers), or hybrid. This determines your data privacy posture, your security surface area, and who owns the consequences when something goes wrong. Local means your data never leaves your infrastructure — and you own the security entirely. Cloud means someone else handles the infrastructure — and you trust them with everything the agent sees. For organizations in regulated industries, this is not a preference question. Healthcare, financial services, and government contractors have compliance postures that may rule out certain deployment models before the feature comparison begins.

Who orchestrates the intelligence? Single-model, multi-model, or model-agnostic. A single-model system gives you consistency and simplicity. Multi-model systems give you optimized task routing. Model-agnostic systems give you maximum flexibility at the cost of configuration burden. The difference matters enormously for what you can actually build. More on this in post two.

What does the interface assume about the user? OpenClaw works through whatever messaging app you already use — WhatsApp, Telegram, Discord, Slack. That sounds like a feature. It’s also a design assumption that you can configure and manage the underlying system. Anthropic’s Dispatch uses a phone-to-desktop model, assuming a professional who wants to delegate but not configure. Perplexity Computer uses a web dashboard, assuming you want to describe an outcome and walk away. The interface isn’t cosmetic — it determines who can actually use the system, which is often more important than what the system can theoretically do.

The Strategic Plays

Five companies have made distinct bets on these axes.

OpenClaw owns the sovereignty position: local execution, model-agnostic, messaging-native. Maximum flexibility, maximum control, maximum risk. The political statement is explicit — the agent should belong to the user, full stop.

Perplexity Computer owns the opposite: cloud-managed, multi-model, outcome-oriented. At $200/month consumer or $325/seat enterprise, it bets that knowledge workers will pay for delegation — not “help me do this” but “do this for me.” Their sprint from consumer launch to enterprise product to iOS/Android to a local Mac mini variant in under three weeks tells you how seriously they’re taking the local-execution threat.

Meta Manus, at $2 billion acquisition, owns the consumer-distributed position: enough capability to be useful, enough guardrails to avoid headlines, deployed to three billion people through the largest social platform on earth.

Anthropic Dispatch owns the professional middle: local execution, managed safety, single-model, phone-based delegation. The “we’ll do the same thing, but properly” play. The implicit pitch: OpenClaw showed what people want. Dispatch gives it to them without the security nightmares.

And NVIDIA built NemoClaw — a security wrapper around OpenClaw itself. Jensen Huang compared it to how Red Hat made Linux enterprise-ready. The explicit acknowledgment that OpenClaw has become infrastructure, not just a product.

Why I’m Running the Dangerous One

I run OpenClaw in production, configured as a multi-agent team, connected to real business workflows, on infrastructure I manage. The reason is that model-agnostic, multi-agent orchestration is where the practical value for my work lives — and no other deployment model delivers it cleanly.

I’m running six agents, each on a different model selected for its specific strengths. My product manager agent (Atlas) runs on GPT-5.4 because it scores 83% on GDPval — the benchmark for professional knowledge work — and its 1M context window lets it hold entire project states simultaneously. My content agent (Compass) runs on Claude Sonnet 4.6, because Sonnet 4.6 produces the best writing quality for brand-constrained content at the cost profile that makes sense. My UI/UX designer (Prism) runs on Gemini 3.1 Pro because it leads on multimodal reasoning and costs 7.5x less than Claude Opus for comparable reasoning depth. My doctoral research assistant (Scholar) also runs on Gemini 3.1 Pro because it leads on GPQA Diamond — expert-level science questions at 94.3% — and its 1M context window can process entire literature corpora in a single session.

NemoClaw routes all inference through a single proxy. One model, one route. For a basic assistant, fine. For what I’m building, it’s a hard architectural constraint.

I want to be precise about what “doing it right” means in this context. It means treating the agent deployment like a production web application — secrets management, service supervision, skill supply chain vetting, least-privilege access, network exposure controls, update hygiene. It means knowing that the skills registry was compromised and vetting every skill before installation. It means knowing that session-level model overrides persist across restarts through undocumented behavior. It means knowing that Slack bots ignore messages from other bots by default and that getting agents to communicate with each other requires a config flag that isn’t in the primary documentation.

There are organizations that should not be running OpenClaw. Non-technical teams without infrastructure staff. Organizations whose risk tolerance doesn’t cover the surface area of a locally-hosted agent with access to credentials for every connected system. Organizations in regulated industries who haven’t done the compliance analysis first.

There are also organizations that can run it correctly, and for those organizations the capability is genuinely substantial. I’ll describe it in the next four posts.

The team structure framework I’m drawing on here comes from Nate Jones’s analysis of the AI-era organization — worth reading in full:

Nate’s Substack

5 AI agents, 5 contradictory bets, 3 questions that tell you which one fits — and the prompts to pressure-test your answer

OpenClaw is the most consequential provocation in AI since ChatGPT. And the coverage — both the “who’s winning” horse race and the “oh God the security” dumpster fire — is hiding the actual story…

Listen now

4 months ago · 66 likes · 5 comments · Nate

Erik Jones

Discussion about this post

Ready for more?