My 21 AI Agents Aren’t Allowed to Talk to Each Other. That’s Why It Works.

There are twenty-one of them, and not one is allowed to speak to another.

That should sound like a bug. For most of this year the future of AI was sold to me as agents that collaborate, that hand work off to each other like a relay team, that negotiate and chain together into something resembling a conversation between robots. I built the opposite. Twenty-one small, single-purpose agents that share a building and never once have a conversation. It is the reason the whole thing runs for the price of a few lunches, and the reason it has never, not once, fallen over.

I built it to run sales for a company I operate alone, across 14 markets and 5 languages. What it closed and why I eventually switched it off is a different story, and I told it over here. This piece is about the machine itself. How it is wired, why each strange decision was the right one, and what building it taught me about the thing almost everyone is getting wrong about AI agents in 2026.

Key takeaways

The hard part of building an AI workforce isn’t the model. It’s the systems engineering around it.

Twenty-one single-purpose agents, each good at one job and ignorant of the rest. Specialists you can test and reason about beat one clever generalist that becomes a black box the first time it surprises you.

The agents never call each other. They share state, not conversation, so any one of them can crash and restart without taking the others down.

Cost is a design output, not luck: cheap models for high-volume scoring, expensive models only for writing, and a free pre-filter that drops about 85% of the work before a single paid call. Total spend across its life: under $20, across 3,228 calls to the model.

A human approves every action. There is no autonomous mode anywhere in the code.

Twenty-One Specialists, Each Gloriously Ignorant

Picture an org chart, not a program.

There is an agent that reads a contact and assigns it a lane and a temperature. One that takes the hot and warm leads and builds the day into a real calendar. One that writes the email, in whichever of five languages the other person actually uses. One that reads every reply as it lands and sorts it by intent. A pair that go hunting for businesses that never filled out a form. And underneath all of that, a quiet maintenance crew whose only job is keeping the data honest: merging duplicate contacts, filling in missing details, reconnecting orphaned records, flagging the deals that have gone cold.

Twenty-one in all, in three crews. The ones that sell, the ones that hunt, and the ones that clean. None of them tries to be the whole salesperson. The agent that scores a lead cannot write an email. The agent that writes the email cannot score a lead. That ignorance is the design, not a flaw in it.

A specialist is something you can test in isolation, swap out, and reason about when it misbehaves. A generalist that does everything is a black box the first time it surprises you, and it will surprise you. I would rather have twenty-one things I understand than one thing I have to trust.

Twenty-one single-purpose agents in three crews, coordinating only through a shared store.

The Rule: No Agent Talks to Another

The fashionable architecture right now is agents that talk. One calls the next, hands off, waits for the reply, and they chain together into a little pipeline of conversation. It demos beautifully. It also fails like a string of cheap Christmas lights, where one dead bulb takes the whole line down with it.

Mine share a destination instead of a conversation. The scorer writes its verdict into the CRM. Later, completely independently, the planner reads that verdict back. The two never meet. There is no handoff, no waiting, no fragile chain where step four dies the moment step three hiccups. Every agent does its one job against the shared record and walks away.

The payoff shows up in how the system fails. Any agent can crash, throw an error, or get killed mid-run, and the rest never notice, because nothing was waiting on it. Start it again tomorrow and it picks up from the last thing it finished. Independent systems degrade. Monolithic ones collapse. I wanted the kind that loses a limb and keeps walking, not the kind that flatlines because one function somewhere timed out.

The Death Star was the most powerful object in the galaxy, and a single torpedo down the right shaft ended it. Any system built on one critical dependency is that station. Twenty-one agents that don’t depend on each other have no shaft to hit.

They share a destination, not a conversation, so the system degrades instead of collapsing.

The Cheapest Call Is the One You Never Make

Then there is the bill, the part nobody renting you a seat at $890 a month wants to talk about. The entire system has cost about $17 across its whole life. Three thousand two hundred and twenty-eight calls to the model, a little over half a cent each. That number is not luck. It is three decisions.

The first is routing by job. Scoring a thousand contacts is high-volume, low-judgment work, so it runs on a small, fast, cheap model. Writing an email a stranger will actually read is the opposite kind of work, so that runs on a stronger, pricier one. Most agent tools pick a single model and pay the premium on everything. I pay it only where the writing has to be good.

The second is a filter that runs for free. Before a single paid call, a plain set of rules and a quick web check throw out roughly 85% of the work: the dead domains, the junk, the contacts already in the system, the throwaway addresses with no company behind them. Only the genuinely uncertain leads ever reach a model at all.

The third is memory. The system refuses to redo finished work, and it looks at a contact only when something has actually changed since the last time. That sounds trivial. It is load-bearing, and choosing the wrong signal for "has anything changed" once turned my money-saver into an outage that never threw an error. I learned that one the hard way, which is the only way these lessons ever arrive.

The cheapest model call is the one you never make.

Skynet’s Real Problem Was Never Intelligence

There is no autonomous mode. Not because I ran out of time to build one. Because there is no code path anywhere that writes to the CRM or reaches a real person without my explicit yes. Every score, every new contact, every drafted email lands in an approval queue first and waits for me.

We were all told the same horror story, and we keep building toward it anyway. Skynet’s whole problem was never that it was smart. It was that nobody had to press the button. An AI acting on the world at machine speed, on imperfect information, with no human standing between the decision and the consequence, is not a productivity tool. It is a liability with good marketing. The human in the loop is not the feature I haven’t gotten around to removing. It is the feature.

The human in the loop isn’t a missing feature. It’s the architecture.

What It Actually Proved

I want to be honest about what this is and isn’t, because the alternative is the kind of overclaim that makes you distrust everything else I just said.

It is not a magic cold-outreach machine that prints money out of strangers. It is closer to a tireless operations team for a book of business that already existed, roughly 2,800 contacts sitting in a CRM nobody had the hours to read. It surfaced the ones worth a call and kept the rest from quietly rotting. That is a smaller claim than the brochure version, and a truer one.

And it was built like real software, not a weekend hack. Twelve rounds of write-the-spec, then write-the-plan, then build, across about seven weeks, each round with a deadline and a short list of things it deliberately would not do. A log of every decision it has ever made. State written to disk so it survives a crash. The interesting part was never a clever prompt. It was the same boring discipline that makes any system you can actually trust.

We were sold a dream about an AI smart enough to replace the person. The one I built does something quieter, and I think better. It is not a genius in a box. It is twenty-one specialists who don’t talk, can’t act alone, and cost less than lunch to run, arranged by someone who decided what each one was for. The leverage was never the intelligence of the model. It was the architecture around it, and the architecture was mine.

Frequently asked questions

What is a multi-agent AI system?
It is a system that splits a large job across many small, specialized AI agents instead of asking one model to do everything. In this case, 21 agents each own a single step (scoring, planning, drafting, cleanup) and coordinate through shared data rather than by talking to each other.

Why don’t the agents talk to each other?
So the system fails gracefully instead of all at once. When agents share state through a common record rather than calling each other directly, any agent can crash or be restarted without affecting the rest. There is no chain to break.

How does it run for so little money?
Three things: a cheap, fast model for high-volume scoring and a stronger model only where writing quality matters; a free rule-based filter that removes about 85% of the work before any paid call; and a memory that refuses to reprocess a contact that hasn’t changed.

Does it act on its own?
No. There is no autonomous mode in the code. Every score, contact, and email passes through an approval queue and waits for a human yes before anything reaches the CRM or a real person.

What did it actually accomplish?
The business results, including revenue and why I eventually turned it off, are covered in Meet Albert. This piece is about the architecture behind those results.

→ I Built an App in Four Days and Sold It in One

Post Views: 0