Sub-agents and skills: how to design a multi-agent fleet

A multi-agent fleet is useful when one agent has too many jobs, too many tools, or too many rules to stay reliable. Sub-agents are specialist workers. Skills are reusable procedures they can follow. The trick is not adding more agents. The trick is giving each one a clear job, a clear owner, and a clean handoff back to the human or orchestrator.

What are sub-agents and skills?

Sub-agents are smaller specialist agents called by a main agent or workflow. Skills are reusable instructions or procedures an agent can apply across jobs. Use sub-agents for different responsibilities. Use skills for repeatable methods inside one responsibility.

When one agent stops being enough

One agent is usually better than a fleet at the start. It is easier to debug, cheaper to run, and simpler for the team to understand.

A fleet starts making sense when the agent is juggling separate jobs that need different context. For example, customer support, finance review, social drafting, and inventory checks should not all share one giant prompt.

Use one agent when	Use sub-agents when
The workflow has one reviewer and one output.	The workflow needs different specialists with different sources.
The prompt still fits in one clean operating procedure.	The prompt has become a pile of exceptions and tool rules.
The same person can judge quality across the whole job.	Different humans own different parts of the workflow.
Tool permissions are simple and low risk.	Some tools are safe for one role and dangerous for another.

The orchestrator pattern

The cleanest multi-agent design has an orchestrator. The orchestrator receives the task, decides which specialist should handle it, passes the right context, and collects the result.

The orchestrator should not do every job itself. That is how it turns into the same bloated general agent you were trying to avoid. Its job is routing, context packing, state tracking, and final assembly.

For business workflows, the orchestrator often lives inside the app or queue system instead of inside a chat window. A ticket arrives, the system classifies it, calls the right agent, saves the draft, and posts it for review.

What belongs in a sub-agent

A sub-agent needs a stable responsibility. "Review refunds against policy" is a good responsibility. "Think about the customer" is not.

Give each sub-agent a tight source pack, a small tool set, a required output format, and a stop rule. The more specialist the agent, the easier it is to know if it failed.

Support drafter: reads customer context and drafts replies.
Policy checker: compares a request against rules and flags exceptions.
Finance analyst: prepares read-only margin or cash briefs.
Content editor: rewrites approved drafts into a brand format.
Data cleaner: normalizes messy inputs before another agent sees them.

What belongs in a skill

A skill is a reusable method. It is not a separate worker. It is more like a playbook the agent can open when the situation calls for it.

Good skills include diagnosing a support ticket, writing an executive brief, checking a source against a policy, turning notes into a proposal section, or preparing a launch checklist. The same support agent might use different skills depending on the ticket.

If the responsibility changes, use a sub-agent. If the method repeats inside the same responsibility, use a skill.

The split rule

Split by ownership, not by hype. If a human would assign the work to a different role, consider a sub-agent. If the same role just needs a procedure, use a skill.

Tool permissions should follow the role

Do not give every sub-agent every tool. That is the fastest way to turn a fleet into a risk multiplier.

A support drafter may need ticket history and order lookup. It probably does not need refund authority. A finance analyst may need read-only sales and cost data. It does not need the payment processor write API.

The permission model should be boring: read broadly where needed, write narrowly where proven, and require approval anywhere money, customers, public content, or legal exposure is involved.

Handoffs need a contract

Sub-agent handoffs fail when one agent sends a messy paragraph and the next agent guesses what matters. Use a structured handoff.

A good handoff includes task, context used, decision made, confidence, unresolved questions, citations or source references, and the next recommended action. That makes the next agent less likely to invent missing context.

Handoff field	Why it matters
Task	The next agent knows what problem was being solved.
Sources used	A reviewer can check whether the answer came from trusted data.
Confidence	Low-confidence outputs route to human review faster.
Open questions	The next step is explicit instead of guessed.
Recommended action	The workflow keeps moving without hiding uncertainty.

The failure mode: agents stepping on each other

The classic fleet failure is shared ownership. Two agents edit the same file. Two agents update the same customer note. One agent closes a task while another is still researching it.

Fix that with ownership boundaries. One writer per artifact. One final approver. One source of truth. If two agents need to contribute, they write separate sections or comments, then a coordinator assembles the final version.

Agents should never fight over state. State belongs in a database, queue, ticket, document, or workflow engine the system controls.

How to roll out a fleet safely

Do not go from one agent to twelve. Add the second agent when the first agent has clean logs and stable correction patterns. Add the third when a real responsibility split appears.

Start with one agent and one workflow.
Extract repeat procedures into skills.
Split a sub-agent only when the role boundary is obvious.
Add structured handoffs and logging before adding more agents.
Review the fleet weekly for duplicate ownership and stale prompts.

A practical fleet map

Picture a sales intake workflow. The orchestrator receives a new form submission. It sends the company website and form data to a research sub-agent. It sends the pain points to a solution-mapping sub-agent. It sends the budget and timeline to a qualification sub-agent.

Each specialist returns a structured result. The orchestrator assembles the brief and posts it for a human to review before any prospect email goes out.

That is a useful fleet because the roles are distinct. Research is not qualification. Qualification is not proposal writing. Proposal writing is not final approval.

Shared memory can become shared confusion

Teams often want one shared memory that every agent can read and write. That sounds efficient until bad information spreads across the fleet.

Use shared memory for stable facts: brand rules, current pricing, approved policies, product catalog, and current workflow state. Be more careful with opinions, rough notes, old research, and half-finished drafts.

If an agent writes to shared memory, log it. If another agent depends on that memory, show the source. Memory without provenance becomes a rumor system.

Cost and latency matter

Multi-agent systems can get expensive fast because every specialist call costs tokens and time. A fleet that calls six agents for a task one agent could handle is not architecture. It is waste.

Use cheaper models for simple classification, extraction, cleanup, and formatting. Save stronger models for judgment-heavy drafting, conflict resolution, and final synthesis.

Also decide which agents can run in parallel. Research, qualification, and data cleanup can often happen at the same time. Final assembly should wait until the pieces are done.

Naming and ownership

Multi-agent systems need boring names. Cute names are fine for internal culture, but the architecture should still expose the role: support drafter, policy checker, finance analyst, social editor, inventory watcher.

Each agent should have an owner. The owner decides the source material, reviews failures, approves prompt changes, and knows when to pause the agent. Without ownership, the fleet rots quietly.

Ownership also applies to files, records, and outputs. One agent owns the support draft. Another may contribute policy notes, but it does not edit the final support draft directly. That keeps the work auditable.

The weekly fleet review

A fleet needs a weekly review while it is young. Look at cost, latency, failure categories, duplicated work, stale sources, rejected drafts, and tool errors.

The review should ask one blunt question: which agent made the business meaningfully better this week? If an agent is not saving time, improving decisions, or reducing misses, it needs a fix or a pause.

Also watch for prompt drift. Agents accumulate exceptions. At some point, the prompt is not an operating procedure anymore. It is a junk drawer. When that happens, split the role, extract a skill, or rewrite the prompt from the logs.

How skills should be versioned

A skill should have a version, owner, and change note. If a skill changes the way a proposal is written or a ticket is diagnosed, you need to know which outputs used which version.

Versioning does not have to be heavy. A simple filename, frontmatter block, or database row can work. The point is traceability.

When a skill performs badly, roll back the skill instead of rewriting the whole agent. That is one reason skills are worth separating from the main role prompt.

How to turn this into a project brief

If this topic is moving from article to build, write the project brief before picking tools. The brief should fit on one page. If it cannot, the scope is probably still too wide.

Use five fields: workflow, owner, sources, allowed actions, and proof. The workflow names the repeat job. The owner names the human reviewer. The sources name the systems and documents the agent may trust. The allowed actions name what the agent can read, draft, update, or never touch. The proof names the metric that decides whether the build worked.

Workflow: what input starts the agent and what output should exist at the end?
Owner: who reviews quality and who can pause the agent?
Sources: which records, files, policies, and examples are trusted?
Actions: what is read-only, what is draft-only, and what requires approval?
Proof: what correction rate, time saved, or risk reduction would make this worth keeping?

This keeps the build tied to business work. Agents fail when they become an abstract technology project. They work when the job, reviewer, sources, permissions, and proof are clear before code starts.

Frequently asked questions

What is a sub-agent?

A sub-agent is a specialist AI agent called by a main workflow or orchestrator. It handles one defined responsibility with its own context, tools, output format, and stop rules.

What is an AI agent skill?

A skill is a reusable procedure an agent can follow. It is useful when the same role needs a repeat method, such as writing a brief, checking a policy, or triaging a ticket.

When should I use multiple agents?

Use multiple agents when one agent has too many responsibilities, tool permissions, or source packs to stay reliable. Do not add agents just because the architecture sounds impressive.

Do multi-agent systems need an orchestrator?

Most production multi-agent systems need some orchestration. The orchestrator routes tasks, packages context, tracks state, and collects results for review.

What is the biggest multi-agent risk?

The biggest risk is unclear ownership. If multiple agents can edit the same artifact, update the same record, or make the same decision, the fleet becomes hard to debug.

Key takeaways

Start with one agent. Split only when role boundaries are real.
Sub-agents are specialist workers. Skills are reusable procedures.
The orchestrator should route work, not become a giant do-everything agent.
Tool permissions should follow the role and stay narrow.
Use structured handoffs with sources, confidence, open questions, and next action.
Prevent agents from sharing ownership of the same artifact or decision.

Need the fleet mapped before it becomes spaghetti?

The intake gives us your roles, tools, workflows, and approval needs. From there, we can decide whether you need one agent, a few skills, or a real fleet.

Start the intake →

Sub-agents and skills
without the mess.

What are sub-agents and skills?

When one agent stops being enough

The orchestrator pattern

What belongs in a sub-agent

What belongs in a skill

The split rule

Tool permissions should follow the role

Handoffs need a contract

The failure mode: agents stepping on each other

How to roll out a fleet safely

A practical fleet map

Shared memory can become shared confusion

Cost and latency matter

Naming and ownership

The weekly fleet review

How skills should be versioned

How to turn this into a project brief

Frequently asked questions

Key takeaways

Related reading

Need the fleet mapped before it becomes spaghetti?

Sub-agents and skillswithout the mess.

What are sub-agents and skills?

When one agent stops being enough

The orchestrator pattern

What belongs in a sub-agent

What belongs in a skill

The split rule

Tool permissions should follow the role

Handoffs need a contract

The failure mode: agents stepping on each other

How to roll out a fleet safely

A practical fleet map

Shared memory can become shared confusion

Cost and latency matter

Naming and ownership

The weekly fleet review

How skills should be versioned

How to turn this into a project brief

Frequently asked questions

Key takeaways

Related reading

Need the fleet mapped before it becomes spaghetti?

Sub-agents and skills
without the mess.