Feature

Subagents — Parallel AI Workstreams

Quick answer

Hermes subagents let the main agent spawn isolated child agents for parallel workstreams without context leaking between them. Each subagent has its own context but shares the skill library, so you get concurrent execution for research or batch work with managed resources.

Key Points

✓Isolated context
✓Parallel execution
✓Shared skill library
✓Resource management

How It Works

1Spawn: '/spawn research-bot'
2Each subagent is independent
3Share skills via common library
4Close when done

Real-World Use Cases

Parallel Research and Writing

Spawn three subagents: one researches the topic, one drafts the outline, one reviews existing content for gaps. The orchestrator collects all three outputs and synthesizes the final document — 3x faster than sequential work.

Concurrent API Integration Testing

Test multiple API endpoints simultaneously. Each subagent handles a different endpoint with isolated state, preventing test interference. Results aggregate back to a single summary report.

Multi-Region Deployment Verification

Deploy to multiple regions in parallel. Each subagent SSH's into a different server, runs the deployment, verifies it, and reports status. The orchestrator waits for all confirmations before marking deployment complete.

Data Pipeline Parallelization

Split large datasets across subagents for parallel processing. Zero-context-cost pipelines via execute_code mean subagents can crunch data without the context window overhead of a single large session.

Under the Hood

Subagents in Hermes are not threads or coroutines — they are fully isolated Hermes sessions with their own context windows, tool access, terminal backends, and Python RPC namespaces. This isolation prevents context leakage: a subagent working on database migrations can't accidentally see or modify the context of a subagent doing security audits. The orchestrator spawns them with a task specification and optional shared context, then receives structured results when they complete.

The spawn model supports both fire-and-forget (spawn and receive results asynchronously) and blocking (wait for all subagents before continuing). Programmatic Tool Calling collapses multi-step tool chains into single inference calls within a subagent, dramatically reducing latency for compute-intensive pipelines. Subagents share the parent's skill library by default, so skills created in the main session are immediately available in spawned workers.

Resource management is automatic: Hermes tracks active subagent count, enforces configurable concurrency limits, and handles graceful shutdown on timeout or error. Each subagent can use a different terminal backend — for example, the orchestrator runs locally while CPU-intensive subagents run on Modal (serverless GPU) or Daytona (serverless persistence). This heterogeneous compute model lets you right-size each workstream without manual infrastructure management.

Frequently asked questions

What are Hermes subagents for?

Running parallel workstreams — research, batch processing, multi-part tasks — concurrently instead of sequentially. Each subagent has isolated context so their work doesn't bleed together.

Do subagents share memory and skills?

They share the skill library and can share context when needed, but each keeps isolated working context by default to prevent leakage. You control what's shared.

Does using subagents cost more?

Yes — concurrent agents use more total tokens. Scope subagent tasks and cap concurrency so you gain speed without runaway cost.

Related Features

multi agent code execution tirith security

Try Hermes Free → Deploy in 60 seconds