Hermes Agent

Multi-Agent — Orchestrate Specialized AI Workers

Key Points

  • Spawn specialized agents per task
  • Agent-to-agent communication
  • Shared context when needed
  • Parallel task execution
  • Automatic result aggregation
  • Resource-aware scheduling

How It Works

  1. 1Define agent roles in config or on-the-fly
  2. 2Main agent delegates tasks to specialists
  3. 3Agents work in parallel with isolated contexts
  4. 4Results flow back to the orchestrator automatically

Real-World Use Cases

Orchestrator + Research Workers

The main agent breaks a research task into subtopics and spawns a specialist for each. One agent digs into technical docs, another scans forums for real-world experiences, a third checks recent papers. All three report back; the orchestrator synthesizes the final analysis.

kimi + minimax Parallel Reasoning

Use kimi for long-context document analysis and minimax for structured reasoning in parallel. The orchestrator routes the right content to the right model and merges outputs — as seen in production Discord deployments by Hermes users.

Code Review Pipeline

Spawn a security agent, a style agent, and a logic agent on the same PR simultaneously. Each reviews from its specialty perspective, and the orchestrator produces a unified review with non-overlapping concerns flagged separately.

Automated Customer Support Routing

Route incoming support requests to specialist agents by topic — billing to one agent, technical issues to another, feature requests to a third. Each has domain-specific skills loaded; the orchestrator handles escalation logic.

Watch it in action

Ultimate Multi-Agent Workflow w/ Hermes Agent Kanban BoardTonbi

Tonbi demonstrates a fully autonomous multi-agent research-to-delivery pipeline built on Hermes Agent's Kanban board. Scout agents run on a cron schedule, scraping X and the web for user pain points. An orchestrator agent scores each issue against a rubric, deduplicates, and routes work to researcher, analyst, builder, and video-producer sub-agents — all running in parallel. The only human touchpoint is a single Telegram approval gate before anything is built or shipped.

  • 0:00Introduction — multi-agent coordination challenge
  • 2:00What the Kanban board solves: shared state, no duplication
  • 5:00Kanban board structure: cards, assignees, states
  • 8:00Dispatcher loop: how a card becomes a running agent
  • 10:00Task dependency graph: parallel researchers, auto-routing
  • 13:00Full pipeline: scout → orchestrator → researcher → build/video
  • 16:00Live demo: running the pipeline and watching cards populate
  • 22:0018 researcher agents working in parallel on the Kanban board
  • 27:00Human approval gate in Telegram — approve, shelf, or modify
  • 33:00Deliverables: CLI tool built and video outline produced
  • 38:00Open-source workflow template at Tonbi Studio

The Kanban Board as a Single Source of Truth

Tonbi's pipeline stores all task state in a single SQLite file. Every agent reads from and writes to this one board rather than talking to each other directly. When an agent crashes or is restarted, it simply picks up the last card that was in-progress. There are no message queues and no polling loops — state is on the desk, not in an agent's head.

Five Reasons the Kanban Wins Over Naive Multi-Agent Setups

According to Tonbi: (1) Durable — survives restarts and crashes. (2) Parallel — many agents work simultaneously, coordinated by one board. (3) Event-driven — work flows itself through the dependency graph without polling. (4) Self-healing — a dead task gets reclaimed and respawned. (5) Auditable — every claim, comment, and completion is logged, so you can trace exactly what happened.

The One Human Gate

The entire pipeline from scouting to research to judgment runs autonomously. The only point where a human is in the loop is a Telegram message listing proposals. You respond 'approve', 'shelf', or modify the plan. After that, building, testing, and delivering run to completion without further input. Tonbi recommends keeping this gate even when the pipeline is mature to prevent agents from wasting tokens on misguided builds.

Profiles as Specialized Agent Identities

Each sub-agent in the workflow is a Hermes Agent profile. The X scout uses a Grok model for social media reasoning; the web scout and workers use GPT-5.5. Different profiles can run different models, carry different skills, and have different tool access — the Kanban routes work to the right profile based on the card's assignee field.

Under the Hood

Hermes multi-agent orchestration implements a hierarchical task decomposition model. The orchestrator (main agent) analyzes a complex task, identifies the optimal work breakdown structure, and spawns specialist worker agents with tailored context: each worker receives only the task-relevant subset of the full conversation context plus any shared skills or knowledge the orchestrator decides to pass. This selective context sharing prevents workers from being confused by irrelevant background while keeping coordination overhead low.

Agent-to-agent communication flows through a structured message passing layer — workers don't share memory directly, they exchange typed result objects that the orchestrator can validate, transform, and route. This prevents the 'telephone game' degradation that plagues naive multi-agent setups where agents summarize each other's summaries. The orchestrator receives raw structured outputs and does the synthesis itself, maintaining full fidelity.

Parallel execution is first-class: the orchestrator can fire multiple worker agents simultaneously and wait for all results before proceeding, or use streaming results to start synthesis as early outputs arrive. Resource-aware scheduling prevents over-spawning — configurable concurrency limits ensure you don't accidentally blow your API rate limits or exhaust your VPS memory with a runaway agent fleet. The architecture is directly inspired by real production patterns shared in the Hermes Discord community, including kimi+minimax parallel configurations and orchestrator+specialist deployments.

Frequently asked questions

What is the Hermes Agent Kanban board?

The Kanban board is a multi-agent coordination layer accessible from the Hermes web dashboard under Plugins > Kanban. Tasks are cards with a title, assignee (the agent profile that should work on it), and status. The dispatcher loop claims ready cards and spawns the assigned agent in its own clean workspace.

How do I access the Kanban board in Hermes?

Open the Hermes web dashboard, go to Plugins, and click Kanban. You need to run /update first to ensure you have the latest version, then run /kanban to initialise it. The dashboard will show a new Kanban tab.

How many agents can run in parallel on the Kanban board?

The Tonbi demo showed 18 researcher agents running simultaneously without any conflicts. The practical limit depends on your API rate limits and VPS memory, not the board itself.

Can I use different AI models for different Kanban agents?

Yes. Each agent is a Hermes profile and each profile can specify its own model. In Tonbi's pipeline, the X scout used Grok while all other agents used GPT-5.5.

Does the Kanban board work with a human approval step?

Yes. The recommended pattern is a single human gate delivered via Telegram. The orchestrator sends proposals to your chat, and you approve, shelf, or modify. After that, the build or video pipeline continues autonomously.

What happens if an agent crashes mid-task?

The Kanban board is self-healing. A dead task is reclaimed and respawned by the dispatcher. Because state lives on the board rather than in the agent's memory, no work is lost on a crash or restart.

How do I adapt the Tonbi workflow template to my own use case?

Tonbi open-sourced a generalised version at Tonbi Studio / hermes-multi-agent-workflow on GitHub. The skeleton includes scout, orchestrator, researcher, and worker profiles plus the dispatcher. You adapt the scout prompts and rubric to your domain.

Related Features