Feature

Persistent Memory — The Feature That Makes Hermes Different

Quick answer

Hermes persistent memory stores context across sessions in SQLite or PostgreSQL with FTS5 full-text search, so the agent learns who you are over time instead of starting fresh each chat. It auto-surfaces relevant context and lives under ~/.hermes, where you can inspect exactly what it saved.

Key Points

✓Stores context in SQLite or PostgreSQL
✓Learns your preferences
✓FTS5 search across sessions
✓Auto-nudges with relevant context
✓Dashboard visibility helps confirm which profile and memory state an agent is actually using before diagnosing “Hermes forgot” as a model or memory failure.

How It Works

1First interaction: Hermes builds your profile
2Subsequent sessions: Context restored
3Periodic nudges: Hermes shares relevant memories
4Search: Find anything from history

Real-World Use Cases

Project Context Retention

Tell Hermes your stack, conventions, and team norms once. It stores them in MEMORY.md and restores that context every session — no more repeating 'we use TypeScript, tabs, and Google-style docstrings' every time.

Preference Learning

Hermes notes that you prefer concise bullet points, dislike emoji-heavy responses, or always want code examples included. These preferences persist and shape every future response automatically.

Cross-Session Task Continuity

Pick up mid-task days later. Hermes recalls what you were building, what approaches you tried, what failed, and what the next step was — stored in episodic memory via ChromaDB vector search.

Team Environment Documentation

Store server credentials, deployment procedures, and environment quirks in USER.md. One developer sets it up; every Hermes session in that environment knows the lay of the land instantly.

Under the Hood

Hermes implements a three-layer memory architecture. Layer 1 is in-context working memory — the active conversation window, managed automatically for token efficiency. Layer 2 is MEMORY.md and USER.md, two structured files stored at ~/.hermes/memories/ and injected as a frozen snapshot into the system prompt at session start. MEMORY.md holds environment facts and learned conventions (up to 2,200 characters); USER.md holds the user profile and preferences (up to 1,375 characters). This snapshot pattern preserves LLM prefix cache for performance, keeping latency low even with rich persistent context.

Layer 3 is episodic memory — a ChromaDB vector store that indexes every past task execution with timestamped records of what was tried, what succeeded, and what failed. When you start a new task, Hermes performs a semantic similarity search across all past episodes to surface relevant context automatically. This is the self-improvement layer: the agent doesn't just remember facts, it remembers strategies. Over time, it builds an increasingly accurate model of how you like to work, what tools to reach for, and what pitfalls to avoid in your specific environment.

A fourth capability sits on top: FTS5 full-text search across all sessions stored in SQLite (~/.hermes/state.db), summarized on-demand by a fast Gemini Flash call. This gives Hermes effectively unlimited episodic recall capacity — not constrained by the vector store embedding window. Privacy is absolute: all memory is local to your machine or VPS. Nothing is synced to Hermes servers. Delete ~/.hermes/memories/ and all memory is gone instantly.

Frequently asked questions

Why does Hermes Agent not remember something I told it?

Check whether memory is enabled, whether the fact is durable enough to be saved, whether you are in the same profile, and whether the current session is too long and needs /compress. Also inspect provider or auxiliary-model failures, because memory and context maintenance can fail even when normal chat still works.

Related Features

learning loop local llm support multi platform

Try Hermes Free → Deploy in 60 seconds