Most AI agents have goldfish memory. Every new conversation starts from zero — you re-explain your project, your preferences, your constraints. Over and over. This is costing you real productivity every single day.
Hermes Agent takes a fundamentally different approach to memory, and it's the single biggest reason developers are switching.
The Problem With Goldfish Memory
Here's what gets lost every time you start a new session with a stateless AI:
- Your stack preferences: "I use TypeScript, not JavaScript" — explained for the 40th time
- Correction history: "No, I told you last week — don't use sudo for Docker, I'm in the docker group" — the agent breaks the same rule again
- Project context: "The API is at ~/code/myapi, it uses Axum + SQLx" — copy-pasted into every session
- Calibrated workflows: The 3-session process you worked out for generating images — gone
- What failed: The approach that produced garbage output last Tuesday — the agent tries it again
The cumulative cost is enormous. Every re-explanation is wasted time. Every repeated mistake is a trust loss. Every re-calibration session is engineering work you have already done.
One YouTube reviewer who built a content production workflow on Hermes summarized the OpenClaw problem directly: "OpenClaw is pretty dumb. It doesn't remember things even if you give it the memory skills. It doesn't really remember. You got to repeat yourself and it kind of defeats the whole purpose of using AI."
Hermes is built to solve this at the architecture level.
The 3-Layer Memory Architecture
See also: Install Guide | Setup Guide
Persistent Memory feature. Hermes does not have "one memory system." It has three, operating at different timescales and for different purposes.
Layer 1: Persistent Memory Files (Always-On)
Two files live in ~/.hermes/memories/:
MEMORY.md (2,200 character limit / ~800 tokens): The agent's personal notes. Environment facts, conventions, completed work, corrections.
USER.md (1,375 character limit / ~500 tokens): Your profile. Communication preferences, working style, expectations.
These are injected as a frozen snapshot into every session's system prompt — before the conversation starts. The agent does not need to look things up; it already knows.
How it appears in the system prompt:
══════════════════════════════════════════════
MEMORY (your personal notes) [67% — 1,474/2,200 chars]
══════════════════════════════════════════════
User's project is a Rust web service at ~/code/myapi using Axum + SQLx
§
This machine runs Ubuntu 22.04, has Docker and Podman installed
Capacity: 8–15 memory entries, 5–10 user profile entries. When full, the agent consolidates and merges automatically.
What gets saved:
- User preferences ("I prefer TypeScript over JavaScript")
- Environment facts ("This server runs Debian 12 with PostgreSQL 16")
- Corrections ("Don't use sudo for Docker — user is in docker group")
- Conventions ("Project uses tabs, 120-char lines, Google docstrings")
- Completed milestones ("Migrated database from MySQL to PostgreSQL on 2026-01-15")
The frozen snapshot pattern: Memory is captured once at session start. This preserves LLM prefix cache — a meaningful cost saving with Claude/Anthropic providers.
Layer 2: Episodic Memory (Skills)
Episodic memory is Hermes's most distinctive feature. After completing a complex task (5+ tool calls), Hermes evaluates whether the experience is worth capturing as a skill document — a markdown file describing the procedure, pitfalls, verification steps, and what to avoid.
Think of this as procedural memory: not just what happened, but how to do it again, better.
The YouTube transcript describes this in practice: "When the agent completes a complex task, like five plus tool calls, it will write a skill document to kind of remember how it did that. So, if I get a good result, I get a result I like. It will then save that chain and be like, 'Oh, he liked that. That worked. That gave us a good result. How did we do it? Okay, save that so we can do it again.' That way you can just repeat things that are good. You're basically repeating winning behavior."
Skills are stored in ~/.hermes/skills/ and use the agentskills.io standard format. They are loaded on-demand via progressive disclosure, so they do not bloat the system prompt.
Layer 3: Session Search
All conversations are stored in SQLite (~/.hermes/state.db) with FTS5 full-text search. This is unlimited storage of your complete interaction history.
When Hermes needs context from a past session, it searches this database and uses Gemini Flash to summarize the relevant conversation — then injects it into the current context.
| Feature | Layer 1 (Memory files) | Layer 3 (Session search) |
|---|---|---|
| Capacity | ~1,300 tokens | Unlimited |
| Speed | Instant (in system prompt) | Requires search + LLM |
| Use case | Key facts always available | Finding past conversations |
| Management | Agent-curated | Automatic |
| Token cost | Fixed per session | On-demand only |
Short-Term Memory: How Context is Managed
Within a session, Hermes maintains an active task context — current goal, steps taken, tool outputs. This is standard LLM context window management, but Hermes's architecture minimizes unnecessary token usage:
- Skills use progressive disclosure (Level 0: names only, Level 1: full content on demand)
- Tool definitions are loaded selectively per platform (browser tools not loaded on messaging-only sessions)
- Memory is a frozen snapshot — no mid-session re-reads
Real-World Test: Memory Compounds Over Time
After 30 sessions: The agent knows your stack, your preferences, your naming conventions. You stop explaining basics. Hermes adapts its output style to match yours. Agent-created skills cover your most common workflows.
After 60+ sessions: Anticipatory behavior emerges. When you start an image generation task, Hermes references the brand guidelines saved in session 12, the hybrid Python/fal.ai approach learned in session 8, and the watermark convention from session 3 — without being prompted.
The YouTube reviewer's summary: "The agent working with you in month three is meaningfully more capable on your specific work than the agent you started with in month one. It gets better over time and it does not forget anything."
Setup Guide first.
How Hermes Memory Compares
vs. ChatGPT memory: ChatGPT's memory feature stores user facts in a centralized OpenAI database. You do not control what's stored, you cannot inspect the raw files, and it does not capture procedural knowledge (skills). Hermes memory lives on your infrastructure in plain text files you own.
vs. Claude's memory: Claude (as of mid-2026) has session memory via Projects but does not persist procedural knowledge or build a deepening skill library. Claude Projects memory is cloud-hosted; Hermes memory is self-hosted.
vs. OpenClaw memory: OpenClaw has MEMORY.md but no episodic/skills layer and no session search. You have to manually maintain memories and skills. Hermes automates this.
Privacy: Who Can See Your Memories
Memory files live entirely on your infrastructure. ~/.hermes/memories/MEMORY.md and USER.md are plain markdown files on your server.
Hermes includes security scanning on memory writes: content matching injection or exfiltration patterns is blocked. Invisible Unicode characters (used in some prompt injection attacks) are stripped.
What never leaves your server: MEMORY.md, USER.md, session database, skill files.
What goes to the LLM API: Your conversation content + memory snapshot on each request. This is unavoidable — the LLM needs context to function. Use Ollama for 100% local if this is a hard requirement.
FAQ
Can I edit MEMORY.md manually? Yes. It is a plain text file. Hermes reads it fresh on next session start.
What happens when memory is full? Hermes automatically consolidates entries — merging related facts, removing outdated ones.
Does Hermes share my memory across devices?
No. Memory is per-installation. Use multiple profiles or sync ~/.hermes/memories/ manually.
Can I search my own memory?
hermes sessions list to browse sessions. Session search is automatic when Hermes needs past context.
What if I want to delete everything?
Delete ~/.hermes/memories/MEMORY.md and USER.md. For session data: ~/.hermes/state.db.
Honcho: AI-Powered User Modeling
Beyond the three core layers, Hermes supports Honcho integration — an AI-generated user understanding system that builds a model of who you are across sessions and platforms.
Setup:
hermes honcho setup
In hybrid mode, MEMORY.md and USER.md stay intact — Honcho adds a persistent modeling layer on top, tracking patterns that explicit memory entries miss: your communication rhythm, how you respond to different output formats, which types of corrections recur.
Community Memory Extensions
The Discord community has built several memory-extending tools:
PLUR (by plur9): brain-inspired "engram" memory with shared episodic memory across agents:
pip install plur-hermes
"Shared episodic memory across 6 agents is genuinely powerful and would be hard to replicate any other way." — geezeruk, Nous Research Discord
Memori-City (by 0x_404): local memory visualization that renders your MEMORY.md as an interactive city map. Open-sourced at github.com/0x404ethsol/Memori-City.
Hermes Memory Keep-Alive for Obsidian: Connects session memory to Obsidian notes to prevent the "you come back and everything is gone" problem after long session gaps.
HERMES TUI Companion (by synextco): Terminal UI dashboard that shows agent memory, skills, sessions, corrections, and projects in real time. "A lot of insightful information at a glance and cool retro look."
What Good Memory Management Looks Like in Practice
After 3 months of use, a power user's MEMORY.md might contain entries like:
Primary project: SaaS analytics dashboard at ~/code/dashboard, React + FastAPI + PostgreSQL
§
Code style: 2-space indent, no semicolons, JSDoc comments, ESLint strict
§
Never use npm, always pnpm. Never use pip directly, always uv.
§
Deployment target: Railway for staging, Hetzner VPS for production
§
Image generation: fal.ai for generation, then Python/PIL for text overlay and logo
§
Send completed task reports to Telegram, not Discord
§
User wakes up at 6am UTC+2 — schedule morning cron summaries for 6:30am
None of this was manually programmed. It accumulated through use, corrections, and Hermes deciding these facts were worth persisting. The USER.md might capture:
Prefers numbered steps over bullet points for procedures
Gets frustrated by overly hedged responses — be direct
Communication language: English, occasional French terms fine
Prefers file diffs over full rewrites when editing code
This level of personalization is what makes the month-3 Hermes experience fundamentally different from a chatbot.
Security: Memory Attack Surface
Memory files are a potential attack surface — specifically, prompt injection via memory contents. Hermes addresses this:
- Memory writes are scanned for injection patterns before persistence
- Invisible Unicode characters (used in prompt injection) are stripped from all memory operations
- Content matching credential exfiltration patterns is blocked at write time
- The frozen snapshot pattern prevents runtime memory injection mid-session
For users running Hermes on shared infrastructure: memory files are plain text, readable by anyone with filesystem access. Encrypt at the filesystem level if this is a concern.