Nous ResearchHermes Agent
Deploy Now

Use Hermes Agent with Ollama for Local AI

Run Hermes Agent with local LLMs via Ollama — fully private, no API keys, no cloud dependency.

Running Hermes with Ollama keeps everything on your machine — no API keys, no cloud costs, no data leaving your network. It's the ideal setup for privacy-conscious users or anyone who wants full control over their AI stack.

Before you start:

  • Hermes Agent installed
  • Ollama installed (ollama.com) on the same or a networked machine
  • Sufficient RAM: 8GB minimum for 7B models, 16GB+ recommended for 13B+ models
  • Optional: NVIDIA or AMD GPU for significantly faster inference

Steps

  1. 1

    Install Ollama

    curl -fsSL https://ollama.com/install.sh | sh — works on Linux and macOS

  2. 2

    Pull a model

    ollama pull hermes3 for the official Hermes 3 model, or any compatible model

  3. 3

    Configure Hermes

    Set model: provider: ollama and model: name: hermes3 in config.yaml

  4. 4

    Set the endpoint

    Ollama defaults to http://localhost:11434 — change if running on another machine

  5. 5

    Start Hermes

    hermes start — all inference runs locally with zero data leaving your machine

Pro Tips

  • 💡The official Hermes 3 model ('ollama pull hermes3') is optimized for tool use and works best with Hermes Agent — start here before trying other models
  • 💡For VPS deployments without a GPU, try smaller quantized models (Q4_K_M) — they run on CPU but are slower
  • 💡Ollama can run on a separate powerful machine while Hermes runs on a smaller server — set 'model: baseUrl: http://your-gpu-machine:11434' in config

Troubleshooting

Ollama connection refused error

Check that Ollama is running: 'ollama serve'. By default it listens on localhost:11434. If running on a different machine, ensure port 11434 is open in your firewall.

Hermes responses are extremely slow with Ollama

You're likely running CPU-only inference on a model too large for your RAM. Try a smaller model ('ollama pull hermes3:7b-q4') or add a GPU to your setup.

Model not found error despite pulling it

Check the model name spelling: 'ollama list' shows installed models. Use the exact name shown, including any variant suffix.

Related Guides