Run Hermes Agent 100% Offline — No Cloud Required
Run Hermes Agent completely offline with local models, local STT, and zero internet dependency.
Hermes can run 100% offline — no internet connection required, no cloud APIs, no data leaving your machine. This guide covers setting up a fully air-gapped AI assistant using local models and local speech tools.
Before you start:
- ☑Hermes Agent installed
- ☑Ollama installed and a model pulled while you still have internet (e.g. 'ollama pull hermes3')
- ☑Sufficient hardware: 16GB+ RAM recommended for quality offline inference
- ☑Optional: local Whisper model downloaded in advance
Steps
- 1
Install Ollama
Install Ollama and pull the hermes3 model while you still have internet
- 2
Configure local model
Set model: provider: ollama in config.yaml — no API keys needed
- 3
Install local STT
pip install openai-whisper and download a model: whisper --model medium
- 4
Install local TTS
Use system TTS (say on macOS, espeak on Linux) or a local TTS model
- 5
Disable cloud features
Set telemetry: false and remove any cloud API keys from config
- 6
Test offline
Disconnect from the internet and verify hermes chat works end-to-end locally
Pro Tips
- 💡Download everything you need (models, dependencies) before going offline — once disconnected, model downloads aren't possible
- 💡Hermes 3 7B Q4 quantized model is a good balance of quality and speed on CPU-only hardware
- 💡Set 'telemetry: false' in config.yaml to disable any analytics or crash reporting that might try to reach the internet
Troubleshooting
❌ Hermes tries to reach the internet even in offline mode
✅ Check config.yaml for any cloud-based settings: cloud memory sync, telemetry, update checks. Disable each with the appropriate 'false' flag. Run 'hermes config show' to review all active settings.
❌ Ollama model loads but inference is extremely slow
✅ CPU-only inference is slow for large models. Use a smaller quantized model (7B Q4) or ensure your GPU is detected by Ollama with 'ollama run hermes3 --verbose'.
❌ Whisper transcription fails offline
✅ Whisper needs its model files downloaded in advance. Run 'python -m whisper --model medium "test.mp3"' while online to pre-download the model weights to ~/.cache/whisper/.