Configure Reasoning Effort — Speed vs Depth
Control how much 'thinking' the model does before responding — balance speed, cost, and reasoning depth.
Reasoning effort controls how much 'thinking' the model does before responding. Higher effort means deeper analysis but slower responses and higher token costs. Lower effort is faster and cheaper but may miss nuances. Find the right balance for your use case.
Managed cloud · API costs included · Skill library · Cancel anytime
Before you start:
- ☑Hermes Agent installed
- ☑A model that supports reasoning effort (most OpenRouter and Nous Portal models)
Steps
- 1
Understand reasoning levels
Levels: xhigh (max), high, medium (default), low, minimal, none (disable)
- 2
Set default level
In config.yaml, set agent: reasoning_effort: medium
- 3
Adjust per-task at runtime
Use hermes config set agent.reasoning_effort high before complex tasks
- 4
Monitor reasoning output
Set display: show_reasoning: true to see the model's thought process
Pro Tips
- 💡Use 'high' or 'xhigh' for complex architecture decisions, debugging, and analysis
- 💡Use 'low' or 'minimal' for simple file operations, git commands, and routine tasks
- 💡The /reasoning show command displays the model's thinking in a dim box above responses
- 💡Heavy users report 20-40% token savings by switching reasoning levels based on task complexity
Troubleshooting
❌ Reasoning effort setting not taking effect
✅ Not all models support reasoning effort. Check your provider's documentation. OpenRouter and Nous Portal models support it.
❌ Responses too slow with high reasoning
✅ Switch to medium or low for routine tasks. Use /reasoning show to see if the extra thinking is actually helping.
❌ Agent missing obvious solutions
✅ Increase reasoning_effort to high or xhigh. The agent may be skipping analysis steps at lower levels.