Configure Hugging Face Inference for Hermes
Configure Hermes Agent to use 20+ open models via Hugging Face's unified inference endpoint.
Hugging Face Inference provides access to 20+ open models through a unified endpoint. Free tier included ($0.10/month), no markup on provider rates. Great for trying different open models without multiple accounts.
Managed cloud · API costs included · Skill library · Cancel anytime
Before you start:
- ☑Hermes Agent installed
- ☑Hugging Face account
- ☑HF token with Inference Providers permission
Steps
- 1
Get your token
Go to huggingface.co/settings/tokens and create a token with 'Make calls to Inference Providers' permission
- 2
Add to .env
Add HF_TOKEN=hf_xxx to ~/.hermes/.env
- 3
Set the provider
In config.yaml, set model: provider: huggingface
- 4
Choose a model
Pick from available open models like Llama, Mistral, Qwen
Pro Tips
- 💡Free tier includes basic usage — enough for experimentation
- 💡No markup on underlying provider rates
- 💡Access to the latest open models as they're released
- 💡Good alternative to running local models without dedicated hardware
Troubleshooting
❌ Permission denied
✅ Ensure your token has 'Make calls to Inference Providers' permission. Regenerate if needed.
❌ Model not available
✅ Not all HF models are available via inference. Check the model card for inference availability.