Nous ResearchHermes Agent

Configure Hugging Face Inference for Hermes

Configure Hermes Agent to use 20+ open models via Hugging Face's unified inference endpoint.

Hugging Face Inference provides access to 20+ open models through a unified endpoint. Free tier included ($0.10/month), no markup on provider rates. Great for trying different open models without multiple accounts.

Deploy Hermes faster with FlyHermes

Managed cloud · API costs included · Skill library · Cancel anytime

Before you start:

  • Hermes Agent installed
  • Hugging Face account
  • HF token with Inference Providers permission

Steps

  1. 1

    Get your token

    Go to huggingface.co/settings/tokens and create a token with 'Make calls to Inference Providers' permission

  2. 2

    Add to .env

    Add HF_TOKEN=hf_xxx to ~/.hermes/.env

  3. 3

    Set the provider

    In config.yaml, set model: provider: huggingface

  4. 4

    Choose a model

    Pick from available open models like Llama, Mistral, Qwen

Pro Tips

  • 💡Free tier includes basic usage — enough for experimentation
  • 💡No markup on underlying provider rates
  • 💡Access to the latest open models as they're released
  • 💡Good alternative to running local models without dedicated hardware

Troubleshooting

Permission denied

Ensure your token has 'Make calls to Inference Providers' permission. Regenerate if needed.

Model not available

Not all HF models are available via inference. Check the model card for inference availability.

Related Guides