Run Hermes on Modal — Serverless Cloud Compute
Configure Hermes Agent to execute commands on Modal's serverless cloud infrastructure — GPU access, auto-scaling, pay only when active.
Modal backend runs Hermes commands on serverless cloud VMs. You get GPU access, auto-scaling, and hibernation between sessions — costs drop to near-zero when idle. Perfect for AI workloads, batch processing, or when you need more compute than your laptop provides.
Managed cloud · API costs included · Skill library · Cancel anytime
Before you start:
- ☑Hermes Agent installed
- ☑Modal account (free tier available)
- ☑Modal CLI authenticated
Steps
- 1
Install Modal
pip install modal
- 2
Authenticate with Modal
Run modal setup — this opens a browser for OAuth authentication
- 3
Configure the backend
In config.yaml, set terminal: backend: modal
- 4
Choose a container image
Set terminal: modal_image: nikolaik/python-nodejs:python3.11-nodejs20
- 5
Set resource limits
Configure container_cpu, container_memory, and container_disk
- 6
Start using Hermes
Commands now execute on Modal's cloud infrastructure
Pro Tips
- 💡Modal hibernates containers between sessions — you only pay for active compute time
- 💡Request GPU access on Modal's dashboard for CUDA-accelerated workloads
- 💡Container images are cached — first run is slow, subsequent runs are fast
- 💡Modal's free tier includes $30/month of compute credits
Troubleshooting
❌ Modal not authenticated
✅ Run 'modal setup' to authenticate via browser. Credentials are stored in ~/.modal/
❌ Container fails to start
✅ Check Modal's dashboard for error logs. Common issue: the Docker image isn't compatible with Modal's runtime.
❌ High latency on first command
✅ First command provisions a new container (cold start). Use container_persistent: true to keep the container warm between sessions.
❌ GPU not available
✅ GPU access requires Modal approval. Request it from your Modal dashboard settings.