Hermes Agent: Free Open-Source AI Agent by NousResearch
The MIT-licensed agent that lives on your server, learns from every session, and reaches you on Telegram or Discord โ without sending a single byte of your data to anyone.
By Free AI News Editorial ยท ยท ยท 9 min read
Most AI tools have a fundamental ceiling: the moment you close the chat window, everything resets. You re-explain your project. You re-establish your preferences. You repeat yourself to an agent that has no memory of who you are. Hermes Agent, launched by Nous Research in February 2026, is built around a fundamentally different premise: what if an AI agent actually grew smarter every time you used it? Released under the MIT license and fully self-hosted, Hermes Agent has quickly attracted attention in the open-source AI community for its persistent memory, self-improving skill loop, and genuine model agnosticism. Here is everything you need to know about it.
What Is Hermes Agent and Who Built It?
Hermes Agent is an autonomous AI agent framework built by Nous Research, the open-weights AI lab best known for its Hermes series of fine-tuned language models. Where most of NousResearch's prior work focused on releasing better model weights, Hermes Agent is a full agent runtime โ the infrastructure that sits around a model and makes it genuinely useful over time.
The project launched in February 2026 and is available for free on GitHub under the MIT license. That means you can read every line of code, fork it, modify it, and run it commercially without restriction. There are no per-seat fees, no usage-based billing, and no vendor lock-in. It is the kind of software that only exists because a research lab decided openness was the right default.
The agent is described on its official site as "the only agent with a built-in learning loop." That claim is worth unpacking: the loop is not marketing copy for RAG retrieval or a vector memory store. It is a system where the agent actively writes new skill documents after completing complex tasks, refines those skills as it uses them, and searches its entire conversation history to surface relevant past context โ all without you asking it to.
How Does Hermes Agent's Self-Improving Learning Loop Work?
The self-improvement mechanism is the most technically interesting part of Hermes Agent, and the clearest departure from standard AI agent frameworks. It works in three layers:
- Automated Skill Creation โ When Hermes solves a problem that required significant multi-step reasoning or tool use, it writes a SKILL.md document capturing the approach. The next time a similar problem appears, the skill is loaded as context and the agent starts ahead rather than from scratch.
- Skill Self-Refinement โ Skills are not static. Each time a skill is used, the agent has the opportunity to update it based on what worked and what did not. Over weeks of use, a skill for, say, web scraping or API integration becomes progressively more reliable.
- FTS5 Session Search with LLM Summarization โ All past conversations are indexed using SQLite's full-text search engine (FTS5). When you reference a past project, the agent queries its own history, retrieves the relevant sessions, and uses an LLM to produce a compressed summary before injecting it into the current context. You do not re-explain things you have already covered.
On top of these layers, Hermes Agent uses Honcho dialectic user modeling โ a technique developed by Plastic Labs that builds a probabilistic model of who you are: your preferred working style, vocabulary, areas of expertise, and project history. The agent maintains and refines this user model over time, making responses progressively more personalised the longer you use it.
All skill documents conform to the agentskills.io open standard โ a portable SKILL.md format that allows skills created in Hermes to be shared publicly or installed from community hubs with a single command. This creates a growing ecosystem of reusable agent skills that any Hermes user can benefit from.
What Platforms and Models Does Hermes Agent Support?
Hermes Agent is designed around two kinds of flexibility: where it runs and what model it uses. Neither is locked to a single choice.
Supported execution backends:
Supported model providers:
You switch models with hermes model โ no code changes, no configuration file surgery. Supported providers include OpenRouter (200+ models), Nous Portal (native OAuth), NovitaAI, NVIDIA NIM (Nemotron), Hugging Face endpoints, Xiaomi MiMo, Moonshot/Kimi, MiniMax, standard OpenAI API, and any local vLLM or Ollama instance. If it speaks OpenAI-compatible API, Hermes can use it.
Supported chat platforms (gateway process):
- Telegram โ Most popular deployment; supports voice memo transcription and inline commands.
- Discord โ Full bot integration with slash commands and thread support.
- Slack โ Workspace integration for team environments.
- WhatsApp and Signal โ Mobile-first access for on-the-go control.
- CLI / TUI โ Full terminal UI with multiline editing, slash-command autocomplete, and streaming tool output for power users.
All platforms share a single gateway process, meaning you can start a conversation on Telegram, continue it in your terminal, and the agent remembers the full thread regardless of which surface you used. Cross-platform conversation continuity is a standard feature, not an add-on.
How Does Hermes Agent Compare to Other Open-Source Agents?
The open-source agent landscape in 2026 is crowded. Projects like Odysseus and OpenJarvis have gained traction, and frameworks like AutoGPT and CrewAI remain widely used. So what distinguishes Hermes?
The key differentiators are the learning loop and the deployment model. Most open-source agents are stateless by default โ they receive a prompt, execute a task, and terminate. Memory is typically bolted on via a vector database, which gives agents access to past data but does not change how the agent approaches future tasks. Hermes's skill creation loop is different: it changes agent behaviour, not just agent recall. An agent that has written a skill for GitHub PR review will approach the next PR differently โ it has internalised a method, not just stored a log.
On deployment, Hermes is explicitly designed to run 24/7 on cheap infrastructure. The serverless options (Modal, Daytona) mean you can run a persistent agent that costs effectively nothing when idle โ it hibernates and wakes on demand. Most agent frameworks assume a local laptop or a dedicated server with a running process. Hermes makes sleeping-until-needed a first-class feature.
The 40+ built-in skills also give it a significant head start compared to minimal frameworks. Out of the box, Hermes ships with skills covering MLOps workflows, GitHub operations, web research, image generation, text-to-speech, browser automation, diagramming, and note-taking. You are not starting from zero. Explore the full open-source AI section for more comparisons.
How Do You Install and Run Hermes Agent?
NousResearch designed the install experience to be as friction-free as possible. On Linux or macOS, a single shell command handles everything: installs the uv dependency manager, creates a virtual environment, installs all packages, and symlinks the hermes command to your path.
Windows users get native support without WSL โ the CLI, gateway, TUI, and all tools run natively in PowerShell. After install, you run hermes from any terminal to start the interactive TUI, or hermes gateway start to launch the multi-platform chat gateway.
After first launch, you configure your model provider with hermes model and connect your chat platforms via the gateway settings. The agent begins building its memory and skill library immediately โ there is no manual setup of memory systems or vector stores. Everything is handled automatically in the background, stored in ~/.hermes/ on your own machine.
For production deployments, Hermes supports Docker with security hardening enabled by default (read-only root filesystem, capability drops, PID namespace limits). This makes it suitable for running on shared servers or cloud VMs without exposing the host system. The Modal and Daytona integrations handle cold-start latency โ typically under two seconds โ so the serverless option is practical for real workloads, not just demos.
One feature worth noting for researchers: Hermes Agent includes built-in batch trajectory generation tools. If you are working on fine-tuning a model on tool-calling behaviour, the agent can generate and compress training trajectories from real task runs. This positions it as infrastructure for the next generation of models, not just a consumer-facing product.
You can find a full list of free AI tools on this site, and compare the best open-source AI models and agents in our dedicated section. Also see our breakdown of DeepSeek V4's open weights and Mistral Small 4's Apache license for the latest free model options to pair with Hermes.
🔑 Key Takeaways
- Hermes Agent is fully free under the MIT license, with all memory stored locally in
~/.hermes/โ zero telemetry, zero data sent to NousResearch or anyone else. - The self-improving learning loop is the core differentiator: after every complex task, the agent writes a SKILL.md document so it never has to solve the same problem from scratch again โ skills also self-refine during use.
- Model-agnostic design means you can switch between 200+ models on OpenRouter, local vLLM instances, or any OpenAI-compatible endpoint with a single command and no code changes.
- The serverless deployment options (Modal, Daytona) let you run a persistent agent that hibernates when idle and wakes in under two seconds โ making 24/7 operation affordable on nearly any budget.
- Hermes Agent's SKILL.md format is compatible with the open agentskills.io standard, enabling a community marketplace of reusable skills you can install with one command.
Frequently Asked Questions
Is Hermes Agent completely free to use?
Yes. Hermes Agent is released under the MIT license, which means you can use, modify, and distribute it without paying anything. The agent itself is free; you only pay for the AI model API calls you choose to make, and you can point it at free or locally hosted models to avoid any costs entirely.
Does Hermes Agent require a GPU to run?
No GPU is required to run Hermes Agent itself. The agent is a Python application that manages conversations, memory, and tool calls. It connects to external model providers like OpenRouter or Nous Portal for inference. If you want to run a local model alongside it, you will need appropriate hardware, but the agent framework itself is CPU-only and runs fine on a $5 VPS.
What AI models can Hermes Agent use?
Hermes Agent is model-agnostic and supports 200+ models via OpenRouter, plus native integrations with Nous Portal, NovitaAI, NVIDIA NIM, Hugging Face, OpenAI, and any OpenAI-compatible local endpoint such as vLLM or Ollama. You switch models with the hermes model command and no code changes are required.
How does Hermes Agent store and protect my data?
All memory, skills, and conversation history are stored locally in the ~/.hermes/ directory on your own machine or server. Hermes Agent has zero telemetry and zero data collection. Nothing is sent to NousResearch. Your data never leaves the machine you choose to run it on.
How is Hermes Agent different from a regular AI chatbot?
Unlike chatbots that reset after each conversation, Hermes Agent runs persistently, remembers your projects and preferences across sessions, and automatically writes reusable skill documents when it solves complex problems. It can be reached via Telegram, Discord, or WhatsApp while running unattended on a server, making it a genuine autonomous agent rather than a conversational assistant.