Back to blog
Memory July 2026 · 6 min read

Aleph learns to remember (without the cloud)

An agent that closes a session and forgets everything isn't that different from a chatbot with amnesia. We gave Aleph a persistent, small, curated memory that survives across sessions.

Every Aleph session used to end at zero. You close the app, open it again, and the agent has no idea what conventions your project uses, or that you've already told it three times you want explanations in Spanish and kept short. Whatever it "learned" during a long task lived and died with that conversation.

We looked at how other agents solve this — Claude Code, Codex/OpenCode, and in particular Hermes Agent by NousResearch, which has a fairly explicit memory design — and ended up with our own version, adapted to something those agents don't always have to deal with: small local models, which generalize worse and need an extra safety layer before we trust what they decide to save.

Two files, two scopes

Project
MEMORY.md
Lives in .agent-aleph/, inside the project folder. It can be committed and shared with the team, just like AGENTS.md.
User
USER.md
Lives outside the project, in the app's data folder. It applies to every session, no matter which folder you're working in.

One tool: memory

Instead of a separate tool for every combination of action and scope, there's just one: memory, with two explicit arguments — scope ("project" or "user") and action ("add", "replace", or "remove"). With small models, fewer tool names to keep track of beats many narrowly-named ones.

{"action": "add", "scope": "project", "content": "This project uses 2-space indentation"}

A hard budget, not infinite memory

MEMORY.md is capped at 2,200 characters, USER.md at 1,375. When a write would exceed that, the tool doesn't silently truncate — it returns an error asking the model to consolidate old entries with replace or remove before adding more. Memory is curated knowledge, not a log that grows forever.

Permissions: why it isn't automatic

Write
memory
In Build mode it asks for confirmation before saving (same as write_file or edit); in Plan mode it's flatly denied.

We made that call on purpose: a large cloud model rarely picks the wrong thing to remember; a small local one does, more often. If the agent could write memory without anyone reviewing it, one bad call today silently pollutes every future session.

Memory nobody reviews isn't memory — it's accumulated noise.

Priority rules (and no secrets, ever)

Since MEMORY.md can be shared via git, the agent's prompt includes explicit instructions: safety and permission rules always win, project instructions (AGENTS.md/CLAUDE.md) win over memory on technical conflicts, project memory wins over user memory for this repo's conventions, and credentials, tokens, or API keys are never — under any circumstance — stored there.

What we learned testing it

The first real test ran on a very small model (0.8B parameters). We asked it to save a personal preference ("explain things in Spanish and keep it short"), and the agent saved it… to project memory instead of user memory, with the content paraphrased and a stray word mixed in from Portuguese that nobody asked for. With a larger, coding-focused model, the same request came out right: correct scope, faithful content.

That wasn't a bug in the system — it was exactly the reason write access isn't automatic. The tool did what it was told; what failed was the model's judgment. With imperfect local models, any feature that depends on the model "deciding well" needs that safety brake, full stop.

The result is a small, explicit, reviewable memory — not a giant vector store guessing at what's relevant. For an agent running on models that sometimes get it wrong, we chose predictability over magic.

Previous: Aleph learns to google