The agent that actually does things

The difference between a chatbot and a coding agent is simple: the agent can do things. Read files, write code, run commands, search the project. If it can only generate text that you then have to copy-paste by hand, that's a chatbot with better prompting.

Aleph has a tool system. The model invokes them, the backend executes them, the result goes back to the model. And so on in a loop until the task is done.

Available tools

Read

read_file

Reads a project file's content. With line limits to avoid flooding the context.

Read

list_dir

Lists files and directories. Useful for the model to understand project structure before touching anything.

Read

grep

Searches text or regex in project files. Returns matches with line context.

Read

glob

Finds files by pattern (e.g. src/**/*.ts). To navigate large projects.

Write

write_file

Writes or replaces a full file. Restricted to the active project directory.

Write

edit

Replaces an exact block of text within a file. More precise than rewriting the whole thing.

Exec

bash

Runs a shell command in the project directory. Configurable timeout, output captured.

The permission system

Not all tools are equal. Reading a file is very different from running a shell command. That's why each tool has a risk level, and the agent mode determines what's allowed automatically and what requires confirmation:

Read: always allowed automatically. If it's only reading, let it read.
Write: in Build mode it's allowed automatically; in Plan mode it requires explicit confirmation.
Exec: always requires user confirmation, regardless of mode. Running arbitrary code is the most dangerous thing the agent can do.

Security isn't a checkbox. If the model wants to run rm -rf, the user should know before it happens.

Beyond risk levels, all file read/write tools are restricted to the active project directory. The model can't read ~/.ssh/id_rsa even if it tries, because the backend validates that the resolved path stays within the project tree.

The agent loop

The model receives the user message plus history.
If it wants to use a tool, it responds with a tool call instead of text.
The backend checks the risk level and asks for confirmation if needed.
The tool is executed and the result is added to history.
The model receives that result and decides whether it needs another tool or can respond now.

This repeats until the model generates a final text response (no tool call). The iteration count has a configurable limit to prevent infinite loops.

What we learned implementing this

The biggest challenge wasn't technical: it was prompt engineering. Getting the model to use tools efficiently (without redundant reads, without overwriting files that were fine) required a lot of iteration on the system prompt and on how we describe each tool.

We also found that GBNF grammar (which constrains model output to the correct JSON tool call format) helps reliability significantly. With free-form output, the model sometimes "forgot" the correct format. With GBNF, it can't.

The result is an agent that can read your project, understand what's going on, and propose concrete changes — without you having to copy-paste anything.