sagent: a Python API for coding agents

Sagent is a strongly typed Python API and CLI for building coding and developer agents.

We wanted an open-source coding-agent API that could hot-swap providers and models without losing context.

That means the same agent loop should work on top of Anthropic, OpenAI, Google, Kimi, Qwen, MiniMax, and self-hosted models. It should be usable as a terminal coding assistant, but also as normal Python code: import an agent, give it tools, run it inside a script, spawn reviewers, switch models, persist sessions, and inspect the typed results.

That became sagent, a strongly typed Python API and CLI for building coding and developer agents.

The public package is about 31k lines of typed Python code with another 26k lines of tests. The main design rule is still simple: everything that crosses the runtime boundary is a message, and every surface uses the same agent loop.

sagent logo
pip install sagent
from sagent import tools
from sagent.agent import Agent
from sagent.lib.json import json_freeze
from sagent.providers import Google

agent = Agent(
    model=Google.from_env().model("gemini-2.5-flash"),
    system="You are a scientist.",
    tools=[tools.Read(), tools.Glob(), tools.Grep()],
)
result = await agent.run(json_freeze({"prompt": "analyze the CSV in ./data/"}))
print(result.content)

Everything is a message

The core idea in sagent is simple: everything that crosses the runtime boundary is a Message.

Text, bytes, JSON, tool calls, tool results, model responses, user prompts, compaction summaries, and multi-part assistant turns all move through the same typed message graph. Conceptually, a message is content plus a MIME descriptor, such as text/plain, application/json, or multipart/x-tool-call.

That decision removes a lot of special plumbing. Providers, tools, sessions, compaction, the CLI, Slack, parent agents, and child agents all speak the same shape. The runtime does not need one representation for model output, another for tool calls, another for persisted sessions, and another for UI events.

The second core idea is that an agent owns an inbox.

while True:
    drain inbox into user messages
    call model
    if tool calls exist: dispatch tools and loop
    if inbox is empty and model is done: go idle

The inbox is a deque. User messages go to the front, so a person can interrupt a running session. Background completions, peer-agent messages, delayed wakeups, and tool results go to the back. The agent keeps draining, running, and checking until there is nothing left to do.

This is closer to an Erlang-style process than to a request-response wrapper. An agent has state, a mailbox, and a loop. It can wake, receive messages, spawn other agents, and keep working without each surface needing its own control plane.

Tools are the core abstraction

A Tool is anything with a schema and a run method. It takes a Message, which may be multipart, and returns a Message. Streaming tools yield intermediate Message events and finish with a final result message.

That makes Agent fit the same shape. An agent can be used directly, but it can also be treated as a streaming tool: send it a message, stream its events, and receive its final message.

This is why AgentSpawn is small conceptually. Spawning an agent is just calling another agent-shaped tool with an isolated model, tool set, session, and depth limit. Recursion falls out of the type, not from a separate orchestration layer.

Agents can change themselves

Sagent has three built-in coordination tools: AgentSelf, AgentSend, and AgentSpawn.

AgentSelf lets an agent inspect and mutate its own state. It can update its status, compact context, clear history, inspect diagnostics, adjust token limits, and change models.

The model swap is a useful consequence of this design. You can conversationally hotswap the backend while keeping the session context. Start with Claude, switch to Gemini, move to an OpenAI-compatible endpoint, then switch back. The provider normalization happens at the edge, so the agent loop keeps seeing the same typed model response shape.

That makes provider choice a runtime decision instead of an architectural one. Researchers can compare model behavior inside the same agent session. Framework builders can route different work to different backends. Coding-agent users can switch models mid-task without restarting from scratch.

Agents can talk to agents

AgentSend lets one live agent send a message to another live agent's inbox. This gives agents peer-to-peer communication instead of only parent-to-child calls.

AgentSpawn creates child agents. A parent can spawn an isolated reviewer, a specialized implementation agent, or a map-reduce worker. Spawned agents can also spawn more agents, subject to explicit depth and tool limits.

This matters because agent composition should not require a separate orchestration framework. In sagent, an agent follows the same protocol shape as a tool. A parent calls AgentSpawn, the child runs in isolation, and the child's final output returns as a normal tool result.

The same primitives cover common agent workflows:

  • ask one child agent to review code;
  • split a large search across many child agents;
  • keep a persistent background agent alive;
  • let two agents coordinate through inbox messages;
  • use different models or tool sets for different subtasks.

The important constraint is that these are still typed Python objects. You can construct them, test them, limit their tools, inspect their sessions, and embed them in another application.

A lightweight interface over a real runtime

For day-to-day use, sagent also works as a terminal coding assistant:

GOOGLE_API_KEY=... sagent --provider Google --model gemini-2.5-flash

The REPL is intentionally lightweight: closer to an IPython-like working session than to a full IDE. It has local tools for files, shell commands, web fetching, search, scholarly papers, and agent coordination. It persists sessions per working directory by default, tracks cost, and compacts old context when the session gets long.

The same Agent powers the CLI, Slack service, parent agents, child agents, and Python applications. Surfaces differ in how they put messages into the inbox and render events. They do not own separate agent logic.

What sagent is for

Use sagent when you want:

  • a strongly typed Python interface for coding agents;
  • provider and/or model hot-swapping without changing the context or agent loop;
  • custom tools as normal Python objects;
  • session persistence and compaction;
  • child agents and peer messaging for review, delegation, and map-reduce work.

It is not a sandbox. Enabled tools run with the current process permissions. Sessions are plaintext local state. If a task needs hard isolation, run sagent inside your own OS or container sandbox and give the agent a narrow tool set.

Sagent is also not trying to be every agent framework. There is no hosted service, desktop UI, browser automation, MCP integration, or LSP integration today. The focus is smaller: a typed Python runtime with concrete coordination primitives.

Try it

Sagent is open source under Apache 2.0.

If you build agent systems in Python, try it and tell us where the abstractions hold up and where they break.

Join us → Return to the blog