Building AI Agents¶

Build a production-ready AI agent in 9 incremental steps. Each step adds one capability -- from a bare agent that calls tools, to a fully observable agent with memory, sandboxed code execution, and cross-agent delegation.

This is the recommended starting point for building agents

This guide walks you through building a complete agent step by step. For deep reference on individual features, see the Building Agents, Memory, Sandbox, and Observability pages.

What You'll Build¶

An AI agent that connects to MCP servers, discovers tools automatically, remembers context across conversations, executes code safely in a sandbox, observes every action for debugging and compliance, and delegates work to peer agents. One function call creates it. Every capability is opt-in.

Concepts¶

Promptise agents are built around three ideas:

MCP-first tool discovery -- You point the agent at one or more MCP servers. On startup it connects to every server, lists all available tools, and converts them into typed tools automatically. No manual wiring.
Opt-in capabilities -- Observability, memory, sandbox execution, cross-agent delegation, and prompt flows are all disabled by default. Enable each one with a single parameter.
Model independence -- Any LLM model string ("openai:gpt-5-mini", "anthropic:claude-sonnet-4.5", "ollama:llama3"), any LangChain BaseChatModel, or any Runnable. Change one string, nothing else moves.

Step 1: Minimal Agent¶

Start with the simplest possible agent -- a model connected to an MCP server:

import asyncio
from promptise import build_agent
from promptise.config import HTTPServerSpec

async def main():
    agent = await build_agent(
        servers={
            "weather": HTTPServerSpec(url="http://localhost:8000/mcp"),
        },
        model="openai:gpt-5-mini",
    )

    result = await agent.ainvoke({
        "messages": [{"role": "user", "content": "What is the weather in Zurich?"}]
    })
    print(result["messages"][-1].content)
    await agent.shutdown()

asyncio.run(main())

build_agent() connects to the MCP server, discovers all available tools, converts their schemas, and returns a ready-to-use PromptiseAgent. The agent decides which tools to call based on the user's message.

Step 2: Multiple Servers and Instructions¶

Connect to multiple MCP servers and provide a system prompt:

from promptise.config import HTTPServerSpec, StdioServerSpec

agent = await build_agent(
    servers={
        "weather": HTTPServerSpec(url="http://localhost:8000/mcp"),
        "files": StdioServerSpec(command="python", args=["-m", "file_server"]),
        "database": HTTPServerSpec(
            url="https://db-api.internal/mcp",
            bearer_token="your-jwt-token",
        ),
    },
    model="openai:gpt-5-mini",
    instructions=(
        "You are a data analyst. Use the weather API for forecasts, "
        "the file server for reading reports, and the database for queries. "
        "Always cite your data sources."
    ),
)

The agent discovers tools from all three servers and has them available simultaneously. Tool names are automatically namespaced to avoid conflicts.

Switch models with one line:

# OpenAI
agent = await build_agent(servers=servers, model="openai:gpt-5-mini")

# Anthropic
agent = await build_agent(servers=servers, model="anthropic:claude-sonnet-4.5")

# Google
agent = await build_agent(servers=servers, model="google_genai:gemini-2.0-flash")

# Local Ollama
agent = await build_agent(servers=servers, model="ollama:llama3")

# Any LangChain model instance
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-5-mini", temperature=0)
agent = await build_agent(servers=servers, model=model)

Step 3: Memory¶

Give your agent persistent memory. Before every invocation, the agent automatically searches for relevant context and injects it into the system prompt.

from promptise.memory import ChromaProvider

# ChromaDB for local persistent vector search
memory = ChromaProvider(
    collection_name="analyst_memory",
    persist_directory=".promptise/chroma",
)

agent = await build_agent(
    servers=servers,
    model="openai:gpt-5-mini",
    memory=memory,
    memory_auto_store=True,  # Automatically store each exchange
)

# First conversation
result = await agent.ainvoke({
    "messages": [{"role": "user", "content": "The Q3 report shows 15% revenue growth."}]
})

# Later conversation -- agent remembers Q3 data
result = await agent.ainvoke({
    "messages": [{"role": "user", "content": "How did Q3 compare to projections?"}]
})
# Agent automatically retrieves the Q3 context from memory

Three providers ship with the framework:

Provider	Use case	Persistence
`InMemoryProvider`	Testing and prototyping	In-process only
`ChromaProvider`	Local vector search	Disk (configurable)
`Mem0Provider`	Enterprise graph search	External service

# In-memory (testing)
from promptise.memory import InMemoryProvider
memory = InMemoryProvider()

# ChromaDB (local, persistent)
from promptise.memory import ChromaProvider
memory = ChromaProvider(collection_name="my_agent", persist_directory="./data")

# Mem0 (enterprise)
from promptise.memory import Mem0Provider
memory = Mem0Provider(api_key="...", user_id="analyst-1")

Step 4: Observability¶

Enable full observability with a single flag. Every LLM turn, tool call, token count, latency, retry, and error is captured automatically.

agent = await build_agent(
    servers=servers,
    model="openai:gpt-5-mini",
    observe=True,  # That's it
)

result = await agent.ainvoke({
    "messages": [{"role": "user", "content": "Analyze the sales data"}]
})

# Get aggregate statistics
stats = agent.get_stats()
print(f"Total tokens: {stats['total_tokens']}")
print(f"Tool calls: {stats['tool_calls']}")
print(f"Duration: {stats['total_duration_ms']}ms")

# Generate an interactive HTML report
agent.generate_report("report.html", title="Sales Analysis")

For full control, pass an ObservabilityConfig:

from promptise.observability_config import ObservabilityConfig, ObserveLevel, TransporterType

agent = await build_agent(
    servers=servers,
    model="openai:gpt-5-mini",
    observe=ObservabilityConfig(
        level=ObserveLevel.FULL,           # OFF, BASIC, STANDARD, FULL
        session_name="production-audit",
        record_prompts=True,
        transporters=[
            TransporterType.HTML,           # Interactive HTML report
            TransporterType.STRUCTURED_LOG, # JSONL file
            TransporterType.CONSOLE,        # Live terminal output
            TransporterType.PROMETHEUS,     # Prometheus /metrics
            TransporterType.OTEL,           # OpenTelemetry spans
            TransporterType.WEBHOOK,        # HTTP POST on events
        ],
        output_dir="./reports",
        log_file="./logs/agent.jsonl",
    ),
)

Eight transporters available:

Transporter	Output	Use case
`HTML`	Interactive HTML report	Post-analysis, sharing
`JSON`	JSON file	Programmatic analysis
`STRUCTURED_LOG`	JSONL file	Log aggregation (ELK, Datadog)
`CONSOLE`	Live terminal output	Development debugging
`PROMETHEUS`	Prometheus metrics	Infrastructure monitoring
`OTEL`	OpenTelemetry spans	Distributed tracing
`WEBHOOK`	HTTP POST	Real-time notifications
`CALLBACK`	Python callable	Custom processing

Step 5: Sandboxed Code Execution¶

When agents write and execute code, run it inside a multi-layer security sandbox:

# Enable with defaults
agent = await build_agent(
    servers=servers,
    model="openai:gpt-5-mini",
    sandbox=True,
)

# Or configure resource limits and network mode
agent = await build_agent(
    servers=servers,
    model="openai:gpt-5-mini",
    sandbox={
        "network_mode": "restricted",  # "none", "restricted", "full"
        "memory_limit": "512M",
        "cpu_limit": 2,
        "timeout": 120,
    },
)

When sandbox is enabled, 5 tools are automatically injected into the agent:

Tool	Description
`execute_code`	Run Python code in the sandbox
`read_file`	Read a file from the sandbox workspace
`write_file`	Write a file to the sandbox workspace
`list_files`	List files in the sandbox workspace
`install_package`	Install a pip package in the sandbox

Security layers applied automatically:

Docker isolation -- code runs in a container, not on your host
Seccomp filtering -- blocks dangerous syscalls
Capability dropping -- removes ~40 Linux capabilities
Read-only rootfs -- only the workspace directory is writable
Resource limits -- CPU, memory, and time constraints
Network isolation -- configurable per agent (none/restricted/full)
Optional gVisor -- userspace kernel for additional isolation

Step 6: Cross-Agent Delegation¶

Let agents ask questions to peer agents via HTTP with JWT authentication:

from promptise.cross_agent import CrossAgent

agent = await build_agent(
    servers=servers,
    model="openai:gpt-5-mini",
    cross_agents={
        "researcher": CrossAgent(
            url="http://research-agent:8001",
            jwt_secret="shared-secret",
            description="Expert at finding and summarizing research papers.",
        ),
        "coder": CrossAgent(
            url="http://code-agent:8002",
            jwt_secret="shared-secret",
            description="Expert at writing and reviewing code.",
        ),
    },
)

This injects ask_agent_researcher and ask_agent_coder tools. The agent decides when to delegate:

User: "Find recent papers on transformer architectures and write a summary script"

Agent thinks: "I need the researcher for papers and the coder for the script"
→ calls ask_agent_researcher("Find recent papers on transformer architectures")
→ gets research results
→ calls ask_agent_coder("Write a Python script that summarizes these papers: ...")
→ combines results and responds

Broadcast to multiple peers:

# From code, send a question to all peers simultaneously
results = await agent.broadcast(
    "What is the current status of your subsystem?",
    timeout=30.0,
)
# Returns dict of agent_name → response, with graceful degradation on timeout

Step 7: SuperAgent Files¶

Define an entire agent declaratively in a .superagent YAML file:

# analyst.superagent
name: data-analyst
model: openai:gpt-5-mini
instructions: |
  You are a senior data analyst. Use available tools to query databases,
  generate visualizations, and produce reports. Always cite data sources.

servers:
  database:
    type: http
    url: http://localhost:8080/mcp
    bearer_token: "${DB_TOKEN}"
  files:
    type: http
    url: http://localhost:8081/mcp

memory:
  provider: chroma
  collection: analyst_memory
  persist_directory: .promptise/chroma
  auto_store: true

observability:
  level: standard
  transporters: [html, structured_log]
  output_dir: ./reports

sandbox:
  enabled: true
  network_mode: restricted
  memory_limit: 1G

cross_agents:
  researcher:
    url: http://research-agent:8001
    jwt_secret: "${CROSS_AGENT_SECRET}"
    description: Expert at finding research papers.

Load and run:

from promptise.superagent import load_superagent

agent = await load_superagent("analyst.superagent")
result = await agent.ainvoke({
    "messages": [{"role": "user", "content": "Analyze Q3 revenue trends"}]
})

Environment variables resolve automatically with ${VAR} and ${VAR:-default} syntax.

Step 8: Conversation Flows¶

Evolve the system prompt across conversation phases for sophisticated multi-turn agents:

from promptise.prompts import ConversationFlow, Phase

flow = ConversationFlow(
    phases={
        "greeting": Phase(
            blocks=["identity", "greeting_instructions"],
            transitions={"investigation": lambda ctx: ctx.turn > 1},
        ),
        "investigation": Phase(
            blocks=["identity", "investigation_rules", "tool_context"],
            transitions={"resolution": lambda ctx: ctx.state.get("has_diagnosis")},
        ),
        "resolution": Phase(
            blocks=["identity", "resolution_rules", "output_format"],
        ),
    },
    initial_phase="greeting",
)

agent = await build_agent(
    servers=servers,
    model="openai:gpt-5-mini",
    flow=flow,
)

The system prompt the agent sees on turn 1 is different from turn 5 and turn 10. Blocks appear and disappear as the conversation progresses. See the Prompt Engineering guide for the full prompt system.

Step 9: Conversation Persistence¶

Every chat application needs to persist conversations. The conversation store handles loading history, saving new exchanges, and managing sessions -- one parameter:

from promptise import build_agent
from promptise.config import HTTPServerSpec
from promptise.conversations import SQLiteConversationStore, generate_session_id

# Pick a backend: SQLite for dev, Postgres for production, Redis for ephemeral
store = SQLiteConversationStore("conversations.db")

agent = await build_agent(
    model="openai:gpt-5-mini",
    servers={
        "tasks": HTTPServerSpec(url="http://localhost:8080/mcp"),
    },
    conversation_store=store,
    conversation_max_messages=200,  # Rolling window (0 = unlimited)
)

# Use secure session IDs — never user-controlled or predictable
sid = generate_session_id()  # "sess_a1b2c3d4e5f6..."

# chat() handles everything: ownership check → load history → invoke → persist
response = await agent.chat(
    "Create a task to review the PR",
    session_id=sid,
    user_id="user-42",  # Locks this session to user-42
)
response = await agent.chat(
    "What task did I just create?",
    session_id=sid,
    user_id="user-42",  # Same user — allowed
)

# Session management (all operations verify ownership)
sessions = await agent.list_sessions(user_id="user-42")
await agent.update_session(sid, calling_user_id="user-42", title="PR Review")
await agent.delete_session(sid, user_id="user-42")

await agent.shutdown()  # Closes store connections

Four built-in stores, or implement the ConversationStore protocol for any backend:

Store	Backend	Use case
`InMemoryConversationStore`	Dict	Testing
`SQLiteConversationStore`	aiosqlite	Local dev
`PostgresConversationStore`	asyncpg	Production
`RedisConversationStore`	redis.asyncio	Ephemeral sessions

Conversation persistence works alongside memory (Step 3). Memory provides semantic search across all sessions ("what do I know about this user?"). The conversation store provides exact message replay within a session ("what did they say 3 messages ago?"). See Conversation Persistence for the full reference.

Complete Example¶

A fully-featured agent with MCP tools, memory, observability, sandbox, and cross-agent delegation:

import asyncio
from promptise import build_agent
from promptise.config import HTTPServerSpec
from promptise.memory import ChromaProvider
from promptise.cross_agent import CrossAgent
from promptise.observability_config import ObservabilityConfig, ObserveLevel, TransporterType

async def main():
    # Memory provider
    memory = ChromaProvider(
        collection_name="analyst",
        persist_directory=".promptise/chroma",
    )

    # Build the agent
    agent = await build_agent(
        servers={
            "database": HTTPServerSpec(
                url="http://localhost:8080/mcp",
                bearer_token="your-jwt-token",
            ),
            "files": HTTPServerSpec(url="http://localhost:8081/mcp"),
        },
        model="openai:gpt-5-mini",
        instructions=(
            "You are a senior data analyst. Query databases, analyze data, "
            "write scripts when needed, and produce clear reports. "
            "Always cite your data sources."
        ),
        memory=memory,
        memory_auto_store=True,
        observe=ObservabilityConfig(
            level=ObserveLevel.STANDARD,
            transporters=[TransporterType.HTML, TransporterType.STRUCTURED_LOG],
            output_dir="./reports",
        ),
        sandbox={"network_mode": "restricted", "memory_limit": "1G"},
        cross_agents={
            "researcher": CrossAgent(
                url="http://research-agent:8001",
                jwt_secret="shared-secret",
                description="Expert at finding research papers.",
            ),
        },
    )

    # Run a conversation
    result = await agent.ainvoke({
        "messages": [{"role": "user", "content": "Analyze Q3 revenue by region"}]
    })
    print(result["messages"][-1].content)

    # Check stats
    stats = agent.get_stats()
    print(f"\nTokens used: {stats['total_tokens']}")
    print(f"Tool calls: {stats['tool_calls']}")

    # Generate report
    agent.generate_report("analysis_report.html")

    await agent.shutdown()

asyncio.run(main())

CLI¶

The Promptise CLI provides quick access to common agent operations:

# Run a SuperAgent file
promptise agent analyst.superagent

# Validate a SuperAgent file
promptise validate analyst.superagent

# List tools discovered from MCP servers
promptise list-tools --server http://localhost:8080/mcp

# Run an agent interactively
promptise run --model openai:gpt-5-mini --server http://localhost:8080/mcp

# Serve an agent over HTTP
promptise serve analyst.superagent --port 8001

Step 10 — Custom Reasoning Patterns¶

Replace the default ReAct loop with a specialized reasoning pattern:

from promptise.engine import PromptGraph, PromptNode, NodeFlag
from promptise.engine.reasoning_nodes import PlanNode, ThinkNode, SynthesizeNode

agent = await build_agent(
    model="openai:gpt-5-mini",
    servers=my_servers,
    # Use a built-in pattern
    agent_pattern="deliberate",  # Think → Plan → Act → Observe → Reflect
)

# Or build a custom graph with reasoning nodes
graph = PromptGraph("my-agent", nodes=[
    PlanNode("plan", is_entry=True),
    PromptNode("act", inject_tools=True, flags={NodeFlag.RETRYABLE}),
    ThinkNode("think"),
    SynthesizeNode("answer", is_terminal=True),
])

agent = await build_agent(
    model="openai:gpt-5-mini",
    servers=my_servers,
    agent_pattern=graph,
)

10 built-in patterns: react (default), verify, managed, code-action, peoatr, research, autonomous, deliberate, debate, pipeline. See Reasoning Patterns.

Error Handling & Troubleshooting¶

Agent Invocation Errors¶

try:
    result = await agent.ainvoke(
        {"messages": [{"role": "user", "content": "Do something complex"}]}
    )
except Exception as exc:
    print(f"Agent failed: {exc}")
    # Common causes: LLM API down, all MCP servers unreachable, timeout

MCP Server Connection Failures¶

# Check which tools were discovered
stats = agent.get_stats()
print(f"Tools available: {stats.get('tools_count', 0)}")

# If 0 tools — server connection failed. Check:
# 1. Is the server running? (stdio: is the command correct? http: is the URL reachable?)
# 2. Are credentials valid? (bearer_token, api_key)
# 3. Is the server healthy? (check server logs)

Timeout Handling¶

agent = await build_agent(
    model="openai:gpt-5-mini",
    servers=my_servers,
    max_agent_iterations=25,   # Limit total reasoning steps
    timeout=120.0,             # Hard timeout in seconds
)

Guardrail Rejections¶

result = await agent.ainvoke(input)
last_msg = result["messages"][-1].content

# If guardrails blocked the input, the response will contain the rejection reason
# Check observability for details:
report = agent.generate_report()
print(report)

What's Next¶

Go deeper on each feature:

Feature used in this guide	Deep reference
`build_agent()`, `PromptiseAgent`	Building Agents
Server specs and connections	Server Configuration
Reasoning patterns and custom graphs	Reasoning Patterns
Reasoning Graph engine	Reasoning Graph
Memory providers and auto-injection	Memory
Observability levels and transporters	Observability
Sandbox configuration and security	Sandbox
Cross-agent delegation	Cross-Agent Delegation
SuperAgent YAML files	SuperAgent Files
CLI commands	CLI Reference

Other guides:

Building Production MCP Servers -- Build the tool servers your agents connect to
Building Agentic Runtime Systems -- Make agents autonomous with triggers and governance
Prompt Engineering -- Build reliable, testable system prompts