Memory¶
Give agents persistent memory with vector search, enabling context-aware responses across conversations.
from promptise import build_agent
from promptise.config import HTTPServerSpec
from promptise.memory import ChromaProvider
provider = ChromaProvider(
collection_name="agent_memory",
persist_directory=".promptise/chroma",
)
agent = await build_agent(
servers={"tools": HTTPServerSpec(url="http://localhost:8000/mcp")},
model="openai:gpt-5-mini",
memory=provider,
)
# Every ainvoke() now automatically searches memory and injects relevant context.
Concepts¶
Promptise memory has two layers:
- Auto-injection via
MemoryAgent-- before every agent invocation, relevant memories are searched and injected as aSystemMessage. The agent sees contextual history without needing explicit memory tools. - Provider protocol -- a simple async interface (
search,add,delete,close) that any backend can implement.
Three providers ship with the framework, covering development through production use cases.
MemoryProvider Protocol¶
All memory providers implement this async protocol:
class MemoryProvider(Protocol):
scope: MemoryScope # SHARED (default) or PER_USER
async def search(
self, query: str, *, limit: int = 5, user_id: str | None = None
) -> list[MemoryResult]: ...
async def add(
self, content: str, *, metadata: dict | None = None, user_id: str | None = None
) -> str: ...
async def delete(self, memory_id: str, *, user_id: str | None = None) -> bool: ...
async def purge_user(self, user_id: str) -> int: ...
async def close(self) -> None: ...
| Method | Returns | Description |
|---|---|---|
search(query, limit=5, user_id=None) |
list[MemoryResult] |
Search for memories relevant to the query. In PER_USER scope, results are filtered to that user. |
add(content, metadata=None, user_id=None) |
str |
Store a new memory, returns its ID. In PER_USER scope, stamps ownership. |
delete(memory_id, user_id=None) |
bool |
Delete a memory by ID. In PER_USER scope, only the owner can delete. |
purge_user(user_id) |
int |
Delete every entry owned by user_id. Returns the count removed. GDPR "right to erasure". |
close() |
None |
Release resources (connections, file handles) |
Memory Scopes — Shared vs Per-User¶
Every built-in provider supports two isolation modes controlled by the scope parameter:
| Scope | Behavior | Use case |
|---|---|---|
MemoryScope.SHARED (default) |
Legacy global pool. user_id is ignored. All callers read/write the same entries. |
Knowledge-base bots, shared org memory, public FAQ agents |
MemoryScope.PER_USER |
Every operation is scoped to a user_id. Users cannot read or delete each other's entries. |
Personal assistants, multi-tenant SaaS, compliance-sensitive workloads |
from promptise.memory import InMemoryProvider, MemoryScope, MemoryIsolationError
# Per-tenant isolation
p = InMemoryProvider(scope=MemoryScope.PER_USER)
await p.add("alice's note", user_id="alice")
await p.add("bob's note", user_id="bob")
assert (await p.search("note", user_id="alice"))[0].content == "alice's note"
assert await p.delete("some-id", user_id="bob") is False # not her entry
If a PER_USER provider is called without a user_id, it raises MemoryIsolationError — a fail-closed guarantee that nothing is ever stored or read without explicit ownership.
Auto-propagation from CallerContext¶
When a memory provider is attached to an agent via build_agent(memory=...), the wrapping MemoryAgent reads the current CallerContext.user_id from the async contextvar and passes it into every search/add/delete call automatically. Your handler code never has to thread user_id manually:
from promptise.agent import CallerContext
# In your request handler
caller = CallerContext(user_id="alice", metadata={"session_id": "sess-1"})
await agent.ainvoke(input, caller=caller)
# → memory provider sees user_id="alice" on every call
If no caller is set (e.g., background tasks), the MemoryAgent catches MemoryIsolationError during auto-search and simply skips memory injection — the agent still runs.
GDPR purge¶
Works on every provider. InMemoryProvider drops in-process entries. ChromaProvider deletes every entry whose metadata contains _promptise_user_id: alice. Mem0Provider delegates to Mem0's delete_all(user_id=…).
MemoryResult¶
Search results are returned as MemoryResult dataclasses:
| Field | Type | Description |
|---|---|---|
content |
str |
The stored text |
score |
float |
Relevance score (0.0 = no match, 1.0 = perfect) |
memory_id |
str |
Unique identifier for this entry |
metadata |
dict |
Provider-specific metadata |
Scores are clamped to [0.0, 1.0] on construction.
Providers¶
InMemoryProvider¶
Substring-search provider for testing and development. No persistence, no embeddings.
from promptise.memory import InMemoryProvider, MemoryScope
# Shared pool (default, legacy behavior)
provider = InMemoryProvider(max_entries=1_000)
# Or per-user isolated
provider = InMemoryProvider(scope=MemoryScope.PER_USER)
await provider.add("Pipeline had 5% error rate at 07:30")
await provider.add("User prefers dark mode")
results = await provider.search("error rate")
# Matches entries containing the substring "error rate"
| Feature | Value |
|---|---|
| Search method | Case-insensitive substring matching |
| Persistence | None (in-memory only) |
| Isolation | SHARED or PER_USER via scope= |
| Dependencies | None |
| Best for | Testing, development, ephemeral agents |
Not for production
InMemoryProvider has no semantic understanding. The query "deployment issues" will not match content containing "deploy". Use ChromaProvider or Mem0Provider for production workloads.
ChromaProvider¶
Local vector similarity search with automatic embedding generation. Wraps ChromaDB.
from promptise.memory import ChromaProvider
# Ephemeral (no persistence)
provider = ChromaProvider(collection_name="agent_memory")
# Persistent (survives restarts)
provider = ChromaProvider(
collection_name="agent_memory",
persist_directory=".promptise/chroma",
)
await provider.add(
"Pipeline had 5% error rate at 07:30",
metadata={"source": "health-check", "severity": "warning"},
)
# Semantic search -- finds related content even without exact matches
results = await provider.search("deployment issues")
| Feature | Value |
|---|---|
| Search method | Vector similarity (cosine distance) |
| Default embedding model | all-MiniLM-L6-v2 (runs locally, no API key) |
| Persistence | Optional (persist_directory parameter) |
| Isolation | SHARED or PER_USER via scope= |
| Dependencies | pip install "promptise[all]" |
| Best for | Production agents needing semantic recall |
Constructor parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
collection_name |
str |
"agent_memory" |
ChromaDB collection name |
persist_directory |
str \| None |
None |
Path for persistent storage (None = ephemeral) |
embedding_function |
Any |
None |
Custom ChromaDB embedding function |
scope |
MemoryScope |
SHARED |
SHARED (global pool) or PER_USER (metadata-filtered per tenant) |
In PER_USER mode the provider stamps every stored document with a _promptise_user_id metadata field and every search/delete uses ChromaDB's where= filter to restrict results to that owner.
Mem0Provider¶
Wraps Mem0 for hybrid vector + graph search. Can run fully local (with Ollama) or via the Mem0 cloud platform.
from promptise.memory import Mem0Provider, MemoryScope
# Per-user (recommended for multi-tenant deployments)
provider = Mem0Provider(scope=MemoryScope.PER_USER)
# user_id flows through from CallerContext on every call
await provider.add("User prefers dark mode", user_id="alice")
results = await provider.search("theme preferences", user_id="alice")
| Feature | Value |
|---|---|
| Search method | Hybrid vector + optional graph search |
| Persistence | Managed by Mem0 |
| Isolation | SHARED (default user) or PER_USER (per-call user_id override) |
| Dependencies | pip install "promptise[all]" |
| Best for | Multi-user agents, knowledge graphs, cloud deployments |
Constructor parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
user_id |
str |
"default" |
Fallback owner when scope=SHARED or no per-call override given |
agent_id |
str \| None |
None |
Optional agent identifier for multi-agent scoping |
config |
dict \| None |
None |
Mem0 configuration dict (for self-hosted setups) |
scope |
MemoryScope |
SHARED |
PER_USER makes per-call user_id= override the tenant for each operation |
MemoryAgent¶
MemoryAgent wraps any LangGraph agent with automatic memory context injection. Before every ainvoke(), it:
- Extracts the user query from the input
- Searches the memory provider for relevant content
- Injects matching results as a
SystemMessage - Invokes the inner agent
- Optionally stores the exchange in memory (if
auto_store=True)
from promptise.memory import MemoryAgent, ChromaProvider
provider = ChromaProvider(persist_directory=".promptise/chroma")
memory_agent = MemoryAgent(
inner=agent_graph,
provider=provider,
max_memories=5,
min_score=0.3,
timeout=5.0,
auto_store=True,
)
result = await memory_agent.ainvoke({"messages": [{"role": "user", "content": "Hello"}]})
| Parameter | Type | Default | Description |
|---|---|---|---|
inner |
Any |
required | The wrapped LangGraph agent |
provider |
MemoryProvider |
required | Memory provider instance |
max_memories |
int |
5 |
Max results to inject per invocation |
min_score |
float |
0.0 |
Min relevance score threshold |
timeout |
float |
5.0 |
Max seconds to wait for memory search |
auto_store |
bool |
False |
Auto-store each exchange after invocation |
Graceful degradation
If the memory provider fails (timeout, connection error), the agent continues normally without memory context. Memory never blocks execution.
Integration with build_agent¶
The simplest way to add memory is through build_agent():
from promptise import build_agent
from promptise.memory import ChromaProvider
provider = ChromaProvider(persist_directory=".promptise/chroma")
agent = await build_agent(
servers={"tools": HTTPServerSpec(url="http://localhost:8000/mcp")},
model="openai:gpt-5-mini",
memory=provider,
)
This automatically wraps the agent graph in a MemoryAgent.
Integration with Agent Runtime¶
In the Agent Runtime, memory is configured through ContextConfig:
from promptise.runtime import ProcessConfig, ContextConfig
config = ProcessConfig(
model="openai:gpt-5-mini",
context=ContextConfig(
memory_provider="chroma", # "in_memory", "chroma", or "mem0"
memory_auto_store=True, # Auto-store exchanges
memory_max=5, # Max memories per invocation
memory_min_score=0.3, # Min relevance score
memory_collection="agent_memory", # ChromaDB collection name
memory_persist_directory=".promptise/chroma",
conversation_max_messages=50, # Short-term buffer size
),
)
Or through a .agent manifest:
memory:
provider: chroma
auto_store: true
max: 5
min_score: 0.3
collection: agent_memory
persist_directory: .promptise/chroma
Security: Memory Sanitization¶
Injected memory content is sanitized before reaching the agent to mitigate prompt injection attacks. The sanitize_memory_content() function:
- Truncates content to a safe injection length (2,000 characters)
- Strips known prompt-injection patterns (
SYSTEM:,[INST],<<SYS>>, etc.) - Removes memory fence markers to prevent content from escaping the context block
The injected memory block is wrapped in <memory_context> fences with explicit instructions that the agent should treat the content as factual context only and not follow any instructions within it.
What's Next?¶
- Sandbox -- execute untrusted code safely in isolated containers
- Observability -- track token usage
- Context & State -- AgentContext and state management
- Conversation Management -- ConversationBuffer