Utilities API Reference¶
Root-level modules: approval, cache, events, fallback, guardrails, streaming, tool optimization, adaptive strategy, context engine, env resolution, types, and exceptions.
Environment Variables¶
Parse and resolve ${ENV_VAR} syntax with default values and recursive resolution.
resolve_env_var¶
promptise.env_resolver.resolve_env_var(value, *, context=None, allow_missing=False)
¶
Resolve environment variables in a string.
Supports syntax: - ${VAR_NAME} - Required variable (raises if not found) - ${VAR_NAME:-default} - Optional variable with default
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
str
|
String potentially containing ${VAR} references. |
required |
context
|
str | None
|
Optional context for error messages (e.g., "servers.math.url"). |
None
|
allow_missing
|
bool
|
If True, leave unresolved vars as-is instead of raising. |
False
|
Returns:
| Type | Description |
|---|---|
str
|
String with all environment variables resolved. |
Raises:
| Type | Description |
|---|---|
EnvVarNotFoundError
|
If a required variable is not found and allow_missing is False. |
Examples:
resolve_env_in_dict¶
promptise.env_resolver.resolve_env_in_dict(data, *, context_prefix='')
¶
Recursively resolve environment variables in a dictionary.
Resolves ${VAR} references in all string values throughout a nested dictionary structure, including in lists and nested dicts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Dictionary potentially containing env var references. |
required |
context_prefix
|
str
|
Prefix for error context (e.g., "servers.math"). |
''
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
New dictionary with all env vars resolved. |
Raises:
| Type | Description |
|---|---|
EnvVarNotFoundError
|
If any required variable is not found. |
Examples:
validate_all_env_vars_available¶
promptise.env_resolver.validate_all_env_vars_available(data)
¶
Check which environment variables are referenced but not available.
This is useful for pre-validation before attempting to load a config. Scans the entire data structure for ${VAR} references and checks if each variable is set in the environment.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Parsed configuration data (dict from YAML/JSON). |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of missing environment variable names (empty if all available). |
list[str]
|
Variables with defaults (${VAR:-default}) are not included. |
Examples:
MCP Tool Discovery¶
ToolInfo¶
promptise.tools.ToolInfo
dataclass
¶
Human-friendly metadata for a discovered MCP tool.
MCPClientError¶
promptise.tools.MCPClientError
¶
Bases: RuntimeError
Raised when communicating with the MCP client fails.
Approval (Human-in-the-loop)¶
Pause the agent before destructive actions and wait for explicit human approval.
ApprovalRequest¶
promptise.approval.ApprovalRequest
dataclass
¶
A request for human approval of a tool call.
Attributes:
| Name | Type | Description |
|---|---|---|
request_id |
str
|
Unique cryptographic ID for this request. |
tool_name |
str
|
Name of the tool requiring approval. |
arguments |
dict[str, Any]
|
Tool arguments (redacted if configured). |
agent_id |
str | None
|
Agent or process identifier. |
caller_user_id |
str | None
|
User who triggered the agent (from CallerContext). |
context_summary |
str
|
Last few messages for reviewer context. |
timestamp |
float
|
When the request was created ( |
timeout |
float
|
Seconds until auto-deny/allow. |
metadata |
dict[str, Any]
|
Developer-provided custom data. |
ApprovalDecision¶
promptise.approval.ApprovalDecision
dataclass
¶
A human's decision on an approval request.
Attributes:
| Name | Type | Description |
|---|---|---|
approved |
bool
|
Whether the tool call is approved. |
modified_arguments |
dict[str, Any] | None
|
If the reviewer edited arguments. |
reviewer_id |
str | None
|
Who made the decision. |
reason |
str | None
|
Optional explanation. |
timestamp |
float
|
When the decision was made. |
ApprovalHandler¶
promptise.approval.ApprovalHandler
¶
Bases: Protocol
Protocol for approval handlers.
Implementations receive an :class:ApprovalRequest and must return
an :class:ApprovalDecision. The handler is async — it can await
webhooks, poll APIs, or wait on queues.
CallbackApprovalHandler¶
promptise.approval.CallbackApprovalHandler
¶
Approval handler that delegates to an async Python callable.
The simplest handler — pass any async def handler(request) -> decision
function and it will be called for each approval request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
callback
|
Callable[..., Any]
|
Async callable that receives an :class: |
required |
request_approval(request)
async
¶
Delegate to the user-provided callback.
WebhookApprovalHandler¶
promptise.approval.WebhookApprovalHandler
¶
Approval handler that POSTs to a webhook URL and polls for a decision.
Sends the :class:ApprovalRequest as a JSON POST to url. Then
polls poll_url (or url + "/" + request_id) for the decision.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url
|
str
|
Webhook URL to POST the approval request to. |
required |
secret
|
str | None
|
HMAC secret for signing requests. If not provided, a random secret is generated (ephemeral — not useful for cross-process verification). |
None
|
poll_url
|
str | None
|
URL to poll for the decision. Defaults to
|
None
|
poll_interval
|
float
|
Seconds between poll attempts. |
2.0
|
headers
|
dict[str, str] | None
|
Custom HTTP headers (e.g., auth tokens). |
None
|
request_approval(request)
async
¶
POST request to webhook, poll for decision.
QueueApprovalHandler¶
promptise.approval.QueueApprovalHandler
¶
Approval handler using async queues for in-process UIs.
For Gradio, Streamlit, or other in-process UIs where the human
reviewer is in the same Python process. The UI reads from
:attr:request_queue and writes decisions to the handler
via :meth:submit_decision.
Example::
handler = QueueApprovalHandler()
# UI thread reads approval requests
request = await handler.request_queue.get()
# Show to user, collect decision
handler.submit_decision(request.request_id, ApprovalDecision(approved=True))
Attributes:
| Name | Type | Description |
|---|---|---|
request_queue |
Queue[ApprovalRequest]
|
Queue of pending :class: |
request_approval(request)
async
¶
Enqueue request and wait for decision from the UI.
submit_decision(request_id, decision)
¶
Submit a decision for a pending request.
Called by the UI after the human reviewer makes a choice.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request_id
|
str
|
The |
required |
decision
|
ApprovalDecision
|
The reviewer's decision. |
required |
Raises:
| Type | Description |
|---|---|
KeyError
|
If no pending request with this ID exists. |
ApprovalPolicy¶
promptise.approval.ApprovalPolicy
¶
Configuration for human-in-the-loop approval.
Defines which tools require approval, how to request it, and what happens on timeout or repeated denial.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tools
|
list[str]
|
Glob patterns for tool names that require approval.
Examples: |
required |
handler
|
ApprovalHandler | Callable[..., Any]
|
An :class: |
required |
timeout
|
float
|
Seconds to wait for a decision before applying
|
300.0
|
on_timeout
|
Literal['deny', 'allow']
|
What to do when timeout expires.
|
'deny'
|
include_arguments
|
bool
|
Include tool arguments in the approval
request. Set to |
True
|
redact_sensitive
|
bool
|
Run arguments through PII/credential detection before sending to the reviewer. Requires guardrails. |
True
|
max_pending
|
int
|
Maximum concurrent pending approvals per agent. Additional tool calls are auto-denied. |
10
|
max_retries_after_deny
|
int
|
If the agent retries a denied tool this many times, return a permanent denial message. |
3
|
redact_arguments(arguments)
async
¶
Redact sensitive data from arguments before sending to reviewer.
Uses the guardrails scanner's check_output if available.
Falls back to returning arguments as-is if guardrails are not
installed.
requires_approval(tool_name)
¶
Check if a tool name matches any approval pattern.
Uses fnmatch glob matching — supports * and ?
wildcards.
Semantic Cache¶
Serves cached responses for semantically similar queries. In-memory or Redis backend with per-user/per-session/shared scope isolation.
SemanticCache¶
promptise.cache.SemanticCache
¶
Semantic cache for agent responses.
Caches LLM responses by query similarity using local or cloud embeddings. Reduces API costs by 30-50% for workloads with repetitive queries.
Security: Default scope is per_user — each user gets an
isolated cache partition. No CallerContext = no caching.
Cached responses always pass through output guardrails.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
backend
|
str
|
|
'memory'
|
redis_url
|
str | None
|
Redis connection URL (when backend is |
None
|
embedding
|
EmbeddingProvider | str | None
|
An :class: |
None
|
similarity_threshold
|
float
|
Minimum cosine similarity for a cache hit. |
0.92
|
default_ttl
|
int
|
Default time-to-live in seconds. |
3600
|
scope
|
str
|
Cache isolation: |
'per_user'
|
max_entries_per_user
|
int
|
Max entries per scope partition. |
1000
|
max_total_entries
|
int
|
Max entries across all scopes. |
100000
|
encrypt_values
|
bool
|
Encrypt cached values at rest (Redis only). |
False
|
ttl_patterns
|
dict[str, int] | None
|
Regex → TTL overrides for time-sensitive queries. |
None
|
invalidate_on_write
|
bool
|
Evict cache when write tools fire. |
True
|
cache_multi_turn
|
bool
|
Cache multi-turn conversations (default: off). |
False
|
shared_data_acknowledged
|
bool
|
Required when scope is |
False
|
Example::
# One-liner
cache = SemanticCache()
# Full config
cache = SemanticCache(
backend="redis",
redis_url="redis://localhost:6379",
similarity_threshold=0.92,
scope="per_user",
ttl_patterns={r"current|now|today": 60},
)
agent = await build_agent(..., cache=cache)
check(query_text, *, context_fingerprint='', caller=None, model_id=None, instruction_hash='')
async
¶
Check for a cached response.
Returns the cached entry if a semantically similar query was
previously cached with the same context, model, and instructions.
Returns None on cache miss.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query_text
|
str
|
The user's query. |
required |
context_fingerprint
|
str
|
Hash of memory + history context. |
''
|
caller
|
Any | None
|
:class: |
None
|
model_id
|
str | None
|
LLM model identifier. |
None
|
instruction_hash
|
str
|
Hash of the system instructions. |
''
|
close()
async
¶
Close the cache backend (Redis connections, etc.).
Called automatically by agent.shutdown().
invalidate_for_write(tool_name, caller=None)
async
¶
Invalidate cache entries after a write operation.
Called automatically when a tool with read_only_hint=False
fires during an agent invocation.
purge_user(user_id)
async
¶
Remove all cached entries for a user.
Use for GDPR right-to-erasure compliance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
user_id
|
str
|
The user whose cache to purge. |
required |
Returns:
| Type | Description |
|---|---|
int
|
Number of entries removed. |
stats()
async
¶
Get cache performance statistics.
store(query_text, response_text, output, *, context_fingerprint='', caller=None, model_id=None, instruction_hash='', tools_used=None)
async
¶
Store a response in the cache.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query_text
|
str
|
The user's query. |
required |
response_text
|
str
|
The extracted response text. |
required |
output
|
Any
|
The full LangGraph output dict. |
required |
context_fingerprint
|
str
|
Hash of memory + history context. |
''
|
caller
|
Any | None
|
:class: |
None
|
model_id
|
str | None
|
LLM model identifier. |
None
|
instruction_hash
|
str
|
Hash of the system instructions. |
''
|
tools_used
|
list[str] | None
|
List of tool names called during this invocation. |
None
|
warmup()
¶
Pre-load the embedding model.
Call at startup to avoid download/load latency on first cache check.
CacheEntry¶
promptise.cache.CacheEntry
dataclass
¶
A single cached response.
Attributes:
| Name | Type | Description |
|---|---|---|
query_text |
str
|
The original user query. |
response_text |
str
|
The extracted response text. |
output |
Any
|
The full LangGraph output dict (for returning to caller). |
embedding |
list[float]
|
The query embedding vector. |
scope_key |
str
|
Isolation scope (e.g. |
context_fingerprint |
str
|
Hash of memory + history + prompt context. |
model_id |
str
|
LLM model that generated this response. |
instruction_hash |
str
|
Hash of the system instructions. |
checksum |
str
|
SHA-256 of response_text for corruption detection. |
created_at |
float
|
Monotonic timestamp of creation. |
ttl |
int
|
Time-to-live in seconds. |
metadata |
dict[str, Any]
|
Extra info (tools_used, token count, etc.). |
CacheStats¶
promptise.cache.CacheStats
dataclass
¶
Cache performance statistics.
Attributes:
| Name | Type | Description |
|---|---|---|
hits |
int
|
Number of cache hits. |
misses |
int
|
Number of cache misses. |
stores |
int
|
Number of entries stored. |
evictions |
int
|
Number of entries evicted. |
hit_rate |
float
|
Proportion of requests served from cache. |
EmbeddingProvider¶
promptise.cache.EmbeddingProvider
¶
Bases: Protocol
Protocol for embedding providers.
Implement this to plug in any embedding model or API::
class MyProvider:
async def embed(self, texts: list[str]) -> list[list[float]]:
return my_model.encode(texts)
LocalEmbeddingProvider¶
promptise.cache.LocalEmbeddingProvider
¶
Local embedding via sentence-transformers.
Uses the same model loading pattern as tool optimization. If the same model is already loaded (e.g. for semantic tool selection), the instance is shared — no duplicate memory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
str
|
Model name or local directory path. |
DEFAULT_MODEL
|
Example::
provider = LocalEmbeddingProvider() # all-MiniLM-L6-v2
provider = LocalEmbeddingProvider(model="BAAI/bge-small-en-v1.5")
provider = LocalEmbeddingProvider(model="/models/local/embeddings")
OpenAIEmbeddingProvider¶
promptise.cache.OpenAIEmbeddingProvider
¶
Embedding via OpenAI or Azure OpenAI API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
str
|
Model name (e.g. |
'text-embedding-3-small'
|
api_key
|
str | None
|
OpenAI API key. |
None
|
base_url
|
str | None
|
Custom base URL (for Azure or proxies). |
None
|
azure_endpoint
|
str | None
|
Azure OpenAI endpoint URL. |
None
|
azure_deployment
|
str | None
|
Azure deployment name. |
None
|
Example::
# OpenAI
provider = OpenAIEmbeddingProvider(
model="text-embedding-3-small",
api_key="${OPENAI_API_KEY}",
)
# Azure OpenAI
provider = OpenAIEmbeddingProvider(
model="text-embedding-3-small",
azure_endpoint="https://xxx.openai.azure.com",
azure_deployment="my-embedding",
api_key="${AZURE_OPENAI_KEY}",
)
embed(texts)
async
¶
Embed texts via OpenAI API.
InMemoryCacheBackend¶
promptise.cache.InMemoryCacheBackend
¶
In-memory cache with numpy-based similarity search.
Stores entries per scope with LRU eviction. Thread-safe via asyncio (single event loop). No persistence — lost on restart.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_entries_per_scope
|
int
|
Max entries per scope partition. |
1000
|
max_total_entries
|
int
|
Max entries across all scopes. |
100000
|
invalidate(scope_key, pattern=None)
async
¶
Evict entries matching a pattern (or all for scope).
purge_user(user_id)
async
¶
Remove all entries for a user (GDPR compliance).
search(scope_key, embedding, threshold)
async
¶
Find the best matching entry above threshold.
store(scope_key, entry)
async
¶
Store an entry, evicting LRU if at capacity.
RedisCacheBackend¶
promptise.cache.RedisCacheBackend
¶
Redis-backed cache with vector similarity search.
Stores cache entries as JSON in Redis hashes. Embeddings are stored per scope and similarity is computed by fetching all embeddings for a scope and running numpy dot product locally. This avoids requiring the RediSearch module while keeping similarity search functional.
Optional AES encryption at rest via encrypt_values=True.
Encryption key is read from PROMPTISE_CACHE_KEY env var or
auto-generated per process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
redis_url
|
str
|
Redis connection URL (e.g. |
'redis://localhost:6379'
|
max_entries_per_scope
|
int
|
Max entries per scope partition. |
1000
|
max_total_entries
|
int
|
Max entries across all scopes. |
100000
|
encrypt_values
|
bool
|
Encrypt cached response values at rest. |
False
|
close()
async
¶
Close the Redis connection.
invalidate(scope_key, pattern=None)
async
¶
Evict entries for a scope.
purge_user(user_id)
async
¶
Remove all entries for a user (GDPR compliance).
search(scope_key, embedding, threshold)
async
¶
Find the best matching entry above threshold.
store(scope_key, entry)
async
¶
Store an entry in Redis.
Event Notifications¶
Webhook + callback sinks for structured agent events (invocation errors, tool failures, budget violations, etc).
AgentEvent¶
promptise.events.AgentEvent
dataclass
¶
A structured notification event from the Promptise framework.
Attributes:
| Name | Type | Description |
|---|---|---|
event_type |
str
|
Dotted event name (e.g. |
severity |
str
|
One of |
timestamp |
float
|
When the event occurred ( |
agent_id |
str | None
|
Agent or process identifier. |
user_id |
str | None
|
User who triggered the action (from CallerContext). |
session_id |
str | None
|
Conversation session ID if applicable. |
data |
dict[str, Any]
|
Event-specific payload (tool name, error message, etc.). |
metadata |
dict[str, Any]
|
Agent configuration, model ID, etc. |
EventSink¶
promptise.events.EventSink
¶
Bases: Protocol
Protocol for event notification sinks.
Sinks receive events from the :class:EventNotifier and deliver
them to external systems (webhooks, logs, callbacks, etc.).
emit(event)
async
¶
Deliver a single event.
EventNotifier¶
promptise.events.EventNotifier
¶
Central event coordinator that routes events to configured sinks.
Events are placed on an async queue and delivered by a background task. The agent never blocks waiting for event delivery.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sinks
|
list[EventSink]
|
List of :class: |
required |
max_queue_size
|
int
|
Maximum events in the delivery queue. When full, new events are dropped with a warning. |
1000
|
Example::
notifier = EventNotifier(sinks=[
WebhookSink("https://hooks.slack.com/...", events=["invocation.error"]),
CallbackSink(my_handler),
])
await notifier.start()
notifier.emit_sync(AgentEvent(event_type="invocation.start", severity="info"))
await notifier.stop()
emit(event)
async
¶
Queue an event for delivery (non-blocking).
If the queue is full, the event is dropped with a warning log.
emit_sync(event)
¶
Queue an event from a synchronous context.
Used by the LangChain callback handler (which is synchronous). If the queue is full, the event is silently dropped.
start()
async
¶
Start the background event delivery task.
stop()
async
¶
Drain remaining events and stop the background task.
WebhookSink¶
promptise.events.WebhookSink
¶
Deliver events via HTTP POST to a webhook URL.
Features: HMAC-SHA256 signing, retry with exponential backoff, SSRF protection, per-event filtering, payload redaction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url
|
str
|
Webhook URL to POST events to. |
required |
events
|
list[str] | None
|
Event types to subscribe to ( |
None
|
headers
|
dict[str, str] | None
|
Custom HTTP headers (e.g. auth tokens). |
None
|
secret
|
str | None
|
HMAC secret for signing payloads. If not provided, a random secret is generated. |
None
|
max_retries
|
int
|
Maximum retry attempts on failure. |
3
|
retry_delay
|
float
|
Initial retry delay in seconds (doubles each retry). |
1.0
|
redact_sensitive
|
bool
|
Scan payloads for PII/credentials before sending. |
True
|
min_severity
|
str | None
|
Minimum severity level to emit. |
None
|
CallbackSink¶
promptise.events.CallbackSink
¶
Deliver events to a Python callable.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
callback
|
Callable[..., Any]
|
Async or sync callable that receives an :class: |
required |
events
|
list[str] | None
|
Event types to subscribe to ( |
None
|
min_severity
|
str | None
|
Minimum severity level to emit. |
None
|
emit(event)
async
¶
Call the callback with the event.
LogSink¶
promptise.events.LogSink
¶
Deliver events to Python's logging system.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
events
|
list[str] | None
|
Event types to subscribe to ( |
None
|
logger_name
|
str
|
Logger name (default: |
'promptise.events'
|
min_severity
|
str | None
|
Minimum severity level to emit. |
None
|
emit(event)
async
¶
Log the event as a structured JSON line.
EventBusSink¶
promptise.events.EventBusSink
¶
Bridge events to the runtime's EventBus for inter-process notifications.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
event_bus
|
Any
|
Any object with an |
required |
events
|
list[str] | None
|
Event types to subscribe to ( |
None
|
emit(event)
async
¶
Publish the event to the EventBus.
default_pii_sanitizer¶
promptise.events.default_pii_sanitizer(data)
¶
Redact PII and credentials from a data dictionary.
Serialises the dict to JSON, applies regex patterns, and deserialises back. Safe to call on any dict — returns the original on serialization failure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Arbitrary dict (event payload, observability metadata, etc.). |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A new dict with sensitive values replaced by placeholders. |
Model Fallback¶
Automatic provider failover. Uses a circuit breaker to route around unhealthy providers without adding latency.
FallbackChain¶
promptise.fallback.FallbackChain
¶
Bases: BaseChatModel
Chain of LLM models with automatic failover.
Tries models in order. If one fails (exception, timeout), the next
is tried. Each model has an independent circuit breaker — after
failure_threshold consecutive failures, the model is skipped
for recovery_timeout seconds before being tested again.
Passes through to build_agent(model=...) seamlessly — it's a
BaseChatModel subclass, so LangChain treats it like any other model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models
|
Sequence[str | BaseChatModel] | None
|
Ordered list of model identifiers (strings like
|
None
|
timeout_per_model
|
float
|
Maximum seconds per model attempt.
|
0
|
global_timeout
|
float
|
Maximum seconds across ALL attempts combined.
|
0
|
failure_threshold
|
int
|
Consecutive failures before a model's circuit breaker opens. Default: 3. |
3
|
recovery_timeout
|
float
|
Seconds before a tripped circuit breaker allows a test request. Default: 60. |
60.0
|
on_fallback
|
Any
|
Optional callback |
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
RuntimeError
|
If all models in the chain fail. |
active_model
property
¶
Return the first non-skipped model's name.
model_name
property
¶
Return the model that last served a request.
Before any request is made, returns the primary model's name. After a request, returns the model that actually served it — this is what observability and cache use.
get_chain_status()
¶
Get the health status of each model in the chain.
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
List of dicts with |
list[dict[str, Any]]
|
|
Security Guardrails¶
Multi-head security scanner: prompt injection (DeBERTa ML), PII detection (69 regex patterns), credential detection (96 patterns), NER (GLiNER), content safety, custom rules.
PromptiseSecurityScanner¶
promptise.guardrails.PromptiseSecurityScanner
¶
Unified security scanner for agent input and output.
Compose detection heads to build exactly the scanner you need. Each head is a standalone config object — plug in what matters, leave out what doesn't.
Composable API (recommended)::
from promptise.guardrails import (
PromptiseSecurityScanner,
InjectionDetector,
PIIDetector,
CredentialDetector,
ContentSafetyDetector,
NERDetector,
CustomRule,
)
scanner = PromptiseSecurityScanner(
detectors=[
InjectionDetector(),
PIIDetector(categories={PIICategory.CREDIT_CARDS, PIICategory.SSN}),
CredentialDetector(categories={CredentialCategory.AWS}),
],
custom_rules=[
CustomRule(name="internal_id", pattern=r"INT-\d{8}"),
],
)
One-liner defaults (all heads enabled)::
scanner = PromptiseSecurityScanner.default()
Flat API (backward compatible)::
scanner = PromptiseSecurityScanner(
detect_injection=True,
detect_pii={PIICategory.CREDIT_CARDS, PIICategory.SSN},
detect_credentials={CredentialCategory.AWS},
)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
detectors
|
list[Any] | None
|
List of detector instances to enable. |
None
|
custom_rules
|
list[CustomRule | dict[str, Any]] | None
|
List of :class: |
None
|
detect_injection
|
bool
|
(flat API) Enable injection detection. |
True
|
detect_pii
|
bool | set[PIICategory]
|
(flat API) Enable PII detection. |
True
|
detect_toxicity
|
bool
|
(flat API) Enable toxicity detection. |
True
|
detect_credentials
|
bool | set[CredentialCategory]
|
(flat API) Enable credential detection. |
True
|
check_input(text)
async
¶
Scan input text. Raises :class:GuardrailViolation on block.
Called by the agent before any processing (memory, tools, LLM).
check_output(output)
async
¶
Scan output text. Redacts PII/credentials. Blocks on injection.
Called by the agent after the LLM response, before returning.
default()
classmethod
¶
Create a scanner with all detection heads enabled (defaults).
Equivalent to::
PromptiseSecurityScanner(detectors=[
InjectionDetector(),
PIIDetector(),
CredentialDetector(),
])
list_credential_patterns()
staticmethod
¶
Return names of all built-in credential patterns.
list_pii_patterns()
staticmethod
¶
Return names of all built-in PII patterns.
scan_text(text, *, direction='input')
async
¶
Run all enabled detection heads on the given text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to scan. |
required |
direction
|
str
|
|
'input'
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
ScanReport
|
class: |
warmup()
¶
Pre-load ML models so the first scan is fast.
Call this at startup to avoid download/load latency on the first message. Safe to call multiple times (models are cached).
Example::
scanner = PromptiseSecurityScanner()
scanner.warmup() # downloads + loads models NOW
agent = await build_agent(..., guardrails=scanner)
SecurityFinding¶
promptise.guardrails.SecurityFinding
dataclass
¶
A single detection result from a scanner.
Attributes:
| Name | Type | Description |
|---|---|---|
detector |
str
|
Which detection head found this (injection/pii/toxicity/credential). |
category |
str
|
Specific sub-category (e.g. |
severity |
Severity
|
How severe this finding is. |
confidence |
float
|
Model confidence or 1.0 for regex matches. |
matched_text |
str
|
The text span that matched. |
start |
int
|
Character offset in original text. |
end |
int
|
Character offset in original text. |
action |
Action
|
What should happen (block/redact/warn). |
description |
str
|
Human-readable explanation. |
metadata |
dict[str, Any]
|
Extra information (model scores, pattern name, etc.). |
ScanReport¶
promptise.guardrails.ScanReport
dataclass
¶
Complete scan result with all findings and metadata.
Attributes:
| Name | Type | Description |
|---|---|---|
passed |
bool
|
True if no findings have action=BLOCK. |
findings |
list[SecurityFinding]
|
All detections from all scanners. |
duration_ms |
float
|
Total scan time in milliseconds. |
scanners_run |
list[str]
|
Which detection heads ran. |
text_length |
int
|
Length of scanned text. |
redacted_text |
str | None
|
Text with PII/credentials replaced (output scans). |
PIICategory¶
promptise.guardrails.PIICategory
¶
Bases: str, Enum
PII detection categories. Pass a set of these to
PromptiseSecurityScanner(enable_pii={...}) to enable only
specific PII types.
Example::
scanner = PromptiseSecurityScanner(
enable_pii={PIICategory.CREDIT_CARDS, PIICategory.SSN, PIICategory.EMAIL},
)
CredentialCategory¶
promptise.guardrails.CredentialCategory
¶
Bases: str, Enum
Credential detection categories. Pass a set of these to
PromptiseSecurityScanner(enable_credentials={...}).
Example::
scanner = PromptiseSecurityScanner(
enable_credentials={
CredentialCategory.AWS,
CredentialCategory.OPENAI,
CredentialCategory.GITHUB,
},
)
Severity¶
promptise.guardrails.Severity
¶
Bases: str, Enum
Severity level of a security finding.
Action¶
promptise.guardrails.Action
¶
Bases: str, Enum
Action to take when a finding is detected.
Streaming¶
StreamEvent¶
promptise.streaming.StreamEvent
dataclass
¶
Base event yielded by astream_with_tools().
All events have a type string and a timestamp.
Use to_dict() for JSON serialization or to_json() for SSE.
Attributes:
| Name | Type | Description |
|---|---|---|
type |
str
|
Event type identifier. |
timestamp |
float
|
Monotonic time when the event was created. |
ToolStartEvent¶
promptise.streaming.ToolStartEvent
dataclass
¶
Bases: StreamEvent
A tool has started executing.
Attributes:
| Name | Type | Description |
|---|---|---|
tool_name |
str
|
Raw MCP tool name (e.g. |
tool_display_name |
str
|
Human-readable name (e.g. |
arguments |
dict[str, Any]
|
Tool arguments (redacted if guardrails active). |
tool_index |
int
|
0-based index of tool calls in this invocation. |
ToolEndEvent¶
promptise.streaming.ToolEndEvent
dataclass
¶
Bases: StreamEvent
A tool has finished executing.
Attributes:
| Name | Type | Description |
|---|---|---|
tool_name |
str
|
Raw MCP tool name. |
tool_summary |
str
|
One-line summary of the result. |
duration_ms |
float
|
Execution time in milliseconds. |
success |
bool
|
Whether the tool completed without error. |
tool_index |
int
|
0-based index of this tool call. |
TokenEvent¶
promptise.streaming.TokenEvent
dataclass
¶
Bases: StreamEvent
An LLM token has been generated.
Attributes:
| Name | Type | Description |
|---|---|---|
text |
str
|
The token text. |
cumulative_text |
str
|
All text generated so far in this invocation. |
DoneEvent¶
promptise.streaming.DoneEvent
dataclass
¶
Bases: StreamEvent
The agent has finished processing.
Attributes:
| Name | Type | Description |
|---|---|---|
full_response |
str
|
Complete response text (post-guardrail redaction). |
tool_calls |
list[dict[str, Any]]
|
Summary of all tool calls made. |
duration_ms |
float
|
Total invocation time in milliseconds. |
cache_hit |
bool
|
Whether the response came from cache. |
ErrorEvent¶
promptise.streaming.ErrorEvent
dataclass
¶
Bases: StreamEvent
An error occurred during processing.
The message is always generic — internal details are never exposed.
Attributes:
| Name | Type | Description |
|---|---|---|
message |
str
|
Human-readable error description. |
recoverable |
bool
|
Whether the agent might retry. |
Tool Optimization¶
Static schema minification + semantic tool selection (local embeddings) to cut prompt tokens by 40-70% on agents with many tools.
OptimizationLevel¶
promptise.tool_optimization.OptimizationLevel
¶
Bases: str, Enum
Preset optimization levels.
MINIMAL— schema minification + description truncation.STANDARD— deeper minification + nested description stripping.SEMANTIC— all static optimizations + per-invocation semantic tool selection.
ToolOptimizationConfig¶
promptise.tool_optimization.ToolOptimizationConfig
dataclass
¶
Configuration for MCP tool token optimization.
Pass a preset :attr:level for sensible defaults, or override
individual settings for fine-grained control. Any field set
explicitly takes precedence over the preset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
level
|
OptimizationLevel | None
|
Preset optimization level. |
None
|
minify_schema
|
bool | None
|
Strip |
None
|
max_description_length
|
int | None
|
Truncate tool descriptions at N chars. |
None
|
strip_nested_descriptions
|
bool | None
|
Remove descriptions from nested model fields (keeps top-level field descriptions). |
None
|
max_schema_depth
|
int | None
|
Flatten nested objects beyond this depth to
|
None
|
semantic_selection
|
bool | None
|
Enable per-invocation semantic tool selection. |
None
|
semantic_top_k
|
int | None
|
Number of most-relevant tools to select per invocation. |
None
|
always_include_fallback
|
bool | None
|
Include a |
None
|
embedding_model
|
str | None
|
Model name or local path for
|
None
|
preserve_tools
|
set[str] | None
|
Tool names that are never optimized and always included in semantic selection. |
None
|
Adaptive Strategy¶
Failure classification and strategy learning. Agents track past failures, categorize them, and adjust behavior.
FailureCategory¶
promptise.strategy.FailureCategory
¶
Bases: str, Enum
Classification of a tool failure.
FailureLog¶
promptise.strategy.FailureLog
dataclass
¶
A single tool failure record.
Attributes:
| Name | Type | Description |
|---|---|---|
tool_name |
str
|
Name of the tool that failed. |
error_type |
str
|
Exception class name (e.g. |
error_message |
str
|
Error message (truncated to 500 chars). |
category |
FailureCategory
|
Classified failure category. |
args_preview |
str
|
Truncated preview of the failed arguments. |
timestamp |
float
|
When the failure occurred. |
confidence |
float
|
Classification confidence (0.0–1.0). |
invocation_id |
str | None
|
Optional invocation identifier. |
AdaptiveStrategyConfig¶
promptise.strategy.AdaptiveStrategyConfig
dataclass
¶
Configuration for the adaptive strategy system.
Attributes:
| Name | Type | Description |
|---|---|---|
enabled |
bool
|
Enable adaptive learning (default: disabled). |
synthesis_threshold |
int
|
Number of strategy failures before triggering LLM synthesis. Infrastructure failures don't count. |
synthesis_model |
str | None
|
LLM model ID for synthesis/verification. Defaults to the agent's own model. |
max_strategies |
int
|
Maximum stored strategies (oldest dropped first). |
auto_cleanup |
bool
|
Delete raw failure logs after synthesis. |
strategy_ttl |
int
|
Strategy expiry in seconds (0 = never expire). |
failure_retention |
int
|
Maximum raw failure logs to keep. |
verify_human_feedback |
bool
|
Use LLM-as-judge to verify human corrections. |
feedback_rate_limit |
int
|
Max corrections per hour per user. |
scope |
str
|
Strategy isolation — |
AdaptiveStrategyManager¶
promptise.strategy.AdaptiveStrategyManager
¶
Manages adaptive strategy learning for an agent.
Sits on top of a :class:MemoryProvider and handles failure
recording, classification, strategy synthesis, human feedback
verification, and strategy retrieval.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
AdaptiveStrategyConfig
|
Adaptive strategy configuration. |
required |
memory
|
Any
|
The agent's memory provider for storing strategies. |
required |
agent_model
|
str | None
|
Default LLM model ID for synthesis/verification. |
None
|
guardrails
|
Any | None
|
Optional guardrails scanner for feedback validation. |
None
|
config
property
¶
The adaptive strategy configuration.
format_strategy_block(strategies)
¶
Format strategies for injection into the agent's system prompt.
Wraps in <strategy_context> fences with anti-injection disclaimer.
get_relevant_strategies(query, *, limit=3)
async
¶
Search for strategies relevant to the current query.
Returns strategies sorted by confidence (highest first). Expired strategies (past TTL) are excluded.
record_failure(failure)
async
¶
Record a tool failure. Only strategy failures are stored.
Infrastructure failures (MCP server down, network errors) are skipped — the agent shouldn't learn from infra problems. Unknown failures are stored with low confidence.
record_human_correction(correction, *, evidence=None, sender_id=None)
async
¶
Process a human correction ("you did this wrong").
Validates the correction against guardrails and optionally verifies it via LLM-as-judge before storing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
correction
|
str
|
The human's feedback text. |
required |
evidence
|
dict[str, Any] | None
|
Tool call history and output for verification. |
None
|
sender_id
|
str | None
|
Who sent the correction (for rate limiting + audit). |
None
|
Returns:
| Type | Description |
|---|---|
bool
|
|
synthesize()
async
¶
Synthesize strategies from accumulated failure logs.
Asks the LLM to reflect on recent failures and produce actionable strategies. Returns the number of strategies created.
classify_failure¶
promptise.strategy.classify_failure(error_type, error_message)
¶
Classify an error as infrastructure, strategy, or unknown.
Deterministic — no LLM call. Based on error type name and message pattern matching.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
error_type
|
str
|
The exception class name. |
required |
error_message
|
str
|
The error message text. |
required |
Returns:
| Type | Description |
|---|---|
FailureCategory
|
The classified :class: |
Context Engine¶
Token budgeting across multiple context layers with model-aware window detection.
ContextLayer¶
promptise.context_engine.ContextLayer
dataclass
¶
A single context source in the assembly pipeline.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Unique layer identifier (e.g. |
priority |
int
|
Assembly priority (0-10). Higher = kept longer when trimming. Required layers (priority >= 9) are never dropped. |
content |
str
|
The text content for this layer. |
required |
bool
|
If |
trim_strategy |
str
|
How to trim this layer — |
metadata |
dict[str, Any]
|
Developer-provided metadata for reporting. |
token_estimate
property
¶
Rough token estimate (chars / 3.5) for quick checks.
ContextReport¶
promptise.context_engine.ContextReport
dataclass
¶
Report of context assembly — what was included, trimmed, and total usage.
Attributes:
| Name | Type | Description |
|---|---|---|
total_tokens |
int
|
Total tokens in the assembled context. |
budget |
int
|
The model's context window minus response reserve. |
layers |
list[dict[str, Any]]
|
Per-layer token counts and trim status. |
trimmed_layers |
list[str]
|
Layers that were trimmed or dropped. |
utilization |
float
|
Fraction of budget used (0.0-1.0). |
utilization
property
¶
Fraction of budget used.
Tokenizer¶
promptise.context_engine.Tokenizer
¶
Bases: Protocol
Protocol for token counters.
Implementations must provide a count(text) -> int method that
returns the exact (or best-estimate) token count for a string.
Callback Handler¶
Bridges LangChain callbacks into the observability collector.
PromptiseCallbackHandler¶
promptise.callback_handler.PromptiseCallbackHandler
¶
Bases: BaseCallbackHandler
LangChain callback handler → ObservabilityCollector bridge.
Captures every LLM turn, tool call, token count, latency, retry, and error. Designed to be the only integration point needed between LangChain's event system and Promptise's observability.
Exceptions¶
SuperAgentError¶
promptise.exceptions.SuperAgentError
¶
Bases: RuntimeError
Base exception for all SuperAgent-related errors.
This is the base class for all exceptions raised by the .superagent file loading and validation system. Catching this exception will catch all SuperAgent-specific errors.
Examples:
SuperAgentValidationError¶
promptise.exceptions.SuperAgentValidationError
¶
Bases: SuperAgentError
Raised when .superagent schema validation fails.
This exception is raised when a .superagent file fails Pydantic schema validation. It includes detailed error information to help users fix the configuration.
Attributes:
| Name | Type | Description |
|---|---|---|
errors |
List of Pydantic validation errors with location and message. |
|
file_path |
Path to the file that failed validation. |
Examples:
>>> raise SuperAgentValidationError(
... "Schema validation failed",
... errors=[{"loc": ["agent", "model"], "msg": "field required"}],
... file_path="/path/to/agent.superagent"
... )
__init__(message, *, errors=None, file_path=None)
¶
Initialize validation error with details.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
Human-readable error message. |
required |
errors
|
list[dict[str, Any]] | None
|
List of Pydantic validation errors (from ValidationError.errors()). |
None
|
file_path
|
str | None
|
Path to the configuration file that failed validation. |
None
|
__str__()
¶
Format validation errors for display.
Returns:
| Type | Description |
|---|---|
str
|
Formatted error message including file path and detailed validation errors. |
EnvVarNotFoundError¶
promptise.exceptions.EnvVarNotFoundError
¶
Bases: SuperAgentError
Raised when a required environment variable is not found.
This exception is raised when a configuration references an environment variable using ${VAR_NAME} syntax, but the variable is not set in the environment.
Attributes:
| Name | Type | Description |
|---|---|---|
var_name |
The name of the missing environment variable. |
|
context |
Optional context about where the variable was referenced (e.g., "servers.math.url"). |
Examples:
__init__(var_name, context=None)
¶
Initialize environment variable error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
var_name
|
str
|
Name of the missing environment variable. |
required |
context
|
str | None
|
Optional context string indicating where the variable was referenced (e.g., field path in configuration). |
None
|