Skip to content

Utilities API Reference

Root-level modules: approval, cache, events, fallback, guardrails, streaming, tool optimization, adaptive strategy, context engine, env resolution, types, and exceptions.

Environment Variables

Parse and resolve ${ENV_VAR} syntax with default values and recursive resolution.

resolve_env_var

promptise.env_resolver.resolve_env_var(value, *, context=None, allow_missing=False)

Resolve environment variables in a string.

Supports syntax: - ${VAR_NAME} - Required variable (raises if not found) - ${VAR_NAME:-default} - Optional variable with default

Parameters:

Name Type Description Default
value str

String potentially containing ${VAR} references.

required
context str | None

Optional context for error messages (e.g., "servers.math.url").

None
allow_missing bool

If True, leave unresolved vars as-is instead of raising.

False

Returns:

Type Description
str

String with all environment variables resolved.

Raises:

Type Description
EnvVarNotFoundError

If a required variable is not found and allow_missing is False.

Examples:

>>> os.environ["API_KEY"] = "secret123"
>>> resolve_env_var("Bearer ${API_KEY}")
'Bearer secret123'
>>> resolve_env_var("${MISSING:-default}")
'default'
>>> resolve_env_var("${MISSING}")  # Raises EnvVarNotFoundError

resolve_env_in_dict

promptise.env_resolver.resolve_env_in_dict(data, *, context_prefix='')

Recursively resolve environment variables in a dictionary.

Resolves ${VAR} references in all string values throughout a nested dictionary structure, including in lists and nested dicts.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary potentially containing env var references.

required
context_prefix str

Prefix for error context (e.g., "servers.math").

''

Returns:

Type Description
dict[str, Any]

New dictionary with all env vars resolved.

Raises:

Type Description
EnvVarNotFoundError

If any required variable is not found.

Examples:

>>> os.environ["TOKEN"] = "abc123"
>>> resolve_env_in_dict({"auth": "${TOKEN}"})
{'auth': 'abc123'}
>>> resolve_env_in_dict({"server": {"url": "${API_URL}"}})
{'server': {'url': 'http://...'}}

validate_all_env_vars_available

promptise.env_resolver.validate_all_env_vars_available(data)

Check which environment variables are referenced but not available.

This is useful for pre-validation before attempting to load a config. Scans the entire data structure for ${VAR} references and checks if each variable is set in the environment.

Parameters:

Name Type Description Default
data dict[str, Any]

Parsed configuration data (dict from YAML/JSON).

required

Returns:

Type Description
list[str]

List of missing environment variable names (empty if all available).

list[str]

Variables with defaults (${VAR:-default}) are not included.

Examples:

>>> data = {"api_key": "${OPENAI_API_KEY}", "url": "${API_URL:-http://default}"}
>>> missing = validate_all_env_vars_available(data)
>>> if missing:
...     print(f"Missing env vars: {', '.join(missing)}")

MCP Tool Discovery

ToolInfo

promptise.tools.ToolInfo dataclass

Human-friendly metadata for a discovered MCP tool.

MCPClientError

promptise.tools.MCPClientError

Bases: RuntimeError

Raised when communicating with the MCP client fails.


Approval (Human-in-the-loop)

Pause the agent before destructive actions and wait for explicit human approval.

ApprovalRequest

promptise.approval.ApprovalRequest dataclass

A request for human approval of a tool call.

Attributes:

Name Type Description
request_id str

Unique cryptographic ID for this request.

tool_name str

Name of the tool requiring approval.

arguments dict[str, Any]

Tool arguments (redacted if configured).

agent_id str | None

Agent or process identifier.

caller_user_id str | None

User who triggered the agent (from CallerContext).

context_summary str

Last few messages for reviewer context.

timestamp float

When the request was created (time.time()).

timeout float

Seconds until auto-deny/allow.

metadata dict[str, Any]

Developer-provided custom data.

compute_hmac(secret)

Compute HMAC-SHA256 signature covering all request fields.

to_dict()

Serialize to a JSON-safe dict (for webhook payloads).

ApprovalDecision

promptise.approval.ApprovalDecision dataclass

A human's decision on an approval request.

Attributes:

Name Type Description
approved bool

Whether the tool call is approved.

modified_arguments dict[str, Any] | None

If the reviewer edited arguments.

reviewer_id str | None

Who made the decision.

reason str | None

Optional explanation.

timestamp float

When the decision was made.

ApprovalHandler

promptise.approval.ApprovalHandler

Bases: Protocol

Protocol for approval handlers.

Implementations receive an :class:ApprovalRequest and must return an :class:ApprovalDecision. The handler is async — it can await webhooks, poll APIs, or wait on queues.

CallbackApprovalHandler

promptise.approval.CallbackApprovalHandler

Approval handler that delegates to an async Python callable.

The simplest handler — pass any async def handler(request) -> decision function and it will be called for each approval request.

Parameters:

Name Type Description Default
callback Callable[..., Any]

Async callable that receives an :class:ApprovalRequest and returns an :class:ApprovalDecision.

required
request_approval(request) async

Delegate to the user-provided callback.

WebhookApprovalHandler

promptise.approval.WebhookApprovalHandler

Approval handler that POSTs to a webhook URL and polls for a decision.

Sends the :class:ApprovalRequest as a JSON POST to url. Then polls poll_url (or url + "/" + request_id) for the decision.

Parameters:

Name Type Description Default
url str

Webhook URL to POST the approval request to.

required
secret str | None

HMAC secret for signing requests. If not provided, a random secret is generated (ephemeral — not useful for cross-process verification).

None
poll_url str | None

URL to poll for the decision. Defaults to {url}/{request_id}.

None
poll_interval float

Seconds between poll attempts.

2.0
headers dict[str, str] | None

Custom HTTP headers (e.g., auth tokens).

None
request_approval(request) async

POST request to webhook, poll for decision.

QueueApprovalHandler

promptise.approval.QueueApprovalHandler

Approval handler using async queues for in-process UIs.

For Gradio, Streamlit, or other in-process UIs where the human reviewer is in the same Python process. The UI reads from :attr:request_queue and writes decisions to the handler via :meth:submit_decision.

Example::

handler = QueueApprovalHandler()

# UI thread reads approval requests
request = await handler.request_queue.get()
# Show to user, collect decision
handler.submit_decision(request.request_id, ApprovalDecision(approved=True))

Attributes:

Name Type Description
request_queue Queue[ApprovalRequest]

Queue of pending :class:ApprovalRequest objects.

request_approval(request) async

Enqueue request and wait for decision from the UI.

submit_decision(request_id, decision)

Submit a decision for a pending request.

Called by the UI after the human reviewer makes a choice.

Parameters:

Name Type Description Default
request_id str

The request_id from the :class:ApprovalRequest.

required
decision ApprovalDecision

The reviewer's decision.

required

Raises:

Type Description
KeyError

If no pending request with this ID exists.

ApprovalPolicy

promptise.approval.ApprovalPolicy

Configuration for human-in-the-loop approval.

Defines which tools require approval, how to request it, and what happens on timeout or repeated denial.

Parameters:

Name Type Description Default
tools list[str]

Glob patterns for tool names that require approval. Examples: ["send_email"], ["delete_*", "payment_*"].

required
handler ApprovalHandler | Callable[..., Any]

An :class:ApprovalHandler implementation or an async callable (ApprovalRequest) -> ApprovalDecision.

required
timeout float

Seconds to wait for a decision before applying on_timeout. Default: 300 (5 minutes).

300.0
on_timeout Literal['deny', 'allow']

What to do when timeout expires. "deny" (default) rejects the tool call. "allow" permits it.

'deny'
include_arguments bool

Include tool arguments in the approval request. Set to False to hide arguments from reviewers.

True
redact_sensitive bool

Run arguments through PII/credential detection before sending to the reviewer. Requires guardrails.

True
max_pending int

Maximum concurrent pending approvals per agent. Additional tool calls are auto-denied.

10
max_retries_after_deny int

If the agent retries a denied tool this many times, return a permanent denial message.

3
redact_arguments(arguments) async

Redact sensitive data from arguments before sending to reviewer.

Uses the guardrails scanner's check_output if available. Falls back to returning arguments as-is if guardrails are not installed.

requires_approval(tool_name)

Check if a tool name matches any approval pattern.

Uses fnmatch glob matching — supports * and ? wildcards.


Semantic Cache

Serves cached responses for semantically similar queries. In-memory or Redis backend with per-user/per-session/shared scope isolation.

SemanticCache

promptise.cache.SemanticCache

Semantic cache for agent responses.

Caches LLM responses by query similarity using local or cloud embeddings. Reduces API costs by 30-50% for workloads with repetitive queries.

Security: Default scope is per_user — each user gets an isolated cache partition. No CallerContext = no caching. Cached responses always pass through output guardrails.

Parameters:

Name Type Description Default
backend str

"memory" (default) or "redis".

'memory'
redis_url str | None

Redis connection URL (when backend is "redis").

None
embedding EmbeddingProvider | str | None

An :class:EmbeddingProvider, a model name string, or None for the default local model.

None
similarity_threshold float

Minimum cosine similarity for a cache hit.

0.92
default_ttl int

Default time-to-live in seconds.

3600
scope str

Cache isolation: "per_user" (default), "per_session", or "shared".

'per_user'
max_entries_per_user int

Max entries per scope partition.

1000
max_total_entries int

Max entries across all scopes.

100000
encrypt_values bool

Encrypt cached values at rest (Redis only).

False
ttl_patterns dict[str, int] | None

Regex → TTL overrides for time-sensitive queries.

None
invalidate_on_write bool

Evict cache when write tools fire.

True
cache_multi_turn bool

Cache multi-turn conversations (default: off).

False
shared_data_acknowledged bool

Required when scope is "shared".

False

Example::

# One-liner
cache = SemanticCache()

# Full config
cache = SemanticCache(
    backend="redis",
    redis_url="redis://localhost:6379",
    similarity_threshold=0.92,
    scope="per_user",
    ttl_patterns={r"current|now|today": 60},
)

agent = await build_agent(..., cache=cache)
check(query_text, *, context_fingerprint='', caller=None, model_id=None, instruction_hash='') async

Check for a cached response.

Returns the cached entry if a semantically similar query was previously cached with the same context, model, and instructions. Returns None on cache miss.

Parameters:

Name Type Description Default
query_text str

The user's query.

required
context_fingerprint str

Hash of memory + history context.

''
caller Any | None

:class:CallerContext for scope isolation.

None
model_id str | None

LLM model identifier.

None
instruction_hash str

Hash of the system instructions.

''
close() async

Close the cache backend (Redis connections, etc.).

Called automatically by agent.shutdown().

invalidate_for_write(tool_name, caller=None) async

Invalidate cache entries after a write operation.

Called automatically when a tool with read_only_hint=False fires during an agent invocation.

purge_user(user_id) async

Remove all cached entries for a user.

Use for GDPR right-to-erasure compliance.

Parameters:

Name Type Description Default
user_id str

The user whose cache to purge.

required

Returns:

Type Description
int

Number of entries removed.

stats() async

Get cache performance statistics.

store(query_text, response_text, output, *, context_fingerprint='', caller=None, model_id=None, instruction_hash='', tools_used=None) async

Store a response in the cache.

Parameters:

Name Type Description Default
query_text str

The user's query.

required
response_text str

The extracted response text.

required
output Any

The full LangGraph output dict.

required
context_fingerprint str

Hash of memory + history context.

''
caller Any | None

:class:CallerContext for scope isolation.

None
model_id str | None

LLM model identifier.

None
instruction_hash str

Hash of the system instructions.

''
tools_used list[str] | None

List of tool names called during this invocation.

None
warmup()

Pre-load the embedding model.

Call at startup to avoid download/load latency on first cache check.

CacheEntry

promptise.cache.CacheEntry dataclass

A single cached response.

Attributes:

Name Type Description
query_text str

The original user query.

response_text str

The extracted response text.

output Any

The full LangGraph output dict (for returning to caller).

embedding list[float]

The query embedding vector.

scope_key str

Isolation scope (e.g. "user:user-42").

context_fingerprint str

Hash of memory + history + prompt context.

model_id str

LLM model that generated this response.

instruction_hash str

Hash of the system instructions.

checksum str

SHA-256 of response_text for corruption detection.

created_at float

Monotonic timestamp of creation.

ttl int

Time-to-live in seconds.

metadata dict[str, Any]

Extra info (tools_used, token count, etc.).

expired property

Check if this entry has expired.

Uses time.time() (wall clock) consistently across all backends. Both InMemory and Redis backends set created_at with time.time().

verify_checksum()

Verify response integrity.

CacheStats

promptise.cache.CacheStats dataclass

Cache performance statistics.

Attributes:

Name Type Description
hits int

Number of cache hits.

misses int

Number of cache misses.

stores int

Number of entries stored.

evictions int

Number of entries evicted.

hit_rate float

Proportion of requests served from cache.

EmbeddingProvider

promptise.cache.EmbeddingProvider

Bases: Protocol

Protocol for embedding providers.

Implement this to plug in any embedding model or API::

class MyProvider:
    async def embed(self, texts: list[str]) -> list[list[float]]:
        return my_model.encode(texts)

LocalEmbeddingProvider

promptise.cache.LocalEmbeddingProvider

Local embedding via sentence-transformers.

Uses the same model loading pattern as tool optimization. If the same model is already loaded (e.g. for semantic tool selection), the instance is shared — no duplicate memory.

Parameters:

Name Type Description Default
model str

Model name or local directory path.

DEFAULT_MODEL

Example::

provider = LocalEmbeddingProvider()  # all-MiniLM-L6-v2
provider = LocalEmbeddingProvider(model="BAAI/bge-small-en-v1.5")
provider = LocalEmbeddingProvider(model="/models/local/embeddings")
embed(texts) async

Embed texts using the local model.

warmup()

Pre-load the embedding model.

OpenAIEmbeddingProvider

promptise.cache.OpenAIEmbeddingProvider

Embedding via OpenAI or Azure OpenAI API.

Parameters:

Name Type Description Default
model str

Model name (e.g. "text-embedding-3-small").

'text-embedding-3-small'
api_key str | None

OpenAI API key.

None
base_url str | None

Custom base URL (for Azure or proxies).

None
azure_endpoint str | None

Azure OpenAI endpoint URL.

None
azure_deployment str | None

Azure deployment name.

None

Example::

# OpenAI
provider = OpenAIEmbeddingProvider(
    model="text-embedding-3-small",
    api_key="${OPENAI_API_KEY}",
)

# Azure OpenAI
provider = OpenAIEmbeddingProvider(
    model="text-embedding-3-small",
    azure_endpoint="https://xxx.openai.azure.com",
    azure_deployment="my-embedding",
    api_key="${AZURE_OPENAI_KEY}",
)
embed(texts) async

Embed texts via OpenAI API.

InMemoryCacheBackend

promptise.cache.InMemoryCacheBackend

In-memory cache with numpy-based similarity search.

Stores entries per scope with LRU eviction. Thread-safe via asyncio (single event loop). No persistence — lost on restart.

Parameters:

Name Type Description Default
max_entries_per_scope int

Max entries per scope partition.

1000
max_total_entries int

Max entries across all scopes.

100000
invalidate(scope_key, pattern=None) async

Evict entries matching a pattern (or all for scope).

purge_user(user_id) async

Remove all entries for a user (GDPR compliance).

search(scope_key, embedding, threshold) async

Find the best matching entry above threshold.

store(scope_key, entry) async

Store an entry, evicting LRU if at capacity.

RedisCacheBackend

promptise.cache.RedisCacheBackend

Redis-backed cache with vector similarity search.

Stores cache entries as JSON in Redis hashes. Embeddings are stored per scope and similarity is computed by fetching all embeddings for a scope and running numpy dot product locally. This avoids requiring the RediSearch module while keeping similarity search functional.

Optional AES encryption at rest via encrypt_values=True. Encryption key is read from PROMPTISE_CACHE_KEY env var or auto-generated per process.

Parameters:

Name Type Description Default
redis_url str

Redis connection URL (e.g. redis://localhost:6379).

'redis://localhost:6379'
max_entries_per_scope int

Max entries per scope partition.

1000
max_total_entries int

Max entries across all scopes.

100000
encrypt_values bool

Encrypt cached response values at rest.

False
close() async

Close the Redis connection.

invalidate(scope_key, pattern=None) async

Evict entries for a scope.

purge_user(user_id) async

Remove all entries for a user (GDPR compliance).

search(scope_key, embedding, threshold) async

Find the best matching entry above threshold.

store(scope_key, entry) async

Store an entry in Redis.


Event Notifications

Webhook + callback sinks for structured agent events (invocation errors, tool failures, budget violations, etc).

AgentEvent

promptise.events.AgentEvent dataclass

A structured notification event from the Promptise framework.

Attributes:

Name Type Description
event_type str

Dotted event name (e.g. "invocation.complete").

severity str

One of "info", "warning", "error", "critical".

timestamp float

When the event occurred (time.time()).

agent_id str | None

Agent or process identifier.

user_id str | None

User who triggered the action (from CallerContext).

session_id str | None

Conversation session ID if applicable.

data dict[str, Any]

Event-specific payload (tool name, error message, etc.).

metadata dict[str, Any]

Agent configuration, model ID, etc.

compute_hmac(secret)

Compute HMAC-SHA256 signature for webhook verification.

to_dict()

Serialize to a JSON-safe dict.

EventSink

promptise.events.EventSink

Bases: Protocol

Protocol for event notification sinks.

Sinks receive events from the :class:EventNotifier and deliver them to external systems (webhooks, logs, callbacks, etc.).

emit(event) async

Deliver a single event.

EventNotifier

promptise.events.EventNotifier

Central event coordinator that routes events to configured sinks.

Events are placed on an async queue and delivered by a background task. The agent never blocks waiting for event delivery.

Parameters:

Name Type Description Default
sinks list[EventSink]

List of :class:EventSink implementations.

required
max_queue_size int

Maximum events in the delivery queue. When full, new events are dropped with a warning.

1000

Example::

notifier = EventNotifier(sinks=[
    WebhookSink("https://hooks.slack.com/...", events=["invocation.error"]),
    CallbackSink(my_handler),
])
await notifier.start()
notifier.emit_sync(AgentEvent(event_type="invocation.start", severity="info"))
await notifier.stop()
emit(event) async

Queue an event for delivery (non-blocking).

If the queue is full, the event is dropped with a warning log.

emit_sync(event)

Queue an event from a synchronous context.

Used by the LangChain callback handler (which is synchronous). If the queue is full, the event is silently dropped.

start() async

Start the background event delivery task.

stop() async

Drain remaining events and stop the background task.

WebhookSink

promptise.events.WebhookSink

Deliver events via HTTP POST to a webhook URL.

Features: HMAC-SHA256 signing, retry with exponential backoff, SSRF protection, per-event filtering, payload redaction.

Parameters:

Name Type Description Default
url str

Webhook URL to POST events to.

required
events list[str] | None

Event types to subscribe to (None = all events).

None
headers dict[str, str] | None

Custom HTTP headers (e.g. auth tokens).

None
secret str | None

HMAC secret for signing payloads. If not provided, a random secret is generated.

None
max_retries int

Maximum retry attempts on failure.

3
retry_delay float

Initial retry delay in seconds (doubles each retry).

1.0
redact_sensitive bool

Scan payloads for PII/credentials before sending.

True
min_severity str | None

Minimum severity level to emit.

None
close() async

Release the persistent HTTP connection pool.

emit(event) async

POST the event to the webhook URL with retries.

CallbackSink

promptise.events.CallbackSink

Deliver events to a Python callable.

Parameters:

Name Type Description Default
callback Callable[..., Any]

Async or sync callable that receives an :class:AgentEvent.

required
events list[str] | None

Event types to subscribe to (None = all events).

None
min_severity str | None

Minimum severity level to emit.

None
emit(event) async

Call the callback with the event.

LogSink

promptise.events.LogSink

Deliver events to Python's logging system.

Parameters:

Name Type Description Default
events list[str] | None

Event types to subscribe to (None = all events).

None
logger_name str

Logger name (default: "promptise.events").

'promptise.events'
min_severity str | None

Minimum severity level to emit.

None
emit(event) async

Log the event as a structured JSON line.

EventBusSink

promptise.events.EventBusSink

Bridge events to the runtime's EventBus for inter-process notifications.

Parameters:

Name Type Description Default
event_bus Any

Any object with an emit(event_type, data) method.

required
events list[str] | None

Event types to subscribe to (None = all events).

None
emit(event) async

Publish the event to the EventBus.

default_pii_sanitizer

promptise.events.default_pii_sanitizer(data)

Redact PII and credentials from a data dictionary.

Serialises the dict to JSON, applies regex patterns, and deserialises back. Safe to call on any dict — returns the original on serialization failure.

Parameters:

Name Type Description Default
data dict[str, Any]

Arbitrary dict (event payload, observability metadata, etc.).

required

Returns:

Type Description
dict[str, Any]

A new dict with sensitive values replaced by placeholders.


Model Fallback

Automatic provider failover. Uses a circuit breaker to route around unhealthy providers without adding latency.

FallbackChain

promptise.fallback.FallbackChain

Bases: BaseChatModel

Chain of LLM models with automatic failover.

Tries models in order. If one fails (exception, timeout), the next is tried. Each model has an independent circuit breaker — after failure_threshold consecutive failures, the model is skipped for recovery_timeout seconds before being tested again.

Passes through to build_agent(model=...) seamlessly — it's a BaseChatModel subclass, so LangChain treats it like any other model.

Parameters:

Name Type Description Default
models Sequence[str | BaseChatModel] | None

Ordered list of model identifiers (strings like "openai:gpt-5-mini") or BaseChatModel instances. First model is primary, rest are fallbacks.

None
timeout_per_model float

Maximum seconds per model attempt. 0 = no per-model timeout (use provider default).

0
global_timeout float

Maximum seconds across ALL attempts combined. 0 = unlimited (each model gets its full timeout).

0
failure_threshold int

Consecutive failures before a model's circuit breaker opens. Default: 3.

3
recovery_timeout float

Seconds before a tripped circuit breaker allows a test request. Default: 60.

60.0
on_fallback Any

Optional callback (primary_model, fallback_model, error) called each time a fallback is activated.

None

Raises:

Type Description
ValueError

If models is empty.

RuntimeError

If all models in the chain fail.

active_model property

Return the first non-skipped model's name.

model_name property

Return the model that last served a request.

Before any request is made, returns the primary model's name. After a request, returns the model that actually served it — this is what observability and cache use.

get_chain_status()

Get the health status of each model in the chain.

Returns:

Type Description
list[dict[str, Any]]

List of dicts with model_id, state (closed/open/half_open),

list[dict[str, Any]]

failures, and is_primary.


Security Guardrails

Multi-head security scanner: prompt injection (DeBERTa ML), PII detection (69 regex patterns), credential detection (96 patterns), NER (GLiNER), content safety, custom rules.

PromptiseSecurityScanner

promptise.guardrails.PromptiseSecurityScanner

Unified security scanner for agent input and output.

Compose detection heads to build exactly the scanner you need. Each head is a standalone config object — plug in what matters, leave out what doesn't.

Composable API (recommended)::

from promptise.guardrails import (
    PromptiseSecurityScanner,
    InjectionDetector,
    PIIDetector,
    CredentialDetector,
    ContentSafetyDetector,
    NERDetector,
    CustomRule,
)

scanner = PromptiseSecurityScanner(
    detectors=[
        InjectionDetector(),
        PIIDetector(categories={PIICategory.CREDIT_CARDS, PIICategory.SSN}),
        CredentialDetector(categories={CredentialCategory.AWS}),
    ],
    custom_rules=[
        CustomRule(name="internal_id", pattern=r"INT-\d{8}"),
    ],
)

One-liner defaults (all heads enabled)::

scanner = PromptiseSecurityScanner.default()

Flat API (backward compatible)::

scanner = PromptiseSecurityScanner(
    detect_injection=True,
    detect_pii={PIICategory.CREDIT_CARDS, PIICategory.SSN},
    detect_credentials={CredentialCategory.AWS},
)

Parameters:

Name Type Description Default
detectors list[Any] | None

List of detector instances to enable.

None
custom_rules list[CustomRule | dict[str, Any]] | None

List of :class:CustomRule instances.

None
detect_injection bool

(flat API) Enable injection detection.

True
detect_pii bool | set[PIICategory]

(flat API) Enable PII detection.

True
detect_toxicity bool

(flat API) Enable toxicity detection.

True
detect_credentials bool | set[CredentialCategory]

(flat API) Enable credential detection.

True
check_input(text) async

Scan input text. Raises :class:GuardrailViolation on block.

Called by the agent before any processing (memory, tools, LLM).

check_output(output) async

Scan output text. Redacts PII/credentials. Blocks on injection.

Called by the agent after the LLM response, before returning.

default() classmethod

Create a scanner with all detection heads enabled (defaults).

Equivalent to::

PromptiseSecurityScanner(detectors=[
    InjectionDetector(),
    PIIDetector(),
    CredentialDetector(),
])
list_credential_patterns() staticmethod

Return names of all built-in credential patterns.

list_pii_patterns() staticmethod

Return names of all built-in PII patterns.

scan_text(text, *, direction='input') async

Run all enabled detection heads on the given text.

Parameters:

Name Type Description Default
text str

Text to scan.

required
direction str

"input" or "output" — affects default actions.

'input'

Returns:

Name Type Description
A ScanReport

class:ScanReport with all findings.

warmup()

Pre-load ML models so the first scan is fast.

Call this at startup to avoid download/load latency on the first message. Safe to call multiple times (models are cached).

Example::

scanner = PromptiseSecurityScanner()
scanner.warmup()  # downloads + loads models NOW
agent = await build_agent(..., guardrails=scanner)

SecurityFinding

promptise.guardrails.SecurityFinding dataclass

A single detection result from a scanner.

Attributes:

Name Type Description
detector str

Which detection head found this (injection/pii/toxicity/credential).

category str

Specific sub-category (e.g. "credit_card_visa", "aws_access_key").

severity Severity

How severe this finding is.

confidence float

Model confidence or 1.0 for regex matches.

matched_text str

The text span that matched.

start int

Character offset in original text.

end int

Character offset in original text.

action Action

What should happen (block/redact/warn).

description str

Human-readable explanation.

metadata dict[str, Any]

Extra information (model scores, pattern name, etc.).

ScanReport

promptise.guardrails.ScanReport dataclass

Complete scan result with all findings and metadata.

Attributes:

Name Type Description
passed bool

True if no findings have action=BLOCK.

findings list[SecurityFinding]

All detections from all scanners.

duration_ms float

Total scan time in milliseconds.

scanners_run list[str]

Which detection heads ran.

text_length int

Length of scanned text.

redacted_text str | None

Text with PII/credentials replaced (output scans).

blocked property

Findings that caused a block.

redacted property

Findings that were redacted.

warnings property

Findings that are warnings only.

PIICategory

promptise.guardrails.PIICategory

Bases: str, Enum

PII detection categories. Pass a set of these to PromptiseSecurityScanner(enable_pii={...}) to enable only specific PII types.

Example::

scanner = PromptiseSecurityScanner(
    enable_pii={PIICategory.CREDIT_CARDS, PIICategory.SSN, PIICategory.EMAIL},
)

CredentialCategory

promptise.guardrails.CredentialCategory

Bases: str, Enum

Credential detection categories. Pass a set of these to PromptiseSecurityScanner(enable_credentials={...}).

Example::

scanner = PromptiseSecurityScanner(
    enable_credentials={
        CredentialCategory.AWS,
        CredentialCategory.OPENAI,
        CredentialCategory.GITHUB,
    },
)

Severity

promptise.guardrails.Severity

Bases: str, Enum

Severity level of a security finding.

Action

promptise.guardrails.Action

Bases: str, Enum

Action to take when a finding is detected.


Streaming

StreamEvent

promptise.streaming.StreamEvent dataclass

Base event yielded by astream_with_tools().

All events have a type string and a timestamp. Use to_dict() for JSON serialization or to_json() for SSE.

Attributes:

Name Type Description
type str

Event type identifier.

timestamp float

Monotonic time when the event was created.

to_dict()

Serialize to a JSON-safe dict.

to_json()

Serialize to a JSON string (for SSE data: lines).

ToolStartEvent

promptise.streaming.ToolStartEvent dataclass

Bases: StreamEvent

A tool has started executing.

Attributes:

Name Type Description
tool_name str

Raw MCP tool name (e.g. "search_customers").

tool_display_name str

Human-readable name (e.g. "Searching customers").

arguments dict[str, Any]

Tool arguments (redacted if guardrails active).

tool_index int

0-based index of tool calls in this invocation.

ToolEndEvent

promptise.streaming.ToolEndEvent dataclass

Bases: StreamEvent

A tool has finished executing.

Attributes:

Name Type Description
tool_name str

Raw MCP tool name.

tool_summary str

One-line summary of the result.

duration_ms float

Execution time in milliseconds.

success bool

Whether the tool completed without error.

tool_index int

0-based index of this tool call.

TokenEvent

promptise.streaming.TokenEvent dataclass

Bases: StreamEvent

An LLM token has been generated.

Attributes:

Name Type Description
text str

The token text.

cumulative_text str

All text generated so far in this invocation.

DoneEvent

promptise.streaming.DoneEvent dataclass

Bases: StreamEvent

The agent has finished processing.

Attributes:

Name Type Description
full_response str

Complete response text (post-guardrail redaction).

tool_calls list[dict[str, Any]]

Summary of all tool calls made.

duration_ms float

Total invocation time in milliseconds.

cache_hit bool

Whether the response came from cache.

ErrorEvent

promptise.streaming.ErrorEvent dataclass

Bases: StreamEvent

An error occurred during processing.

The message is always generic — internal details are never exposed.

Attributes:

Name Type Description
message str

Human-readable error description.

recoverable bool

Whether the agent might retry.


Tool Optimization

Static schema minification + semantic tool selection (local embeddings) to cut prompt tokens by 40-70% on agents with many tools.

OptimizationLevel

promptise.tool_optimization.OptimizationLevel

Bases: str, Enum

Preset optimization levels.

  • MINIMAL — schema minification + description truncation.
  • STANDARD — deeper minification + nested description stripping.
  • SEMANTIC — all static optimizations + per-invocation semantic tool selection.

ToolOptimizationConfig

promptise.tool_optimization.ToolOptimizationConfig dataclass

Configuration for MCP tool token optimization.

Pass a preset :attr:level for sensible defaults, or override individual settings for fine-grained control. Any field set explicitly takes precedence over the preset.

Parameters:

Name Type Description Default
level OptimizationLevel | None

Preset optimization level.

None
minify_schema bool | None

Strip description from Pydantic Field metadata.

None
max_description_length int | None

Truncate tool descriptions at N chars.

None
strip_nested_descriptions bool | None

Remove descriptions from nested model fields (keeps top-level field descriptions).

None
max_schema_depth int | None

Flatten nested objects beyond this depth to dict. None means no limit.

None
semantic_selection bool | None

Enable per-invocation semantic tool selection.

None
semantic_top_k int | None

Number of most-relevant tools to select per invocation.

None
always_include_fallback bool | None

Include a request_more_tools fallback tool when semantic selection is active.

None
embedding_model str | None

Model name or local path for sentence-transformers. Defaults to "all-MiniLM-L6-v2" (downloaded once, then cached in ~/.cache/huggingface/). Point to a local directory for fully-offline / air-gapped deployments::

embedding_model="/models/all-MiniLM-L6-v2"
None
preserve_tools set[str] | None

Tool names that are never optimized and always included in semantic selection.

None

Adaptive Strategy

Failure classification and strategy learning. Agents track past failures, categorize them, and adjust behavior.

FailureCategory

promptise.strategy.FailureCategory

Bases: str, Enum

Classification of a tool failure.

FailureLog

promptise.strategy.FailureLog dataclass

A single tool failure record.

Attributes:

Name Type Description
tool_name str

Name of the tool that failed.

error_type str

Exception class name (e.g. "ValidationError").

error_message str

Error message (truncated to 500 chars).

category FailureCategory

Classified failure category.

args_preview str

Truncated preview of the failed arguments.

timestamp float

When the failure occurred.

confidence float

Classification confidence (0.0–1.0).

invocation_id str | None

Optional invocation identifier.

AdaptiveStrategyConfig

promptise.strategy.AdaptiveStrategyConfig dataclass

Configuration for the adaptive strategy system.

Attributes:

Name Type Description
enabled bool

Enable adaptive learning (default: disabled).

synthesis_threshold int

Number of strategy failures before triggering LLM synthesis. Infrastructure failures don't count.

synthesis_model str | None

LLM model ID for synthesis/verification. Defaults to the agent's own model.

max_strategies int

Maximum stored strategies (oldest dropped first).

auto_cleanup bool

Delete raw failure logs after synthesis.

strategy_ttl int

Strategy expiry in seconds (0 = never expire).

failure_retention int

Maximum raw failure logs to keep.

verify_human_feedback bool

Use LLM-as-judge to verify human corrections.

feedback_rate_limit int

Max corrections per hour per user.

scope str

Strategy isolation — "per_user" (default), "shared", or "per_session".

AdaptiveStrategyManager

promptise.strategy.AdaptiveStrategyManager

Manages adaptive strategy learning for an agent.

Sits on top of a :class:MemoryProvider and handles failure recording, classification, strategy synthesis, human feedback verification, and strategy retrieval.

Parameters:

Name Type Description Default
config AdaptiveStrategyConfig

Adaptive strategy configuration.

required
memory Any

The agent's memory provider for storing strategies.

required
agent_model str | None

Default LLM model ID for synthesis/verification.

None
guardrails Any | None

Optional guardrails scanner for feedback validation.

None
config property

The adaptive strategy configuration.

format_strategy_block(strategies)

Format strategies for injection into the agent's system prompt.

Wraps in <strategy_context> fences with anti-injection disclaimer.

get_relevant_strategies(query, *, limit=3) async

Search for strategies relevant to the current query.

Returns strategies sorted by confidence (highest first). Expired strategies (past TTL) are excluded.

record_failure(failure) async

Record a tool failure. Only strategy failures are stored.

Infrastructure failures (MCP server down, network errors) are skipped — the agent shouldn't learn from infra problems. Unknown failures are stored with low confidence.

record_human_correction(correction, *, evidence=None, sender_id=None) async

Process a human correction ("you did this wrong").

Validates the correction against guardrails and optionally verifies it via LLM-as-judge before storing.

Parameters:

Name Type Description Default
correction str

The human's feedback text.

required
evidence dict[str, Any] | None

Tool call history and output for verification.

None
sender_id str | None

Who sent the correction (for rate limiting + audit).

None

Returns:

Type Description
bool

True if the correction was accepted, False if rejected.

synthesize() async

Synthesize strategies from accumulated failure logs.

Asks the LLM to reflect on recent failures and produce actionable strategies. Returns the number of strategies created.

classify_failure

promptise.strategy.classify_failure(error_type, error_message)

Classify an error as infrastructure, strategy, or unknown.

Deterministic — no LLM call. Based on error type name and message pattern matching.

Parameters:

Name Type Description Default
error_type str

The exception class name.

required
error_message str

The error message text.

required

Returns:

Type Description
FailureCategory

The classified :class:FailureCategory.


Context Engine

Token budgeting across multiple context layers with model-aware window detection.

ContextLayer

promptise.context_engine.ContextLayer dataclass

A single context source in the assembly pipeline.

Attributes:

Name Type Description
name str

Unique layer identifier (e.g. "memory", "strategies").

priority int

Assembly priority (0-10). Higher = kept longer when trimming. Required layers (priority >= 9) are never dropped.

content str

The text content for this layer.

required bool

If True, this layer is never trimmed (overrides priority).

trim_strategy str

How to trim this layer — "truncate" (cut from end) or "oldest_first" (for conversation history, removes oldest messages first).

metadata dict[str, Any]

Developer-provided metadata for reporting.

token_estimate property

Rough token estimate (chars / 3.5) for quick checks.

ContextReport

promptise.context_engine.ContextReport dataclass

Report of context assembly — what was included, trimmed, and total usage.

Attributes:

Name Type Description
total_tokens int

Total tokens in the assembled context.

budget int

The model's context window minus response reserve.

layers list[dict[str, Any]]

Per-layer token counts and trim status.

trimmed_layers list[str]

Layers that were trimmed or dropped.

utilization float

Fraction of budget used (0.0-1.0).

utilization property

Fraction of budget used.

Tokenizer

promptise.context_engine.Tokenizer

Bases: Protocol

Protocol for token counters.

Implementations must provide a count(text) -> int method that returns the exact (or best-estimate) token count for a string.


Callback Handler

Bridges LangChain callbacks into the observability collector.

PromptiseCallbackHandler

promptise.callback_handler.PromptiseCallbackHandler

Bases: BaseCallbackHandler

LangChain callback handler → ObservabilityCollector bridge.

Captures every LLM turn, tool call, token count, latency, retry, and error. Designed to be the only integration point needed between LangChain's event system and Promptise's observability.

get_summary()

Return a summary of all tracked metrics.

on_llm_new_token(token, *, run_id, **kwargs)

Capture streaming tokens (FULL level only).


Exceptions

SuperAgentError

promptise.exceptions.SuperAgentError

Bases: RuntimeError

Base exception for all SuperAgent-related errors.

This is the base class for all exceptions raised by the .superagent file loading and validation system. Catching this exception will catch all SuperAgent-specific errors.

Examples:

>>> raise SuperAgentError("Failed to load configuration")

SuperAgentValidationError

promptise.exceptions.SuperAgentValidationError

Bases: SuperAgentError

Raised when .superagent schema validation fails.

This exception is raised when a .superagent file fails Pydantic schema validation. It includes detailed error information to help users fix the configuration.

Attributes:

Name Type Description
errors

List of Pydantic validation errors with location and message.

file_path

Path to the file that failed validation.

Examples:

>>> raise SuperAgentValidationError(
...     "Schema validation failed",
...     errors=[{"loc": ["agent", "model"], "msg": "field required"}],
...     file_path="/path/to/agent.superagent"
... )
__init__(message, *, errors=None, file_path=None)

Initialize validation error with details.

Parameters:

Name Type Description Default
message str

Human-readable error message.

required
errors list[dict[str, Any]] | None

List of Pydantic validation errors (from ValidationError.errors()).

None
file_path str | None

Path to the configuration file that failed validation.

None
__str__()

Format validation errors for display.

Returns:

Type Description
str

Formatted error message including file path and detailed validation errors.

EnvVarNotFoundError

promptise.exceptions.EnvVarNotFoundError

Bases: SuperAgentError

Raised when a required environment variable is not found.

This exception is raised when a configuration references an environment variable using ${VAR_NAME} syntax, but the variable is not set in the environment.

Attributes:

Name Type Description
var_name

The name of the missing environment variable.

context

Optional context about where the variable was referenced (e.g., "servers.math.url").

Examples:

>>> raise EnvVarNotFoundError("OPENAI_API_KEY", context="agent.model.api_key")
__init__(var_name, context=None)

Initialize environment variable error.

Parameters:

Name Type Description Default
var_name str

Name of the missing environment variable.

required
context str | None

Optional context string indicating where the variable was referenced (e.g., field path in configuration).

None

Type Definitions

ModelLike

promptise.types.ModelLike = str | BaseChatModel | Runnable[Any, Any] module-attribute


System Prompt

DEFAULT_SYSTEM_PROMPT

promptise.prompt.DEFAULT_SYSTEM_PROMPT = 'You are a capable deep agent. Use available tools from connected MCP servers to plan and execute tasks. Always inspect tool descriptions and input schemas before calling them. Be precise and avoid hallucinating tool arguments. Prefer calling tools rather than guessing, and cite results from tools clearly.' module-attribute