Skip to content

Callback Handler

The PromptiseCallbackHandler is a LangChain callback handler that captures every LLM turn, tool call, token count, latency, retry, and error -- and forwards them to the Promptise ObservabilityCollector. It is the single integration point between LangChain's event system and Promptise's observability pipeline.

Source: src/promptise/callback_handler.py

Quick example

import asyncio
from promptise import build_agent
from promptise.callback_handler import PromptiseCallbackHandler
from promptise.observability import ObservabilityCollector

async def main():
    collector = ObservabilityCollector("demo-session")
    handler = PromptiseCallbackHandler(collector, agent_id="my-agent")

    agent = await build_agent(
        model="openai:gpt-5-mini",
        servers={},
    )
    result = await agent.ainvoke(
        {"messages": [{"role": "user", "content": "What is 2 + 2?"}]},
        config={"callbacks": [handler]},
    )

    print(result["messages"][-1].content)
    print(f"Tokens used: {handler.total_tokens}")
    print(handler.get_summary())

asyncio.run(main())

Concepts

How it works

The handler implements LangChain's BaseCallbackHandler interface. When attached to an agent's config["callbacks"], LangChain invokes the handler's on_* methods at each lifecycle event. The handler:

  1. Records timing -- starts a timer on each on_*_start event and computes latency on the corresponding on_*_end event.
  2. Extracts token usage -- reads prompt/completion/total token counts from LLMResult.llm_output and LangChain >=0.3 usage_metadata.
  3. Accumulates session totals -- maintains running counters for tokens, LLM calls, tool calls, errors, and retries.
  4. Forwards events -- sends structured timeline events to the ObservabilityCollector for later export or dashboard display.

Constructor parameters

handler = PromptiseCallbackHandler(
    collector,                      # ObservabilityCollector instance
    agent_id="my-agent",            # Optional agent identifier
    record_prompts=False,           # Whether to log prompt/response text
    level=ObserveLevel.STANDARD,    # Observation detail level
)

Observation levels

The level parameter controls how much detail the handler records:

Level Behaviour
ObserveLevel.BASIC Skips detailed LLM start events. Records end events, tools, errors.
ObserveLevel.STANDARD Records LLM start/end, tool start/end, errors. Default.
ObserveLevel.FULL Additionally captures individual streaming tokens.

Session counters

The handler maintains cumulative counters across all ainvoke() calls:

Counter Type Description
total_prompt_tokens int Total input tokens sent to the LLM
total_completion_tokens int Total output tokens generated by the LLM
total_tokens int Sum of prompt + completion tokens
llm_call_count int Number of LLM calls made
tool_call_count int Number of tool invocations
error_count int Number of errors (LLM, tool, or chain)
retry_count int Number of retry attempts

Event methods

The handler implements these LangChain callback methods:

LLM events

Method When fired What it records
on_llm_start LLM call begins Starts timer, increments llm_call_count, logs model name
on_llm_new_token Each streaming token arrives Accumulates tokens (FULL level only)
on_llm_end LLM call completes Latency, token counts, tool calls in response
on_llm_error LLM call fails Error type, message, traceback

Tool events

Method When fired What it records
on_tool_start Tool invocation begins Tool name, arguments, starts timer
on_tool_end Tool invocation completes Result preview, latency
on_tool_error Tool invocation fails Error type, message, traceback

Chain (agent-level) events

Method When fired What it records
on_chain_start Top-level agent invocation begins Input preview or length
on_chain_end Top-level agent invocation completes Session totals (tokens, call counts)
on_chain_error Top-level agent invocation fails Error type, message, traceback

Retry events

Method When fired What it records
on_retry A retry is attempted Attempt number, triggering error

Getting a summary

Call get_summary() to retrieve a dictionary of all accumulated metrics:

summary = handler.get_summary()
# {
#     "total_prompt_tokens": 150,
#     "total_completion_tokens": 42,
#     "total_tokens": 192,
#     "llm_call_count": 1,
#     "tool_call_count": 0,
#     "error_count": 0,
#     "retry_count": 0,
# }

API summary

PromptiseCallbackHandler

Parameter Type Default Description
collector ObservabilityCollector required The collector that receives timeline events
agent_id str \| None None Optional identifier for this agent
record_prompts bool False If True, log prompt text and response previews
level ObserveLevel STANDARD Detail level: BASIC, STANDARD, or FULL

get_summary()

Returns Type Description
summary dict[str, Any] Dictionary of all accumulated metrics

Tips and gotchas

Tip

Instantiate the handler once per agent and reuse it across multiple ainvoke() calls. Session counters accumulate automatically.

Warning

Setting record_prompts=True logs prompt text and response previews into the observability timeline. Avoid this in production if prompts contain sensitive data (PII, credentials, etc.).

Tip

The handler is synchronous (BaseCallbackHandler, not AsyncCallbackHandler). LangChain calls it from within the async chain on the event loop thread. This is safe because ObservabilityCollector is thread-safe.

What's next