Skip to content

Process Lifecycle

The ProcessLifecycle class implements a thread-safe, auditable state machine that governs every AgentProcess. It defines the valid states a process can be in, the legal transitions between them, and records every transition with timestamps and reasons.

from promptise.runtime.lifecycle import ProcessLifecycle, ProcessState

lc = ProcessLifecycle()
assert lc.state == ProcessState.CREATED

await lc.transition(ProcessState.STARTING, reason="user requested start")
await lc.transition(ProcessState.RUNNING)

assert len(lc.history) == 2

Concepts

Agent processes follow a strict state machine. Invalid transitions are rejected at the API level, preventing processes from entering inconsistent states. Every transition is recorded as a ProcessTransition with a timestamp, reason, and optional metadata -- creating a complete audit trail of what happened and why.


State Diagram

CREATED ----> STARTING ----> RUNNING ----> STOPPING ----> STOPPED
                 |              |  ^           ^
                 |              |  |           |
                 v              v  |           |
               FAILED      SUSPENDED       FAILED ----> STARTING (restart)
                              |
                              v
                          AWAITING ----> RUNNING (trigger fires)

Process States

State Description
CREATED Initial state after registration. Not yet started.
STARTING Start sequence in progress (building agent, connecting MCP servers).
RUNNING Active and processing triggers.
SUSPENDED Paused due to idle timeout. Can resume.
AWAITING Waiting for next trigger event. A sub-state of running.
STOPPING Graceful shutdown in progress.
STOPPED Fully stopped. Can be restarted.
FAILED Encountered a fatal error. Can be restarted.

Valid Transitions

From Allowed Targets
CREATED STARTING, STOPPED
STARTING RUNNING, FAILED, STOPPING
RUNNING SUSPENDED, STOPPING, FAILED, AWAITING
SUSPENDED RUNNING, STOPPING, FAILED
AWAITING RUNNING, STOPPING, FAILED
STOPPING STOPPED, FAILED
STOPPED STARTING
FAILED STARTING

Any transition not in this table raises a StateError.


Using ProcessLifecycle

Creating and transitioning

from promptise.runtime.lifecycle import ProcessLifecycle, ProcessState

lc = ProcessLifecycle()  # starts in CREATED

# Transition with a reason
await lc.transition(ProcessState.STARTING, reason="CLI start command")
await lc.transition(ProcessState.RUNNING, reason="agent built successfully")

# Check current state
assert lc.state == ProcessState.RUNNING

Checking validity before transitioning

if lc.can_transition(ProcessState.SUSPENDED):
    await lc.transition(ProcessState.SUSPENDED, reason="idle timeout")

Transition with metadata

await lc.transition(
    ProcessState.FAILED,
    reason="Max consecutive failures reached",
    metadata={"failures": 3},
)

Invalid transitions raise StateError

from promptise.runtime.lifecycle import StateError

lc = ProcessLifecycle()  # CREATED

try:
    await lc.transition(ProcessState.RUNNING)  # Invalid: must go through STARTING
except StateError as e:
    print(e)
    # Cannot transition from 'created' to 'running'.
    # Allowed targets: ['starting', 'stopped']

ProcessTransition

Every state transition is recorded as a ProcessTransition dataclass:

Field Type Description
from_state ProcessState Previous state
to_state ProcessState New state
timestamp datetime When the transition occurred (UTC)
reason str Human-readable explanation
metadata dict[str, Any] Additional context (error details, etc.)
transition = await lc.transition(ProcessState.STOPPING, reason="user request")

print(transition.from_state)   # ProcessState.RUNNING
print(transition.to_state)     # ProcessState.STOPPING
print(transition.reason)       # "user request"
print(transition.timestamp)    # datetime (UTC)

Serialization

data = transition.to_dict()
restored = ProcessTransition.from_dict(data)

Audit History

The full transition history is available as a list:

for t in lc.history:
    print(f"{t.from_state.value} -> {t.to_state.value}: {t.reason}")

Snapshotting and restoring

# Serialize current state + full history
snapshot = lc.snapshot()

# Restore from a snapshot (e.g., after crash recovery)
restored_lc = ProcessLifecycle.from_snapshot(snapshot)
assert restored_lc.state == lc.state
assert len(restored_lc.history) == len(lc.history)

Thread Safety

ProcessLifecycle uses an internal asyncio.Lock to ensure that concurrent transition attempts are serialized. This prevents race conditions when multiple triggers or external signals attempt to change the process state simultaneously.

# Safe for concurrent use from multiple coroutines
await asyncio.gather(
    lc.transition(ProcessState.SUSPENDED, reason="idle"),
    lc.transition(ProcessState.STOPPING, reason="shutdown"),
)
# Only one transition will succeed; the other raises StateError

API Summary

Method / Property Description
ProcessLifecycle(initial) Create with initial state (default: CREATED)
state Current ProcessState
history List of all ProcessTransition records
can_transition(target) Check if transitioning to target is valid
await transition(target, reason, metadata) Perform a state transition
snapshot() Serialize state and history to dict
ProcessLifecycle.from_snapshot(data) Restore from serialized dict

Tips and Gotchas

Always provide a reason

Transition reasons make debugging straightforward. Include context about who or what triggered the transition: "cron trigger fired", "max failures reached", "user stop command".

Use metadata for error details

When transitioning to FAILED, include the error details in metadata. This information flows into the journal for later analysis.

STOPPED and FAILED are restartable

Both states allow transitioning back to STARTING. This is intentional for the restart policy feature. If you need a process to stay dead, remove it from the runtime entirely.

AWAITING is not SUSPENDED

AWAITING means the process is healthy but waiting for its next trigger. SUSPENDED means the process was explicitly paused (idle timeout). They have different allowed transitions.


What's Next