Replaying Executions
Replaying executions is a core feature of ctx that allows you to re-run the sequence of actions recorded in a Context Pack. This process is used to verify the reproducibility of an agent's work, debug non-deterministic behavior, and ensure that changes to the environment or tool definitions haven't introduced "drift" into the agent's logic.
Replaying a Pack
To re-execute an agent run exactly as it was recorded, use the replay command with a pack hash or a ctx:// URI.
ctx replay <hash>
When you run a replay, the substrate:
- Loads the Manifest: Extracts the system prompt, user prompts, and input files.
- Sequential Execution: Iterates through the
stepsdefined in the pack. - Tool Invocation: Re-calls the tools with the original parameters recorded in the pack.
- Fidelity Comparison: Compares the new output of each step against the
OutputRefstored in the pack.
Example Output
Replaying Pack: 5f8e2a1b9c3d
Step 0: read_file [OK]
Step 1: search_docs [OK]
Step 2: write_summary [DIVERGED] - Output differs from recorded execution.
Fidelity: DEGRADED
Summary: 2/3 steps matched perfectly. 1 step produced different results.
Understanding Fidelity Levels
After a replay completes, ctx assigns a Fidelity score. This score indicates how closely the re-execution matched the original recording.
| Status | Exit Code | Description |
| :--- | :--- | :--- |
| Success | 0 | Every step produced identical output to the original run. |
| Degraded | 1 | All steps completed, but one or more steps produced different output (common in non-deterministic model calls). |
| Failed | 2 | One or more steps could not be executed (e.g., a missing tool, a network error, or a fatal environment mismatch). |
Integration with CI/CD
Because ctx replay provides specific exit codes based on execution fidelity, it is designed to be used in automated testing pipelines. You can use it to ensure that updates to your agent's codebase don't break the reproducibility of known "golden" execution packs.
# Example CI script
ctx replay $GOLDEN_PACK_HASH || {
echo "Execution drift detected!"
exit 1
}
How Replay Handles Steps
The ctx substrate uses the Steps array within the pack manifest to drive the replay. Each step includes:
- Type/Tool: The specific function or model call performed.
- Parameters: The exact arguments passed to the tool.
- Deterministic Flag: A hint to the replayer. If a step is marked as
Deterministic: true, any divergence in output during replay will be flagged as a significant fidelity loss. - Environment Context: Replays are executed using the local environment, but
ctxchecks the recordedEnvironment(OS, Runtime, Tool Versions) in the pack to warn you if your current setup differs significantly from the original.
Debugging Divergence
If a replay results in Degraded fidelity, you can use the ctx diff command to compare the original pack with a new pack generated from the replayed run. This helps identify exactly where the reasoning or tool output diverged.
# Pack the replayed run
ctx pack replayed_log.json
# Compare the original and the replay
ctx diff <original-hash> <replayed-hash> --human