Replay the failed Python run, not just the stack trace.

Retrace records a failed Python test or CI run as a deterministic replay. Open it in VS Code, step backwards from the failure, and inspect the runtime state that actually happened.

Pip install retracesoftware

Open source · CPython 3.11+ · Pytest in one command · VS Code replay

A failed pytest run replayed in VS Code. Step backwards from the exception and inspect the value that caused it.

Failed runs become artifacts

Keep failed pytest and CI runs as .retrace files.

See benchmarks →

Replay in VS Code

Open the failed run locally and debug the execution that actually happened.

Step backwards

Move backwards from the exception to the runtime state that caused it.

Runtime facts for AI agents

Give your AI coding agent real stack frames and values — not just logs and tracebacks.

No test rewrite

Wrap your existing pytest command. No special test harness required.

Production path

Start in CI. Use the same recording model for production failures when ready.

QUICK START
Replay a failed pytest run.
Run your tests normally. If they fail, keep the failed execution as a replayable .retrace artifact.
Step 1 - install
Shell
python -m venv .venv
source .venv/bin/activate

python -m pip install retracesoftware

No app rewrite required.
Step 2 - record pytest
Run your app normally with one environment variable.
Record
mkdir -p recordings
RETRACE_RECORDING=recordings/failed-run.retrace \

  python -m pytest

If the test passes, discard the recording.

If it fails, keep it.
Step 3 - replay
Open the recording in VS Code and debug the original execution.
Replay
code .

# Open recordings/failed-run.retrace
# Start replay from the Retrace sidebar
# Step backwards from the failure

No live test process required. You are debugging the recorded execution.
Try the 10-Minute Demo. Want to see this end-to-end with a real example?
CI artifact
Keep failed CI runs as replayable artifacts.

Run pytest under Retrace in CI. If the job passes, ignore the recording. If it fails, upload the `.retrace` file as a build artifact.

Now the failed run does not disappear when the CI process exits. A developer can replay it locally in VS Code, step backwards from the failure, and inspect the runtime state that caused it.

An AI coding agent can use the same artifact as runtime context instead of guessing from logs and a traceback.

1. pytest in CI
2. failure
3. .retrace artifact
4. VS Code replay
Works as a plain CI artifact. No platform-specific plugin required.
GitHub Actions snippet
- name: Run pytest with Retrace
  run: |
    mkdir -p recordings
    RETRACE_RECORDING=recordings/failed-run.retrace python -m pytest

- name: Upload Retrace recording
  if: failure()
  uses: actions/upload-artifact@v4
  with:
    name: retrace-failed-run
    path: recordings/failed-run.retrace
What makes this different
A stack trace tells you where Python crashed.
Retrace lets you replay the run that crashed.
Today With Retrace
CI artifacts CI artifacts are logs and tracebacks CI artifact is replayable
AI agents AI agents infer from partial context AI agents get runtime evidence
Failure Stack trace shows where it crashed Replay shows what happened before
What gets preserved Logs show what you predicted would matter Retrace preserves the failed execution
The failed execution becomes something you can inspect, replay, and share.
Production
Production is the destination.

Same recording model, bigger stakes.

The `.retrace` artifact from a failed test uses the same architecture as a production crash replay. Start with tests today. Run the same tool against production when the trust is there.

Retrace records the boundary between your Python code and the outside world — databases, APIs, files, time, randomness, and other non-deterministic calls — then replays those results locally.

How it works

1. Pyton code

2. Boundary calls

DB
API
Files
Time
Randomness

3. .retrace recording

CALL
RESULT
ERROR

4. Local replay

Same code, external cals stubbed
Thread ordering preserved
PROVENANCE ENGINE · EARLY ACCESS
Replay shows you what happened.
Provenance shows you why.

A recording lets you step through the execution. But when you're staring at a wrong value, the real question is: where did it come from?

Retrace's provenance engine traces any value back through the execution — from the point you noticed it, through every transformation, to the original input that caused it.

  • Select any value. Jump to its origin.

    Click a variable in the debugger and instantly see the exact line and inputs that produced it.

  • Chain backwards through transformations.

    Each origin has its own provenance. Keep drilling back until you reach the root cause.

  • Works on every value, not just outputs.

    Intermediate variables, function returns, container mutations — provenance covers everything in the execution.

Now in early access with select design partners.

 

PROVENANCE DRILLBACK

 


Three clicks from ZeroDivisionError to root cause: the API caller sent qty: "0" in the request body. No manual searching. No log correlation.

How it works.
1
RUN
Run pytest, CI, or your Python app with Retrace enabled.
2
RECORD
Retrace records the execution into a `.retrace` artifact.

run-2025-05-05.retrace

~ Single source of truth

3
REPLAY
Open the artifact locally in VS Code and debug the same execution.
4
STEP BACKWARDS
Move backwards from the failure and inspect the runtime state that caused it.
Rich runtime context for AI agents and root cause analysis.
Deterministic
& reproducible
Works locally
offline
Shareable
artifact
Perfect context
for AI
Q&A Section
Getting started
Can I use this with pytest?
Yes. The first workflow is wrapping an existing pytest command and keeping the `.retrace` file when the run fails.
Do I need to change my tests?
No. The goal is to wrap the command you already run.
What Python versions work?
CPython 3.11–3.14.
CI, AI, production
Can I use this in CI?
Yes. Start by uploading the `.retrace` file as a normal CI artifact on failure.
How does this help AI coding agents?
Agents normally see source code, logs, and tracebacks. Retrace gives them runtime evidence from the failed execution.
Can I run Retrace in production?
That is the destination. Tests and CI are the easiest place to start; the same recording model applies to production failures.
How It’s Different
Why can’t I just re-run the request?
Because many production failures depend on timing, concurrency, external services, or non-deterministic behavior.

A re-run often takes a different path.

Retrace lets you debug the exact execution that happened, after the fact.
How is this different from logging or APM tools?
Logs/APM show symptoms and depend on what you instrument. They can’t reconstruct past state.

Retrace records the real execution and lets you replay it deterministically, so you can inspect the actual code path and state.
Can it catch race conditions and flaky tests?
Yes. Retrace captures timing and thread interactions and replays them deterministically. This helps reproduce race conditions and flaky CI failures by replaying the run that failed.
Open Source & Community
Is it really open source?
Retrace is open source and built for Python developers.
Why is this a preview release?
We’re opening the agent early to gather feedback while we expand Python/library coverage and harden for GA.
How can I contribute?
Try the agent, file issues, and submit PRs on GitHub. Library compatibility reports and docs fixes are great first contributions.
How does it work?

Retrace records external interactions (DB, API calls, file I/O, time) during a real run, then replays them deterministically in your local debugger — no prod access needed.

App runs normally
 
Your Production App running normally
External calls captured automatically
Bug happens Retrace captures it
Debug the exact execution locally
Debug Locally Replay in VSCode
Use cases

Perfect for:

  • Debug failed CI runs.
    Debug production-only bugs you can’t reproduce
    Replay the exact execution that already happened. No repro steps required.
  • Help AI agents debug broken tests.
    Give coding agents runtime evidence from the failed run, not just a traceback.
  • Stabilise flaky tests.
    Replay the exact failure to understand non-deterministic behaviour.
  • Reproduce external dependency failures.
    Replay failures involving APIs, databases, files, time, or other external calls.
  • Investigate after the fact.
    Inspect real code paths and runtime state after the process has exited.
  • Debug production-only failures.
    Use the same recording model for production crashes you cannot reproduce locally.

 

Built by Retrace Software.

Backed by Preston-Werner Ventures.

 

Get launch updates

One email per month. Demos + release notes. Unsubscribe anytime.

Ready to replay your next failed Python test?
Run pytest with Retrace, keep the failed execution, and debug what actually happened.

Open source · CPython 3.11–3.14 · Pytest in one command · VS Code replay