Replay the failed Python run, not just the stack trace.

Retrace records a failed Python test or CI run as a deterministic replay. Open it in VS Code, step backwards from the failure, and inspect the runtime state that actually happened.

Pip install retracesoftware

Try the pytest quickstart

Watch the demo

Open source · Python 3.11 & 3.12 · Pytest in one command · VS Code replay

A failed pytest run replayed in VS Code. Step backwards from the exception and inspect the value that caused it.

Failed runs become artifacts

Keep failed pytest and CI runs as .retrace files.

See benchmarks →

Replay in VS Code

Open the failed run locally and debug the execution that actually happened.

Step backwards

Move backwards from the exception to the runtime state that caused it.

Runtime facts for AI agents

Give your AI coding agent real stack frames and values — not just logs and tracebacks.

No test rewrite

Wrap your existing pytest command. No special test harness required.

Production path

Start in CI. Use the same recording model for production failures when ready.

QUICK START

Replay a failed pytest run.

Run your tests normally. If they fail, keep the failed execution as a replayable .retrace artifact.

Step 1 - install

Shell

Copy code

							python -m venv .venv

							source .venv/bin/activate

							python -m pip install retracesoftware

							No app rewrite required.

Step 2 - record pytest

Run your app normally with one environment variable.

Record

Copy code

							mkdir -p recordings

							RETRACE_RECORDING=recordings/failed-run.retrace \

							  python -m pytest

							If the test passes, discard the recording.

							If it fails, keep it.

Step 3 - replay

Open the recording in VS Code and debug the original execution.

Replay

Copy code

							code .

							# Open recordings/failed-run.retrace

							# Start replay from the Retrace sidebar

							# Step backwards from the failure

							No live test process required. You are debugging the recorded execution.

Try the 10-Minute Demo. Want to see this end-to-end with a real example?

Try the pytest quickstart

CI artifact

Keep failed CI runs as replayable artifacts.

Run pytest under Retrace in CI. If the job passes, ignore the recording. If it fails, upload the `.retrace` file as a build artifact.

Now the failed run does not disappear when the CI process exits. A developer can replay it locally in VS Code, step backwards from the failure, and inspect the runtime state that caused it.

An AI coding agent can use the same artifact as runtime context instead of guessing from logs and a traceback.

1. pytest in CI

2. failure

3. .retrace artifact

4. VS Code replay

Works as a plain CI artifact. No platform-specific plugin required.

GitHub Actions snippet

Copy code

								- name: Run pytest with Retrace

								  run: |

								    mkdir -p recordings

								    RETRACE_RECORDING=recordings/failed-run.retrace python -m pytest

								- name: Upload Retrace recording

								  if: failure()

								  uses: actions/upload-artifact@v4

								  with:

								    name: retrace-failed-run

								    path: recordings/failed-run.retrace

What makes this different

A stack trace tells you where Python crashed.
Retrace lets you replay the run that crashed.

	Today	With Retrace
CI artifacts	CI artifacts are logs and tracebacks	CI artifact is replayable
AI agents	AI agents infer from partial context	AI agents get runtime evidence
Failure	Stack trace shows where it crashed	Replay shows what happened before
What gets preserved	Logs show what you predicted would matter	Retrace preserves the failed execution

The failed execution becomes something you can inspect, replay, and share.

Production

Production is the destination.

Same recording model, bigger stakes.

The `.retrace` artifact from a failed test uses the same architecture as a production crash replay. Start with tests today. Run the same tool against production when the trust is there.

Retrace records the boundary between your Python code and the outside world — databases, APIs, files, time, randomness, and other non-deterministic calls — then replays those results locally.

Read the architecture →

How it works

1. Python code

2. Boundary calls

API

Files

Time

Randomness

3. .retrace recording

CALL

RESULT

ERROR

4. Local replay

Same code, external calls stubbed

Thread ordering preserved

PROVENANCE ENGINE · EARLY ACCESS

Replay shows you what happened.
Provenance shows you why.

A recording lets you step through the execution. But when you're staring at a wrong value, the real question is: where did it come from?

Retrace's provenance engine traces any value back through the execution — from the point you noticed it, through every transformation, to the original input that caused it.

Select any value. Jump to its origin.

Click a variable in the debugger and instantly see the exact line and inputs that produced it.
Chain backwards through transformations.

Each origin has its own provenance. Keep drilling back until you reach the root cause.
Works on every value, not just outputs.

Intermediate variables, function returns, container mutations — provenance covers everything in the execution.

Now in early access with select design partners.

See how provenance works →

Request early access →

PROVENANCE DRILLBACK

Three clicks from ZeroDivisionError to root cause: the API caller sent qty: "0" in the request body. No manual searching. No log correlation.

How it works.

RUN

Run pytest, CI, or your Python app with Retrace enabled.

RECORD

Retrace records the execution into a `.retrace` artifact.

run-2025-05-05.retrace

~ Single source of truth

REPLAY

Open the artifact locally in VS Code and debug the same execution.

STEP BACKWARDS

Move backwards from the failure and inspect the runtime state that caused it.

Rich runtime context for AI agents and root cause analysis.

Deterministic
& reproducible

Works locally
offline

Shareable
artifact

Perfect context
for AI

Q&A Section

Getting started

Can I use this with pytest?

Yes. The first workflow is wrapping an existing pytest command and keeping the `.retrace` file when the run fails.

Do I need to change my tests?

No. The goal is to wrap the command you already run.

What Python versions work?

Python 3.11 & 3.12

CI, AI, production

Can I use this in CI?

Yes. Start by uploading the `.retrace` file as a normal CI artifact on failure.

How does this help AI coding agents?

Agents normally see source code, logs, and tracebacks. Retrace gives them runtime evidence from the failed execution.

Can I run Retrace in production?

That is the destination. Tests and CI are the easiest place to start; the same recording model applies to production failures.

How It’s Different

Why can’t I just re-run the request?

Because many production failures depend on timing, concurrency, external services, or non-deterministic behavior.

A re-run often takes a different path.

Retrace lets you debug the exact execution that happened, after the fact.

How is this different from logging or APM tools?

Logs/APM show symptoms and depend on what you instrument. They can’t reconstruct past state.

Retrace records the real execution and lets you replay it deterministically, so you can inspect the actual code path and state.

Can it catch race conditions and flaky tests?

Yes. Retrace captures timing and thread interactions and replays them deterministically. This helps reproduce race conditions and flaky CI failures by replaying the run that failed.

Open Source & Community

Is it really open source?

Retrace is open source and built for Python developers.

Why is this a preview release?

We’re opening the agent early to gather feedback while we expand Python/library coverage and harden for GA.

How can I contribute?

Try the agent, file issues, and submit PRs on GitHub. Library compatibility reports and docs fixes are great first contributions.

How does it work?

Retrace records external interactions (DB, API calls, file I/O, time) during a real run, then replays them deterministically in your local debugger — no prod access needed.

App runs normally

External calls captured automatically

Debug the exact execution locally

Use cases

Perfect for:

Debug failed CI runs.
Debug production-only bugs you can’t reproduce
Replay the exact execution that already happened. No repro steps required.
Help AI agents debug broken tests.
Give coding agents runtime evidence from the failed run, not just a traceback.
Stabilise flaky tests.
Replay the exact failure to understand non-deterministic behaviour.
Reproduce external dependency failures.
Replay failures involving APIs, databases, files, time, or other external calls.
Investigate after the fact.
Inspect real code paths and runtime state after the process has exited.
Debug production-only failures.
Use the same recording model for production crashes you cannot reproduce locally.

Open Source

Star the repo

github.com/retracesoftware/
retracesoftware

Community Discussion

Discuss failed-run workflows and AI debugging

Documentation

Quickstart and CI examples

Report Bugs

Report bugs and tell us where replay breaks

Built by Retrace Software.

Backed by Preston-Werner Ventures.

How it works

Get launch updates