Inductor Debug Artifacts
Stack: torch-spyre (new, Inductor-based).
TORCH_COMPILE_DEBUG=1 causes torch.compile to dump the intermediate
representation of each compiled function to disk. These artifacts are
the primary tool for answering “what did the compiler actually do with
my model?”.
Tip
For a step-by-step debugging workflow that uses these artifacts (plus
the sendnn backend bisect), see Debugging. This page is a
reference for the artifact layout itself.
Enabling artifact dumps
TORCHINDUCTOR_FORCE_DISABLE_CACHES=1 \
TORCH_COMPILE_DEBUG=1 \
python my_script.py
TORCHINDUCTOR_FORCE_DISABLE_CACHES=1 is important — without it,
Inductor will reuse cached compilation results and no new artifacts
are written.
Artifacts land under /tmp/torchinductor_<user>/ or
./torch_compile_debug/ depending on the PyTorch version.
Directory layout
torch_compile_debug/
└── run_<timestamp>-pid_<pid>/
├── torchdynamo/
│ └── debug.log
└── torchinductor/
├── aot_model___0_debug.log
└── model__0_inference_0.0/
├── fx_graph_readable.py ← ATen graph (human-readable)
├── fx_graph_runnable.py ← self-contained runnable graph
├── fx_graph_transformed.py ← FX graph after Inductor passes
├── inductor_provenance_tracking_node_mappings.json ← IR ↔ source mapping
├── ir_pre_fusion.txt ← LoopLevelIR before fusion
├── ir_post_fusion.txt ← LoopLevelIR after fusion
├── output_code.py ← generated host code
└── sdsc_<index>.json ← per-kernel specs fed to DeepTools backend
What each layer tells you
fx_graph_readable.py
The ATen graph after Dynamo capture. Reading it answers:
Is the operation you expect actually present, or was it decomposed?
Are input shapes and dtypes what you expect?
Did any unwanted decomposition change semantics?
fx_graph_transformed.py
The FX graph after Inductor’s pre-grad and post-grad passes (padding
insertion, fusion hints, etc.). Diff this against
fx_graph_readable.py to see what the frontend passes changed.
ir_pre_fusion.txt / ir_post_fusion.txt
LoopLevelIR — nested loops with buffer shapes and strides. Reading it answers:
Do loop ranges match the tensor sizes including padding?
Did fusion happen where you expected?
Mismatches here typically indicate a bug in Inductor lowering or in stickification.
sdsc_<index>.json
The final specifications handed to the DeepTools back-end — one
sdsc_<index>.json per compiled kernel in the graph (e.g.,
sdsc_0.json, sdsc_1.json, …), indexed in lowering order. Each file
encodes:
Op name (e.g.,
clone,bmm,layernorm)Input/output tensor layouts (
device_size,stride_map,device_dtype)Work division (how cores split the op)
Scratchpad allocations
Bugs that only show up in the final output frequently trace back to one
of these files — when a kernel produces the wrong numeric result, find
the corresponding sdsc_<index>.json (cross-reference output_code.py
to map kernel index → op) and inspect it first.
inductor_provenance_tracking_node_mappings.json
When INDUCTOR_PROVENANCE=1 is also set, this JSON records the
mapping between IR nodes and the original source ops. Combined with
tlparse this gives you an
HTML visualisation of how each source op flowed through the pipeline.
output_code.py
The generated host code — what actually runs on CPU to launch the compiled kernels. Look here when you suspect launch overhead or host- side glue is on the critical path.
Inductor provenance tracking
INDUCTOR_PROVENANCE=1 + TORCH_TRACE=<dir> produces a trace log that
tlparse renders into a
three-stage HTML viewer showing how each source op is transformed
through Inductor.
pip install tlparse
TORCH_TRACE=~/my_trace_log_dir \
INDUCTOR_PROVENANCE=1 \
python my_script.py
tlparse log_file_name.log --inductor-provenance
Known limitation: the post-grad panel renders empty when the program contains only a single operator, and link-highlighting may break in that case. See the PyTorch provenance docs for the upstream reference.
Quick reference
# Dump everything for a minimal reproducer
TORCHINDUCTOR_FORCE_DISABLE_CACHES=1 \
TORCH_SPYRE_DEBUG=1 \
TORCH_COMPILE_DEBUG=1 \
INDUCTOR_PROVENANCE=1 \
python my_reproducer.py
# Locate the artifacts
find . -name "sdsc_*.json" 2>/dev/null
find /tmp -name "fx_graph_readable.py" 2>/dev/null
See also
Debugging — step-by-step use of these artifacts during a bug investigation