benchmarks.SaturnNPU.scripts.analyze_npu_graph
Source: benchmarks/SaturnNPU/scripts/analyze_npu_graph.py
benchmarks.SaturnNPU.scripts.analyze_npu_graph
Multi-level SmolVLA compute graph decomposition for NPU coverage analysis.
Traces operations through every compilation level
Python module tree → Torch-MLIR → Linalg/Input → Global-Opt/NPU ISA
Produces a structured JSON manifest and CSV breakdown.
Usage
python tools/analyze_npu_graph.py --understanding-pi0 third_party/Understanding-PI0 --torch-mlir build/compiled_models/smolVLA/.../smolVLA.q.fp8.mlir --linalg-input build/compiled_models/smolVLA/.../phases/module.1.input.mlir --global-opt build/compiled_models/smolVLA/.../phases/module.4.global-optimization.mlir --output-dir benchmarks/SaturnNPU/
OpRecord
dataclass
A single MLIR operation extracted from a file.
SemanticBlock
dataclass
A model-level semantic block traced through compilation levels.
assert_counts(torch_data, linalg_data, global_data)
Check op counts against known values. Returns list of failures.
build_cross_level_summary(torch_data, linalg_data, global_data)
Build the cross-level op mapping summary.
compute_coverage(linalg_data, global_data)
Compute kernel-level coverage: what % of compute does each kernel cover?
The framing is: "if we implement kernel type X on the NPU, what fraction of total model compute does it cover?" This is not about what's currently lowered — it's about what kernels we need to implement.
compute_per_layer_decomposition(lines, linalg_data)
Compute what each PyTorch layer type decomposes into at linalg level.
Finds representative ranges for SigLIP attention, SigLIP MLP, Gemma attention, and Gemma MLP by locating key marker ops.
detect_composite_patterns(lines)
Detect composite op patterns in the linalg IR and map to PyTorch ops.
Scans the op sequence for known multi-op patterns that correspond to single PyTorch operations.
estimate_flops_for_op(op)
Estimate FLOPs for a single operation.
group_torch_ops_into_blocks(torch_data)
Group torch-MLIR ops into semantic blocks (attention, MLP, etc).
Strategy: walk ops in order and recognize repeating patterns.
parse_global_opt(path)
Parse the global-optimization MLIR file for NPU ISA ops and classify generics.
Separates function body from initializers to show what's been hoisted.
parse_linalg_input(path)
Parse the linalg/input MLIR file.
parse_pytorch_module_tree(pi0_dir)
Extract the canonical module tree from the Understanding-PI0 README.
Returns an ordered list of semantic block descriptors.
parse_torch_mlir(path)
Parse the Torch-MLIR file and extract key ops with shapes.
write_csv(torch_blocks, linalg_data, output_path)
Write per-op CSV breakdown.