`benchmarks.SaturnNPU.scripts.analyze_npu_graph`

Source: benchmarks/SaturnNPU/scripts/analyze_npu_graph.py

`benchmarks.SaturnNPU.scripts.analyze_npu_graph`

Multi-level SmolVLA compute graph decomposition for NPU coverage analysis.

Traces operations through every compilation level

Python module tree → Torch-MLIR → Linalg/Input → Global-Opt/NPU ISA

Produces a structured JSON manifest and CSV breakdown.

Usage

python tools/analyze_npu_graph.py --understanding-pi0 third_party/Understanding-PI0 --torch-mlir build/compiled_models/smolVLA/.../smolVLA.q.fp8.mlir --linalg-input build/compiled_models/smolVLA/.../phases/module.1.input.mlir --global-opt build/compiled_models/smolVLA/.../phases/module.4.global-optimization.mlir --output-dir benchmarks/SaturnNPU/

`OpRecord` `dataclass`

A single MLIR operation extracted from a file.

`SemanticBlock` `dataclass`

A model-level semantic block traced through compilation levels.

`assert_counts(torch_data, linalg_data, global_data)`

Check op counts against known values. Returns list of failures.

`build_cross_level_summary(torch_data, linalg_data, global_data)`

Build the cross-level op mapping summary.

`compute_coverage(linalg_data, global_data)`

Compute kernel-level coverage: what % of compute does each kernel cover?

The framing is: "if we implement kernel type X on the NPU, what fraction of total model compute does it cover?" This is not about what's currently lowered — it's about what kernels we need to implement.

`compute_per_layer_decomposition(lines, linalg_data)`

Compute what each PyTorch layer type decomposes into at linalg level.

Finds representative ranges for SigLIP attention, SigLIP MLP, Gemma attention, and Gemma MLP by locating key marker ops.

`detect_composite_patterns(lines)`

Detect composite op patterns in the linalg IR and map to PyTorch ops.

Scans the op sequence for known multi-op patterns that correspond to single PyTorch operations.

`estimate_flops_for_op(op)`

Estimate FLOPs for a single operation.

`group_torch_ops_into_blocks(torch_data)`

Group torch-MLIR ops into semantic blocks (attention, MLP, etc).

Strategy: walk ops in order and recognize repeating patterns.

`parse_global_opt(path)`

Parse the global-optimization MLIR file for NPU ISA ops and classify generics.

Separates function body from initializers to show what's been hoisted.

`parse_linalg_input(path)`

Parse the linalg/input MLIR file.

`parse_pytorch_module_tree(pi0_dir)`

Extract the canonical module tree from the Understanding-PI0 README.

Returns an ordered list of semantic block descriptors.

`parse_torch_mlir(path)`

Parse the Torch-MLIR file and extract key ops with shapes.

`write_csv(torch_blocks, linalg_data, output_path)`

Write per-op CSV breakdown.

benchmarks.SaturnNPU.scripts.analyze_npu_graph