Skip to content

benchmarks.SaturnNPU.scripts.analyze_npu_graph

Source: benchmarks/SaturnNPU/scripts/analyze_npu_graph.py

benchmarks.SaturnNPU.scripts.analyze_npu_graph

Multi-level SmolVLA compute graph decomposition for NPU coverage analysis.

Traces operations through every compilation level

Python module tree → Torch-MLIR → Linalg/Input → Global-Opt/NPU ISA

Produces a structured JSON manifest and CSV breakdown.

Usage

python tools/analyze_npu_graph.py --understanding-pi0 third_party/Understanding-PI0 --torch-mlir build/compiled_models/smolVLA/.../smolVLA.q.fp8.mlir --linalg-input build/compiled_models/smolVLA/.../phases/module.1.input.mlir --global-opt build/compiled_models/smolVLA/.../phases/module.4.global-optimization.mlir --output-dir benchmarks/SaturnNPU/

OpRecord dataclass

A single MLIR operation extracted from a file.

SemanticBlock dataclass

A model-level semantic block traced through compilation levels.

assert_counts(torch_data, linalg_data, global_data)

Check op counts against known values. Returns list of failures.

build_cross_level_summary(torch_data, linalg_data, global_data)

Build the cross-level op mapping summary.

compute_coverage(linalg_data, global_data)

Compute kernel-level coverage: what % of compute does each kernel cover?

The framing is: "if we implement kernel type X on the NPU, what fraction of total model compute does it cover?" This is not about what's currently lowered — it's about what kernels we need to implement.

compute_per_layer_decomposition(lines, linalg_data)

Compute what each PyTorch layer type decomposes into at linalg level.

Finds representative ranges for SigLIP attention, SigLIP MLP, Gemma attention, and Gemma MLP by locating key marker ops.

detect_composite_patterns(lines)

Detect composite op patterns in the linalg IR and map to PyTorch ops.

Scans the op sequence for known multi-op patterns that correspond to single PyTorch operations.

estimate_flops_for_op(op)

Estimate FLOPs for a single operation.

group_torch_ops_into_blocks(torch_data)

Group torch-MLIR ops into semantic blocks (attention, MLP, etc).

Strategy: walk ops in order and recognize repeating patterns.

parse_global_opt(path)

Parse the global-optimization MLIR file for NPU ISA ops and classify generics.

Separates function body from initializers to show what's been hoisted.

parse_linalg_input(path)

Parse the linalg/input MLIR file.

parse_pytorch_module_tree(pi0_dir)

Extract the canonical module tree from the Understanding-PI0 README.

Returns an ordered list of semantic block descriptors.

parse_torch_mlir(path)

Parse the Torch-MLIR file and extract key ops with shapes.

write_csv(torch_blocks, linalg_data, output_path)

Write per-op CSV breakdown.