Skip to content

benchmarks.SaturnNPU.scripts.generate_npu_golden_tests

Source: benchmarks/SaturnNPU/scripts/generate_npu_golden_tests.py

benchmarks.SaturnNPU.scripts.generate_npu_golden_tests

Generate hierarchical golden test data for SmolVLA NPU kernels.

Golden data is generated at two levels
  1. Layer level (PyTorch): full layer forward pass (e.g., GemmaAttention)
  2. Operator level: each atomic op within the layer (matmul, softmax, etc.)

The operator-level golden data composes to reproduce the layer-level output. This lets RTL teams test individual operators AND verify their composition.

Usage

python tools/generate_npu_golden_tests.py benchmarks/SaturnNPU/smolvla_graph_manifest.json --output-dir benchmarks/SaturnNPU/golden_data/ --generate-programs third_party/npu_model/npu_model/configs/programs/

LayerTrace dataclass

Full layer golden data with operator decomposition.

OpTrace dataclass

One operator's golden data within a layer decomposition.

generate_action_time_mlp(hidden=1024, time_dim=2048, dtype=torch.bfloat16)

Generate golden data for the action time MLP.

Linear(time_dim→hidden) + SiLU + Linear(hidden→hidden).

generate_gemma_attention(seq_len=50, hidden=720, num_heads=15, num_kv_heads=5, dtype=torch.bfloat16)

Generate golden data for one Gemma decoder self-attention layer.

generate_gemma_cross_attention(q_seq_len=50, kv_seq_len=241, hidden=720, num_heads=15, num_kv_heads=5, dtype=torch.bfloat16)

Generate golden data for Gemma expert cross-attention.

Different from self-attention: query and key/value have different seq lengths. Query comes from action tokens (50), KV from vision+language context (241).

generate_gemma_mlp(seq_len=50, hidden=720, intermediate=2048, dtype=torch.bfloat16)

Generate golden data for one Gemma MLP layer (gate + up + GELU + down).

generate_siglip_attention(seq_len=1024, hidden=768, num_heads=12, dtype=torch.bfloat16)

Generate golden data for one SigLIP self-attention layer.

generate_siglip_mlp(seq_len=1024, hidden=768, intermediate=3072, dtype=torch.bfloat16)

Generate golden data for one SigLIP MLP layer (fc1 + GELU + fc2).

generate_siglip_patch_embed(image_size=512, patch_size=16, in_channels=3, hidden=768, dtype=torch.bfloat16)

Generate golden data for SigLIP patch embedding.

Conv2d(3→768, kernel=16, stride=16) + position embedding add.

ref_gelu_tanh(x)

GELU with tanh approximation (matches PyTorch nn.GELU('tanh')).

ref_rms_norm(x, weight, eps=1e-06)

RMS normalization: x * rsqrt(mean(x^2) + eps) * weight.

ref_rope(q, k, cos, sin)

Apply rotary position embedding.

ref_silu(x)

SiLU / Swish activation.

ref_softmax(x, dim=-1)

Softmax decomposed into max-subtract-exp-sum-div.

save_layer_trace(trace, output_dir)

Save a LayerTrace to disk in a hierarchical directory structure.

verify_composition(trace)

Verify that operator outputs chain correctly to reproduce layer output.