Per-layer attention diagnostics
Two measures of attention similarity bias across 12 transformer layers.
Self-value projection in output
0.224
0.305
0.386
0.467
0.549
0.630
0
1
2
3
4
5
6
7
8
9
10
11
Layer
cos(Y, V_self)
B/32
B/16
B/14
Inter-token value similarity
-0.006
0.088
0.182
0.275
0.369
0.463
0
1
2
3
4
5
6
7
8
9
10
11
Layer
cos(V_i, V_j)
B/32
B/16
B/14
Averaged over 8 ImageNet validation images per model. All models are baselines (no XSA) evaluated at their training checkpoints.