Patch value similarity across ViT scales
Cosine similarity between each patch's value vector and the target patch (white border). Smaller patches yield more similar neighbors.
-1
+1
Value similarity computed from pretrained ViT-B checkpoints at 224px