Updates
- 2025-09-01: [Rank-sum Leaderboard] Segmentation results were updated (small changes based on improved performance for all models on the ocelot dataset only -> Related commit). Segmentation and global rankings do no thus match exactly (a few small differences only) Table 4 from the current version of our arXiv paper. The paper will be updated soon.
- 2025-09-30: [SPIDER Leaderboard] Four SPIDER datasets have been integrated into thunder. Results associated to them have not been integrated into the rank-sum leaderboard (only datasets presented in our arXiv paper are aggregated in the rank-sum learderboard), but we have instead created a leaderboard dedicated to SPIDER datasets below.
- 2025-10-06: [Zero-shot VLM Classification Leaderboard] A new zero-shot classification task was integrated into THUNDER. Results are presented in a dedicated leaderboard below.
🏆 Rank-sum Leaderboard
Model |
Domain |
Type |
KNN ↑ |
Lin. prob. ↑ |
Few-shot ↑ |
Seg.↑ |
Calib. ↓ |
Adv. attack ↓ |
Rank sum ↓ |
HIBOU-B | Histopathology | VM | 75.8 (10) | 78.0 (14) | 74.2 (6) | 67.8 (10) | 3.7 (2) | 52.8 (14) | 56 (7) |
HIBOU-L | Histopathology | VM | 75.2 (12) | 81.2 (7) | 70.4 (12) | 68.6 (6) | 5.5 (18) | 40.0 (5) | 60 (8) |
H-OPTIMUS-0 | Histopathology | VM | 79.2 (5) | 81.4 (5) | 73.4 (7) | 65.2 (13) | 4.7 (13) | 44.2 (9) | 52 (6) |
H-OPTIMUS-1 | Histopathology | VM | 80.5 (3) | 83.3 (2) | 74.8 (4) | 64.5 (15) | 4.1 (4) | 58.0 (17) | 45 (5) |
MIDNIGHT | Histopathology | VM | 78.2 (8) | 82.9 (3) | 70.6 (11) | 68.8 (4) | 3.2 (1) | 36.3 (4) | 31 (3) |
PHIKON | Histopathology | VM | 72.8 (14) | 78.4 (13) | 72.2 (10) | 68.0 (9) | 6.4 (22) | 34.4 (3) | 71 (11) |
PHIKON2 | Histopathology | VM | 70.1 (15) | 76.5 (15) | 70.1 (13) | 67.4 (12) | 4.6 (11) | 45.6 (11) | 77 (12) |
UNI | Histopathology | VM | 78.8 (6) | 81.3 (6) | 76.4 (2) | 67.8 (11) | 4.3 (7) | 42.8 (7) | 39 (4) |
UNI2-H | Histopathology | VM | 81.7 (1) | 83.9 (1) | 78.4 (1) | 69.0 (3) | 4.5 (8) | 34.3 (2) | 16 (1) |
VIRCHOW | Histopathology | VM | 74.2 (13) | 80.2 (10) | 68.5 (15) | 69.2 (2) | 5.5 (20) | 41.0 (6) | 66 (10) |
VIRCHOW2 | Histopathology | VM | 81.2 (2) | 82.7 (4) | 72.6 (9) | 69.3 (1) | 4.6 (10) | 33.6 (1) | 27 (2) |
CONCH | Histopathology | VLM | 77.3 (9) | 80.2 (11) | 73.1 (8) | 68.3 (7) | 4.3 (6) | 55.0 (15) | 56 (7) |
CONCH 1.5 | Histopathology | VLM | 78.6 (7) | 80.8 (9) | 74.6 (5) | 68.8 (5) | 4.9 (14) | 75.3 (23) | 63 (9) |
KEEP | Histopathology | VLM | 79.7 (4) | 81.1 (8) | 75.8 (3) | 68.0 (8) | 4.7 (12) | 44.7 (10) | 45 (5) |
MUSK | Histopathology | VLM | 75.6 (11) | 79.0 (12) | 70.0 (14) | 65.1 (14) | 4.5 (9) | 69.3 (22) | 82 (13) |
PLIP | Histopathology | VLM | 67.8 (19) | 71.0 (22) | 63.4 (17) | 58.5 (22) | 4.9 (15) | 56.9 (16) | 111 (18) |
QUILTNET | Histopathology | VLM | 68.3 (17) | 71.0 (21) | 65.7 (16) | 58.9 (21) | 7.0 (23) | 52.7 (13) | 111 (18) |
DINOv2-B | Natural-image | VM | 67.9 (18) | 74.8 (17) | 61.0 (18) | 59.8 (19) | 5.5 (21) | 65.8 (20) | 113 (19) |
DINOv2-L | Natural-image | VM | 69.6 (16) | 75.3 (16) | 59.2 (19) | 59.6 (20) | 5.3 (17) | 64.5 (19) | 107 (17) |
ViT-B/16 | Natural-image | VM | 64.4 (21) | 71.9 (19) | 57.8 (21) | 61.0 (17) | 3.9 (3) | 46.8 (12) | 93 (14) |
ViT-L/16 | Natural-image | VM | 67.5 (20) | 72.8 (18) | 56.5 (22) | 63.1 (16) | 5.0 (16) | 44.1 (8) | 100 (15) |
CLIP-B/32 | Natural-image | VLM | 61.9 (23) | 65.8 (23) | 53.3 (23) | 56.0 (23) | 5.5 (19) | 60.4 (18) | 129 (23) |
CLIP-L/14 | Natural-image | VLM | 64.2 (22) | 71.3 (20) | 58.2 (20) | 60.8 (18) | 4.2 (5) | 67.8 (21) | 106 (16) |
🏆 SPIDER Leaderboard
F1-score on test sets of SPIDER datasets and average across datasets for the knn and linear probing tasks. The considered datasets are:
Model |
Domain |
Type |
KNN ↑ |
Linear probing ↑ |
Br | Co | Sk | Th | Avg |
Br | Co | Sk | Th | Avg |
HIBOU-B | Histopathology | VM | 83.3 (2) | 88.1 (1) | 87.7 (7) | 93.4 (3) | 88.1 (5) | 86.6 (5) | 90.7 (2) | 91.1 (8) | 94.5 (4) | 90.7 (5) |
HIBOU-L | Histopathology | VM | 83.6 (1) | 88.1 (2) | 90.7 (2) | 93.5 (2) | 89.0 (1) | 88.0 (1) | 89.8 (8) | 93.3 (1) | 94.1 (7) | 91.3 (1) |
H-OPTIMUS-0 | Histopathology | VM | 81.7 (6) | 87.8 (7) | 89.3 (4) | 93.8 (1) | 88.2 (4) | 87.2 (3) | 89.9 (7) | 91.9 (5) | 94.4 (6) | 90.8 (4) |
H-OPTIMUS-1 | Histopathology | VM | 83.0 (3) | 87.8 (5) | 91.1 (1) | 91.5 (9) | 88.4 (2) | 86.1 (8) | 90.3 (5) | 92.3 (3) | 93.6 (11) | 90.6 (6) |
MIDNIGHT | Histopathology | VM | 77.1 (13) | 84.9 (13) | 85.7 (10) | 92.7 (5) | 85.1 (11) | 86.1 (7) | 89.6 (11) | 91.0 (9) | 94.4 (5) | 90.3 (9) |
PHIKON | Histopathology | VM | 78.9 (11) | 85.1 (12) | 83.2 (13) | 89.7 (15) | 84.3 (12) | 84.9 (12) | 88.5 (12) | 87.9 (11) | 92.4 (13) | 88.4 (12) |
PHIKON2 | Histopathology | VM | 80.2 (8) | 86.5 (10) | 83.3 (11) | 91.4 (11) | 85.3 (10) | 86.0 (9) | 89.7 (10) | 87.2 (14) | 94.7 (3) | 89.4 (11) |
UNI | Histopathology | VM | 81.3 (7) | 88.0 (4) | 87.2 (8) | 91.2 (12) | 86.9 (8) | 85.7 (10) | 90.4 (4) | 91.2 (7) | 93.9 (8) | 90.3 (8) |
UNI2-H | Histopathology | VM | 82.6 (4) | 87.1 (9) | 90.5 (3) | 92.5 (7) | 88.2 (3) | 86.7 (4) | 90.5 (3) | 92.5 (2) | 95.1 (1) | 91.2 (2) |
VIRCHOW | Histopathology | VM | 79.3 (10) | 87.8 (6) | 88.8 (6) | 92.3 (8) | 87.0 (7) | 86.2 (6) | 90.2 (6) | 91.3 (6) | 94.7 (2) | 90.6 (7) |
VIRCHOW2 | Histopathology | VM | 82.3 (5) | 88.0 (3) | 89.1 (5) | 92.6 (6) | 88.0 (6) | 87.2 (2) | 90.8 (1) | 92.0 (4) | 93.9 (9) | 91.0 (3) |
CONCH | Histopathology | VLM | 75.1 (15) | 84.5 (14) | 81.7 (15) | 91.1 (14) | 83.1 (15) | 82.1 (13) | 87.9 (13) | 87.3 (13) | 91.0 (15) | 87.1 (14) |
CONCH 1.5 | Histopathology | VLM | 75.9 (14) | 84.2 (15) | 83.3 (12) | 91.4 (10) | 83.7 (14) | 81.6 (14) | 87.4 (15) | 87.0 (15) | 92.1 (14) | 87.0 (15) |
KEEP | Histopathology | VLM | 79.8 (9) | 87.2 (8) | 87.2 (9) | 93.1 (4) | 86.9 (9) | 85.6 (11) | 89.7 (9) | 89.3 (10) | 93.8 (10) | 89.6 (10) |
MUSK | Histopathology | VLM | 77.2 (12) | 85.7 (11) | 82.5 (14) | 91.1 (13) | 84.1 (13) | 80.6 (15) | 87.9 (14) | 87.6 (12) | 93.3 (12) | 87.4 (13) |
PLIP | Histopathology | VLM | 69.4 (17) | 79.9 (16) | 74.4 (16) | 86.4 (16) | 77.5 (16) | 77.1 (18) | 84.7 (19) | 82.1 (17) | 88.6 (16) | 83.1 (17) |
QUILTNET | Histopathology | VLM | 69.9 (16) | 77.7 (19) | 73.4 (17) | 85.3 (17) | 76.6 (17) | 77.0 (19) | 82.9 (21) | 81.2 (20) | 88.5 (18) | 82.4 (19) |
DINOv2-B | Natural-image | VM | 64.0 (20) | 77.5 (20) | 70.4 (20) | 78.1 (20) | 72.5 (20) | 76.0 (20) | 83.9 (20) | 80.1 (21) | 87.6 (21) | 81.9 (21) |
DINOv2-L | Natural-image | VM | 66.1 (19) | 79.0 (17) | 71.4 (19) | 78.5 (19) | 73.7 (19) | 74.0 (21) | 85.3 (16) | 82.1 (16) | 87.7 (20) | 82.3 (20) |
ViT-B/16 | Natural-image | VM | 63.6 (21) | 76.7 (21) | 68.7 (21) | 77.0 (22) | 71.5 (21) | 78.2 (17) | 84.7 (18) | 81.2 (19) | 87.9 (19) | 83.0 (18) |
ViT-L/16 | Natural-image | VM | 66.5 (18) | 78.7 (18) | 71.8 (18) | 81.8 (18) | 74.7 (18) | 79.3 (16) | 85.1 (17) | 81.3 (18) | 88.5 (17) | 83.6 (16) |
CLIP-B/32 | Natural-image | VLM | 57.0 (23) | 71.0 (23) | 63.7 (23) | 73.7 (23) | 66.4 (23) | 69.0 (23) | 81.3 (23) | 75.8 (23) | 84.7 (23) | 77.7 (23) |
CLIP-L/14 | Natural-image | VLM | 62.5 (22) | 74.6 (22) | 66.5 (22) | 77.4 (21) | 70.2 (22) | 73.6 (22) | 82.8 (22) | 78.5 (22) | 86.7 (22) | 80.4 (22) |
🏆 Zero-shot VLM Classification Leaderboard
F1-score on test sets of all supported datasets and average across datasets for the zero-shot classification task. Only VLM models with publicly released patch-level text encoders are included.
Model |
Domain |
Type |
bach |
bracs |
break_his |
ccrcc |
crc |
esca |
mhist |
patch_camelyon |
tcga_crc_msi |
tcga_tils |
tcga_uniform |
wilds |
spider_breast |
spider_colorectal |
spider_skin |
spider_thorax |
Avg |
CONCH | Histopathology | VLM | 56.1 (3) | 37.9 (2) | 53.6 (1) | 56.9 (2) | 51.8 (4) | 40.1 (1) | 60.8 (2) | 57.8 (3) | 21.6 (4) | 47.4 (5) | 37.9 (2) | 83.2 (2) | 30.7 (3) | 31.4 (3) | 35.1 (3) | 43.0 (3) | 46.6 (3) |
KEEP | Histopathology | VLM | 63.4 (1) | 34.2 (3) | 45.0 (3) | 69.1 (1) | 80.6 (1) | 33.3 (2) | 41.3 (7) | 71.4 (1) | 15.5 (6) | 55.5 (2) | 44.9 (1) | 89.4 (1) | 37.7 (1) | 44.4 (1) | 60.7 (1) | 51.8 (2) | 52.4 (1) |
MUSK | Histopathology | VLM | 62.5 (2) | 38.6 (1) | 52.6 (2) | 50.7 (3) | 57.9 (3) | 25.9 (4) | 63.8 (1) | 53.5 (5) | 22.7 (3) | 50.1 (3) | 32.4 (3) | 71.2 (3) | 36.3 (2) | 36.2 (2) | 48.5 (2) | 55.2 (1) | 47.4 (2) |
PLIP | Histopathology | VLM | 42.7 (4) | 25.9 (5) | 25.6 (5) | 38.2 (5) | 61.4 (2) | 31.5 (3) | 53.9 (5) | 46.1 (7) | 16.3 (5) | 64.1 (1) | 10.3 (5) | 51.0 (6) | 14.6 (4) | 28.3 (4) | 23.5 (4) | 22.6 (4) | 34.7 (4) |
QUILTNET | Histopathology | VLM | 30.3 (5) | 29.1 (4) | 37.5 (4) | 24.2 (6) | 44.2 (5) | 14.2 (5) | 57.1 (4) | 65.8 (2) | 50.4 (1) | 47.6 (4) | 12.3 (4) | 44.3 (7) | 13.9 (5) | 25.0 (5) | 18.5 (5) | 19.7 (5) | 33.4 (5) |
CLIP-B/32 | Natural-image | VLM | 13.2 (7) | 7.5 (7) | 18.7 (6) | 21.8 (7) | 24.4 (7) | 9.8 (7) | 42.2 (6) | 48.1 (6) | 13.9 (7) | 21.6 (7) | 2.0 (7) | 56.8 (5) | 3.9 (7) | 6.1 (7) | 4.3 (7) | 5.5 (6) | 18.7 (7) |
CLIP-L/14 | Natural-image | VLM | 27.1 (6) | 19.2 (6) | 5.8 (7) | 40.6 (4) | 41.1 (6) | 10.4 (6) | 58.0 (3) | 55.6 (4) | 49.4 (2) | 25.4 (6) | 7.4 (6) | 70.2 (4) | 7.3 (6) | 16.0 (6) | 6.6 (6) | 5.4 (7) | 27.8 (6) |