Skip to content

Leaderboards

Updates

  • 2025-11-14: [SPIDER Leaderboard] DINOv3-S, DINOv3-B, DINOv3-L, GIGAPATH, KAIKO-S/16, KAIKO-B/16 added to the SPIDER leaderboard.
  • 2025-11-14: [Up-to-date Rank-sum Leaderboard] GIGAPATH, KAIKO-S/16, KAIKO-B/16 added to the updated rank-sum leaderboard.
  • 2025-10-30: [Up-to-date Rank-sum Leaderboard] A new up-to-date rank-sum leaderboard was added. This is an update of the rank-sum leaderboard (Table 4) in our paper with new datasets and models, e.g. here we added results for SPIDER datasets and DINOv3 model variants.
  • 2025-10-27: [Original (paper) Rank-sum Leaderboard] Our arXiv paper was updated and now the rank-sum table in it (Table 4) matches exactly our Original (paper) Rank-sum Leaderboard.
  • 2025-10-06: [Zero-shot VLM Classification Leaderboard] A new zero-shot classification task was integrated into THUNDER. Results are presented in a dedicated leaderboard below.
  • 2025-09-30: [SPIDER Leaderboard] Four SPIDER datasets have been integrated into THUNDER. We have created a leaderboard dedicated to SPIDER datasets below.
  • 2025-09-01: [Original (paper) Rank-sum Leaderboard] Segmentation results were updated (small changes based on improved performance for all models on the ocelot dataset only -> Related commit). Segmentation and global rankings do no thus match exactly (a few small differences only) Table 4 from the current version of our paper. The paper will be updated soon. [This was now updated on 2025-10-30]

🏆 Up-to-date Rank-sum Leaderboard

This leaderboard is an updated version of the rank-sum table (Table 4) in our original paper.

The following was added (compared with paper results):

  • DINOv3 model variants (DINOv3-S, DINOv3-B, DINOv3-L).
  • Updated average performance for all models (including DINOv3 models) and all tasks except segmentation with additional results on SPIDER datasets (results from other datasets used to compute the average performance stay the same).
  • GIGAPATH.
  • KAIKO-S/16 and KAIKO-B/16.
Model Domain Type KNN ↑ Lin. prob. ↑ Few-shot ↑ Seg.↑ Calib. ↓ Adv. attack ↓ Rank sum ↓
HIBOU-BHistopathologyVM78.9 (10)81.2 (15)76.3 (6)67.8 (11)3.2 (4)52.7 (16)62 (9)
HIBOU-LHistopathologyVM78.6 (13)83.7 (6)73.8 (12)68.6 (6)4.7 (23)39.5 (7)67 (10)
H-OPTIMUS-0HistopathologyVM81.4 (5)83.8 (5)76.2 (7)65.2 (15)4.0 (16)43.9 (12)60 (8)
H-OPTIMUS-1HistopathologyVM82.5 (3)85.1 (2)77.3 (3)64.5 (17)3.5 (6)57.4 (19)50 (5)
MIDNIGHTHistopathologyVM79.9 (7)84.7 (4)71.5 (18)68.8 (5)2.9 (2)37.0 (4)40 (3)
PHIKONHistopathologyVM75.7 (17)80.9 (17)73.6 (13)68.0 (9)5.8 (28)33.5 (3)87 (17)
PHIKON2HistopathologyVM73.9 (18)79.7 (18)71.8 (16)67.4 (12)3.9 (10)43.8 (11)85 (16)
UNIHistopathologyVM80.8 (6)83.5 (7)78.1 (2)67.8 (10)3.8 (8)40.3 (8)41 (4)
UNI2-HHistopathologyVM83.3 (1)85.7 (1)79.8 (1)69.0 (3)3.9 (12)31.7 (2)20 (1)
VIRCHOWHistopathologyVM77.4 (16)82.8 (10)71.8 (17)69.2 (2)4.5 (19)38.3 (5)69 (11)
VIRCHOW2HistopathologyVM82.9 (2)84.8 (3)73.9 (11)69.3 (1)3.9 (11)31.1 (1)29 (2)
CONCHHistopathologyVLM78.8 (11)81.9 (12)73.4 (14)68.3 (7)4.1 (17)57.3 (18)79 (13)
CONCH 1.5HistopathologyVLM79.9 (8)82.4 (11)75.0 (10)68.8 (4)4.6 (21)75.8 (29)83 (15)
KEEPHistopathologyVLM81.5 (4)83.2 (8)77.1 (4)68.0 (8)4.0 (14)44.9 (14)52 (6)
MUSKHistopathologyVLM77.7 (15)81.1 (16)71.9 (15)65.1 (16)4.0 (13)71.9 (28)103 (18)
PLIPHistopathologyVLM70.2 (24)74.0 (26)64.1 (22)58.5 (28)4.5 (20)60.8 (20)140 (24)
QUILTNETHistopathologyVLM70.4 (23)73.9 (27)65.6 (20)58.9 (27)5.8 (29)55.6 (17)143 (26)
DINOv2-BNatural-imageVM69.0 (26)76.6 (23)60.2 (24)59.8 (25)5.0 (27)66.1 (23)148 (27)
DINOv2-LNatural-imageVM70.6 (22)77.0 (22)58.7 (25)59.6 (26)4.9 (25)64.4 (22)142 (25)
ViT-B/16Natural-imageVM66.2 (27)74.7 (25)57.3 (27)61.0 (23)4.0 (15)47.0 (15)132 (23)
ViT-L/16Natural-imageVM69.3 (25)75.5 (24)56.9 (28)63.1 (20)4.3 (18)43.9 (13)128 (22)
CLIP-B/32Natural-imageVLM63.0 (29)68.8 (29)52.3 (29)56.0 (29)4.9 (24)63.6 (21)161 (28)
CLIP-L/14Natural-imageVLM65.7 (28)73.6 (28)58.1 (26)60.8 (24)3.8 (9)70.5 (27)142 (25)
DINOv3-BNatural-imageVM70.6 (21)77.4 (20)64.8 (21)63.4 (19)3.7 (7)70.1 (26)114 (21)
DINOv3-SNatural-imageVM72.0 (19)77.1 (21)65.7 (19)62.0 (22)2.8 (1)66.9 (24)106 (19)
DINOv3-LNatural-imageVM71.5 (20)77.9 (19)61.8 (23)62.6 (21)3.0 (3)68.9 (25)111 (20)
GIGAPATHHistopathologyVM79.5 (9)82.9 (9)75.5 (8)63.5 (18)3.4 (5)42.1 (9)58 (7)
KAIKO-S/16HistopathologyVM78.2 (14)81.7 (13)75.1 (9)66.8 (14)4.6 (22)42.5 (10)82 (14)
KAIKO-B/16HistopathologyVM78.7 (12)81.4 (14)76.4 (5)66.8 (13)5.0 (26)38.8 (6)76 (12)

🏆 Original (paper) Rank-sum Leaderboard

This leaderboard exactly reproduces the rank-sum table (Table 4) presented in our original paper.

Model Domain Type KNN ↑ Lin. prob. ↑ Few-shot ↑ Seg.↑ Calib. ↓ Adv. attack ↓ Rank sum ↓
HIBOU-BHistopathologyVM75.8 (10)78.0 (14)74.2 (6)67.8 (10)3.7 (2)52.8 (14)56 (7)
HIBOU-LHistopathologyVM75.2 (12)81.2 (7)70.4 (12)68.6 (6)5.5 (18)40.0 (5)60 (8)
H-OPTIMUS-0HistopathologyVM79.2 (5)81.4 (5)73.4 (7)65.2 (13)4.7 (13)44.2 (9)52 (6)
H-OPTIMUS-1HistopathologyVM80.5 (3)83.3 (2)74.8 (4)64.5 (15)4.1 (4)58.0 (17)45 (5)
MIDNIGHTHistopathologyVM78.2 (8)82.9 (3)70.6 (11)68.8 (4)3.2 (1)36.3 (4)31 (3)
PHIKONHistopathologyVM72.8 (14)78.4 (13)72.2 (10)68.0 (9)6.4 (22)34.4 (3)71 (11)
PHIKON2HistopathologyVM70.1 (15)76.5 (15)70.1 (13)67.4 (12)4.6 (11)45.6 (11)77 (12)
UNIHistopathologyVM78.8 (6)81.3 (6)76.4 (2)67.8 (11)4.3 (7)42.8 (7)39 (4)
UNI2-HHistopathologyVM81.7 (1)83.9 (1)78.4 (1)69.0 (3)4.5 (8)34.3 (2)16 (1)
VIRCHOWHistopathologyVM74.2 (13)80.2 (10)68.5 (15)69.2 (2)5.5 (20)41.0 (6)66 (10)
VIRCHOW2HistopathologyVM81.2 (2)82.7 (4)72.6 (9)69.3 (1)4.6 (10)33.6 (1)27 (2)
CONCHHistopathologyVLM77.3 (9)80.2 (11)73.1 (8)68.3 (7)4.3 (6)55.0 (15)56 (7)
CONCH 1.5HistopathologyVLM78.6 (7)80.8 (9)74.6 (5)68.8 (5)4.9 (14)75.3 (23)63 (9)
KEEPHistopathologyVLM79.7 (4)81.1 (8)75.8 (3)68.0 (8)4.7 (12)44.7 (10)45 (5)
MUSKHistopathologyVLM75.6 (11)79.0 (12)70.0 (14)65.1 (14)4.5 (9)69.3 (22)82 (13)
PLIPHistopathologyVLM67.8 (19)71.0 (22)63.4 (17)58.5 (22)4.9 (15)56.9 (16)111 (18)
QUILTNETHistopathologyVLM68.3 (17)71.0 (21)65.7 (16)58.9 (21)7.0 (23)52.7 (13)111 (18)
DINOv2-BNatural-imageVM67.9 (18)74.8 (17)61.0 (18)59.8 (19)5.5 (21)65.8 (20)113 (19)
DINOv2-LNatural-imageVM69.6 (16)75.3 (16)59.2 (19)59.6 (20)5.3 (17)64.5 (19)107 (17)
ViT-B/16Natural-imageVM64.4 (21)71.9 (19)57.8 (21)61.0 (17)3.9 (3)46.8 (12)93 (14)
ViT-L/16Natural-imageVM67.5 (20)72.8 (18)56.5 (22)63.1 (16)5.0 (16)44.1 (8)100 (15)
CLIP-B/32Natural-imageVLM61.9 (23)65.8 (23)53.3 (23)56.0 (23)5.5 (19)60.4 (18)129 (23)
CLIP-L/14Natural-imageVLM64.2 (22)71.3 (20)58.2 (20)60.8 (18)4.2 (5)67.8 (21)106 (16)

🏆 SPIDER Leaderboard

F1-score on test sets of SPIDER datasets and average across datasets for the knn and linear probing tasks. The considered datasets are:

Model Domain Type KNN ↑ Linear probing ↑
BrCoSkThAvg BrCoSkThAvg
HIBOU-BHistopathologyVM83.3 (2)88.1 (1)87.7 (7)93.4 (3)88.1 (5)86.6 (5)90.7 (2)91.1 (9)94.5 (4)90.7 (5)
HIBOU-LHistopathologyVM83.6 (1)88.1 (2)90.7 (2)93.5 (2)89.0 (1)88.0 (1)89.8 (9)93.3 (1)94.1 (7)91.3 (1)
H-OPTIMUS-0HistopathologyVM81.7 (6)87.8 (5)89.3 (4)93.8 (1)88.2 (4)87.2 (3)89.9 (8)91.9 (5)94.4 (6)90.8 (4)
H-OPTIMUS-1HistopathologyVM83.0 (3)87.8 (6)91.1 (1)91.5 (12)88.4 (2)86.1 (8)90.3 (5)92.3 (3)93.6 (12)90.6 (6)
KAIKO-S/16HistopathologyVM76.8 (16)85.4 (13)82.4 (17)92.7 (6)84.3 (14)83.1 (14)88.4 (15)88.6 (12)93.1 (15)88.3 (14)
KAIKO-B/16HistopathologyVM77.8 (13)85.3 (14)82.5 (15)92.0 (11)84.4 (13)81.9 (16)88.5 (14)87.4 (15)93.3 (14)87.8 (15)
MIDNIGHTHistopathologyVM77.1 (15)84.9 (16)85.7 (11)92.7 (5)85.1 (12)86.1 (7)89.6 (12)91.0 (10)94.4 (5)90.3 (9)
PHIKONHistopathologyVM78.9 (12)85.1 (15)83.2 (14)89.7 (18)84.3 (15)84.9 (13)88.5 (13)87.9 (13)92.4 (16)88.4 (13)
PHIKON2HistopathologyVM80.2 (9)86.5 (11)83.3 (12)91.4 (13)85.3 (11)86.0 (9)89.7 (10)87.2 (17)94.7 (3)89.4 (12)
GIGAPATHHistopathologyVM81.5 (7)87.4 (8)86.9 (10)92.4 (9)87.0 (7)85.5 (12)90.3 (6)91.3 (6)93.8 (10)90.2 (10)
UNIHistopathologyVM81.3 (8)88.0 (4)87.2 (8)91.2 (15)86.9 (10)85.7 (10)90.4 (4)91.2 (8)93.9 (9)90.3 (8)
UNI2-HHistopathologyVM82.6 (4)87.1 (10)90.5 (3)92.5 (8)88.2 (3)86.7 (4)90.5 (3)92.5 (2)95.1 (1)91.2 (2)
VIRCHOWHistopathologyVM79.3 (11)87.8 (7)88.8 (6)92.3 (10)87.0 (8)86.2 (6)90.2 (7)91.3 (7)94.7 (2)90.6 (7)
VIRCHOW2HistopathologyVM82.3 (5)88.0 (3)89.1 (5)92.6 (7)88.0 (6)87.2 (2)90.8 (1)92.0 (4)93.9 (8)91.0 (3)
CONCHHistopathologyVLM75.1 (18)84.5 (17)81.7 (18)91.1 (16)83.1 (18)82.1 (15)87.9 (16)87.3 (16)91.0 (18)87.1 (17)
CONCH 1.5HistopathologyVLM75.9 (17)84.2 (18)83.3 (13)91.4 (14)83.7 (17)81.6 (17)87.4 (18)87.0 (18)92.1 (17)87.0 (18)
KEEPHistopathologyVLM79.8 (10)87.2 (9)87.2 (9)93.1 (4)86.9 (9)85.6 (11)89.7 (11)89.3 (11)93.8 (11)89.6 (11)
MUSKHistopathologyVLM77.2 (14)85.7 (12)82.5 (16)91.1 (17)84.1 (16)80.6 (18)87.9 (17)87.6 (14)93.3 (13)87.4 (16)
PLIPHistopathologyVLM69.4 (20)79.9 (19)74.4 (19)86.4 (19)77.5 (19)77.1 (22)84.7 (25)82.1 (21)88.6 (21)83.1 (22)
QUILTNETHistopathologyVLM69.9 (19)77.7 (25)73.4 (21)85.3 (20)76.6 (20)77.0 (23)82.9 (27)81.2 (26)88.5 (23)82.4 (25)
DINOv2-BNatural-imageVM64.0 (26)77.5 (26)70.4 (26)78.1 (26)72.5 (26)76.0 (25)83.9 (26)80.1 (27)87.6 (27)81.9 (27)
DINOv2-LNatural-imageVM66.1 (24)79.0 (22)71.4 (25)78.5 (25)73.7 (25)74.0 (27)85.3 (20)82.1 (20)87.7 (26)82.3 (26)
ViT-B/16Natural-imageVM63.6 (27)76.7 (27)68.7 (27)77.0 (28)71.5 (27)78.2 (21)84.7 (24)81.2 (25)87.9 (25)83.0 (23)
ViT-L/16Natural-imageVM66.5 (23)78.7 (24)71.8 (24)81.8 (21)74.7 (23)79.3 (19)85.1 (21)81.3 (24)88.5 (22)83.6 (20)
DINOv3-BNatural-imageVM65.8 (25)78.9 (23)73.2 (22)80.5 (24)74.6 (24)76.5 (24)84.8 (23)81.6 (23)90.0 (20)83.2 (21)
DINOv3-LNatural-imageVM66.9 (21)79.8 (20)73.1 (23)81.6 (22)75.3 (22)78.3 (20)86.3 (19)82.5 (19)90.1 (19)84.3 (19)
DINOv3-SNatural-imageVM66.8 (22)79.5 (21)73.9 (20)81.6 (23)75.5 (21)75.0 (26)85.0 (22)81.8 (22)88.2 (24)82.5 (24)
CLIP-B/32Natural-imageVLM57.0 (29)71.0 (29)63.7 (29)73.7 (29)66.4 (29)69.0 (29)81.3 (29)75.8 (29)84.7 (29)77.7 (29)
CLIP-L/14Natural-imageVLM62.5 (28)74.6 (28)66.5 (28)77.4 (27)70.2 (28)73.6 (28)82.8 (28)78.5 (28)86.7 (28)80.4 (28)

🏆 Zero-shot VLM Classification Leaderboard

F1-score on test sets of all supported datasets and average across datasets for the zero-shot classification task. Only VLM models with publicly released patch-level text encoders are included.

Model Domain Type bach bracs break_his ccrcc crc esca mhist patch_camelyon tcga_crc_msi tcga_tils tcga_uniform wilds spider_breast spider_colorectal spider_skin spider_thorax Avg
CONCHHistopathologyVLM56.1 (3)37.9 (2)53.6 (1)56.9 (2)51.8 (4)40.1 (1)60.8 (2)57.8 (3)21.6 (4)47.4 (5)37.9 (2)83.2 (2)30.7 (3)31.4 (3)35.1 (3)43.0 (3)46.6 (3)
KEEPHistopathologyVLM63.4 (1)34.2 (3)45.0 (3)69.1 (1)80.6 (1)33.3 (2)41.3 (7)71.4 (1)15.5 (6)55.5 (2)44.9 (1)89.4 (1)37.7 (1)44.4 (1)60.7 (1)51.8 (2)52.4 (1)
MUSKHistopathologyVLM62.5 (2)38.6 (1)52.6 (2)50.7 (3)57.9 (3)25.9 (4)63.8 (1)53.5 (5)22.7 (3)50.1 (3)32.4 (3)71.2 (3)36.3 (2)36.2 (2)48.5 (2)55.2 (1)47.4 (2)
PLIPHistopathologyVLM42.7 (4)25.9 (5)25.6 (5)38.2 (5)61.4 (2)31.5 (3)53.9 (5)46.1 (7)16.3 (5)64.1 (1)10.3 (5)51.0 (6)14.6 (4)28.3 (4)23.5 (4)22.6 (4)34.7 (4)
QUILTNETHistopathologyVLM30.3 (5)29.1 (4)37.5 (4)24.2 (6)44.2 (5)14.2 (5)57.1 (4)65.8 (2)50.4 (1)47.6 (4)12.3 (4)44.3 (7)13.9 (5)25.0 (5)18.5 (5)19.7 (5)33.4 (5)
CLIP-B/32Natural-imageVLM13.2 (7)7.5 (7)18.7 (6)21.8 (7)24.4 (7)9.8 (7)42.2 (6)48.1 (6)13.9 (7)21.6 (7)2.0 (7)56.8 (5)3.9 (7)6.1 (7)4.3 (7)5.5 (6)18.7 (7)
CLIP-L/14Natural-imageVLM27.1 (6)19.2 (6)5.8 (7)40.6 (4)41.1 (6)10.4 (6)58.0 (3)55.6 (4)49.4 (2)25.4 (6)7.4 (6)70.2 (4)7.3 (6)16.0 (6)6.6 (6)5.4 (7)27.8 (6)