Skip to content

Updates

  • 2025-09-01: [Rank-sum Leaderboard] Segmentation results were updated (small changes based on improved performance for all models on the ocelot dataset only -> Related commit). Segmentation and global rankings do no thus match exactly (a few small differences only) Table 4 from the current version of our arXiv paper. The paper will be updated soon.
  • 2025-09-30: [SPIDER Leaderboard] Four SPIDER datasets have been integrated into thunder. Results associated to them have not been integrated into the rank-sum leaderboard (only datasets presented in our arXiv paper are aggregated in the rank-sum learderboard), but we have instead created a leaderboard dedicated to SPIDER datasets below.
  • 2025-10-06: [Zero-shot VLM Classification Leaderboard] A new zero-shot classification task was integrated into THUNDER. Results are presented in a dedicated leaderboard below.

🏆 Rank-sum Leaderboard

Model Domain Type KNN ↑ Lin. prob. ↑ Few-shot ↑ Seg.↑ Calib. ↓ Adv. attack ↓ Rank sum ↓
HIBOU-BHistopathologyVM75.8 (10)78.0 (14)74.2 (6)67.8 (10)3.7 (2)52.8 (14)56 (7)
HIBOU-LHistopathologyVM75.2 (12)81.2 (7)70.4 (12)68.6 (6)5.5 (18)40.0 (5)60 (8)
H-OPTIMUS-0HistopathologyVM79.2 (5)81.4 (5)73.4 (7)65.2 (13)4.7 (13)44.2 (9)52 (6)
H-OPTIMUS-1HistopathologyVM80.5 (3)83.3 (2)74.8 (4)64.5 (15)4.1 (4)58.0 (17)45 (5)
MIDNIGHTHistopathologyVM78.2 (8)82.9 (3)70.6 (11)68.8 (4)3.2 (1)36.3 (4)31 (3)
PHIKONHistopathologyVM72.8 (14)78.4 (13)72.2 (10)68.0 (9)6.4 (22)34.4 (3)71 (11)
PHIKON2HistopathologyVM70.1 (15)76.5 (15)70.1 (13)67.4 (12)4.6 (11)45.6 (11)77 (12)
UNIHistopathologyVM78.8 (6)81.3 (6)76.4 (2)67.8 (11)4.3 (7)42.8 (7)39 (4)
UNI2-HHistopathologyVM81.7 (1)83.9 (1)78.4 (1)69.0 (3)4.5 (8)34.3 (2)16 (1)
VIRCHOWHistopathologyVM74.2 (13)80.2 (10)68.5 (15)69.2 (2)5.5 (20)41.0 (6)66 (10)
VIRCHOW2HistopathologyVM81.2 (2)82.7 (4)72.6 (9)69.3 (1)4.6 (10)33.6 (1)27 (2)
CONCHHistopathologyVLM77.3 (9)80.2 (11)73.1 (8)68.3 (7)4.3 (6)55.0 (15)56 (7)
CONCH 1.5HistopathologyVLM78.6 (7)80.8 (9)74.6 (5)68.8 (5)4.9 (14)75.3 (23)63 (9)
KEEPHistopathologyVLM79.7 (4)81.1 (8)75.8 (3)68.0 (8)4.7 (12)44.7 (10)45 (5)
MUSKHistopathologyVLM75.6 (11)79.0 (12)70.0 (14)65.1 (14)4.5 (9)69.3 (22)82 (13)
PLIPHistopathologyVLM67.8 (19)71.0 (22)63.4 (17)58.5 (22)4.9 (15)56.9 (16)111 (18)
QUILTNETHistopathologyVLM68.3 (17)71.0 (21)65.7 (16)58.9 (21)7.0 (23)52.7 (13)111 (18)
DINOv2-BNatural-imageVM67.9 (18)74.8 (17)61.0 (18)59.8 (19)5.5 (21)65.8 (20)113 (19)
DINOv2-LNatural-imageVM69.6 (16)75.3 (16)59.2 (19)59.6 (20)5.3 (17)64.5 (19)107 (17)
ViT-B/16Natural-imageVM64.4 (21)71.9 (19)57.8 (21)61.0 (17)3.9 (3)46.8 (12)93 (14)
ViT-L/16Natural-imageVM67.5 (20)72.8 (18)56.5 (22)63.1 (16)5.0 (16)44.1 (8)100 (15)
CLIP-B/32Natural-imageVLM61.9 (23)65.8 (23)53.3 (23)56.0 (23)5.5 (19)60.4 (18)129 (23)
CLIP-L/14Natural-imageVLM64.2 (22)71.3 (20)58.2 (20)60.8 (18)4.2 (5)67.8 (21)106 (16)

🏆 SPIDER Leaderboard

F1-score on test sets of SPIDER datasets and average across datasets for the knn and linear probing tasks. The considered datasets are:

Model Domain Type KNN ↑ Linear probing ↑
BrCoSkThAvg BrCoSkThAvg
HIBOU-BHistopathologyVM83.3 (2)88.1 (1)87.7 (7)93.4 (3)88.1 (5)86.6 (5)90.7 (2)91.1 (8)94.5 (4)90.7 (5)
HIBOU-LHistopathologyVM83.6 (1)88.1 (2)90.7 (2)93.5 (2)89.0 (1)88.0 (1)89.8 (8)93.3 (1)94.1 (7)91.3 (1)
H-OPTIMUS-0HistopathologyVM81.7 (6)87.8 (7)89.3 (4)93.8 (1)88.2 (4)87.2 (3)89.9 (7)91.9 (5)94.4 (6)90.8 (4)
H-OPTIMUS-1HistopathologyVM83.0 (3)87.8 (5)91.1 (1)91.5 (9)88.4 (2)86.1 (8)90.3 (5)92.3 (3)93.6 (11)90.6 (6)
MIDNIGHTHistopathologyVM77.1 (13)84.9 (13)85.7 (10)92.7 (5)85.1 (11)86.1 (7)89.6 (11)91.0 (9)94.4 (5)90.3 (9)
PHIKONHistopathologyVM78.9 (11)85.1 (12)83.2 (13)89.7 (15)84.3 (12)84.9 (12)88.5 (12)87.9 (11)92.4 (13)88.4 (12)
PHIKON2HistopathologyVM80.2 (8)86.5 (10)83.3 (11)91.4 (11)85.3 (10)86.0 (9)89.7 (10)87.2 (14)94.7 (3)89.4 (11)
UNIHistopathologyVM81.3 (7)88.0 (4)87.2 (8)91.2 (12)86.9 (8)85.7 (10)90.4 (4)91.2 (7)93.9 (8)90.3 (8)
UNI2-HHistopathologyVM82.6 (4)87.1 (9)90.5 (3)92.5 (7)88.2 (3)86.7 (4)90.5 (3)92.5 (2)95.1 (1)91.2 (2)
VIRCHOWHistopathologyVM79.3 (10)87.8 (6)88.8 (6)92.3 (8)87.0 (7)86.2 (6)90.2 (6)91.3 (6)94.7 (2)90.6 (7)
VIRCHOW2HistopathologyVM82.3 (5)88.0 (3)89.1 (5)92.6 (6)88.0 (6)87.2 (2)90.8 (1)92.0 (4)93.9 (9)91.0 (3)
CONCHHistopathologyVLM75.1 (15)84.5 (14)81.7 (15)91.1 (14)83.1 (15)82.1 (13)87.9 (13)87.3 (13)91.0 (15)87.1 (14)
CONCH 1.5HistopathologyVLM75.9 (14)84.2 (15)83.3 (12)91.4 (10)83.7 (14)81.6 (14)87.4 (15)87.0 (15)92.1 (14)87.0 (15)
KEEPHistopathologyVLM79.8 (9)87.2 (8)87.2 (9)93.1 (4)86.9 (9)85.6 (11)89.7 (9)89.3 (10)93.8 (10)89.6 (10)
MUSKHistopathologyVLM77.2 (12)85.7 (11)82.5 (14)91.1 (13)84.1 (13)80.6 (15)87.9 (14)87.6 (12)93.3 (12)87.4 (13)
PLIPHistopathologyVLM69.4 (17)79.9 (16)74.4 (16)86.4 (16)77.5 (16)77.1 (18)84.7 (19)82.1 (17)88.6 (16)83.1 (17)
QUILTNETHistopathologyVLM69.9 (16)77.7 (19)73.4 (17)85.3 (17)76.6 (17)77.0 (19)82.9 (21)81.2 (20)88.5 (18)82.4 (19)
DINOv2-BNatural-imageVM64.0 (20)77.5 (20)70.4 (20)78.1 (20)72.5 (20)76.0 (20)83.9 (20)80.1 (21)87.6 (21)81.9 (21)
DINOv2-LNatural-imageVM66.1 (19)79.0 (17)71.4 (19)78.5 (19)73.7 (19)74.0 (21)85.3 (16)82.1 (16)87.7 (20)82.3 (20)
ViT-B/16Natural-imageVM63.6 (21)76.7 (21)68.7 (21)77.0 (22)71.5 (21)78.2 (17)84.7 (18)81.2 (19)87.9 (19)83.0 (18)
ViT-L/16Natural-imageVM66.5 (18)78.7 (18)71.8 (18)81.8 (18)74.7 (18)79.3 (16)85.1 (17)81.3 (18)88.5 (17)83.6 (16)
CLIP-B/32Natural-imageVLM57.0 (23)71.0 (23)63.7 (23)73.7 (23)66.4 (23)69.0 (23)81.3 (23)75.8 (23)84.7 (23)77.7 (23)
CLIP-L/14Natural-imageVLM62.5 (22)74.6 (22)66.5 (22)77.4 (21)70.2 (22)73.6 (22)82.8 (22)78.5 (22)86.7 (22)80.4 (22)

🏆 Zero-shot VLM Classification Leaderboard

F1-score on test sets of all supported datasets and average across datasets for the zero-shot classification task. Only VLM models with publicly released patch-level text encoders are included.

Model Domain Type bach bracs break_his ccrcc crc esca mhist patch_camelyon tcga_crc_msi tcga_tils tcga_uniform wilds spider_breast spider_colorectal spider_skin spider_thorax Avg
CONCHHistopathologyVLM56.1 (3)37.9 (2)53.6 (1)56.9 (2)51.8 (4)40.1 (1)60.8 (2)57.8 (3)21.6 (4)47.4 (5)37.9 (2)83.2 (2)30.7 (3)31.4 (3)35.1 (3)43.0 (3)46.6 (3)
KEEPHistopathologyVLM63.4 (1)34.2 (3)45.0 (3)69.1 (1)80.6 (1)33.3 (2)41.3 (7)71.4 (1)15.5 (6)55.5 (2)44.9 (1)89.4 (1)37.7 (1)44.4 (1)60.7 (1)51.8 (2)52.4 (1)
MUSKHistopathologyVLM62.5 (2)38.6 (1)52.6 (2)50.7 (3)57.9 (3)25.9 (4)63.8 (1)53.5 (5)22.7 (3)50.1 (3)32.4 (3)71.2 (3)36.3 (2)36.2 (2)48.5 (2)55.2 (1)47.4 (2)
PLIPHistopathologyVLM42.7 (4)25.9 (5)25.6 (5)38.2 (5)61.4 (2)31.5 (3)53.9 (5)46.1 (7)16.3 (5)64.1 (1)10.3 (5)51.0 (6)14.6 (4)28.3 (4)23.5 (4)22.6 (4)34.7 (4)
QUILTNETHistopathologyVLM30.3 (5)29.1 (4)37.5 (4)24.2 (6)44.2 (5)14.2 (5)57.1 (4)65.8 (2)50.4 (1)47.6 (4)12.3 (4)44.3 (7)13.9 (5)25.0 (5)18.5 (5)19.7 (5)33.4 (5)
CLIP-B/32Natural-imageVLM13.2 (7)7.5 (7)18.7 (6)21.8 (7)24.4 (7)9.8 (7)42.2 (6)48.1 (6)13.9 (7)21.6 (7)2.0 (7)56.8 (5)3.9 (7)6.1 (7)4.3 (7)5.5 (6)18.7 (7)
CLIP-L/14Natural-imageVLM27.1 (6)19.2 (6)5.8 (7)40.6 (4)41.1 (6)10.4 (6)58.0 (3)55.6 (4)49.4 (2)25.4 (6)7.4 (6)70.2 (4)7.3 (6)16.0 (6)6.6 (6)5.4 (7)27.8 (6)