QueryGym
QueryGym Leaderboard
SIGIR 2026 Reproducibility — Query Reformulation × LLMs × Datasets

QueryGym Leaderboard

Reproducible LLM-based query reformulation results across BEIR, MS MARCO, and TREC DL benchmarks. Every row is backed by a committed JSON, a TREC run file, and the reformulated queries — verifiable from a fresh clone.

0
Runs
32
Datasets
0
Methods
0
LLMs

Datasets with results

No SIGIR runs landed yet

The schema is locked and the pipeline is live. Once the SIGIR backfill PR lands, results appear here automatically.

How to submit a result →