QueryGym Leaderboard
Reproducible benchmarks for LLM query reformulation.
Datasets
Methods
Models
Retrievers
Cite
About
Toolkit
Retrievers
Each retriever has its own per-(method × model) leaderboard.
BGE-base-en-v1.5
360 runs
bge-base-en-v1.5
dense
BM25
360 runs
bm25
lexical
SPLADE++
360 runs
splade-pp
learned_sparse