About this leaderboard
The QueryGym Leaderboard tracks reproducible query-reformulation results
across IR benchmarks (BEIR, MS MARCO, TREC DL). Every row is backed by a
JSON file conforming to reproducibility/schema.json v1.
Submissions may also include the reformulated-queries TSV and a
TREC-format .run.txt for full re-evaluation; both are optional.
All artifacts live in the repository
under reproducibility/data/runs/{dataset}/{method}/{model}/{retriever}/.
Citing a number is as simple as linking the commit + the run JSON.
Submitting a result
Run the example pipeline, then use submit_run.py and open a PR.
submit.sh
python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2doc --model gpt-4.1 \
--output-dir outputs/dl19_query2doc
python -m reproducibility.scripts.submit_run --from-dir outputs/dl19_query2doc
make repro-aggregate
git add reproducibility/data/ && git commit -m "..." && git push
gh pr create Full guide: Reproducibility User Guide ↗
Papers
Two papers back QueryGym. See the Cite page for BibTeX entries you can paste into your bibliography.