QueryGym
QueryGym Leaderboard
Reproducible benchmarks for LLM query reformulation.
← Datasets

ArguAna

beir-v1.0.0-arguana
All results produced by QueryGym · fully reproducible!

120 (method × LLM × retriever) configurations evaluated on this dataset.
Click any row or the + button to expand. The three steps (reformulate → retrieve → evaluate) for that run appear inline.

Retriever
Model
Method
120 / 120 runs
best in column
Method LLM Retriever nDCG@10 R@100
csqe Qwen2.5-72B-Instruct BGE-base-en-v1.5 0.62290.9886
methodcsqe llmQwen2.5-72B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
csqe Qwen2.5-72B-Instruct BM25 0.3864
methodcsqe llmQwen2.5-72B-Instruct retrieverBM25 datasetArguAna
No run config available.
csqe Qwen2.5-72B-Instruct SPLADE++ 0.51180.9787
methodcsqe llmQwen2.5-72B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
csqe Qwen2.5-7B-Instruct BGE-base-en-v1.5 0.62310.9893
methodcsqe llmQwen2.5-7B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
csqe Qwen2.5-7B-Instruct BM25 0.40080.9403
methodcsqe llmQwen2.5-7B-Instruct retrieverBM25 datasetArguAna
No run config available.
csqe Qwen2.5-7B-Instruct SPLADE++ 0.51000.9801
methodcsqe llmQwen2.5-7B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
csqe gpt-4.1 BGE-base-en-v1.5 0.62180.9915
methodcsqe llmgpt-4.1 retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
csqe gpt-4.1 BM25 0.39770.9445
methodcsqe llmgpt-4.1 retrieverBM25 datasetArguAna
No run config available.
csqe gpt-4.1 SPLADE++ 0.38010.9829
methodcsqe llmgpt-4.1 retrieverSPLADE++ datasetArguAna
No run config available.
csqe gpt-4.1-nano BGE-base-en-v1.5 0.62100.9886
methodcsqe llmgpt-4.1-nano retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
csqe gpt-4.1-nano BM25 0.39640.9381
methodcsqe llmgpt-4.1-nano retrieverBM25 datasetArguAna
No run config available.
csqe gpt-4.1-nano SPLADE++ 0.37920.9801
methodcsqe llmgpt-4.1-nano retrieverSPLADE++ datasetArguAna
No run config available.
genqr Qwen2.5-72B-Instruct BGE-base-en-v1.5 0.62480.9900
methodgenqr llmQwen2.5-72B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
genqr Qwen2.5-72B-Instruct BM25 0.4188
methodgenqr llmQwen2.5-72B-Instruct retrieverBM25 datasetArguAna
No run config available.
genqr Qwen2.5-72B-Instruct SPLADE++ 0.52010.9815
methodgenqr llmQwen2.5-72B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
genqr Qwen2.5-7B-Instruct BGE-base-en-v1.5 0.62620.9893
methodgenqr llmQwen2.5-7B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
genqr Qwen2.5-7B-Instruct BM25 0.43390.9523
methodgenqr llmQwen2.5-7B-Instruct retrieverBM25 datasetArguAna
No run config available.
genqr Qwen2.5-7B-Instruct SPLADE++ 0.52110.9851
methodgenqr llmQwen2.5-7B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
genqr gpt-4.1 BGE-base-en-v1.5 0.62560.9893
methodgenqr llmgpt-4.1 retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
genqr gpt-4.1 BM25 0.40600.9495
methodgenqr llmgpt-4.1 retrieverBM25 datasetArguAna
No run config available.
genqr gpt-4.1 SPLADE++ 0.37550.9836
methodgenqr llmgpt-4.1 retrieverSPLADE++ datasetArguAna
No run config available.
genqr gpt-4.1-nano BGE-base-en-v1.5 0.62340.9900
methodgenqr llmgpt-4.1-nano retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
genqr gpt-4.1-nano BM25 0.40130.9488
methodgenqr llmgpt-4.1-nano retrieverBM25 datasetArguAna
No run config available.
genqr gpt-4.1-nano SPLADE++ 0.37730.9829
methodgenqr llmgpt-4.1-nano retrieverSPLADE++ datasetArguAna
No run config available.
genqr_ensemble Qwen2.5-72B-Instruct BGE-base-en-v1.5 0.62540.9893
methodgenqr_ensemble llmQwen2.5-72B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
genqr_ensemble Qwen2.5-72B-Instruct BM25 0.4080
methodgenqr_ensemble llmQwen2.5-72B-Instruct retrieverBM25 datasetArguAna
No run config available.
genqr_ensemble Qwen2.5-72B-Instruct SPLADE++ 0.51930.9822
methodgenqr_ensemble llmQwen2.5-72B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
genqr_ensemble Qwen2.5-7B-Instruct BGE-base-en-v1.5 0.61960.9900
methodgenqr_ensemble llmQwen2.5-7B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
genqr_ensemble Qwen2.5-7B-Instruct BM25 0.41870.9566
methodgenqr_ensemble llmQwen2.5-7B-Instruct retrieverBM25 datasetArguAna
No run config available.
genqr_ensemble Qwen2.5-7B-Instruct SPLADE++ 0.51800.9815
methodgenqr_ensemble llmQwen2.5-7B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
genqr_ensemble gpt-4.1 BGE-base-en-v1.5 0.61870.9900
methodgenqr_ensemble llmgpt-4.1 retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
genqr_ensemble gpt-4.1 BM25 0.40730.9566
methodgenqr_ensemble llmgpt-4.1 retrieverBM25 datasetArguAna
No run config available.
genqr_ensemble gpt-4.1 SPLADE++ 0.38060.9808
methodgenqr_ensemble llmgpt-4.1 retrieverSPLADE++ datasetArguAna
No run config available.
genqr_ensemble gpt-4.1-nano BGE-base-en-v1.5 0.61960.9900
methodgenqr_ensemble llmgpt-4.1-nano retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
genqr_ensemble gpt-4.1-nano BM25 0.39450.9474
methodgenqr_ensemble llmgpt-4.1-nano retrieverBM25 datasetArguAna
No run config available.
genqr_ensemble gpt-4.1-nano SPLADE++ 0.38180.9808
methodgenqr_ensemble llmgpt-4.1-nano retrieverSPLADE++ datasetArguAna
No run config available.
lamer Qwen2.5-72B-Instruct BGE-base-en-v1.5 0.62100.9893
methodlamer llmQwen2.5-72B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
lamer Qwen2.5-72B-Instruct BM25 0.4111
methodlamer llmQwen2.5-72B-Instruct retrieverBM25 datasetArguAna
No run config available.
lamer Qwen2.5-72B-Instruct SPLADE++ 0.51610.9815
methodlamer llmQwen2.5-72B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
lamer Qwen2.5-7B-Instruct BGE-base-en-v1.5 0.61950.9908
methodlamer llmQwen2.5-7B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
lamer Qwen2.5-7B-Instruct BM25 0.40630.9388
methodlamer llmQwen2.5-7B-Instruct retrieverBM25 datasetArguAna
No run config available.
lamer Qwen2.5-7B-Instruct SPLADE++ 0.51480.9794
methodlamer llmQwen2.5-7B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
lamer gpt-4.1 BGE-base-en-v1.5 0.62040.9893
methodlamer llmgpt-4.1 retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
lamer gpt-4.1 BM25 0.41190.9452
methodlamer llmgpt-4.1 retrieverBM25 datasetArguAna
No run config available.
lamer gpt-4.1 SPLADE++ 0.38360.9829
methodlamer llmgpt-4.1 retrieverSPLADE++ datasetArguAna
No run config available.
lamer gpt-4.1-nano BGE-base-en-v1.5 0.62540.9900
methodlamer llmgpt-4.1-nano retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
lamer gpt-4.1-nano BM25 0.40370.9388
methodlamer llmgpt-4.1-nano retrieverBM25 datasetArguAna
No run config available.
lamer gpt-4.1-nano SPLADE++ 0.38000.9780
methodlamer llmgpt-4.1-nano retrieverSPLADE++ datasetArguAna
No run config available.
mugi Qwen2.5-72B-Instruct BGE-base-en-v1.5 0.61940.9900
methodmugi llmQwen2.5-72B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
mugi Qwen2.5-72B-Instruct BM25 0.3868
methodmugi llmQwen2.5-72B-Instruct retrieverBM25 datasetArguAna
No run config available.
mugi Qwen2.5-72B-Instruct SPLADE++ 0.50310.9787
methodmugi llmQwen2.5-72B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
mugi Qwen2.5-7B-Instruct BGE-base-en-v1.5 0.62130.9922
methodmugi llmQwen2.5-7B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
mugi Qwen2.5-7B-Instruct BM25 0.39260.9381
methodmugi llmQwen2.5-7B-Instruct retrieverBM25 datasetArguAna
No run config available.
mugi Qwen2.5-7B-Instruct SPLADE++ 0.51010.9787
methodmugi llmQwen2.5-7B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
mugi gpt-4.1 BGE-base-en-v1.5 0.61610.9900
methodmugi llmgpt-4.1 retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
mugi gpt-4.1 BM25 0.37580.9331
methodmugi llmgpt-4.1 retrieverBM25 datasetArguAna
No run config available.
mugi gpt-4.1 SPLADE++ 0.37030.9780
methodmugi llmgpt-4.1 retrieverSPLADE++ datasetArguAna
No run config available.
mugi gpt-4.1-nano BGE-base-en-v1.5 0.61840.9900
methodmugi llmgpt-4.1-nano retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
mugi gpt-4.1-nano BM25 0.38310.9317
methodmugi llmgpt-4.1-nano retrieverBM25 datasetArguAna
No run config available.
mugi gpt-4.1-nano SPLADE++ 0.37180.9787
methodmugi llmgpt-4.1-nano retrieverSPLADE++ datasetArguAna
No run config available.
qa_expand Qwen2.5-72B-Instruct BGE-base-en-v1.5 0.62130.9900
methodqa_expand llmQwen2.5-72B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
qa_expand Qwen2.5-72B-Instruct BM25 0.3995
methodqa_expand llmQwen2.5-72B-Instruct retrieverBM25 datasetArguAna
No run config available.
qa_expand Qwen2.5-72B-Instruct SPLADE++ 0.51740.9794
methodqa_expand llmQwen2.5-72B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
qa_expand Qwen2.5-7B-Instruct BGE-base-en-v1.5 0.62080.9900
methodqa_expand llmQwen2.5-7B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
qa_expand Qwen2.5-7B-Instruct BM25 0.39400.9324
methodqa_expand llmQwen2.5-7B-Instruct retrieverBM25 datasetArguAna
No run config available.
qa_expand Qwen2.5-7B-Instruct SPLADE++ 0.51700.9829
methodqa_expand llmQwen2.5-7B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
qa_expand gpt-4.1 BGE-base-en-v1.5 0.62310.9900
methodqa_expand llmgpt-4.1 retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
qa_expand gpt-4.1 BM25 0.39700.9324
methodqa_expand llmgpt-4.1 retrieverBM25 datasetArguAna
No run config available.
qa_expand gpt-4.1 SPLADE++ 0.38230.9801
methodqa_expand llmgpt-4.1 retrieverSPLADE++ datasetArguAna
No run config available.
qa_expand gpt-4.1-nano BGE-base-en-v1.5 0.62130.9893
methodqa_expand llmgpt-4.1-nano retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
qa_expand gpt-4.1-nano BM25 0.40210.9367
methodqa_expand llmgpt-4.1-nano retrieverBM25 datasetArguAna
No run config available.
qa_expand gpt-4.1-nano SPLADE++ 0.38110.9787
methodqa_expand llmgpt-4.1-nano retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (FS) Qwen2.5-72B-Instruct BGE-base-en-v1.5 0.61900.9900
methodQ2D (FS) llmQwen2.5-72B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (ZS) Qwen2.5-72B-Instruct BGE-base-en-v1.5 0.61870.9900
methodQ2D (ZS) llmQwen2.5-72B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (COT) Qwen2.5-72B-Instruct BGE-base-en-v1.5 0.61880.9900
methodQ2D (COT) llmQwen2.5-72B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (ZS) Qwen2.5-72B-Instruct BM25 0.3995
methodQ2D (ZS) llmQwen2.5-72B-Instruct retrieverBM25 datasetArguAna
No run config available.
Q2D (COT) Qwen2.5-72B-Instruct BM25 0.4060
methodQ2D (COT) llmQwen2.5-72B-Instruct retrieverBM25 datasetArguAna
No run config available.
Q2D (FS) Qwen2.5-72B-Instruct BM25 0.3991
methodQ2D (FS) llmQwen2.5-72B-Instruct retrieverBM25 datasetArguAna
No run config available.
Q2D (FS) Qwen2.5-72B-Instruct SPLADE++ 0.52000.9801
methodQ2D (FS) llmQwen2.5-72B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (ZS) Qwen2.5-72B-Instruct SPLADE++ 0.51940.9808
methodQ2D (ZS) llmQwen2.5-72B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (COT) Qwen2.5-72B-Instruct SPLADE++ 0.51990.9808
methodQ2D (COT) llmQwen2.5-72B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (COT) Qwen2.5-7B-Instruct BGE-base-en-v1.5 0.61950.9893
methodQ2D (COT) llmQwen2.5-7B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (ZS) Qwen2.5-7B-Instruct BGE-base-en-v1.5 0.61830.9893
methodQ2D (ZS) llmQwen2.5-7B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (FS) Qwen2.5-7B-Instruct BGE-base-en-v1.5 0.62070.9886
methodQ2D (FS) llmQwen2.5-7B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (COT) Qwen2.5-7B-Instruct BM25 0.40110.9360
methodQ2D (COT) llmQwen2.5-7B-Instruct retrieverBM25 datasetArguAna
No run config available.
Q2D (FS) Qwen2.5-7B-Instruct BM25 0.39840.9353
methodQ2D (FS) llmQwen2.5-7B-Instruct retrieverBM25 datasetArguAna
No run config available.
Q2D (ZS) Qwen2.5-7B-Instruct BM25 0.40070.9353
methodQ2D (ZS) llmQwen2.5-7B-Instruct retrieverBM25 datasetArguAna
No run config available.
Q2D (COT) Qwen2.5-7B-Instruct SPLADE++ 0.52000.9808
methodQ2D (COT) llmQwen2.5-7B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (ZS) Qwen2.5-7B-Instruct SPLADE++ 0.51960.9815
methodQ2D (ZS) llmQwen2.5-7B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (FS) Qwen2.5-7B-Instruct SPLADE++ 0.51990.9808
methodQ2D (FS) llmQwen2.5-7B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (ZS) gpt-4.1 BGE-base-en-v1.5 0.61870.9900
methodQ2D (ZS) llmgpt-4.1 retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (COT) gpt-4.1 BGE-base-en-v1.5 0.61860.9886
methodQ2D (COT) llmgpt-4.1 retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (FS) gpt-4.1 BGE-base-en-v1.5 0.61790.9893
methodQ2D (FS) llmgpt-4.1 retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (ZS) gpt-4.1 BM25 0.39700.9324
methodQ2D (ZS) llmgpt-4.1 retrieverBM25 datasetArguAna
No run config available.
Q2D (COT) gpt-4.1 BM25 0.40280.9374
methodQ2D (COT) llmgpt-4.1 retrieverBM25 datasetArguAna
No run config available.
Q2D (FS) gpt-4.1 BM25 0.40120.9410
methodQ2D (FS) llmgpt-4.1 retrieverBM25 datasetArguAna
No run config available.
Q2D (ZS) gpt-4.1 SPLADE++ 0.38190.9808
methodQ2D (ZS) llmgpt-4.1 retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (COT) gpt-4.1 SPLADE++ 0.38200.9801
methodQ2D (COT) llmgpt-4.1 retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (FS) gpt-4.1 SPLADE++ 0.38260.9808
methodQ2D (FS) llmgpt-4.1 retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (COT) gpt-4.1-nano BGE-base-en-v1.5 0.61940.9893
methodQ2D (COT) llmgpt-4.1-nano retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (FS) gpt-4.1-nano BGE-base-en-v1.5 0.61880.9900
methodQ2D (FS) llmgpt-4.1-nano retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (ZS) gpt-4.1-nano BGE-base-en-v1.5 0.61900.9900
methodQ2D (ZS) llmgpt-4.1-nano retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
Q2D (COT) gpt-4.1-nano BM25 0.40110.9360
methodQ2D (COT) llmgpt-4.1-nano retrieverBM25 datasetArguAna
No run config available.
Q2D (FS) gpt-4.1-nano BM25 0.39650.9324
methodQ2D (FS) llmgpt-4.1-nano retrieverBM25 datasetArguAna
No run config available.
Q2D (ZS) gpt-4.1-nano BM25 0.39800.9374
methodQ2D (ZS) llmgpt-4.1-nano retrieverBM25 datasetArguAna
No run config available.
Q2D (COT) gpt-4.1-nano SPLADE++ 0.38200.9801
methodQ2D (COT) llmgpt-4.1-nano retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (FS) gpt-4.1-nano SPLADE++ 0.38230.9801
methodQ2D (FS) llmgpt-4.1-nano retrieverSPLADE++ datasetArguAna
No run config available.
Q2D (ZS) gpt-4.1-nano SPLADE++ 0.38190.9808
methodQ2D (ZS) llmgpt-4.1-nano retrieverSPLADE++ datasetArguAna
No run config available.
query2e Qwen2.5-72B-Instruct BGE-base-en-v1.5 0.61960.9900
methodquery2e llmQwen2.5-72B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
query2e Qwen2.5-72B-Instruct BM25 0.4066
methodquery2e llmQwen2.5-72B-Instruct retrieverBM25 datasetArguAna
No run config available.
query2e Qwen2.5-72B-Instruct SPLADE++ 0.51880.9808
methodquery2e llmQwen2.5-72B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
query2e Qwen2.5-7B-Instruct BGE-base-en-v1.5 0.62050.9900
methodquery2e llmQwen2.5-7B-Instruct retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
query2e Qwen2.5-7B-Instruct BM25 0.40520.9403
methodquery2e llmQwen2.5-7B-Instruct retrieverBM25 datasetArguAna
No run config available.
query2e Qwen2.5-7B-Instruct SPLADE++ 0.51930.9815
methodquery2e llmQwen2.5-7B-Instruct retrieverSPLADE++ datasetArguAna
No run config available.
query2e gpt-4.1 BGE-base-en-v1.5 0.61920.9900
methodquery2e llmgpt-4.1 retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
query2e gpt-4.1 BM25 0.40620.9381
methodquery2e llmgpt-4.1 retrieverBM25 datasetArguAna
No run config available.
query2e gpt-4.1 SPLADE++ 0.38180.9808
methodquery2e llmgpt-4.1 retrieverSPLADE++ datasetArguAna
No run config available.
query2e gpt-4.1-nano BGE-base-en-v1.5 0.61980.9900
methodquery2e llmgpt-4.1-nano retrieverBGE-base-en-v1.5 datasetArguAna
No run config available.
query2e gpt-4.1-nano BM25 0.40600.9417
methodquery2e llmgpt-4.1-nano retrieverBM25 datasetArguAna
No run config available.
query2e gpt-4.1-nano SPLADE++ 0.38190.9808
methodquery2e llmgpt-4.1-nano retrieverSPLADE++ datasetArguAna
No run config available.