30 method × retriever configurations using this LLM across BEIR, MS MARCO DL, and DL-HARD.
Click any row or the + button to expand. Tabs switch dataset
context. The three steps (reformulate → retrieve → evaluate) update accordingly.
Retriever
Method
Datasets
BEIR ·
MS MARCO DL ·
Metric
| Method | Retriever | ArguAna | DBPedia | FiQA | SciFact | COVID | News | BRIGHT — AOPS | BRIGHT — Biology | BRIGHT — Earth Science | BRIGHT — Economics | BRIGHT — LeetCode | BRIGHT — Pony | BRIGHT — Psychology | BRIGHT — Robotics | BRIGHT — Stack Overflow | BRIGHT — Sustainable Living | BRIGHT — TheoremQA Questions | BRIGHT — TheoremQA Theorems | DL-HARD | DL 2019 | DL 2020 | ||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| nDCG@10 | R@100 | nDCG@10 | R@100 | nDCG@10 | R@100 | nDCG@10 | R@100 | nDCG@10 | R@100 | nDCG@10 | R@100 | nDCG@10 | R@1k | nDCG@10 | R@1k | nDCG@10 | R@1k | |||||||||||||||||||||||||||
| csqe | BGE-base-en-v1.5 | 0.6218 | 0.9915 | 0.4242 | 0.5229 | 0.4067 | 0.7384 | 0.7553 | 0.9633 | 0.7879 | 0.1431 | 0.4631 | 0.5075 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.4144 | 0.8640 | 0.7551 | 0.9009 | 0.7139 | 0.8968 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| csqe | BM25 | 0.3977 | 0.9445 | 0.3899 | 0.5136 | 0.2473 | 0.5835 | 0.7206 | 0.9487 | 0.6994 | 0.1638 | 0.4790 | 0.5909 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3658 | 0.7873 | 0.6899 | 0.9035 | 0.6548 | 0.8871 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| csqe | SPLADE++ | 0.3801 | 0.9829 | 0.3962 | 0.5232 | 0.3294 | 0.6748 | 0.7065 | 0.9593 | 0.6811 | 0.1116 | 0.4502 | 0.5018 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3690 | 0.8341 | 0.6936 | 0.9193 | 0.6796 | 0.9397 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method csqe \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| genqr | BGE-base-en-v1.5 | 0.6256 | 0.9893 | 0.3555 | 0.4693 | 0.3924 | 0.7330 | 0.7480 | 0.9700 | 0.7784 | 0.1475 | 0.4641 | 0.5089 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3870 | 0.8402 | 0.7023 | 0.8650 | 0.6903 | 0.8516 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| genqr | BM25 | 0.4060 | 0.9495 | 0.3442 | 0.4635 | 0.2302 | 0.5818 | 0.7262 | 0.9632 | 0.6869 | 0.1627 | 0.4647 | 0.6096 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.2921 | 0.7434 | 0.5479 | 0.8282 | 0.5368 | 0.8402 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| genqr | SPLADE++ | 0.3755 | 0.9836 | 0.3827 | 0.5414 | 0.3243 | 0.6774 | 0.7277 | 0.9500 | 0.6820 | 0.1193 | 0.4256 | 0.4877 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3800 | 0.8488 | 0.7065 | 0.9333 | 0.6260 | 0.9143 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method genqr \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| genqr_ensemble | BGE-base-en-v1.5 | 0.6187 | 0.9900 | 0.3759 | 0.4961 | 0.4029 | 0.7456 | 0.7589 | 0.9700 | 0.7999 | 0.1443 | 0.4748 | 0.5249 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3572 | 0.8633 | 0.7034 | 0.8870 | 0.6826 | 0.8699 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| genqr_ensemble | BM25 | 0.4073 | 0.9566 | 0.3600 | 0.4765 | 0.2388 | 0.5804 | 0.7251 | 0.9666 | 0.7528 | 0.1839 | 0.4860 | 0.6293 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.2697 | 0.7775 | 0.5589 | 0.8685 | 0.5528 | 0.8613 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| genqr_ensemble | SPLADE++ | 0.3806 | 0.9808 | 0.3643 | 0.5365 | 0.3014 | 0.6536 | 0.7175 | 0.9433 | 0.6731 | 0.1198 | 0.4438 | 0.5053 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3047 | 0.8207 | 0.6859 | 0.9020 | 0.5857 | 0.9141 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method genqr_ensemble \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| lamer | BGE-base-en-v1.5 | 0.6204 | 0.9893 | 0.4018 | 0.4998 | 0.4080 | 0.7410 | 0.7572 | 0.9733 | 0.7796 | 0.1373 | 0.4367 | 0.4591 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.4120 | 0.8557 | 0.7032 | 0.8888 | 0.7148 | 0.9026 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| lamer | BM25 | 0.4119 | 0.9452 | 0.3989 | 0.5159 | 0.2616 | 0.5901 | 0.7253 | 0.9487 | 0.7020 | 0.1661 | 0.4799 | 0.5960 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3555 | 0.8065 | 0.6368 | 0.8566 | 0.6530 | 0.9002 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| lamer | SPLADE++ | 0.3836 | 0.9829 | 0.3559 | 0.4904 | 0.3292 | 0.6724 | 0.7182 | 0.9577 | 0.6312 | 0.1081 | 0.4520 | 0.4770 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3673 | 0.8246 | 0.6836 | 0.9065 | 0.6390 | 0.9378 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method lamer \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| mugi | BGE-base-en-v1.5 | 0.6161 | 0.9900 | 0.4400 | 0.5286 | 0.4294 | 0.7584 | 0.7569 | 0.9767 | 0.8024 | 0.1427 | 0.4898 | 0.5212 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.4038 | 0.8415 | 0.7351 | 0.8869 | 0.7203 | 0.8950 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| mugi | BM25 | 0.3758 | 0.9331 | 0.4099 | 0.5309 | 0.2641 | 0.6000 | 0.7345 | 0.9660 | 0.7137 | 0.1739 | 0.5156 | 0.6075 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3651 | 0.8216 | 0.6952 | 0.9005 | 0.6578 | 0.8996 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| mugi | SPLADE++ | 0.3703 | 0.9780 | 0.3843 | 0.5137 | 0.3352 | 0.6799 | 0.7059 | 0.9600 | 0.6458 | 0.1118 | 0.4422 | 0.5002 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3625 | 0.8111 | 0.6859 | 0.9088 | 0.6508 | 0.9199 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method mugi \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| qa_expand | BGE-base-en-v1.5 | 0.6231 | 0.9900 | 0.4005 | 0.5087 | 0.4162 | 0.7452 | 0.7367 | 0.9600 | 0.7954 | 0.1419 | 0.4697 | 0.4852 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3739 | 0.8543 | 0.7370 | 0.8936 | 0.7074 | 0.8754 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| qa_expand | BM25 | 0.3970 | 0.9324 | 0.3699 | 0.4890 | 0.2643 | 0.5814 | 0.7063 | 0.9403 | 0.7065 | 0.1620 | 0.4502 | 0.5608 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3018 | 0.7570 | 0.6832 | 0.8495 | 0.6418 | 0.8787 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| qa_expand | SPLADE++ | 0.3823 | 0.9801 | 0.3873 | 0.5289 | 0.3399 | 0.6821 | 0.6964 | 0.9493 | 0.6941 | 0.1152 | 0.4266 | 0.4566 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3552 | 0.8034 | 0.7335 | 0.9170 | 0.6739 | 0.9260 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method qa_expand \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| Q2D (ZS) | BGE-base-en-v1.5 | 0.6187 | 0.9900 | 0.4311 | 0.5221 | 0.4151 | 0.7489 | 0.7609 | 0.9633 | 0.8061 | 0.1454 | 0.4761 | 0.5108 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3786 | 0.8591 | 0.7281 | 0.8995 | 0.7393 | 0.9056 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| Q2D (COT) | BGE-base-en-v1.5 | 0.6186 | 0.9886 | 0.3678 | 0.4556 | 0.4009 | 0.7483 | 0.7580 | 0.9633 | 0.7984 | 0.1380 | 0.4331 | 0.4763 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3755 | 0.8505 | 0.7125 | 0.8877 | 0.6720 | 0.8756 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| Q2D (FS) | BGE-base-en-v1.5 | 0.6179 | 0.9893 | 0.4302 | 0.5303 | 0.4205 | 0.7542 | 0.7519 | 0.9667 | 0.8039 | 0.1411 | 0.4715 | 0.5157 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.4074 | 0.8726 | 0.7272 | 0.8890 | 0.7141 | 0.8948 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| Q2D (ZS) | BM25 | 0.3970 | 0.9324 | 0.4062 | 0.5051 | 0.2599 | 0.6002 | 0.7203 | 0.9477 | 0.7430 | 0.1704 | 0.4980 | 0.5858 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3502 | 0.7811 | 0.6873 | 0.8924 | 0.6625 | 0.8942 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| Q2D (COT) | BM25 | 0.4028 | 0.9374 | 0.3934 | 0.4775 | 0.2578 | 0.5843 | 0.7135 | 0.9510 | 0.7277 | 0.1696 | 0.4656 | 0.5829 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3291 | 0.7737 | 0.6528 | 0.8777 | 0.6239 | 0.8781 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| Q2D (FS) | BM25 | 0.4012 | 0.9410 | 0.4010 | 0.5083 | 0.2684 | 0.5993 | 0.7123 | 0.9493 | 0.7081 | 0.1639 | 0.4801 | 0.5842 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3562 | 0.8042 | 0.6904 | 0.8861 | 0.6746 | 0.8984 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| Q2D (ZS) | SPLADE++ | 0.3819 | 0.9808 | 0.3947 | 0.5209 | 0.3301 | 0.6766 | 0.7035 | 0.9553 | 0.6340 | 0.1089 | 0.4517 | 0.4786 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3377 | 0.8389 | 0.7000 | 0.9142 | 0.6875 | 0.9372 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train","mode":"zs"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| Q2D (COT) | SPLADE++ | 0.3820 | 0.9801 | 0.3926 | 0.5319 | 0.3154 | 0.6513 | 0.7120 | 0.9460 | 0.6858 | 0.1056 | 0.4160 | 0.4741 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3308 | 0.8456 | 0.6877 | 0.9153 | 0.6534 | 0.9089 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"cot","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| Q2D (FS) | SPLADE++ | 0.3826 | 0.9808 | 0.3910 | 0.5192 | 0.3446 | 0.6890 | 0.7093 | 0.9567 | 0.6591 | 0.1099 | 0.4302 | 0.5009 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3771 | 0.8396 | 0.6932 | 0.9068 | 0.6749 | 0.9389 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2doc \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"mode":"fs","num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| query2e | BGE-base-en-v1.5 | 0.6192 | 0.9900 | 0.3249 | 0.4268 | 0.3920 | 0.7411 | 0.7417 | 0.9633 | 0.7741 | 0.1404 | 0.4448 | 0.4848 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3779 | 0.8306 | 0.6970 | 0.8701 | 0.6422 | 0.8184 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BGE-base-en-v1.5 (dense) python -m pyserini.search.faiss \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.bge-base-en-v1.5 \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder BAAI/bge-base-en-v1.5 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| query2e | BM25 | 0.4062 | 0.9381 | 0.3778 | 0.4772 | 0.2690 | 0.5930 | 0.7089 | 0.9403 | 0.7150 | 0.1772 | 0.4633 | 0.5807 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3446 | 0.7639 | 0.5935 | 0.8698 | 0.5759 | 0.8594 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.flat \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · BM25 (lexical) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --bm25 --k1 0.9 --b 0.4 \ --output run.txt \ --hits 1000 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||
| query2e | SPLADE++ | 0.3818 | 0.9808 | 0.3936 | 0.5477 | 0.3282 | 0.6670 | 0.7187 | 0.9393 | 0.6869 | 0.1222 | 0.4206 | 0.4992 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0.3518 | 0.8380 | 0.6812 | 0.9302 | 0.6522 | 0.9252 | |
| 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-arguana \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-arguana.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-arguana-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-dbpedia-entity \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-dbpedia-entity.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-dbpedia-entity-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-fiqa \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-fiqa.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-fiqa-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-scifact \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-scifact.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-scifact-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-covid \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-covid.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-covid-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset beir-v1.0.0-trec-news \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index beir-v1.0.0-trec-news.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@100 python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.100 \ beir-v1.0.0-trec-news-test run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.dlhard \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ /mnt/data/son/Thesis/t5/data/dlhard/neutral_queries.tsv run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2019 \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl19-passage run.txt 1 reformulate querygym → reformulated_queries.tsv python examples/querygym_pyserini/pipeline.py \
--dataset msmarco-v1-passage.trecdl2020 \
--method query2e \
--model openai/gpt-4.1 \
--steps reformulate \
--temperature 1 \
--max-tokens 128 \
--method-params '{"num_examples":4,"train_split":"train"}' \
--output-dir outputs/reproduce 2 retrieve pyserini · SPLADE++ (learned_sparse) python -m pyserini.search.lucene \ --threads 16 --batch-size 128 \ --index msmarco-v1-passage.splade-pp-ed \ --topics outputs/reproduce/queries/reformulated_queries.tsv \ --encoder naver/splade-cocondenser-ensembledistil \ --output run.txt \ --hits 1000 --impact 3 evaluate trec_eval · nDCG@10 + R@1k python -m pyserini.eval.trec_eval -c -m ndcg.cut.10 -m recall.1000 \ dl20-passage run.txt | ||||||||||||||||||||||||||||||||||||||||||||