Dense Passage Retrieval for Open-Domain Question Answering
Karpukhin, Oguz, Min, Lewis, Wu, Edunov, Chen, Yih
What it says
DPR uses two BERT encoders — one for questions, one for passages — trained so that matching question/passage pairs have high dot product and non-matching pairs have low dot product. Training uses in-batch negatives (other passages in the same mini-batch act as negatives for free) plus hard negatives mined from BM25. At inference, all passages are embedded offline; a new question is embedded at query time and nearest neighbors are retrieved via FAISS.
Why it matters
DPR is the canonical recipe for dense retrieval and the direct ancestor of every modern embedding model used in RAG. The dual-encoder, in-batch-negatives pattern is still the default way to train retrieval models in 2026. Before DPR, BM25 was the hard-to-beat baseline; after DPR, dense retrieval became the default.
Read next
- ColBERT (Khattab & Zaharia, 2020) — late-interaction retrieval, better quality at higher cost.
- Sentence-BERT (Reimers & Gurevych, 2019) — the earlier dual-encoder that set the template.
- E5 / BGE embedding models (2023–2024) — modern production-grade descendants.