Multi-Stage Ranking
Multi-Stage Ranking
Multi-Stage Ranking is a retrieval architecture that pipes results through progressively more complex and expensive models. It balances the “Efficiency vs. Effectiveness” trade-off.
The Funnel Analogy
Searching millions of documents is like finding a needle in a haystack. You can’t use a microscope (expensive model) on every straw.
- First Stage (Retrieval): Use a leaf-blower (BM25 or DPR) to quickly grab the top 1000 candidates.
- Second Stage (Reranking): Use a magnifying glass (Neural Reranking / MonoBERT) to find the best 100 from those.
- Final Stage (Optional): Use a microscope (Heavy LLMs) to pick the perfect top 10.
Standard Pipeline Structure
| Stage | Model Type | Documents Handled | Speed | Quality |
|---|---|---|---|---|
| Retrieval | BM25, DPR | Millions 1000 | Sub-millisecond | Medium |
| Reranking | MonoBERT, Cross-Encoder | 1000 100 | Deciseconds | High |
| Fine Reranking | monoT5, LLMs | 100 10 | Seconds | Maximum |
Trade-offs: Efficiency vs. Effectiveness
- Latency: If the reranker is slow, we must retrieve fewer documents in the first stage.
- Recall: If the first stage misses the relevant document, the reranker can never find it.
- Cost: Running Transformers on 1000 documents per query is computationally expensive.
Connections
- Components: Neural Reranking, BM25, Dense Retrieval.
- Models used: MonoBERT, DPR.
- Solves: The problem that complex models are too slow for full-collection search.