Cross-Encoder

Cross-Encoder

A cross-encoder processes the query and document jointly as a single input to a transformer. It can model fine-grained interactions between query and document tokens via self-attention, making it highly effective but computationally expensive.

Input:  [CLS] query tokens [SEP] document tokens [SEP]
                    ↓
              [Transformer]
                    ↓
              [CLS] → relevance score

Scoring

Cross-Encoder vs Bi-Encoder

PropertyCross-EncoderBi-Encoder
Query-doc interactionFull (self-attention)None (independent encoding)
EffectivenessHigherLower
LatencyHigh ( per query)Low (pre-compute docs)
Use caseReranking top-kFirst-stage retrieval
Can pre-compute docs?

In Practice

Cross-encoders are too expensive for full-collection retrieval. They’re used as rerankers in Multi-Stage Ranking: first retrieve top-k with BM25 or Dense Retrieval, then rerank with cross-encoder.

Appears In