Contrastive Learning

Contrastive Learning

Contrastive Learning is a training paradigm that learns representations by encouraging positive pairs to be close together in the embedding space and negative pairs to be far apart.

InfoNCE Loss

For a query $q$ , a positive document $d^{+}$ , and a set of negative documents ${d_{1}^{-}, ..., d_{n}^{-}}$ , the contrastive loss is often defined as: $L = - lo g \frac{e x p ( sim ( q , d ^{+} ) / τ )}{e x p ( sim ( q , d ^{+} ) / τ ) + \sum _{i = 1}^{n} e x p ( sim ( q , d _{i}^{-} ) / τ )}$

where:

$sim (\cdot, \cdot)$ — A similarity metric (e.g., dot product or cosine similarity)

$τ$ — Temperature parameter scaling the distribution

In Information Retrieval

Contrastive learning is the foundation of Dense Retrieval models like DPR.

Positive Pair: (Query, Relevant Document).
Negative Pair: (Query, Irrelevant Document).

Negative Strategies

The choice of negatives is the most important factor in contrastive training:

Random Negatives: Documents sampled randomly from the collection (too easy).
In-batch Negatives: Efficiently using other positive documents in the current training batch as negatives.
Hard Negative Mining: Deliberately selecting documents that “look” relevant but aren’t (e.g., high BM25 score but labeled irrelevant).

Connections

Main application: Training DPR bi-encoders.
Optimization technique: Hard Negative Mining.
Metric: Often evaluated using MRR or Recall@k.

Study Notes

Explorer

Contrastive Learning

Contrastive Learning

In Information Retrieval

Negative Strategies

Connections

Appears In

Graph View

Table of Contents

Backlinks