Contrastive Learning
Contrastive Learning
Contrastive Learning is a training paradigm that learns representations by encouraging positive pairs to be close together in the embedding space and negative pairs to be far apart.
InfoNCE Loss
For a query , a positive document , and a set of negative documents , the contrastive loss is often defined as:
where:
- — A similarity metric (e.g., dot product or cosine similarity)
- — Temperature parameter scaling the distribution
In Information Retrieval
Contrastive learning is the foundation of Dense Retrieval models like DPR.
- Positive Pair: (Query, Relevant Document).
- Negative Pair: (Query, Irrelevant Document).
Negative Strategies
The choice of negatives is the most important factor in contrastive training:
- Random Negatives: Documents sampled randomly from the collection (too easy).
- In-batch Negatives: Efficiently using other positive documents in the current training batch as negatives.
- Hard Negative Mining: Deliberately selecting documents that “look” relevant but aren’t (e.g., high BM25 score but labeled irrelevant).
Connections
- Main application: Training DPR bi-encoders.
- Optimization technique: Hard Negative Mining.
- Metric: Often evaluated using MRR or Recall@k.