Approximate Nearest Neighbor

Approximate Nearest Neighbor (ANN)

ANN refers to a class of algorithms designed to find vectors in a high-dimensional space that are “close” to a query vector, sacrificing perfect accuracy for significant gains in speed and memory efficiency.

Accuracy vs. Latency

In Dense Retrieval, searching for the exact nearest neighbor in a collection of 20 million vectors requires a linear scan (), which is too slow. ANN algorithms create data structures that allow searching in or by only checking a promising subset of the data.

Key Algorithms & Techniques

1. Inverted File Index (IVF)

  • Mechanism: Clusters the vector space (e.g., using k-means). At search time, only search vectors in the nearest cluster centroids.
  • Trade-off: High speed, but can miss neighbors at cluster boundaries.

2. HNSW (Hierarchical Navigable Small Worlds)

  • Mechanism: A graph-based approach where vectors are nodes. It build a multi-layered graph where top layers have “long” skip-links for fast navigation and bottom layers have short-range links for precision.
  • Status: Current gold-standard for speed/accuracy trade-off.

3. Product Quantization (PQ)

  • Mechanism: Compresses vectors by splitting them into sub-vectors and quantizing each part.
  • Benefit: Massive memory reduction (e.g., 10x-100x), allowing huge indexes to fit in RAM.

Trade-offs

  • Recall: The percentage of times the true 1st-nearest neighbor is actually found.
  • Latency: Time per query.
  • Index Size: Memory required on disk/RAM.

Connections

  • Tooling: FAISS (Facebook AI Similarity Search).
  • Usage: Enables first-stage retrieval for DPR.
  • Part of: Multi-Stage Ranking infrastructure.

Appears In