Boolean Retrieval

Boolean Retrieval

Boolean Retrieval is a retrieval model where documents are represented as sets of terms, and queries are expressed as Boolean expressions of terms using operators like AND, OR, and NOT.

Mechanism

  • Input: A query like (Brutus AND Caesar) AND NOT Cassius.
  • Representation: Often visualized using a Term-Document Incidence Matrix, where rows represent terms and columns represent documents (1 if present, 0 if absent).
  • Processing: The system performs bitwise operations on the incidence vectors (or more commonly, intersects posting lists in an Inverted Index).

Match vs. Rank

In Boolean Retrieval, there is no “ranking.” A document either matches the boolean condition (stays in the result set) or it doesn’t. There is no concept of a document being “more relevant” than another.

Pros and Cons

ProsCons
Precise control for expert usersNo relevance ranking (all or nothing)
predictable and transparent results”Feast or famine”: too many or zero results
Efficient technical implementationDifficult for non-expert users to write queries

Connections

Appears In