Implicit and Explicit Feedback

Definition

Implicit and Explicit Feedback

The two kinds of preference signal a Recommender System learns from.

  • Explicit feedback is a deliberate, stated judgement of preference: a 1–5 star rating, a thumbs up/down, a review score. The value is (roughly) on an interpretable preference scale, and a missing entry means “not rated”, i.e. genuinely unknown.
  • Implicit feedback is an indirect behavioural signal interpreted as preference: a click, play, watch-duration, purchase, skip, add-to-playlist. It is observed as a by-product of using the system. The signal is (usually) positive-only — we see what the user did, never an explicit “I dislike this”.

Both are recorded in the user–item Interaction Matrix with users and items , but the semantics of an observed value and of a missing value differ, which changes how we model and evaluate.

Intuition

What does a missing cell mean?

The crux of the distinction is the meaning of an unobserved entry .

  • With explicit ratings, an unrated cell is missing data: the user simply has not told us. We should not treat it as a negative. Models are fit on the observed entries only.
  • With implicit feedback, there is no “negative” channel. A user who never clicked an item might (a) dislike it, or (b) never have seen it. Treating all non-interactions as negatives is wrong (most are unseen), but ignoring them entirely leaves only positive examples and the model collapses to predicting “everything is good”.

Practical reality (RS-L01 case studies): explicit ratings are scarce, expensive, and biased (few users rate; those who do are not a random sample). Implicit signals are abundant and cheap — every play, skip, and purchase is logged — which is why production systems (Spotify Daily Mix, bol.com deals, YouTube feed) run almost entirely on implicit feedback. The price is noise and ambiguity: a play could be a mis-click, a purchase could be a gift.

Mathematical Formulation

The same architecture can be trained on either signal; the loss function is what differs. RS-L01 frames Neural Collaborative Filtering as binary classification with label and prediction , and gives two losses:

Explicit feedback — (weighted) squared loss

where:

  • — set of observed (rated) user–item pairs; the sum runs over these only
  • — the actual rating (e.g. on a 1–5 scale)
  • — predicted rating (e.g. dot product in Matrix Factorization, or NCF output)
  • — optional per-entry weight (set for plain squared error)

Implicit feedback — binary cross-entropy with negative sampling

where:

  • — observed interactions, treated as positives ()
  • sampled unobserved pairs treated as negatives (); drawn by Negative Sampling to avoid summing over the huge set of all non-interactions
  • — predicted probability that item is relevant to (sigmoid output)

An alternative for implicit data is to rank a positive above a sampled negative rather than classify each. This is Bayesian Personalized Ranking (BPR) (Rendle et al., 2012), the canonical implicit-feedback objective:

BPR — pairwise ranking from implicit feedback

where:

  • — triples with an observed (positive) item for and a sampled unobserved item
  • — model scores for the positive and negative item
  • — sigmoid; the loss pushes the positive’s score above the negative’s, encoding "" rather than fitting an absolute value

Key Properties / Variants

  • Missing-data semantics (the central exam point):
    • Explicit → missing = unknown; fit on observed entries only; a regression/rating-prediction task evaluated with error metrics (the implicit assumption being “missing at random”, which is itself questionable).
    • Implicit → missing = unlabeled (mostly unseen, some disliked); a ranking / one-class task; you must synthesise negatives (Negative Sampling) because there is no negative channel.
  • Task framing. Explicit feedback naturally suits rating prediction (, evaluated by RMSE/MAE). Implicit feedback naturally suits Top-K Recommendation / Next-Item Prediction, evaluated by ranking metrics — Recall/HR@K, MRR, NDCG (RS-L01/RS-L02). The modern course default is implicit + ranking.
  • Confidence weighting. Implicit signals carry strength: watching a film twice or purchasing is stronger evidence than a single click. A common trick is to weight the positive term by an interaction-derived confidence (e.g. ), the implicit analogue of above.
  • Bias. Implicit logs are not an unbiased sample of preference: they are filtered through what the system already exposed and through Position Bias (RS-L02 motivates simulation and debiasing precisely because logged implicit data is biased). Explicit ratings carry selection bias (users rate what they feel strongly about).
  • Where each model sits:
Choosing the objective from the feedback type
─────────────────────────────────────────────
if feedback is EXPLICIT (ratings / scores):
    task   = rating prediction
    train  = squared loss over OBSERVED entries only      (do NOT impute 0)
    eval   = RMSE / MAE   (and/or rank the predicted ratings)
else feedback is IMPLICIT (clicks / plays / purchases):
    positives = observed interactions
    negatives = SAMPLE from unobserved pairs              (negative sampling)
    train  = BCE  (pointwise)   or   BPR  (pairwise ranking)
    weight positives by confidence c_ui if signal strength varies
    eval   = ranking metrics: Recall/HR@K, MRR, NDCG
    beware: logged implicit data is BIASED (exposure / position)

Connections

Appears In