Belief State

Belief State

A belief state is a probability distribution over the hidden states of a POMDP, representing the agent’s uncertainty about which state it is in given its history of observations and actions:

Bayesian Update

Belief State Update

After taking action and observing , the belief is updated via Bayes’ rule:

where:

  • — observation likelihood (how likely is this observation if the true state is )
  • — transition probability
  • — prior belief about state
  • The result is normalized to sum to 1

Intuition

Tracking Where You Might Be

Imagine you’re in a dark room. You can’t see where you are (hidden state), but you can feel around (observations). Your belief state is your mental map of where you think you are — a probability over all possible locations. Each time you move and get a new sensory input, you update this mental map using Bayes’ rule.

The Tiger Problem (Classic Example)

A tiger is behind one of two doors. The agent can:

  • Open Left (OL): reward +100 if treasure, -100 if tiger
  • Open Right (OR): reward +100 if treasure, -100 if tiger
  • Listen (L): reward -1, get noisy observation (85% correct)

Belief evolution (from the lecture, starting at ):

StepActionObservationBest action value
Start0.50Listen:
1ListenHear Left0.85Listen:
2ListenHear Left0.97Listen:
3ListenHear Left~0.995Open Right becomes best

After enough consistent observations, the agent becomes confident enough to open the door.

Key Properties

  • The belief state is a sufficient statistic for the history — it captures all relevant information
  • The belief state MDP is fully observable (we know what belief we’re in)
  • Planning (e.g., Dynamic Programming) in belief space yields the optimal POMDP policy
  • Belief states live in a continuous space (a probability simplex) even if the underlying state space is discrete

Advantages and Disadvantages

AdvantagesDisadvantages
Concrete meaning: probability over latent statesRequires knowledge of underlying models ,
Relatively compact: dim() =
Can be updated recursivelyOnly practical for discrete state spaces
Converts POMDP to (continuous) MDPContinuous belief space makes planning hard

Connections

Appears In