Predictive State Representation

Predictive State Representation (PSR)

An alternative to Belief States for handling Partial Observability in POMDPs. Instead of maintaining a probability distribution over hidden states, a PSR defines the internal state as a vector of predictions about future observations (test probabilities).

Core Idea

Predict the Future, Not the Hidden Present

Belief states answer: “What hidden state am I likely in?” PSRs answer: “What would I observe if I did certain things?” Both are valid Markov representations, but PSRs don’t require knowledge of the hidden state space or transition/observation models.

Definition

Define a “test” $τ = a_{1} o_{1} a_{2} o_{2} \dots a_{k} o_{k}$ as a sequence of actions and observations. The test probability is:

Test Probability

$p (τ ∣ h) = Pr {O_{t + 1} = o_{1}, O_{t + 2} = o_{2}, \dots, O_{t + k} = o_{k} ∣ H_{t} = h, A_{t} = a_{1}, A_{t + 1} = a_{2}, \dots}$

For a set of core tests $τ_{1}, τ_{2}, \dots, τ_{d}$ , the PSR is:

$f (h) = [p (τ_{1} ∣ h), p (τ_{2} ∣ h), \dots, p (τ_{d} ∣ h)]$

It can be proven that for special sets of core tests, this vector is a Markov state — it satisfies the Markov criterion by definition, since it fully characterizes the distribution of future observations.

Tiger Problem Example

In the Tiger problem, all information can be captured by just two tests:

$p (H L ∣ h, L)$ — probability of hearing left if we listen
$p (H R ∣ h, L)$ — probability of hearing right if we listen

These probabilities can be learned from data (e.g., with an LSTM classifier).

Advantages and Disadvantages

Advantages	Disadvantages
Test probabilities learnable from data	Limited to tabular setting (extensions exist)
As compact or more so than belief states	Finding core tests can be difficult
Can still be updated recursively	Less intuitive than belief states
No model of hidden states needed

Connections

Alternative to Belief State for POMDPs
Both satisfy the Markov criterion for Partial Observability
More practical when the hidden state model is unknown
Can be learned with methods like LSTM classifiers

Appears In

RL-L13 - Partial Observability
RL-Book Ch17 - Frontiers (§17.3)
Littman, Sutton & Singh, “Predictive Representations of State” (2001)

Study Notes

Explorer

Predictive State Representation

Predictive State Representation

Core Idea

Definition

Tiger Problem Example

Advantages and Disadvantages

Connections

Appears In

Graph View

Table of Contents

Backlinks