Study Notes

❯

❯

Policy Evaluation

Policy Evaluation

Jun 06, 20261 min read

tabular-methods

Policy Evaluation

Policy Evaluation (Prediction)

Computing the state-value function $v_{π} (s)$ for a given policy $π$ . Also called the prediction problem.

Iterative update: $V_{k + 1} (s) = \sum_{a} π (a ∣ s) \sum_{s^{'}, r} p (s^{'}, r ∣ s, a) [r + γ V_{k} (s^{'})]$

Converges to $v_{π}$ as $k \to \infty$ . Used as a subroutine in Policy Iteration.

Appears In

RL-L02 - Dynamic Programming, RL-Book Ch4 - Dynamic Programming

Graph View

Policy Evaluation
Appears In

Backlinks

Dynamic Programming
RL-Book Ch4 - Dynamic Programming
RL-L14 - Recap

Created with Quartz v4.5.2 © 2026

GitHub
Discord Community