Linear Function Approximation

Linear Function Approximation

The value function is approximated as a linear combination of features: $\overset{v}{^} (s, w) = w^{⊤} x (s) = \sum_{i = 1}^{d} w_{i} x_{i} (s)$

where $x (s) = (x_{1} (s), \dots, x_{d} (s))^{⊤}$ is a feature vector and $w$ is a weight vector.

Gradient

The gradient is simply the feature vector: $\nabla_{w} \overset{v}{^} (s, w) = x (s)$

This makes updates simple: $w_{t + 1} = w_{t} + α δ_{t} x (S_{t})$

Convergence Guarantee

Why Linear Is Special

Semi-gradient TD(0) with linear FA converges to the TD Fixed Point: $\overline{V E} (w_{T D}) \leq \frac{1}{1 - γ} min_{w} \overline{V E} (w)$

This guarantee does not hold for non-linear (e.g., neural network) approximators.

Feature Construction

The power of linear FA depends entirely on the feature vector $x (s)$ . See Feature Construction:

Tile Coding: Binary features from overlapping tilings
Polynomials: $x_{i} (s) = s^{i}$
Fourier basis: Cosine functions at different frequencies
Radial Basis Functions: Gaussian bumps centered at prototypes
One-hot (tabular): Each state gets its own feature → recovers tabular case

Connections

Special case of: Function Approximation
Solved exactly by: LSTD
Feature design: Feature Construction, Tile Coding
Convergence: TD Fixed Point

Study Notes

Explorer

Linear Function Approximation

Linear Function Approximation

Gradient

Convergence Guarantee

Feature Construction

Connections

Appears In

Graph View

Table of Contents

Backlinks