Function Approximation
Definition
Function Approximation in RL
Function approximation replaces the lookup table used in tabular RL with a parameterized function (or ). Instead of storing one value per state, we learn a weight vector where .
Why We Need It
Tabular methods store for every state. Real problems have millions or billions of states (or continuous state spaces). You can’t visit every state, let alone store a value for each. Function approximation lets you generalize — update from one state, and the values of similar states change too.
The Prediction Objective
Mean Squared Value Error ( )
where:
- — on-policy distribution (how often state is visited under )
- — true value (unknown)
- — our approximation
Stochastic Gradient Descent
SGD Update for Value Prediction
Problem: we don’t know . Replace it with a target:
- MC target: → true gradient method (converges to local minimum of )
- TD target: → semi-gradient (not true gradient because target depends on )
Types of Function Approximators
Linear Function Approximation
- is a feature vector
- Simple, well-understood convergence guarantees
- Feature design matters: Tile Coding, polynomials, Fourier basis, RBFs
Neural Network Function Approximation
- Non-linear, can represent complex functions
- Trained with backpropagation
- Fewer convergence guarantees
- Foundation of Deep Q-Network (DQN) and modern deep RL
Key Distinction: Tabular as Special Case
Tabular = Function Approximation with One-Hot Features
A lookup table is actually a special case of linear function approximation where the feature vector is a one-hot vector (1 in position , 0 elsewhere). Then — each state has its own weight. All tabular convergence guarantees follow from the more general FA framework.
Challenges
- Generalization: Updating one state affects nearby states — can be good (efficiency) or bad (interference)
- The Deadly Triad: Function approximation + bootstrapping + off-policy = potential divergence
- Non-stationarity: Target values change as updates
Connections
- Extends: Value Function (from tables to functions)
- Methods: Semi-Gradient Methods, LSTD, Linear Function Approximation
- Features: Tile Coding, Feature Construction
- Deep version: Deep Q-Network (DQN), Neural Network Function Approximation
- Danger: Deadly Triad