TD Fixed Point
TD Fixed Point
The TD fixed point is the weight vector where the expected semi-gradient TD(0) update is zero. For Linear Function Approximation, this is the unique solution to .
TD Fixed Point Equation
where:
Error Bound
TD Fixed Point Error Bound
The error at the TD fixed point is at most times the best possible error with the given features.
Not the Minimum
The TD fixed point is generally NOT the that minimizes . MC gradient descent finds the true minimum; TD finds a different (potentially worse) point. The bound above quantifies how much worse it can be.
Connections
- Solved by: LSTD (directly), Semi-gradient TD (iteratively)
- Only guaranteed for: Linear Function Approximation