Target Network
Target Network
A target network is a separate, periodically-updated copy of the Q-network used to compute TD targets during training. Its weights are held fixed for steps, then copied from the online network .
Why needed:
- Without it, the TD target changes with every gradient step → moving target problem
- Freezing the target provides a stable regression objective
- Alternative: soft update (Polyak averaging, used in DDPG/SAC)