Target Network

Target Network

A target network is a separate, periodically-updated copy of the Q-network used to compute TD targets during training. Its weights are held fixed for steps, then copied from the online network .

Why needed:

  • Without it, the TD target changes with every gradient step → moving target problem
  • Freezing the target provides a stable regression objective
  • Alternative: soft update (Polyak averaging, used in DDPG/SAC)

Appears In