Study Notes

❯

❯

Target Network

Jun 06, 20261 min read

deep-rl
exam-topic

Target Network

Target Network

A target network is a separate, periodically-updated copy of the Q-network used to compute TD targets during training. Its weights $θ^{-}$ are held fixed for $C$ steps, then copied from the online network $θ$ .

Why needed:

Without it, the TD target $r + γ max_{a^{'}} Q (s^{'}, a^{'}; θ)$ changes with every gradient step → moving target problem
Freezing the target provides a stable regression objective
Alternative: soft update $θ^{-} \leftarrow τ θ + (1 - τ) θ^{-}$ (Polyak averaging, used in DDPG/SAC)

Appears In

RL-L08 - Deep RL Value-Based, Deep Q-Network (DQN)

Graph View

Target Network
Appears In

Backlinks

Deadly Triad
Deep Deterministic Policy Gradient
Deep Q-Network (DQN)
Neural Network Function Approximation
Q-Learning
Soft Actor-Critic (SAC)
TD3
RL-Book Ch16 - Applications and Case Studies
RL-L06 - On-Policy TD with Approximation
RL-L08 - Deep RL Value-Based
RL-L11 - SAC, Decision Transformer & Diffuser
RL-L14 - Recap
RL - Overview

Created with Quartz v4.5.2 © 2026

GitHub
Discord Community