Study Notes
Search
Search
Dark mode
Light mode
Explorer
Tag: policy-gradient
25 items with this tag.
Jun 06, 2026
A3C
policy-gradient
actor-critic
deep-rl
exam-topic
Jun 06, 2026
Actor-Critic
policy-gradient
actor-critic
exam-topic
Jun 06, 2026
Advantage Actor-Critic (A2C)
policy-gradient
actor-critic
deep-rl
exam-topic
Jun 06, 2026
Advantage Function
policy-gradient
actor-critic
value-function
temporal-difference
Jun 06, 2026
Baseline
variance-reduction
policy-gradient
reinforcement-learning
Jun 06, 2026
Compatible Function Approximation
policy-gradient
actor-critic
exam-topic
Jun 06, 2026
Deep Deterministic Policy Gradient
policy-gradient
deep-rl
actor-critic
exam-topic
Jun 06, 2026
Deterministic Policy Gradient
policy-gradient
off-policy
continuous-control
actor-critic
Jun 06, 2026
Entropy
policy-gradient
exploration
deep-rl
exam-topic
Jun 06, 2026
Fisher Information
policy-gradient
optimization
deep-rl
exam-topic
Jun 06, 2026
GRPO
policy-gradient
deep-rl
llm-training
Jun 06, 2026
Gaussian Policy
policy-gradient
continuous-actions
stochastic
Jun 06, 2026
Generalized Advantage Estimation
advantage-function
temporal-difference
policy-gradient
bias-variance
Jun 06, 2026
Maximum Entropy RL
deep-rl
policy-gradient
Jun 06, 2026
Natural Policy Gradient
policy-gradient
optimization
fisher-information
geometry
Jun 06, 2026
PPO
policy-gradient
deep-rl
exam-topic
Jun 06, 2026
Policy Gradient Methods
policy-gradient
exam-topic
Jun 06, 2026
Policy Gradient Theorem
policy-gradient
theoretical-foundation
gradient-ascent
Jun 06, 2026
REINFORCE
policy-gradient
algorithm
monte-carlo
on-policy
Jun 06, 2026
Reinforcement Learning from Human Feedback
policy-gradient
deep-rl
exam-topic
Jun 06, 2026
Reward-Weighted Regression
policy-gradient
algorithm
offline-rl
exam-topic
Jun 06, 2026
Softmax Policy
policy-gradient
discrete-actions
stochastic
exploration
Jun 06, 2026
TD3
policy-gradient
deep-rl
actor-critic
exam-topic
Jun 06, 2026
Trust Region Policy Optimization (TRPO)
policy-gradient
deep-rl
optimization
exam-topic
Jun 06, 2026
Upside-Down RL
deep-rl
offline-rl
policy-gradient
exam-topic