Statements (23)
Predicate | Object |
---|---|
gptkbp:instanceOf |
reinforcement learning algorithm
|
gptkbp:appliesTo |
control systems
robotics game playing |
gptkbp:canBe |
off-policy
on-policy |
gptkbp:hasComponent |
actor
literary criticism |
gptkbp:hasVariant |
gptkb:A2C
gptkb:A3C gptkb:DDPG gptkb:TD3 gptkb:SAC |
https://www.w3.org/2000/01/rdf-schema#label |
Actor-Critic
|
gptkbp:introducedIn |
1980s
|
gptkbp:learnsPolicy |
actor
|
gptkbp:learnsValueFunction |
literary criticism
|
gptkbp:type |
temporal difference learning
policy gradient method |
gptkbp:usedIn |
gptkb:artificial_intelligence
gptkb:machine_learning |
gptkbp:bfsParent |
gptkb:reinforcement_learning
|
gptkbp:bfsLayer |
5
|