Deep Reinforcement Learning from Human Preferences

GPTKB entity

Statements (47)
Predicate Object
gptkbp:instanceOf Research Paper
gptkbp:addresses Sample Efficiency
gptkbp:aimsTo Align AI behavior with human values
gptkbp:appliesTo Autonomous_Systems
gptkbp:author gptkb:Paul_Christiano
gptkbp:contributedTo AI Safety
gptkbp:designedBy New Training Algorithms
gptkbp:develops Reward_Models
gptkbp:discusses Ethical Implications
gptkbp:exhibits Improved Performance
gptkbp:focusesOn Reinforcement Learning
Human Feedback
https://www.w3.org/2000/01/rdf-schema#label Deep Reinforcement Learning from Human Preferences
gptkbp:includes Experimental Results
gptkbp:influences Future AI Development
gptkbp:involves Human_Evaluators
gptkbp:isChallengedBy Scalability Issues
Human Bias
Interpretability Issues
Generalization Problems
Data_Quality_Concerns
gptkbp:isCitedBy Numerous Subsequent Studies
gptkbp:isEvaluatedBy Benchmark Tests
gptkbp:isExploredIn Education
Entertainment
Finance
Healthcare
Marketing
Robotics
Security
Transportation
Game Playing
Various Domains
Social Good
gptkbp:isPartOf AI_Research_Community
gptkbp:isRelatedTo Artificial Intelligence
Machine Learning
Inverse Reinforcement Learning
Behavioral Cloning
Human-Computer_Interaction
gptkbp:isSupportedBy Empirical Evidence
Real-World Applications
Simulation Environments
Theoretical Frameworks
gptkbp:publishedIn NeurIPS 2017
gptkbp:relatedTo Traditional Reinforcement Learning
gptkbp:uses Preference-based Learning