AI control problem

GPTKB entity

Statements (53)
Predicate Object
gptkbp:instanceOf gptkb:philosophy
gptkbp:addressedTo AI alignment research
policy and governance
technical solutions
AI safety organizations
gptkbp:alsoKnownAs AI alignment problem
gptkbp:category gptkb:artificial_intelligence
existential risk
ethics of artificial intelligence
gptkbp:challenge corrigibility
value alignment
instrumental convergence
reward hacking
preventing unintended consequences
robustness to distributional shift
specifying correct goals for AI
gptkbp:concerns controlling advanced artificial intelligence
gptkbp:debatedBy policy makers
philosophers
AI researchers
technology ethicists
gptkbp:describedBy gptkb:Superintelligence:_Paths,_Dangers,_Strategies
gptkbp:field AI safety
machine ethics
gptkbp:firstDiscussed 20th century
gptkbp:goal ensure AI acts in accordance with human values
gptkbp:hasSubproblem scalable oversight
value alignment problem
avoiding negative side effects
avoiding reward tampering
capability control problem
corrigibility problem
motivation selection problem
reward specification problem
robustness to adversarial inputs
safe interruptibility
https://www.w3.org/2000/01/rdf-schema#label AI control problem
gptkbp:importantFor high for future of humanity
gptkbp:notablePublication gptkb:Superintelligence:_Paths,_Dangers,_Strategies
Concrete Problems in AI Safety
The Off-Switch Game
gptkbp:organization gptkb:DeepMind
gptkb:OpenAI
gptkb:Future_of_Humanity_Institute
gptkb:Machine_Intelligence_Research_Institute
gptkb:Center_for_Human-Compatible_AI
gptkbp:relatedTo artificial general intelligence
existential risk from artificial intelligence
gptkbp:studiedBy gptkb:Nick_Bostrom
gptkb:Stuart_Russell
gptkb:Eliezer_Yudkowsky
gptkbp:bfsParent gptkb:Agent_Foundations
gptkbp:bfsLayer 6