AI control problem

URI: https://gptkb.org/entity/AI_control_problem

GPTKB entity

Statements (53)

Predicate	Object
gptkbp:instanceOf	gptkb:philosophy
gptkbp:addressedTo	AI alignment research policy and governance technical solutions AI safety organizations
gptkbp:alsoKnownAs	AI alignment problem
gptkbp:category	gptkb:existential_risk gptkb:artificial_intelligence ethics of artificial intelligence
gptkbp:challenge	corrigibility value alignment instrumental convergence reward hacking preventing unintended consequences robustness to distributional shift specifying correct goals for AI
gptkbp:concerns	controlling advanced artificial intelligence
gptkbp:debatedBy	policy makers philosophers AI researchers technology ethicists
gptkbp:describedBy	gptkb:Superintelligence:_Paths,_Dangers,_Strategies
gptkbp:field	AI safety machine ethics
gptkbp:firstDiscussed	20th century
gptkbp:goal	ensure AI acts in accordance with human values
gptkbp:hasSubproblem	scalable oversight value alignment problem avoiding negative side effects avoiding reward tampering capability control problem corrigibility problem motivation selection problem reward specification problem robustness to adversarial inputs safe interruptibility
gptkbp:importantFor	high for future of humanity
gptkbp:notablePublication	gptkb:Superintelligence:_Paths,_Dangers,_Strategies Concrete Problems in AI Safety The Off-Switch Game
gptkbp:organization	gptkb:DeepMind gptkb:OpenAI gptkb:Future_of_Humanity_Institute gptkb:Machine_Intelligence_Research_Institute gptkb:Center_for_Human-Compatible_AI
gptkbp:relatedTo	artificial general intelligence existential risk from artificial intelligence
gptkbp:studiedBy	gptkb:Nick_Bostrom gptkb:Stuart_Russell gptkb:Eliezer_Yudkowsky
gptkbp:bfsParent	gptkb:Agent_Foundations
gptkbp:bfsLayer	7
http://www.w3.org/2000/01/rdf-schema#label	AI control problem