Transformer Circuits

URI: https://gptkb.org/entity/Transformer_Circuits

GPTKB entity

Statements (51)

Predicate	Object
gptkbp:instanceOf	gptkb:Research_Project
gptkbp:associatedWith	gptkb:OpenAI gptkb:Anthropic
gptkbp:field	gptkb:Machine_Learning gptkb:artificial_intelligence
gptkbp:focusesOn	gptkb:Neural_Networks Transformer Models Interpretability
gptkbp:goal	Reverse Engineering Neural Networks Understanding Internal Mechanisms of Transformers
gptkbp:influenced	gptkb:Interpretability_Tools gptkb:Mechanistic_Interpretability_Community AI Alignment Research
gptkbp:influencedBy	gptkb:Feature_Visualization_in_CNNs gptkb:DeepDream Neuroscience
gptkbp:language	English
gptkbp:method	gptkb:Activation_Patching Ablation Studies Circuit Decomposition Feature Attribution Visualization of Weights
gptkbp:notableContributor	gptkb:Nick_Schiefer gptkb:Gabriel_Goh gptkb:Shan_Carter gptkb:Catherine_Olsson gptkb:Chris_Olah gptkb:Nelson_Elhage gptkb:Tom_Henighan gptkb:Ludwig_Schubert
gptkbp:notableFinds	gptkb:Induction_Heads_in_Transformers Algorithmic Circuits in Language Models Modularity in Transformer Layers Superposition of Features
gptkbp:notablePublication	gptkb:Transformer_Circuits_Thread gptkb:A_Mathematical_Framework_for_Transformer_Circuits
gptkbp:relatedConcept	gptkb:Attention_Heads gptkb:Induction_Heads gptkb:Language_Models gptkb:MLP_Neurons gptkb:Mechanistic_Interpretability gptkb:GPT-2 gptkb:Feature_Visualization gptkb:BERT Circuit Analysis Superposition in Neural Networks
gptkbp:startDate	2021
gptkbp:website	https://transformer-circuits.pub
gptkbp:bfsParent	gptkb:Neel_Nanda
gptkbp:bfsLayer	7
https://www.w3.org/2000/01/rdf-schema#label	Transformer Circuits