Transformer Circuits

GPTKB entity

Statements (51)
Predicate Object
gptkbp:instanceOf Research Project
gptkbp:associatedWith gptkb:OpenAI
gptkb:Anthropic
gptkbp:field gptkb:Machine_Learning
gptkb:artificial_intelligence
gptkbp:focusesOn gptkb:Neural_Networks
Transformer Models
Interpretability
gptkbp:goal Reverse Engineering Neural Networks
Understanding Internal Mechanisms of Transformers
https://www.w3.org/2000/01/rdf-schema#label Transformer Circuits
gptkbp:influenced gptkb:Interpretability_Tools
gptkb:Mechanistic_Interpretability_Community
AI Alignment Research
gptkbp:influencedBy gptkb:Feature_Visualization_in_CNNs
gptkb:DeepDream
Neuroscience
gptkbp:language English
gptkbp:method gptkb:Activation_Patching
Ablation Studies
Circuit Decomposition
Feature Attribution
Visualization of Weights
gptkbp:notableContributor gptkb:Nick_Schiefer
gptkb:Gabriel_Goh
gptkb:Shan_Carter
gptkb:Catherine_Olsson
gptkb:Chris_Olah
gptkb:Nelson_Elhage
gptkb:Tom_Henighan
gptkb:Ludwig_Schubert
gptkbp:notableFinds gptkb:Induction_Heads_in_Transformers
Algorithmic Circuits in Language Models
Modularity in Transformer Layers
Superposition of Features
gptkbp:notablePublication gptkb:Transformer_Circuits_Thread
gptkb:A_Mathematical_Framework_for_Transformer_Circuits
gptkbp:relatedConcept gptkb:Attention_Heads
gptkb:Induction_Heads
gptkb:Language_Models
gptkb:MLP_Neurons
gptkb:Mechanistic_Interpretability
gptkb:GPT-2
gptkb:Feature_Visualization
gptkb:BERT
Circuit Analysis
Superposition in Neural Networks
gptkbp:startDate 2021
gptkbp:website https://transformer-circuits.pub
gptkbp:bfsParent gptkb:Neel_Nanda
gptkbp:bfsLayer 7