gptkbp:instanceOf
|
Natural Language Processing Task
|
gptkbp:category
|
Language Model Pretraining
|
gptkbp:compatibleWith
|
gptkb:GPT
Causal Language Modeling
|
gptkbp:enables
|
Bidirectional Context Learning
|
https://www.w3.org/2000/01/rdf-schema#label
|
Masked Language Modeling
|
gptkbp:improves
|
Contextual Representations
|
gptkbp:input
|
Text Sequence
|
gptkbp:inspiredBy
|
Cloze Test
|
gptkbp:introducedIn
|
2018
|
gptkbp:language
|
English
Multilingual
|
gptkbp:lossFunction
|
Cross-Entropy Loss
|
gptkbp:maskingStrategy
|
Random Masking
Span Masking
Whole Word Masking
|
gptkbp:maskToken
|
[MASK]
|
gptkbp:objective
|
Predict Masked Words
|
gptkbp:output
|
Predicted Tokens
|
gptkbp:percentageMasked
|
15%
|
gptkbp:proposedBy
|
Devlin et al.
|
gptkbp:relatedTo
|
Cloze Task
Self-Supervised Learning
|
gptkbp:trainer
|
gptkb:Wikipedia
gptkb:BookCorpus
|
gptkbp:usedFor
|
Pretraining Language Models
|
gptkbp:usedIn
|
gptkb:BERT
gptkb:ALBERT
gptkb:DistilBERT
gptkb:RoBERTa
|
gptkbp:bfsParent
|
gptkb:BERT:_Pre-training_of_Deep_Bidirectional_Transformers_for_Language_Understanding
|
gptkbp:bfsLayer
|
6
|