gptkbp:instanceOf
|
gptkb:model
large language model
|
gptkbp:architecture
|
encoder
|
gptkbp:author
|
gptkb:Pengcheng_He
gptkb:Weizhu_Chen
gptkb:Xiaodong_Liu
gptkb:Jianfeng_Gao
|
gptkbp:availableOn
|
gptkb:Hugging_Face_Model_Hub
|
gptkbp:basedOn
|
transformer architecture
|
gptkbp:citation
|
1000+
|
gptkbp:developedBy
|
gptkb:Microsoft_Research
|
gptkbp:github
|
https://github.com/microsoft/DeBERTa
|
gptkbp:hasVersion
|
gptkb:DeBERTa-Base
gptkb:DeBERTa-Large
gptkb:DeBERTa-XLarge
gptkb:DeBERTa-XXLarge
gptkb:DeBERTa-v2
gptkb:DeBERTa-v3
|
https://www.w3.org/2000/01/rdf-schema#label
|
DeBERTa
|
gptkbp:improves
|
gptkb:BERT
gptkb:RoBERTa
|
gptkbp:introducedIn
|
2021
|
gptkbp:language
|
English
|
gptkbp:license
|
gptkb:MIT_License
|
gptkbp:notableFeature
|
better handling of word order
disentangled attention
enhanced mask decoder
improved generalization
|
gptkbp:notablePublication
|
gptkb:DeBERTa:_Decoding-enhanced_BERT_with_Disentangled_Attention
|
gptkbp:openSource
|
true
|
gptkbp:parameter
|
up to 1.5B
|
gptkbp:publishedIn
|
gptkb:ICLR_2021
|
gptkbp:standsFor
|
gptkb:Decoding-enhanced_BERT_with_Disentangled_Attention
|
gptkbp:supports
|
transfer learning
fine-tuning
|
gptkbp:tokenizerType
|
gptkb:WordPiece
|
gptkbp:trainer
|
large text corpora
|
gptkbp:usedFor
|
question answering
natural language understanding
text classification
named entity recognition
|
gptkbp:usedIn
|
gptkb:GLUE
gptkb:SQuAD
gptkb:SuperGLUE
NLP benchmarks
|
gptkbp:uses
|
disentangled attention mechanism
enhanced mask decoder
|
gptkbp:bfsParent
|
gptkb:transformation
|
gptkbp:bfsLayer
|
5
|