gptkbp:instance_of
|
gptkb:language
|
gptkbp:application
|
language translation
text generation
text summarization
text classification
|
gptkbp:architecture
|
gptkb:Transformers
|
gptkbp:attention_heads
|
gptkb:16
|
gptkbp:available_on
|
gptkb:Hugging_Face
gptkb:Py_Torch_Hub
gptkb:Tensor_Flow_Hub
|
gptkbp:batch_size
|
gptkb:32
|
gptkbp:coat_of_arms
|
gptkb:24
|
gptkbp:community_support
|
high
|
gptkbp:developed_by
|
gptkb:Google
|
gptkbp:drops
|
0.1
|
gptkbp:evaluates
|
gptkb:GLUE
gptkb:Co_NLL-2003
gptkb:SQu_AD
gptkb:MNLI
|
gptkbp:field_of_study
|
gptkb:Deep_Learning
gptkb:NLP
gptkb:AI_technology
|
gptkbp:hidden_size
|
1024
|
https://www.w3.org/2000/01/rdf-schema#label
|
BERT-Large
|
gptkbp:impact
|
gptkb:significant
|
gptkbp:influenced_by
|
gptkb:ELMo
gptkb:GPT
|
gptkbp:initialization
|
Xavier initialization
|
gptkbp:input_output
|
contextual embeddings
tokenized text
|
gptkbp:is_a_framework_for
|
gptkb:Tensor_Flow
gptkb:Py_Torch
|
gptkbp:is_compared_to
|
gptkb:Ro_BERTa
gptkb:XLNet
gptkb:BERT-Base
ALBERT
|
gptkbp:is_open_source
|
gptkb:true
|
gptkbp:is_optimized_for
|
gptkb:Adam
|
gptkbp:is_tasked_with
|
natural language understanding
question answering
sentiment analysis
named entity recognition
|
gptkbp:is_taught_in
|
2e-5
|
gptkbp:is_trained_in
|
gptkb:Wikipedia
gptkb:Book_Corpus
|
gptkbp:language
|
English
|
gptkbp:losses
|
cross-entropy loss
|
gptkbp:max_sequence_length
|
512
|
gptkbp:orbital_period
|
345 million
|
gptkbp:output_layer
|
softmax
|
gptkbp:performance
|
state-of-the-art
|
gptkbp:release_date
|
gptkb:2019
|
gptkbp:successor
|
gptkb:BERT-Base
|
gptkbp:tokenizer
|
Word Piece
|
gptkbp:training
|
supervised
unsupervised
|
gptkbp:training_data_size
|
3.3 billion words
|
gptkbp:tuning
|
possible
|
gptkbp:usage
|
widely adopted
|
gptkbp:uses
|
masked language modeling
next sentence prediction
|
gptkbp:bfsParent
|
gptkb:BERT
|
gptkbp:bfsLayer
|
5
|