BERT-Large

GPTKB entity

Statements (63)
Predicate Object
gptkbp:instance_of gptkb:language
gptkbp:application language translation
text generation
text summarization
text classification
gptkbp:architecture gptkb:Transformers
gptkbp:attention_heads gptkb:16
gptkbp:available_on gptkb:Hugging_Face
gptkb:Py_Torch_Hub
gptkb:Tensor_Flow_Hub
gptkbp:batch_size gptkb:32
gptkbp:coat_of_arms gptkb:24
gptkbp:community_support high
gptkbp:developed_by gptkb:Google
gptkbp:drops 0.1
gptkbp:evaluates gptkb:GLUE
gptkb:Co_NLL-2003
gptkb:SQu_AD
gptkb:MNLI
gptkbp:field_of_study gptkb:Deep_Learning
gptkb:NLP
gptkb:AI_technology
gptkbp:hidden_size 1024
https://www.w3.org/2000/01/rdf-schema#label BERT-Large
gptkbp:impact gptkb:significant
gptkbp:influenced_by gptkb:ELMo
gptkb:GPT
gptkbp:initialization Xavier initialization
gptkbp:input_output contextual embeddings
tokenized text
gptkbp:is_a_framework_for gptkb:Tensor_Flow
gptkb:Py_Torch
gptkbp:is_compared_to gptkb:Ro_BERTa
gptkb:XLNet
gptkb:BERT-Base
ALBERT
gptkbp:is_open_source gptkb:true
gptkbp:is_optimized_for gptkb:Adam
gptkbp:is_tasked_with natural language understanding
question answering
sentiment analysis
named entity recognition
gptkbp:is_taught_in 2e-5
gptkbp:is_trained_in gptkb:Wikipedia
gptkb:Book_Corpus
gptkbp:language English
gptkbp:losses cross-entropy loss
gptkbp:max_sequence_length 512
gptkbp:orbital_period 345 million
gptkbp:output_layer softmax
gptkbp:performance state-of-the-art
gptkbp:release_date gptkb:2019
gptkbp:successor gptkb:BERT-Base
gptkbp:tokenizer Word Piece
gptkbp:training supervised
unsupervised
gptkbp:training_data_size 3.3 billion words
gptkbp:tuning possible
gptkbp:usage widely adopted
gptkbp:uses masked language modeling
next sentence prediction
gptkbp:bfsParent gptkb:BERT
gptkbp:bfsLayer 5