BERT-Large

URI: https://gptkb.org/entity/BERT-Large

GPTKB entity

Statements (63)

Predicate	Object
gptkbp:instance_of	gptkb:language
gptkbp:application	language translation text generation text summarization text classification
gptkbp:architecture	gptkb:Transformers
gptkbp:attention_heads	gptkb:16
gptkbp:available_on	gptkb:Hugging_Face gptkb:Py_Torch_Hub gptkb:Tensor_Flow_Hub
gptkbp:batch_size	gptkb:32
gptkbp:coat_of_arms	gptkb:24
gptkbp:community_support	high
gptkbp:developed_by	gptkb:Google
gptkbp:drops	0.1
gptkbp:evaluates	gptkb:GLUE gptkb:Co_NLL-2003 gptkb:SQu_AD gptkb:MNLI
gptkbp:field_of_study	gptkb:Deep_Learning gptkb:NLP gptkb:AI_technology
gptkbp:hidden_size	1024
https://www.w3.org/2000/01/rdf-schema#label	BERT-Large
gptkbp:impact	gptkb:significant
gptkbp:influenced_by	gptkb:ELMo gptkb:GPT
gptkbp:initialization	Xavier initialization
gptkbp:input_output	contextual embeddings tokenized text
gptkbp:is_a_framework_for	gptkb:Tensor_Flow gptkb:Py_Torch
gptkbp:is_compared_to	gptkb:Ro_BERTa gptkb:XLNet gptkb:BERT-Base ALBERT
gptkbp:is_open_source	gptkb:true
gptkbp:is_optimized_for	gptkb:Adam
gptkbp:is_tasked_with	natural language understanding question answering sentiment analysis named entity recognition
gptkbp:is_taught_in	2e-5
gptkbp:is_trained_in	gptkb:Wikipedia gptkb:Book_Corpus
gptkbp:language	English
gptkbp:losses	cross-entropy loss
gptkbp:max_sequence_length	512
gptkbp:orbital_period	345 million
gptkbp:output_layer	softmax
gptkbp:performance	state-of-the-art
gptkbp:release_date	gptkb:2019
gptkbp:successor	gptkb:BERT-Base
gptkbp:tokenizer	Word Piece
gptkbp:training	supervised unsupervised
gptkbp:training_data_size	3.3 billion words
gptkbp:tuning	possible
gptkbp:usage	widely adopted
gptkbp:uses	masked language modeling next sentence prediction
gptkbp:bfsParent	gptkb:BERT
gptkbp:bfsLayer	5