Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:academic_journal
gptkbp:arXivID	1901.02860
gptkbp:author	gptkb:Jaime_Carbonell gptkb:Ruslan_Salakhutdinov gptkb:Yiming_Yang gptkb:Zhilin_Yang gptkb:Zihang_Dai gptkb:Quoc_V._Le
gptkbp:bench	gptkb:WikiText-103 One Billion Word enwik8
gptkbp:citation	high (over 2000 citations as of 2024)
gptkbp:contribution	improves language modeling performance introduces novel positional encoding scheme introduces segment-level recurrence mechanism enables learning dependency beyond fixed-length context
gptkbp:field	gptkb:machine_learning deep learning natural language processing
gptkbp:improves	previous state-of-the-art models on language modeling benchmarks
gptkbp:influenced	subsequent transformer architectures
gptkbp:language	English
gptkbp:proposedBy	Transformer-XL model
gptkbp:publicationYear	2019
gptkbp:publishedIn	gptkb:arXiv
gptkbp:repository	https://github.com/kimiyoung/transformer-xl
gptkbp:bfsParent	gptkb:Transformer-XL
gptkbp:bfsLayer	8
http://www.w3.org/2000/01/rdf-schema#label	Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context