Sparse Transformer: Generating Text with Sparse Attention

GPTKB entity