Statements (24)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:tokenization_algorithm
|
| gptkbp:advantage |
handles unknown words
reduces vocabulary size |
| gptkbp:developedBy |
gptkb:Google
|
| gptkbp:input |
gptkb:text
|
| gptkbp:introducedIn |
2016
|
| gptkbp:output |
tokens
|
| gptkbp:purpose |
subword tokenization
|
| gptkbp:relatedTo |
gptkb:Byte_Pair_Encoding
gptkb:SentencePiece |
| gptkbp:splitsWordsInto |
subword units
|
| gptkbp:usedFor |
natural language processing
|
| gptkbp:usedIn |
gptkb:BERT
gptkb:ALBERT gptkb:DistilBERT machine translation speech recognition question answering text classification |
| gptkbp:bfsParent |
gptkb:BPE
gptkb:Tokenizers_library gptkb:BERT |
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
WordPiece
|