Statements (24)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:software
|
| gptkbp:developedBy |
gptkb:Google
|
| gptkbp:documentation |
https://github.com/google/sentencepiece
|
| gptkbp:feature |
Unicode support
language independent subword tokenization trainable on raw text |
| gptkbp:firstReleased |
2018
|
| gptkbp:license |
gptkb:Apache_License_2.0
|
| gptkbp:npmPackage |
sentencepiece
|
| gptkbp:platform |
cross-platform
|
| gptkbp:programmingLanguage |
gptkb:Python
gptkb:C++ |
| gptkbp:purpose |
text segmentation
unsupervised text tokenizer |
| gptkbp:repository |
https://github.com/google/sentencepiece
|
| gptkbp:supportsAlgorithm |
gptkb:Byte-Pair_Encoding
gptkb:Unigram_Language_Model |
| gptkbp:usedIn |
machine translation
natural language processing language modeling |
| gptkbp:bfsParent |
gptkb:T5
|
| gptkbp:bfsLayer |
6
|
| https://www.w3.org/2000/01/rdf-schema#label |
SentencePiece
|