Statements (28)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:software
|
gptkbp:developedBy |
gptkb:Google
|
gptkbp:documentation |
https://github.com/google/sentencepiece
|
gptkbp:feature |
Unicode support
language independent subword tokenization trainable on raw text |
gptkbp:firstReleased |
2018
|
https://www.w3.org/2000/01/rdf-schema#label |
SentencePiece
|
gptkbp:license |
gptkb:Apache_License_2.0
|
gptkbp:npmPackage |
sentencepiece
|
gptkbp:platform |
cross-platform
|
gptkbp:programmingLanguage |
gptkb:Python
gptkb:C++ |
gptkbp:purpose |
text segmentation
unsupervised text tokenizer |
gptkbp:repository |
https://github.com/google/sentencepiece
|
gptkbp:supportsAlgorithm |
gptkb:Byte-Pair_Encoding
gptkb:Unigram_Language_Model |
gptkbp:usedIn |
machine translation
natural language processing language modeling |
gptkbp:bfsParent |
gptkb:WordPiece
gptkb:T5 gptkb:XLM-R gptkb:Mixtral_8x7B gptkb:Mixtral |
gptkbp:bfsLayer |
6
|