Statements (29)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:software
|
| gptkbp:availableOn |
gptkb:GitHub
|
| gptkbp:developedBy |
gptkb:Hugging_Face
|
| gptkbp:feature |
decoding
Unicode support normalization customizable pipelines serialization export to JSON fast tokenization integration with Python via bindings multi-threaded processing pre-tokenization training new tokenizers |
| gptkbp:integratesWith |
gptkb:Transformers_library
|
| gptkbp:license |
gptkb:Apache_License_2.0
|
| gptkbp:npmPackage |
tokenizers
|
| gptkbp:openSource |
true
|
| gptkbp:programmingLanguage |
gptkb:Python
gptkb:Rust |
| gptkbp:purpose |
text tokenization
|
| gptkbp:supports |
gptkb:WordPiece
gptkb:Byte-Pair_Encoding gptkb:SentencePiece Unigram |
| gptkbp:usedFor |
natural language processing
|
| gptkbp:bfsParent |
gptkb:Hugging_Face
|
| gptkbp:bfsLayer |
6
|
| https://www.w3.org/2000/01/rdf-schema#label |
Tokenizers library
|