Hugging Face Tokenizers library
GPTKB entity
Statements (29)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:software
|
gptkbp:availableOn |
gptkb:PyPI
|
gptkbp:developedBy |
gptkb:Hugging_Face
|
gptkbp:documentation |
https://huggingface.co/docs/tokenizers
|
gptkbp:feature |
Unicode support
normalization customizable pipelines serialization post-processing fast tokenization multi-threaded processing pre-tokenization training new tokenizers pre-trained tokenizers |
https://www.w3.org/2000/01/rdf-schema#label |
Hugging Face Tokenizers library
|
gptkbp:integratesWith |
gptkb:Transformers_library
|
gptkbp:license |
gptkb:Apache_License_2.0
|
gptkbp:programmingLanguage |
gptkb:Python
gptkb:Rust |
gptkbp:purpose |
text tokenization
|
gptkbp:repository |
https://github.com/huggingface/tokenizers
|
gptkbp:supports |
gptkb:WordPiece
gptkb:BPE gptkb:SentencePiece Unigram |
gptkbp:usedFor |
gptkb:machine_learning
natural language processing |
gptkbp:bfsParent |
gptkb:Hugging_Face_Model_Hub
|
gptkbp:bfsLayer |
7
|