Hugging Face Tokenizers library

GPTKB entity

Statements (29)
Predicate Object
gptkbp:instanceOf gptkb:software
gptkbp:availableOn gptkb:PyPI
gptkbp:developedBy gptkb:Hugging_Face
gptkbp:documentation https://huggingface.co/docs/tokenizers
gptkbp:feature Unicode support
normalization
customizable pipelines
serialization
post-processing
fast tokenization
multi-threaded processing
pre-tokenization
training new tokenizers
pre-trained tokenizers
https://www.w3.org/2000/01/rdf-schema#label Hugging Face Tokenizers library
gptkbp:integratesWith gptkb:Transformers_library
gptkbp:license gptkb:Apache_License_2.0
gptkbp:programmingLanguage gptkb:Python
gptkb:Rust
gptkbp:purpose text tokenization
gptkbp:repository https://github.com/huggingface/tokenizers
gptkbp:supports gptkb:WordPiece
gptkb:BPE
gptkb:SentencePiece
Unigram
gptkbp:usedFor gptkb:machine_learning
natural language processing
gptkbp:bfsParent gptkb:Hugging_Face_Model_Hub
gptkbp:bfsLayer 7