Bpe Tokenizers NLP Tools
There are 5 bpe tokenizers tools tracked. The highest-rated is ml-rust/splintr at 44/100 with 57 stars and 112 monthly downloads.
Get all 5 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=bpe-tokenizers&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
ml-rust/splintr
A high-performance tokenizer (BPE + SentencePiece) built with Rust with... |
|
Emerging |
| 2 |
georg-jung/FastBertTokenizer
Fast and memory-efficient library for WordPiece tokenization as it is used by BERT. |
|
Emerging |
| 3 |
sanderland/script_tok
Code for the paper "BPE stays on SCRIPT" |
|
Emerging |
| 4 |
ash-01xor/bpe.c
Simple Byte pair Encoding mechanism used for tokenization process . written... |
|
Experimental |
| 5 |
deepanprabhu/fastbpe
Java library implementing Byte-Pair Encoding Tokenization |
|
Experimental |