hardesttype/switch-tokenizer
A multilingual tokenization approach that maps different language tokenizers to a shared vocabulary space, enabling efficient parameter usage through context-dependent token interpretation.
No commits in the last 6 months.
Stars
—
Forks
—
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Apr 20, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/hardesttype/switch-tokenizer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
lenML/tokenizers
a lightweight no-dependency fork from transformers.js (only tokenizers)
aiqinxuancai/TiktokenSharp
Token calculation for OpenAI models, using `o200k_base` `cl100k_base` `p50k_base` encoding.
dqbd/tiktokenizer
Online playground for OpenAPI tokenizers
pkoukk/tiktoken-go
go version of tiktoken
tryAGI/Tiktoken
This project implements token calculation for OpenAI's gpt-4 and gpt-3.5-turbo model,...