graykode/xlnet-Pytorch
Simple XLNet implementation with Pytorch Wrapper
Implements XLNet's permutation language modeling objective with Transformer-XL backbone, supporting configurable partial prediction masking, bidirectional data augmentation, and memory caching for long-context sequences. Uses SentencePiece/BERT tokenizers and exposes pretraining hyperparameters (sequence length, reuse length, permutation size, mask groups) as CLI arguments for reproducible experiments. Designed for PyTorch with minimal dependencies, enabling educational exploration of XLNet's two-stream self-attention and target-aware representations on custom datasets.
581 stars. No commits in the last 6 months.
Stars
581
Forks
104
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Jul 03, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/graykode/xlnet-Pytorch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
luozhouyang/transformers-keras
Transformer-based models implemented in tensorflow 2.x(using keras).
xv44586/toolkit4nlp
transformers implement (architecture, task example, serving and more)
ufal/neuralmonkey
An open-source tool for sequence learning in NLP built on TensorFlow.
uzaymacar/attention-mechanisms
Implementations for a family of attention mechanisms, suitable for all kinds of natural language...
budzianowski/PyTorch-Beam-Search-Decoding
PyTorch implementation of beam search decoding for seq2seq models