graykode/xlnet-Pytorch

Simple XLNet implementation with Pytorch Wrapper

/ 100

Emerging

Implements XLNet's permutation language modeling objective with Transformer-XL backbone, supporting configurable partial prediction masking, bidirectional data augmentation, and memory caching for long-context sequences. Uses SentencePiece/BERT tokenizers and exposes pretraining hyperparameters (sequence length, reuse length, permutation size, mask groups) as CLI arguments for reproducible experiments. Designed for PyTorch with minimal dependencies, enabling educational exploration of XLNet's two-stream self-attention and target-aware representations on custom datasets.

581 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

581

Forks

104

Language

Jupyter Notebook

License

Apache-2.0

Higher-rated alternatives

luozhouyang/transformers-keras

Transformer-based models implemented in tensorflow 2.x(using keras).

xv44586/toolkit4nlp

transformers implement (architecture, task example, serving and more)

ufal/neuralmonkey

An open-source tool for sequence learning in NLP built on TensorFlow.

uzaymacar/attention-mechanisms

Implementations for a family of attention mechanisms, suitable for all kinds of natural language...

budzianowski/PyTorch-Beam-Search-Decoding

PyTorch implementation of beam search decoding for seq2seq models

Explore NLP Tools

All categories Trending NLP directory Insights