bytedance/lightseq
LightSeq: A High Performance Library for Sequence Processing and Generation
ArchivedBuilt on CUDA with custom fused kernels optimized for Transformer architectures, it supports fp16 and int8 mixed-precision training and inference across BERT, GPT, ViT, and other sequence models. Integrates seamlessly with Fairseq, Hugging Face, and DeepSpeed, with TensorRT Inference Server backend for production deployment and includes decoding algorithms like beam search and CRF.
3,304 stars. No commits in the last 6 months.
Stars
3,304
Forks
335
Language
C++
License
—
Category
Last pushed
May 16, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/bytedance/lightseq"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ggml-org/ggml
Tensor library for machine learning
onnx/ir-py
Efficient in-memory representation for ONNX, in Python
SandAI-org/MagiCompiler
A plug-and-play compiler that delivers free-lunch optimizations for both inference and training.
R-D-BioTech-Alaska/Qelm
Qelm - Quantum Enhanced Language Model
kekzl/imp
High-performance LLM inference engine in C++/CUDA for NVIDIA Blackwell GPUs (RTX 5090)