bytedance/lightseq

LightSeq: A High Performance Library for Sequence Processing and Generation

Archived

/ 100

Emerging

Built on CUDA with custom fused kernels optimized for Transformer architectures, it supports fp16 and int8 mixed-precision training and inference across BERT, GPT, ViT, and other sequence models. Integrates seamlessly with Fairseq, Hugging Face, and DeepSpeed, with TensorRT Inference Server backend for production deployment and includes decoding algorithms like beam search and CRF.

3,304 stars. No commits in the last 6 months.

Archived Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

3,304

Forks

335

Language

C++

License

—

Higher-rated alternatives

ggml-org/ggml

Tensor library for machine learning

onnx/ir-py

Efficient in-memory representation for ONNX, in Python

SandAI-org/MagiCompiler

A plug-and-play compiler that delivers free-lunch optimizations for both inference and training.

R-D-BioTech-Alaska/Qelm

Qelm - Quantum Enhanced Language Model

kekzl/imp

High-performance LLM inference engine in C++/CUDA for NVIDIA Blackwell GPUs (RTX 5090)

Explore LLM Tools

All categories Trending LLM Tool directory Insights