sophgo/LLM-TPU

Run generative AI models in sophgo BM1684X/BM1688

/ 100

Established

This project helps integrate and run various generative AI models, primarily large language models (LLMs) and vision-language models (VLMs), on Sophgo's BM1684X and BM1688 AI chips. It takes pre-trained AI models as input and outputs optimized versions that run efficiently on these specialized hardware platforms, enabling on-device AI capabilities. AI application developers, hardware integrators, and embedded system engineers who want to deploy advanced AI models on Sophgo hardware would use this.

271 stars.

Use this if you need to deploy and run sophisticated generative AI models, especially large language models or multimodal models, efficiently on Sophgo BM1684X or BM1688 AI chips.

Not ideal if you are working with AI models that are not primarily generative LLMs or VLMs, or if your target deployment hardware is not a Sophgo BM1684X/BM1688 chip.

AI model deployment edge AI large language models vision-language models hardware acceleration

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

271

Forks

Language

C++

License

—

Related models

PaddlePaddle/FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

mlc-ai/mlc-llm

Universal LLM Deployment Engine with ML Compilation

skyzh/tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny...

ServerlessLLM/ServerlessLLM

Serverless LLM Serving for Everyone.

AXERA-TECH/ax-llm

Explore LLM model deployment based on AXera's AI chips

Explore Transformer Models

All categories Trending Transformer directory Insights