beehive-lab/GPULlama3.java

GPU-accelerated Llama3.java inference in pure Java using TornadoVM.

/ 100

Established

Leverages TornadoVM's JIT compilation to automatically translate native Java tensor operations to OpenCL or NVIDIA PTX, supporting multiple model architectures (Llama3, Mistral, Qwen, Phi, Granite) in GGUF format. Integrates as an official model provider in LangChain4j and Quarkus, enabling GPU-accelerated inference within existing Java AI frameworks without additional glue code.

238 stars.

No Package No Dependents

Maintenance 13 / 25

Adoption 10 / 25

Maturity 15 / 25

Community 16 / 25

How are scores calculated?

Stars

238

Forks

Language

Java

License

MIT

Related models

srgtuszy/llama-cpp-swift

Swift bindings for llama-cpp library

gitkaz/mlx_gguf_server

This is a FastAPI based LLM server. Load multiple LLM models (MLX or llama.cpp) simultaneously...

JackZeng0208/llama.cpp-android-tutorial

llama.cpp tutorial on Android phone

dougeeai/llama-cpp-python-wheels

Pre-built wheels for llama-cpp-python across platforms and CUDA versions

RhinoDevel/mt_llm

Pure C wrapper library to use llama.cpp with Linux and Windows as simple as possible.

Explore Transformer Models

All categories Trending Transformer directory Insights