TensorRT and onnx-tensorrt
TensorRT is the core inference engine, while ONNX-TensorRT is a plugin that enables TensorRT to execute ONNX model format directly—making them complements that work together.
About TensorRT
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Provides modular ONNX parser and plugin ecosystem for custom operator support, enabling developers to extend inference capabilities beyond built-in operations. The SDK performs graph optimization, kernel auto-tuning, and precision calibration (transitioning from implicit to explicit quantization) to compile trained models into optimized GPU engines. Integrates with TensorFlow, PyTorch, and ONNX through conversion tools and sample applications, with multi-device inference support via NCCL.
About onnx-tensorrt
onnx/onnx-tensorrt
ONNX-TensorRT: TensorRT backend for ONNX
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work