High-performance ONNX inference engine with CUDA GPU acceleration, written in pure Rust.
- ONNX opset 11–20 support - Modern operators for transformer models
- CUDA GPU acceleration - Native cuBLAS/cuDNN integration via garboard
- Pure Rust - No C++ dependencies, just Rust + CUDA
GpuGraphExecutor is !Sync — use one executor per thread.
use std::collections::HashMap;
use iconnx::{OnnxParser, GpuGraphExecutor, Tensor};
let model = OnnxParser::parse_file("model.onnx")?;
let executor = GpuGraphExecutor::from_model(&model)?; // !Sync: one executor per thread
let mut inputs = HashMap::new();
inputs.insert("input".to_string(), Tensor::from_vec_f32(vec![1.0, 2.0, 3.0], vec![1, 3]));
let outputs = executor.run(inputs, vec!["output"])?;Add to your Cargo.toml:
[dependencies]
iconnx = "0.1"For GPU acceleration (enabled by default), you need:
- CUDA Toolkit 12.0+
- cuBLAS and cuDNN libraries
To build without CUDA:
[dependencies]
iconnx = { version = "0.1", default-features = false }iconnx supports ONNX opset 11–20 operators including:
- Add, Sub, Mul, Div, Pow, Sqrt, Exp
- MatMul, Gemm
- Sigmoid, Tanh, Relu, LeakyRelu, Softmax
- LayerNormalization
- Reshape, Transpose, Squeeze, Unsqueeze, Concat, Slice, Gather, ScatterND
- ReduceSum, ReduceMean
- Conv, ConvTranspose
- LSTM
- STFT (Short-Time Fourier Transform)
- Equal, Greater, Less, GreaterOrEqual, Where, And
- Cast, Expand, Pad, Resize, NonZero
┌─────────────────────────────────────────┐
│ OnnxParser │ ← Parse .onnx files
│ - Load model weights │
│ - Extract computation graph │
└────────────────┬────────────────────────┘
│
┌────────────────▼────────────────────────┐
│ GpuGraphExecutor (CUDA) │ ← GPU inference
│ - Weights loaded via from_model() │
│ - Per-inference CUDA streams │
│ - cuBLAS/cuDNN via garboard │
│ - Custom CUDA kernels │
└─────────────────────────────────────────┘
iconnx is designed for high-throughput inference:
- Memory pooling - Avoid allocation fragmentation
- Kernel caching - Compiled CUDA kernels are reused
- Kernel fusion - MulAdd, DivMul patterns fused
iconnx is thoroughly tested against ONNX Runtime:
# Run all tests
cargo test
# Run with CUDA tests (requires GPU)
cargo test --release
# Run specific operator tests
cargo test operators::MIT License - see LICENSE for details.
Contributions are welcome! Please ensure all tests pass and add tests for new functionality.