Day-0 LLM Model Support
For Any Hardware
Eliminate months of development delays with our Triton and vLLM backend plugins. Our solutions integrate into any CPU, GPU or NPU software stack.

AI Inference beyond GPUs
A wave of NPU hardware specialized for LLM inference is challenging GPUs. We are building a platform based on the Triton language and compiler that will work across all of them.
New model concepts are developed in Triton
PyTorch TorchInductor automatically generates Triton code for optimization
Inference platforms like vLLM and Ollama use Triton for high-performance kernels
Liger Kernel provides Triton kernels optimized for LLMs
Python-like syntax makes parallel programming accessible to ML engineers
Open-source compiler generates optimized code for various hardware
Why Kernelize?
Industry Standard
Open-source code built on industry standard AI infrastructure prevents lock-in and falling behind
Compiler Experts
Our team has decades of experience building compilers for GPU and NPU AI hardware
Triton Community
Leverage the biggest and most experienced AI compiler community in the world