Day-0 LLM Model Support
For Any Hardware

Eliminate months of development delays with our Triton and vLLM backend plugins. Our solutions integrate into any CPU, GPU or NPU software stack.

Kernelize Hero

AI Inference beyond GPUs

A wave of NPU hardware specialized for LLM inference is challenging GPUs. We are building a platform based on the Triton language and compiler that will work across all of them.

New model concepts are developed in Triton

PyTorch TorchInductor automatically generates Triton code for optimization

Inference platforms like vLLM and Ollama use Triton for high-performance kernels

Liger Kernel provides Triton kernels optimized for LLMs

Python-like syntax makes parallel programming accessible to ML engineers

Open-source compiler generates optimized code for various hardware

Learn More

Why Kernelize?

Industry Standard

Open-source code built on industry standard AI infrastructure prevents lock-in and falling behind

Compiler Experts

Our team has decades of experience building compilers for GPU and NPU AI hardware

Triton Community

Leverage the biggest and most experienced AI compiler community in the world