Engineering Architect, AI & HPC Frameworks

Why this role exists

Every AI chip ships with framework support. “Supports PyTorch” is on every datasheet. What that usually means is: someone ran the tutorial model, it produced correct output, the box got checked. The gap between “runs” and “runs fast enough to justify the silicon” is enormous — and it’s the gap where VRULL operates.

The same story plays out in HPC. Fortran codes that have run on x86 for twenty years need to run on RISC‑V — and they need to run fast, not just correct. Meanwhile, Julia is emerging as the language that refuses to accept the two-language problem: write in a high-level language, get performance that rivals hand-tuned C. Making Julia deliver on that promise on new RISC‑V silicon is compilation and framework work that barely anyone is doing yet.

Internal framework teams at silicon companies are typically small, stretched across too many workloads, and lack the depth to trace a performance problem from a PyTorch operator or a Julia kernel through the runtime, the MLIR pipeline, the LLVM backend, and the generated code to the hardware instruction that’s causing the bottleneck. They know the framework. They don’t know the full stack below it.

We do. VRULL’s framework engineers work vertically: from the model definition or simulation code through the runtime to the machine code. When a customer’s inference pipeline is 3x slower than it should be, or a Fortran simulation won’t vectorise, or a Julia kernel isn’t hitting the hardware’s peak — we find the layer where the performance is lost and fix it there.

What you’ll do

Port, optimise, and upstream AI framework support for RISC‑V silicon — PyTorch, TensorFlow, ONNX Runtime
Optimise HPC frameworks and scientific computing stacks for RISC‑V — Fortran runtime libraries, numerical libraries, Julia packages
Diagnose performance problems end-to-end: trace bottlenecks from model-level or application-level operations through MLIR/LLVM to generated machine code
Build and optimise inference and computation kernels that exploit custom ISA extensions — matrix-computing, vector, and application-specific instructions
Work with compiler engineers on MLIR and LLVM to ensure framework-level patterns map to efficient code generation
Explore Julia as a first-class target for RISC‑V AI and HPC workloads — its LLVM-based compilation model makes it a natural bridge between the two domains
Drive upstream contributions to major frameworks and language ecosystems — not just patches, but architectural changes that benefit the RISC‑V ecosystem

What we’re looking for

Deep knowledge of at least one major AI framework’s internals (PyTorch, TensorFlow, ONNX Runtime) or HPC runtime environment — not the API, but the runtime, the dispatch, the optimisation passes
Understanding of how MLIR and LLVM sit beneath frameworks and how decisions at each level compound into the final generated code
The ability to read compiler output and know whether a performance gap is a framework issue, a compiler issue, or a hardware limitation
Experience optimising workloads on real hardware — whether that’s AI inference latency or HPC throughput on a vector machine
Interest in modern language ecosystems — Julia’s compilation model, Fortran’s continuing relevance, and where the field is heading
Upstream contributor standing in at least one major open-source project

What sets you apart

Experience bringing up AI or HPC frameworks on new silicon — not just porting, but making them competitive
Knowledge of quantisation, operator fusion, custom runtime kernel development, or HPC-specific optimisation patterns
Contributions to PyTorch, TensorFlow, ONNX Runtime core, or Julia’s compiler/package ecosystem
Experience with Fortran optimisation for production HPC codes — you know what these workloads actually need
The ability to tell a silicon partner exactly which instruction they need to add to make their framework story real — whether that framework is PyTorch or Julia

“Supports PyTorch” is a checkbox. Making PyTorch, Julia, and legacy Fortran codes fast on silicon that didn’t exist six months ago — that’s engineering.