I am a Senior Member of Technical Staff in AMD’s Research and Advanced Development group. I received my PhD in computer science at the University of Washington. At UW, I was part of the architecture group and was advised by Mark Oskin.
I am interested in understanding and improving the performance of complex architectures. I’m currently working on agent-driven kernel optimization for NPU. Check out our Triton compiler for NPU here.
During graduate school, I focused on making complex architectures easier to program while maintaining high performance. I have worked on code generation targetting various parallel architectures and have created methodologies and built tools to better understand the behavior and performance of GPUs. My thesis work focused on generating code for graph applications on a manycore architecture that utilizes high bandwidth memory.
More information: curriculum vitae, email
News
-
February 2026We open sourced our project Triton-XDNA. Check it out on GitHub!
-
January 2026We're presenting our work "From Triton to AMD NPU: Compiler-Driven Kernel Generation with MLIR-AIR" at C4ML next month!
-
July 2025I was promoted to Senior Member of Technical Staff!
Publications and Presentations
From Triton to AMD NPU: Compiler-Driven Kernel Generation with MLIR-AIR.
In
C4ML (co-located with CGO) (2026).
Code Generation and Optimization of Graph Programs on a Manycore Architecture.
In
University of Washington Dissertations & Theses (2021).
paper (pdf),
bibtex
Taming the Zoo: The Unified GraphIt Compiler Framework for Novel Architectures.
In
Intl. Symposium on Computer Architecture (ISCA) (2021).
paper (pdf),
bibtex
Profiling A GPU Database Implementation.
In
DaMoN (2017).
paper (pdf),
bibtex
Parallelizing Instance-Based Data Classifiers.
In
Intl. Florida Artificial Intelligence Research Society Conference (FLAIRS) (2016).
Patents
Selecting Intermediate Representation Transformations for Compilations.
Patent Pending
US 20250004730 A1.
Filed 2023.
view patent
Automatic Data Layout for Operation Chains.
Patent Pending
US-20240272791-A1.
Filed 2023.
view patent