I am a Senior Member of Technical Staff in AMD’s Research and Advanced Development group. I received my PhD in computer science at the University of Washington. At UW, I was part of the architecture group and was advised by Mark Oskin.

I am interested in understanding and improving the performance of complex architectures. I’m currently working on agent-driven kernel optimization for NPU. Check out our Triton compiler for NPU here.

During graduate school, I focused on making complex architectures easier to program while maintaining high performance. I have worked on code generation targetting various parallel architectures and have created methodologies and built tools to better understand the behavior and performance of GPUs. My thesis work focused on generating code for graph applications on a manycore architecture that utilizes high bandwidth memory.

More information: curriculum vitae, email

News

  • February 2026
    We open sourced our project Triton-XDNA. Check it out on GitHub!
  • January 2026
    We're presenting our work "From Triton to AMD NPU: Compiler-Driven Kernel Generation with MLIR-AIR" at C4ML next month!
  • July 2025
    I was promoted to Senior Member of Technical Staff!

Publications and Presentations

From Triton to AMD NPU: Compiler-Driven Kernel Generation with MLIR-AIR.
Erwei Wang, Emily Furst, Yiannis Papadopoulos, Aaron Knoll, Michael L. Chu, Joseph Melber, Stephen Neuendorffer, Samuel Bayliss.
In C4ML (co-located with CGO) (2026).

Code Generation and Optimization of Graph Programs on a Manycore Architecture.
Emily Furst.
In University of Washington Dissertations & Theses (2021).
paper (pdf), bibtex

Taming the Zoo: The Unified GraphIt Compiler Framework for Novel Architectures.
Ajay Brahmakshatriya, Emily Furst, Victor Ying, Claire Hsu, Changwan Hong, Max Ruttenberg, Yunming Zhang, Dai Cheol Jung, Dustin Richmond, Michael Taylor, Julian Shun, Mark Oskin, Daniel Sanchez, Saman Amarasinghe.
In Intl. Symposium on Computer Architecture (ISCA) (2021).
paper (pdf), bibtex

Profiling A GPU Database Implementation.
Emily Furst, Mark Oskin, Bill Howe.
In DaMoN (2017).
paper (pdf), bibtex

Parallelizing Instance-Based Data Classifiers.
Imad Rahal, Emily Furst, Ramzi Haraty.
In Intl. Florida Artificial Intelligence Research Society Conference (FLAIRS) (2016).

Patents

Selecting Intermediate Representation Transformations for Compilations.
Emily Furst, Robin Conradine Knauerhase, Sangeeta Chowdhary, Michael L. Chu.
Patent Pending US 20250004730 A1. Filed 2023.
view patent

Automatic Data Layout for Operation Chains.
Benjamin Youngjae Cho, Armand Bahram Behroozi, Michael L. Chu, Emily Furst.
Patent Pending US-20240272791-A1. Filed 2023.
view patent