Ryan Hanrui Wang

MIT

About

Ryan Hanrui Wang received his Ph.D. in CS from MIT, advised by Prof. Song Han. His research focuses on efficient AI computing and computer architecture. He has received several honors, including the Best Paper Award at QCE and ICML RL4RL, Best Paper Candidate at DATE, ACM SRC 1st Place, Best Poster at the NSF AI Institute, and fellowships from Qualcomm and the Unitary Fund. He was also named a Rising Star in both ML & Systems and ISSCC, and was a finalist for the NVIDIA Fellowship.

Ryan's work on SpAtten (Sparse Attention)—an efficient GenAI compression framework inventing cascade KV token pruning and quantization—has become widely adopted in both academia and industry. It is the most cited HPCA paper since 2020. He also introduced Hardware-Aware Transformers for optimized GenAI deployment. His open-source repositories and models have been downloaded over one million times and integrated into platforms like IBM and the PyTorch Ecosystem. He co-founded the QuCS Forum to promote AI education. He earned his B.Eng. with highest honors from Fudan University.

Research Interests

Efficient Generative AI

Efficiency is a fundamental enabler for scaling generative AI to real-world deployment. Our research develops principled methods to improve the performance, cost-efficiency, and scalability of large models across modalities. We focus on Transformer and LLM optimization—pioneering techniques such as SpAtten (cascade KV cache pruning and quantization), SpAtten-Chip, Hardware-Aware Transformer, Lightning-Transformer, and SpArch for system-algorithm co-design. In computer vision, we introduced AMC, and APQ for automated model compression. Our work leverages pruning, quantization, neural architecture search, reinforcement learning, and compiler-hardware-algorithm co-design to build highly optimized models that run efficiently on diverse hardware—from cloud GPUs to edge devices. These innovations have been widely adopted in both academia and industry, powering faster, more efficient, and accessible foundation models.

Efficient AI Systems with Emerging Technology

We explore how emerging compute platforms—such as quantum computers and photonics accelerators—can advance AI efficiency and scalability. Our work spans AI-centric hardware-software co-design, photonic neural networks (Lightning-Transformer), hybrid quantum-classical AI acceleration (QuantumNAS, QuEst, QuantumNAT, QOC, Atomique), and system-level optimization to push the limits of model training and inference performance.

Selected Publications

SpArch: Efficient Architecture for Sparse Matrix Multiplication

Zhekai Zhang*, Hanrui Wang*, Song Han, William J. Dally
HPCA 2020

QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits

Hanrui Wang, Yongshan Ding, Jiaqi Gu, Zirui Li, Yujun Lin, David Z. Pan, Frederic T. Chong, Song Han
HPCA 2022

QuEst: Graph Transformer for Quantum Circuit Reliability Estimation

Hanrui Wang, Pengyu Liu, Jinglei Cheng, Zhiding Liang, Jiaqi Gu, Zirui Li, Yongshan Ding, Weiwen Jiang, Yiyu Shi, Xuehai Qian, David Z Pan, Frederic T Chong, Song Han
ICCAD 2022

PointAcc: Efficient Point Cloud Accelerator

Yujun Lin, Zhekai Zhang, Haotian Tang, Hanrui Wang, Song Han
MICRO 2021

Park: An Open Platform for Learning-Augmented Computer Systems

Hongzi Mao, Parimarjan Negi, Akshay Narayan, Hanrui Wang, Jiacheng Yang, Haonan Wang, Ryan Marcus, Mehrdad Khani Shirkoohi, Songtao He, Vikram Nathan, Frank Cangialosi, Shaileshh Venkatakrishnan, Wei-Hung Weng, Song Han, Tim Kraska, Mohammad Alizadeh
NeurIPS 2019

Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training and Inference

Shuqing Luo, Pingzhi Li, Jie Peng, Hanrui Wang, Yang (Katie) Zhao, Yu (Kevin) Cao, Yu Cheng, Tianlong Chen
ICML 2025

HEXA-MoE: Efficient and Heterogeneous-aware MoE Acceleration with ZERO Computation Redundancy

Shuqing Luo, Jie Peng, Pingzhi Li, Hanrui Wang, Tianlong Chen
arxiv 2025

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, Song Han
ACL 2020

Atomique: A Quantum Compiler for Reconfigurable Neutral Atom Arrays

Hanrui Wang, Pengyu Liu, Daniel Bochen Tan, Yilian Liu, Jiaqi Gu, David Z. Pan, Jason Cong, Umut A. Acar, Song Han
ISCA 2024

APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy

Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Hanrui Wang, Yujun Lin, Song Han
CVPR 2020

AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Yihui He*, Ji Lin*, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han
ECCV 2018

Honors and Awards

2024 Rising Star in Solid-State Circuits at ISSCC
2/18/2024
2023 Rising Stars in ML and Systems
8/17/2023
MARC 2023 Best Pitch Award
1/25/2023
Gold Medal of 2022 ACM Student Research Competition
11/1/2022
2022 DAC Young Fellowship
7/10/2022
2022 ACM Student Research Competition Award 1st Place
5/1/2022
2021 Qualcomm Innovation Fellowship
5/1/2021
2020 Nvidia Graduate Fellowship Finalist
5/1/2020
2021 Analog Devices Outstanding Student Designer Award
5/1/2020
2020 DAC Young Fellowship
5/1/2020
3/31/2025
Best Paper Candidate
of
qGDP
7/15/2023
Best Demo Award
of
DAC University Demo
SpAtten
4/29/2023
Best Poster Award
of
NSF AI Institute Annual Showcase
QuantumNAT
9/17/2022
Best Paper Award
of
IEEE International Conference on Quantum Computing and Engineering (QCE)
SnCQA
5/3/2022
Best Poster Award
of
2022 NSF Athena AI Institute
QuantumNAS
12/15/2020
Best Presentation Award
of
DAC 2020 Young Fellow
6/9/2019
Best Paper Award
of
ICML 2019 Reinforcement Learning for Real Life Workshop

Competition Awards

1st Place Award
,
ACM Quantum Computing for Drug Discovery Contest
,
, @
ICCAD 2023
,
2023
QuantumNAS
First Place (1/150)
,
ACM/IEEE TinyML Design Contest
,
Memory Occupation Track
, @
ICCAD
,
2022
Hardware-Aware Transformer
First Place
,
SemanticKITTI leaderboard
,
3D semantic segmentation
, @
ECCV
,
2020
SPVNAS

Contact

Email: hanrui@mit.edu

If you work on efficient AI Computing, Quantum Computing, GenAI and interested in working with me, please fill out the recruiting form.