Short bio:
I am a Senior Research Scientist at NVIDIA Research. My research focuses on computer system and architecture for emerging applications. I received my S.M. and Ph.D. degrees in CS from MIT, advised by Prof. Daniel Sanchez at MIT CSAIL. My current research projects focus on tensor accelerators design and modeling, memory hierarchy optimizations for domain-specific accelerators, and hardware acceleration for autonomous machines. During my Ph.D., I also worked on memory hierarchy design, resource management in multi-core systems, and software/hardware co-optimization for object-based programming model.Last update: October 2023
Publications
RM-STC: Row-Merge Dataflow Inspired GPU Sparse Tensor Core for Energy-Efficient Sparse Acceleration
Guyue Huang, Zhengyang Wang, Po-An Tsai, Chen Zhang, Yufei Ding, Yuan Xie
The 56th IEEE/ACM International Symposium on Microarchitecture (MICRO-56), October 2023.
HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity
Yannan Nellie Wu, Po-An Tsai, Saurav Muralidharan, Angshuman Parashar, Vivienne Sze, Joel Emer
The 56th IEEE/ACM International Symposium on Microarchitecture (MICRO-56), October 2023.
[paper][arxiv]
[MIT news]
Accelerating Sparse Data Orchestration via Dynamic Reflexive Tiling
Toluwanimi O Odemuyiwa, Hadi Asghari-Moghaddam, Michael Pellauer, Kartik Hegde, Po-An Tsai, Neal C Crago, Aamer Jaleel, John D Owens, Edgar Solomonik, Joel S Emer, Christopher W Fletcher
The 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-28), March 2023.
[paper]
Demystifying Map Space Exploration for NPUs
Sheng-Chun Kao, Angshuman Parashar, Po-An Tsai, Tushar Krishna
2022 IEEE International Symposium on Workload Characterization (IISWC 2022), November 2022.
[paper]
Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling
Yannan Nellie Wu, Po-An Tsai, Angshuman Parashar, Vivienne Sze, Joel S Emer
The 55th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-55), October 2022.
[paper][arxiv]
SIMD^2: A Generalized Matrix Instruction Set for Accelerating Tensor Computation beyond GEMM
Yunan Zhang, Po-An Tsai, Hung-Wei Tseng
The 49th Annual International Symposium on Computer Architecture (ISCA'22), June 2022.
[paper]
Ruby: Improving Hardware Efficiency for Tensor Algebra Accelerators Through Imperfect Factorization
Mark Horeni, Pooria Taheri, Po-An Tsai, Angshuman Parashar, Joel Emer, Siddharth Joshi
2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2022), May 2022.
[paper]
Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators
Geonhwa Jeong, Gokcen Kestor, Prasanth Chatarasi, Angshuman Parashar, Po-An Tsai, Sivasankaran Rajamanickam, Roberto Gioiosa, and Tushar Krishna.
The 30th International Conference on Parallel Architectures and Compilation Techniques (PACT-30), September 2021.
[paper]
Leaking Secrets through Compressed Caches
Po-An Tsai, Andres Sanchez, Christopher W. Fletcher and Daniel Sanchez.
IEEE Micro's Top Picks from the Computer Architecture Conferences, May/June 2021.
[paper]
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search
Kartik Hegde, Po-An Tsai, Sitao Huang, Vikas Chandram, Angshuman Parashar, and Christopher W. Fletcher.
The 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-26), April 2021.
[paper]
[code]
Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators
Yannan Nellie Wu, Po-An Tsai, Angshuman Parashar, Vivienne Sze, Joel S. Emer.
2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) (poster)
[paper]
[tutorial]
Hardware Abstractions for Targeting EDDO Architectures with the Polyhedral Model
Angshuman Parashar, Prasanth Chatarasi, and Po-An Tsai.
International Workshop on Polyhedral Compilation Techniques (IMPACT), January 2021.
[paper]
Safecracker: Leaking Secrets through Compressed Caches
Po-An Tsai, Andres Sanchez, Christopher W. Fletcher and Daniel Sanchez.
The 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-25), March 2020.
[paper]
[talk]
[video]
[code]
Compress Objects, Not Cache Lines: An Object-Based Compressed Memory Hierarchy
Po-An Tsai and Daniel Sanchez.
The 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-24), April 2019.
[paper]
[talk]
[poster]
[lightning]
[MIT news]
Rethinking the Memory Hierarchy for Modern Languages
Po-An Tsai, Yee Ling Gan, and Daniel Sanchez.
The 51st IEEE/ACM International Symposium on Microarchitecture (MICRO-51), October 2018.
[paper]
[talk]
[poster]
[lightning]
Adaptive Scheduling for Systems with Asymmetric Memory Hierarchies
Po-An Tsai, Changping Chen, and Daniel Sanchez.
The 51st IEEE/ACM International Symposium on Microarchitecture (MICRO-51), October 2018.
[paper]
[talk]
[poster]
[lightning]
KPart: A Hybrid Cache Partitioning-Sharing Technique for Commodity Multicores
Nosayba El-Sayed, Anurag Mukkara, Po-An Tsai, Harshad Kasture, Xiaosong Ma, and Daniel Sanchez.
The 24th Intl' Symposium on High Performance Computer Architecture (HPCA-24), February 2018.
[paper]
[talk]
[code]
Nexus: A New Approach to Replication in Distributed Shared Caches
Po-An Tsai, Nathan Beckmann, and Daniel Sanchez.
The 26th Intl' Conference on Parallel Architectures and Compilation Techniques (PACT-26), September 2017.
[paper]
[talk]
Jenga: Software-Defined Cache Hierarchies
Po-An Tsai, Nathan Beckmann, and Daniel Sanchez.
The 44th International Symposium on Computer Architecture (ISCA-44), June 2017.
[paper]
[talk]
[tech-report]
[MIT news]
Scaling Distributed Cache Hierarchies through Computation and Data Co-Scheduling
Nathan Beckmann, Po-An Tsai, and Daniel Sanchez.
The 21st Intl' Symposium on High Performance Computer Architecture (HPCA-21), February 2015.
*Nominated for best paper award
[paper]
[talk]
[MIT news]
Hybrid Path-Diversity-Aware Adaptive Routing with Latency Prediction Model in Network-on-Chip Systems
Po-An Tsai, Yu-Hsin Kuo, En-Jui Chang, and An-Yeu Wu.
2013 International Symposium on VLSI Design, Automation and Test, (VLSI-DAT), March 2013.
[paper]
Path-Diversity-Aware Adaptive Routing in Network-on-Chip Systems
Yu-Hsin Kuo, Po-An Tsai, Hao-Ping Ho, En-Jui Chang, Hsien-Kai Hsin, and An-Yeu Wu.
The 6th International Symposium on Embedded Multicore SoCs (MCSoC), September 2012.
[paper]
Contact Me
Email: poant@nvidia.com
Press
MIT news about Zippads
Hacker News discussion on Zippads
The Morning Paper summary on Zippads
MIT news about Jenga
Hacker News discussion on Jenga
MIT news about CDCS
Techenablement article about CDCS
The Industry-Academia Partnership (IAP) post about my best poster award