Open Menu
Close Menu
Research
Publications
People
Posts
News
Advices
Contact Us
Contact Us
Featured Publications
Zhaohui Yang
,
Dawen Ding
,
Chenghong Zhu
,
Jianxin Chen
,
Prof. Yuan Xie
(2025).
Phoenix: Pauli-based High-level Optimization Engine for Instruction Execution on NISQ devices
.
Design Automation Conference
.
PDF
Cite
Code
Slides
Recent Publications
A comprehensive list of publications is maintained on
Prof. Xie’s Google Scholar profile
.
Jianbo Dong
,
Bin Luo
,
Jun Zhang
,
Pengcheng Zhang
,
Fei Feng
,
Yikai Zhu
,
Ang Liu
,
Zian Chen
,
Yi Shi
,
Hairong Jiao
,
Others
(2025).
Enhancing Large-Scale AI Training Efficiency: The C4 Solution for Real-Time Anomaly Detection and Communication Optimization
.
2025 IEEE International Symposium on High Performance Computer Architecture (HPCA)
.
Cite
Cong Li
,
Yihan Yin
,
Xintong Wu
,
Jingchen Zhu
,
Zhutianya Gao
,
Dimin Niu
,
Qiang Wu
,
Xin Si
,
Prof. Yuan Xie
,
Chen Zhang
,
Others
(2025).
H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference
.
Proceedings of the 52nd Annual International Symposium on Computer Architecture
.
Cite
Ling Liang
,
Zhen Gu
,
Fahong Zhang
,
Zhaohui Chen
,
Zhirui Li
,
Xin Fan
,
Dimin Niu
,
Meng LI
,
Zhiyong Li
,
Zongwei Wang
,
Others
(2025).
Matrix: Multi-Cipher Structures Dataflow for Parallel and Pipelined TFHE Accelerator
.
ACM Transactions on Architecture and Code Optimization
.
Cite
Tianchan Guan
,
Yijin Guan
,
Zhaoyang Du
,
Jiacheng Ma
,
Boyu Tian
,
Zhao Wang
,
Teng Ma
,
Zheng Liu
,
Yang Kong
,
Prof. Yuan Xie
,
Others
(2025).
MemTunnel: a CXL-based Rack-Scale Host Memory Pooling Architecture for Cloud Service
.
IEEE Transactions on Parallel and Distributed Systems
.
Cite
Yiquan Chen
,
Zhen Jin
,
Yijing Wang
,
Yi Chen
,
Jiexiong Xu
,
Dr. Hao Yu
,
Jinlong Chen
,
Wenhai Lin
,
Kanghua Fang
,
Keyao Zhang
,
Others
(2025).
NVMePass: A Lightweight, High-performance and Scalable NVMe Virtualization Architecture with I/O Queues Passthrough
.
2025 IEEE International Symposium on High Performance Computer Architecture (HPCA)
.
Cite
Zhaohui Yang
,
Dawen Ding
,
Chenghong Zhu
,
Jianxin Chen
,
Prof. Yuan Xie
(2025).
Phoenix: Pauli-based High-level Optimization Engine for Instruction Execution on NISQ devices
.
Design Automation Conference
.
PDF
Cite
Code
Slides
Jiayi Huang
,
Yanhua Chen
,
Zhe Wang
,
Christopher J Hughes
,
Yufei Ding
,
Prof. Yuan Xie
(2025).
Push Multicast: A Speculative and Coherent Interconnect for Mitigating Manycore CPU Communication Bottleneck
.
2025 IEEE International Symposium on High Performance Computer Architecture (HPCA)
.
Cite
Guyue Huang
,
Hao Li
,
Le Qin
,
Jiayi Huang
,
Yangwook Kang
,
Yufei Ding
,
Prof. Yuan Xie
(2025).
TRACI: Network Acceleration of Input-Dynamic Communication for Large-Scale Deep Learning Recommendation Model
.
Proceedings of the 52nd Annual International Symposium on Computer Architecture
.
Cite
Tongxin Xie
,
Dr. Zhenhua Zhu
,
Bing Li
,
Yukai He
,
Cong Li
,
Guangyu Sun
,
Huazhong Yang
,
Prof. Yuan Xie
,
Yu Wang
(2025).
UniNDP: A Unified Compilation and Simulation Tool for Near DRAM Processing Architectures
.
2025 IEEE International Symposium on High Performance Computer Architecture (HPCA)
.
Cite
Meng Wu
,
Mingyu Yan
,
Wenming Li
,
Xiaochun Ye
,
Dongrui Fan
,
Ninghui Sun
,
Prof. Yuan Xie
(2024).
A comprehensive survey on gnn characterization
.
arXiv e-prints
.
Cite
Hao Zhang
,
Sicheng Li
,
Yupeng Gui
,
Zhiyong Li
,
Shusong Xu
,
Yanheng Lu
,
Dimin Niu
,
Hongzhong Zheng
,
Yen-Kuang Chen
,
Prof. Yuan Xie
,
Others
(2024).
A Tightly Coupled AI-ISP Vision Processor
.
IEEE Transactions on Circuits and Systems for Video Technology
.
Cite
Chen Zhang
,
Yang Wang
,
Zhiqiang Xie
,
Cong Guo
,
Yunxin Liu
,
Jingwen Leng
,
Guangyu Sun
,
Zhigang Ji
,
Runsheng Wang
,
Prof. Yuan Xie
,
Others
(2024).
Dstc: Dual-side sparsity tensor core for dnns acceleration on modern gpu architectures
.
IEEE Transactions on Computers
.
Cite
Jiaming Xu
,
Shan Huang
,
Jinhao Li
,
Guyue Huang
,
Prof. Yuan Xie
,
Yu Wang
,
Guohao Dai
(2024).
Enabling efficient sparse multiplications on GPUs with heuristic adaptability
.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
.
Cite
Zhaodong Chen
,
Andrew Kerr
,
Richard Cai
,
Jack Kosaian
,
Haicheng Wu
,
Yufei Ding
,
Prof. Yuan Xie
(2024).
Evt: Accelerating deep learning training with epilogue visitor tree
.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3
.
Cite
Dr. Chen Bai
,
Xuechao Wei
,
Youwei Zhuo
,
Yi Cai
,
Hongzhong Zheng
,
Bei Yu
,
Prof. Yuan Xie
(2024).
Klotski v2: Improved DNN Model Orchestration Framework for Dataflow Architecture Accelerators
.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
.
Cite
Zhaodong Chen
,
Weiqin Zhao
,
Lei Deng
,
Yufei Ding
,
Qinghao Wen
,
Guoqi Li
,
Prof. Yuan Xie
(2024).
Large-scale self-normalizing neural networks
.
Journal of Automation and Intelligence
.
Cite
Ruiyang Ma
,
Jiayi Huang
,
Shijian Zhang
,
Prof. Yuan Xie
,
Guojie Luo
(2024).
NoCFuzzer: Automating NoC Verification in UVM
.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
.
Cite
Yiquan Chen
,
Prof. Yuan Xie
,
Yijing Wang
,
Jiexiong Xu
,
Zhen Jin
,
Anyu Li
,
Xiaoyan Fu
,
Qiang Liu
,
Wenzhi Chen
(2024).
Optimizing nvme storage for large-scale deployment: Key technologies and strategies in alibaba cloud
.
IEEE Micro
.
Cite
Zhaohui Chen
,
Zhen Gu
,
Yanheng Lu
,
Xuanle Ren
,
Ruiguang Zhong
,
Wen-Jie Lu
,
Jiansong Zhang
,
Yichi Zhang
,
Hanghang Wu
,
Xiaofu Zheng
,
Others
(2024).
SAFE: A Scalable Homomorphic Encryption Accelerator for Vertical Federated Learning
.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
.
Cite
Yu Zou
,
Yiran Li
,
Sheng Wang
,
Le Su
,
Zhen Gu
,
Yanheng Lu
,
Yijin Guan
,
Dimin Niu
,
Mingyu Gao
,
Prof. Yuan Xie
,
Others
(2024).
Salus: A Practical Trusted Execution Environment for CPU-FPGA Heterogeneous Cloud Platforms
.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4
.
Cite
Nan Wu
,
Yingjie Li
,
Hang Yang
,
Hanqiu Chen
,
Steve Dai
,
Cong Hao
,
Cunxi Yu
,
Prof. Yuan Xie
(2024).
Survey of machine learning for software-assisted hardware design verification: Past, present, and prospect
.
ACM Transactions on Design Automation of Electronic Systems
.
Cite
Haiyang Lin
,
Mingyu Yan
,
Xiaochun Ye
,
Dongrui Fan
,
Shirui Pan
,
Wenguang Chen
,
Prof. Yuan Xie
(2023).
A comprehensive survey on distributed training of graph neural networks
.
Proceedings of the IEEE
.
Cite
Yanhong Wang
,
Tianchan Guan
,
Dimin Niu
,
Qiaosha Zou
,
Hongzhong Zheng
,
C-J Richard Shi
,
Prof. Yuan Xie
(2023).
Accelerating distributed GNN training by codes
.
IEEE Transactions on Parallel and Distributed Systems
.
Cite
Zheng Qu
(2023).
Addressing Data Explosion Issue in Emerging Deep Learning Applications
.
Cite
Guyue Huang
,
Yang Bai
,
Liu Liu
,
Yuke Wang
,
Bei Yu
,
Yufei Ding
,
Prof. Yuan Xie
(2023).
Alcop: Automatic load-compute pipelining in deep learning compiler for ai-gpus
.
Proceedings of Machine Learning and Systems
.
Cite
Dr. Chen Bai
,
Jiayi Huang
,
Xuechao Wei
,
Yuzhe Ma
,
Sicheng Li
,
Hongzhong Zheng
,
Bei Yu
,
Prof. Yuan Xie
(2023).
ArchExplorer: Microarchitecture exploration via bottleneck analysis
.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture
.
Cite
Xuanle Ren
,
Zhaohui Chen
,
Zhen Gu
,
Yanheng Lu
,
Ruiguang Zhong
,
Wen-Jie Lu
,
Jiansong Zhang
,
Yichi Zhang
,
Hanghang Wu
,
Xiaofu Zheng
,
Others
(2023).
CHAM: A customized homomorphic encryption accelerator for fast matrix-vector product
.
2023 60th ACM/IEEE Design Automation Conference (DAC)
.
Cite
Zhaodong Chen
,
Zheng Qu
,
Yuying Quan
,
Liu Liu
,
Yufei Ding
,
Prof. Yuan Xie
(2023).
Dynamic n: M fine-grained structured sparse attention mechanism
.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming
.
Cite
Guiming Wu
,
Qianwen He
,
Jiali Jiang
,
Zhenxiang Zhang
,
Yunfeng Shi
,
Xin Long
,
Linquan Jiang
,
Shuangchen Li
,
Prof. Yuan Xie
,
Changzheng Wei
,
Others
(2023).
E-booster: A field-programmable gate array-based accelerator for secure tree boosting using additively homomorphic encryption
.
IEEE Micro
.
Cite
Siqi Li
,
Fengbin Tu
,
Liu Liu
,
Jilan Lin
,
Zheng Wang
,
Yangwook Kang
,
Yufei Ding
,
Prof. Yuan Xie
(2023).
Ecssd: Hardware/data layout co-designed in-storage-computing architecture for extreme classification
.
Proceedings of the 50th annual international symposium on computer architecture
.
Cite
Bizhao Shi
,
Jiaxi Zhang
,
Zhuolun He
,
Xuechao Wei
,
Sicheng Li
,
Guojie Luo
,
Hongzhong Zheng
,
Prof. Yuan Xie
(2023).
Efficient super-resolution system with block-wise hybridization and quantized winograd on fpga
.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
.
Cite
Nan Wu
,
Yingjie Li
,
Cong Hao
,
Steve Dai
,
Cunxi Yu
,
Prof. Yuan Xie
(2023).
Gamora: Graph learning based symbolic reasoning for large-scale boolean networks
.
2023 60th ACM/IEEE Design Automation Conference (DAC)
.
Cite
Ao Ren
,
Yuhao Wang
,
Tao Zhang
,
Jiaxing Shi
,
Duo Liu
,
Xianzhang Chen
,
Yujuan Tan
,
Prof. Yuan Xie
(2023).
Hbp: Hierarchically balanced pruning and accelerator co-design for efficient dnn inference
.
2023 60th ACM/IEEE Design Automation Conference (DAC)
.
Cite
Yiquan Chen
,
Zhen Jin
,
Yijing Wang
,
Yi Chen
,
Dr. Hao Yu
,
Jiexiong Xu
,
Jinlong Chen
,
Wenhai Lin
,
Kanghua Fang
,
Chengkun Wei
,
Others
(2023).
High-performance and scalable software-based NVMe virtualization mechanism with I/O queues passthrough
.
arXiv preprint arXiv:2304.05148
.
Cite
Dr. Chen Bai
,
Xuechao Wei
,
Youwei Zhuo
,
Yi Cai
,
Hongzhong Zheng
,
Bei Yu
,
Prof. Yuan Xie
(2023).
Klotski: DNN model orchestration framework for dataflow architecture accelerators
.
2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)
.
Cite
Jin Lin
,
Xiaotong Luo
,
Ming Hong
,
Yanyun Qu
,
Prof. Yuan Xie
,
Zongze Wu
(2023).
Memory-friendly scalable super-resolution via rewinding lottery ticket hypothesis
.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
.
Cite
Dr. Zhenhua Zhu
,
Hanbo Sun
,
Tongxin Xie
,
Yu Zhu
,
Guohao Dai
,
Lixue Xia
,
Dimin Niu
,
Xiaoming Chen
,
Xiaobo Sharon Hu
,
Yu Cao
,
Others
(2023).
Mnsim 2.0: A behavior-level modeling tool for processing-in-memory architectures
.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
.
Cite
Xinfeng Xie
,
Peng Gu
,
Yufei Ding
,
Dimin Niu
,
Hongzhong Zheng
,
Prof. Yuan Xie
(2023).
MPU: Memory-centric SIMT processor via in-DRAM near-bank computing
.
ACM Transactions on Architecture and Code Optimization
.
Cite
Yuanwei Fang
,
Zihao Liu
,
Yanheng Lu
,
Jiawei Liu
,
Jiajie Li
,
Yi Jin
,
Jian Chen
,
Yenkuang Chen
,
Hongzhong Zheng
,
Prof. Yuan Xie
(2023).
NPS: a framework for accurate program sampling using graph neural network
.
arXiv preprint arXiv:2304.08880
.
Cite
Zhaoyang Du
,
Yijin Guan
,
Tianchan Guan
,
Dimin Niu
,
Nianxiong Tan
,
Xiaopeng Yu
,
Hongzhong Zheng
,
Jianyi Meng
,
Xiaolang Yan
,
Prof. Yuan Xie
(2023).
Predicting the output structure of sparse matrix multiplication with sampled compression ratio
.
2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)
.
Cite
Guyue Huang
,
Zhengyang Wang
,
Po-an Tsai
,
Chen Zhang
,
Yufei Ding
,
Prof. Yuan Xie
(2023).
Rm-stc: Row-merge dataflow inspired gpu sparse tensor core for energy-efficient sparse acceleration
.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture
.
Cite
Zhiyao Li
,
Jiaxiang Li
,
Taijie Chen
,
Dimin Niu
,
Hongzhong Zheng
,
Prof. Yuan Xie
,
Mingyu Gao
(2023).
Spada: Accelerating sparse matrix multiplication with adaptive dataflow
.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2
.
Cite
Ling Liang
,
Jilan Lin
,
Zheng Qu
,
Ishtiyaque Ahmad
,
Fengbin Tu
,
Trinabh Gupta
,
Yufei Ding
,
Prof. Yuan Xie
(2023).
Spg: Structure-private graph database via squeezepir
.
Proceedings of the VLDB Endowment
.
Cite
Zheng Qu
,
Dimin Niu
,
Shuangchen Li
,
Hongzhong Zheng
,
Prof. Yuan Xie
(2023).
Tt-gnn: Efficient on-chip graph neural network training via embedding reformation and hardware optimization
.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture
.
Cite
Prof. Yuan Xie
,
Jiawei Ren
,
Ji Xu
(2022).
Adaptive ship-radiated noise recognition with learnable fine-grained wavelet transform
.
Ocean engineering
.
Cite
Nan Wu
,
Prof. Yuan Xie
,
Cong Hao
(2022).
Ai-assisted synthesis in next generation eda: Promises, challenges, and prospects
.
2022 IEEE 40th International Conference on Computer Design (ICCD)
.
Cite
Anbang Wu
,
Hezi Zhang
,
Gushu Li
,
Alireza Shabani
,
Prof. Yuan Xie
,
Yufei Ding
(2022).
Autocomm: A framework for enabling efficient communication in distributed quantum programs
.
2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)
.
Cite
Wenqin Huangfu
,
Krishna T Malladi
,
Andrew Chang
,
Prof. Yuan Xie
(2022).
Beacon: Scalable near-data-processing accelerators for genome analysis near memory pool with the cxl support
.
2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)
.
Cite
Mingyu Yan
,
Mo Zou
,
Xiaocheng Yang
,
Wenming Li
,
Xiaochun Ye
,
Dongrui Fan
,
Prof. Yuan Xie
(2022).
Characterizing and understanding HGNNs on GPUs
.
IEEE Computer Architecture Letters
.
Cite
Minghai Qin
,
Tianyun Zhang
,
Fei Sun
,
Yen-Kuang Chen
,
Makan Fardad
,
Yanzhi Wang
,
Prof. Yuan Xie
(2022).
Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting
.
2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)
.
Cite
Liu Liu
,
Zheng Qu
,
Zhaodong Chen
,
Fengbin Tu
,
Yufei Ding
,
Prof. Yuan Xie
(2022).
Dynamic sparse attention for scalable transformer acceleration
.
IEEE Transactions on Computers
.
Cite
Linyong Huang
,
Zhe Zhang
,
Zhaoyang Du
,
Shuangchen Li
,
Hongzhong Zheng
,
Prof. Yuan Xie
,
Nianxiong Tan
(2022).
EPQuant: A Graph Neural Network compression approach based on product quantization
.
Neurocomputing
.
Cite
Xuanle Ren
,
Le Su
,
Zhen Gu
,
Sheng Wang
,
Feifei Li
,
Prof. Yuan Xie
,
Song Bian
,
Chao Li
,
Fan Zhang
(2022).
HEDA: multi-attribute unbounded aggregation over homomorphically encrypted database
.
Proceedings of the VLDB Endowment
.
Cite
Sicheng Li
,
Dr. Chen Bai
,
Xuechao Wei
,
Bizhao Shi
,
Yen-Kuang Chen
,
Prof. Yuan Xie
(2022).
Iccad cad contest 2022
.
Cite
Zejiang Hou
,
Fei Sun
,
Yen-Kuang Chen
,
Prof. Yuan Xie
,
Sun-Yuan Kung
(2022).
Milan: Masked image pretraining on language assisted representation
.
arXiv preprint arXiv:2208.06049
.
Cite
Gongjian Sun
,
Mingyu Yan
,
Duo Wang
,
Han Li
,
Wenming Li
,
Xiaochun Ye
,
Dongrui Fan
,
Prof. Yuan Xie
(2022).
Multi-node acceleration for large-scale gcns
.
IEEE Transactions on Computers
.
Cite
Tianxue Ma
,
Mingwei Bi
,
Jian Zhang
,
Wang Yuan
,
Zhizhong Zhang
,
Prof. Yuan Xie
,
Shouhong Ding
,
Lizhuang Ma
(2022).
Mutually reinforcing structure with proposal contrastive consistency for few-shot object detection
.
European Conference on Computer Vision
.
Cite
Zhaoyang Du
,
Yijin Guan
,
Tianchan Guan
,
Dimin Niu
,
Linyong Huang
,
Hongzhong Zheng
,
Prof. Yuan Xie
(2022).
OpSparse: a highly optimized framework for sparse general matrix multiplication on GPUs
.
IEEE Access
.
Cite
Jiangming Wang
,
Zhizhong Zhang
,
Mingang Chen
,
Yi Zhang
,
Cong Wang
,
Bin Sheng
,
Yanyun Qu
,
Prof. Yuan Xie
(2022).
Optimal transport for label-efficient visible-infrared person re-identification
.
European Conference on Computer Vision
.
Cite
Fengbin Tu
,
Yiqi Wang
,
Zihan Wu
,
Ling Liang
,
Yufei Ding
,
Bongjin Kim
,
Leibo Liu
,
Shaojun Wei
,
Prof. Yuan Xie
,
Shouyi Yin
(2022).
ReDCIM: Reconfigurable digital computing-in-memory processor with unified FP/INT pipeline for cloud AI acceleration
.
IEEE Journal of Solid-State Circuits
.
Cite
Jeong-Jun Lee
,
Wenrui Zhang
,
Prof. Yuan Xie
,
Peng Li
(2022).
Saarsp: An architecture for systolic-array acceleration of recurrent spiking neural networks
.
ACM Journal on Emerging Technologies in Computing Systems (JETC)
.
Cite
Yiqi Wang
,
Fengbin Tu
,
Leibo Liu
,
Shaojun Wei
,
Prof. Yuan Xie
,
Shouyi Yin
(2022).
SPCIM: Sparsity-balanced practical CIM accelerator with optimized spatial-temporal multi-macro utilization
.
IEEE Transactions on Circuits and Systems I: Regular Papers
.
Cite
Zihao Zhao
,
Yanhong Wang
,
Qiaosha Zou
,
Tie Xu
,
Fangbo Tao
,
Jiansong Zhang
,
Xiaoan Wang
,
C-J Richard Shi
,
Junwen Luo
,
Prof. Yuan Xie
(2022).
The spike gating flow: A hierarchical structure-based spiking neural network for online gesture recognition
.
Frontiers in Neuroscience
.
Cite
Ling Liang
,
Kaidi Xu
,
Xing Hu
,
Lei Deng
,
Prof. Yuan Xie
(2022).
Toward robust spiking neural network against adversarial perturbation
.
Advances in Neural Information Processing Systems
.
Cite
Fengbin Tu
,
Zihan Wu
,
Yiqi Wang
,
Ling Liang
,
Liu Liu
,
Yufei Ding
,
Leibo Liu
,
Shaojun Wei
,
Prof. Yuan Xie
,
Shouyi Yin
(2022).
TranCIM: Full-digital bitline-transpose CIM-based sparse transformer accelerator with pipeline/parallel reconfigurable modes
.
IEEE Journal of Solid-State Circuits
.
Cite
Prof. Yuan Xie
,
Jiawei Ren
,
Ji Xu
(2022).
Underwater-art: Expanding information perspectives with text templates for underwater acoustic target recognition
.
The Journal of the Acoustical Society of America
.
Cite