H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference

Jan 1, 2025·

Cong Li

,

Yihan Yin

,

Xintong Wu

,

Jingchen Zhu

,

Zhutianya Gao

,

Dimin Niu

,

Qiang Wu

,

Xin Si

Prof. Yuan Xie

,

Chen Zhang

,

Others

· 0 min read

Cite

Type

Conference paper

Publication

Proceedings of the 52nd Annual International Symposium on Computer Architecture

Last updated on Jan 1, 2025

Prof. Yuan Xie

Authors

Chair Professor

Fang Professor of Engineering | Chair Professor | IEEE/ACM/AAAS Fellow

← Enhancing Large-Scale AI Training Efficiency: The C4 Solution for Real-Time Anomaly Detection and Communication Optimization Jan 1, 2025

Matrix: Multi-Cipher Structures Dataflow for Parallel and Pipelined TFHE Accelerator Jan 1, 2025 →