Processing in Memory
04-03, 14:00–15:40 (CET), Mees


Session Chair: Timothy Pinkston (Univ. of South California)

PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System
Yintao He (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Haiyu Mao (King's College London,ETH Zürich), Christina Giannoula (University of Toronto,Vector Institute), Mohammad Sadrosadati (ETH Zürich), Juan Gómez-Luna (NVIDIA), Huawei Li (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Xiaowei Li (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Ying Wang (SKLP, Institute of Computing Technology, CAS), Onur Mutlu (ETH Zürich)
Paper

PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference
Yufeng Gu (University of Michigan), Alireza Khadem (University of Michigan), Sumanth Umesh (University of Michigan), Ning Liang (University of Michigan), Xavier Servot (ETH Zurich), Onur Mutlu (ETH Zurich), Ravi Iyer (Google), Reetuparna Das (University of Michigan)
Paper

CINM (Cinnamon): A Compilation Infrastructure for Heterogeneous Compute In-Memory and Compute Near-Memory Paradigms
Asif Ali Khan (Technische Universität Dresden), Hamid Farzaneh (Technische Universität Dresden), Karl Friedrich Alexander Friebel (Technische Universität Dresden), Clément Fournier (Technische Universität Dresden), Lorenzo Chelini (Intel), Jeronimo Castrillon (Technische Universität Dresden)
Paper

Toleo: Scaling Freshness to Tera-scale Memory Using CXL and PIM
Juechu Dong (University of Michigan), Jonah Rosenblum (University of Michigan), Satish Narayanasamy (University of Michigan)
Paper

Be CIM or Be Memory: A Dual-mode-aware DNN Compiler for CIM Accelerators
Shixin Zhao (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Yuming Li (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Bing Li (Institute of Microelectronics, Chinese Academy of Sciences), Yintao He (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Mengdi Wang (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Yinhe Han (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Ying Wang (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences)
Paper