04-01, 15:40–17:00 (CET), Rotterdam hall 1B
Session Chair: Yu Hua (Huazhong Univ. of Science and Technology)
Composing Distributed Computations Through Task and Kernel Fusion
Rohan Yadav (Stanford University), Shiv Sundram (Stanford University), Wonchan Lee (NVIDIA), Michael Garland (NVIDIA), Michael Bauer (NVIDIA), Alex Aiken (Stanford University), Fredrik Kjolstad (Stanford University)
Paper
CXLfork: Fast Remote Fork over CXL Fabrics
Chloe Alverti (University of Illinois Urbana-Champaign), Stratos Psomadakis (National Technical University of Athens), Burak Ocalan (University of Illinois Urbana-Champaign), Shashwat Jaiswal (University of Illinois Urbana-Champaign), Tianyin Xu (University of Illinois Urbana-Champaign), Josep Torrellas (University of Illinois Urbana-Champaign)
Paper
OS2G: A High-Performance DPU Offloading Architecture for GPU-based Deep Learning with Object Storage
Zhen Jin (Zhejiang University,Alibaba Group), Yiquan Chen (Alibaba Group), Mingxu Liang (Alibaba Group), Yijing Wang (Alibaba Group), Guoju Fang (Alibaba Group), Ao Zhou (Alibaba Group), Keyao Zhang (Zhejiang University), Jiexiong Xu (Zhejiang University), Wenhai Lin (Zhejiang University), Yiquan Lin (Zhejiang University), Shushu Zhao (Alibaba Group), Wenkai Shi (Alibaba Group), Zhenhua He (Alibaba Group), Shishun Cai (Alibaba Group), Wenzhi Chen (Zhejiang University)
Paper
pulse: Accelerating Distributed Pointer-Traversals on Disaggregated Memory
Yupeng Tang (Yale University), Seung-seob Lee (Yale University), Abhishek Bhattacharjee (Yale University), Anurag Khandelwal (Yale University)
Paper