HPCA 2026
Sat 31 January - Wed 4 February 2026 Sydney, Australia
co-located with HPCA/CGO/PPoPP/CC 2026
Tue 3 Feb 2026 17:35 - 17:55 at Coogee - Industry Track Chair(s): Pradip Bose

As a major provider of LLM inference services, ByteDance has continuously explored diverse accelerator options to meet the rapidly growing inference demands of various heterogeneous LLM scenarios with higher cost-effectiveness, thereby enabling LLMs to serve more people worldwide. However, during this process, we have found that the complexity and opacity of cloud scenarios and corresponding cloud accelerators make it difficult for academia and many innovative chip startups to fully understand the real demands and challenges of these scenarios, which in turn severely restricts innovation and application potential in this field.

To bridge this gap, we first present and analyze the data and characteristics of the ByteDance Doubao LLM app across multiple dimensions, helping the community understand real-world cloud scenarios, and detail the challenges and opportunities we have identified. Second, we propose and plan to open-source our multi-level evaluation framework, ByteMLPerf, which includes benchmarks spanning instructions, operators, and models. This framework improves interpretability and trustworthiness, and helps promising new accelerator architectures gain wider adoption and development. Finally, we present comparative results of four accelerators currently deployed at scale, summarize their shortcomings and challenges, conduct in-depth analysis, and highlight numerous architectural and scheduling innovation opportunities we have observed.

Tue 3 Feb

Displayed time zone: Hobart change

17:15 - 18:15
Industry TrackIndustry Track at Coogee
Chair(s): Pradip Bose IBM
17:15
20m
Industry talk
Enterprise Class On-Chip Accelerator Integration
Industry Track
17:35
20m
Industry talk
Characterizing Cloud-Native LLM Inference at ByteDance and Exposing Optimization Challenges and Opportunities for Future AI Accelerators
Industry Track
Jingwei Cai ByteDance Seed, Dehao Kong , Huang Hantao ByteDance Seed, Zishan Jiang ByteDance Seed, Zixuan Ma ByteDance Seed, Qingyu Guo ByteDance Seed, Zhenxing Zhang ByteDance Seed, Guiming Shi Tsinghua University, Mingyu Gao Tsinghua University, Kaisheng Ma Tsinghua University, Minghui Yu ByteDance Seed
17:55
20m
Industry talk
eGPU: Production-Scale Elastic Sharing over 10,000 GPUs
Industry Track
Xiaochuan Tang Alibaba Group, Hao Qi , Jianbo Dong Alibaba Group, Yinghao Yu Alibaba Group, Zhennan Xue Alibaba Group, Zhengyu Zhang Alibaba Group, Daocheng Ying Alibaba Group, Zheng Cao Alibaba Group, Xiaoyi Lu UC Merced