HPCA 2026
Sat 31 January - Wed 4 February 2026 Sydney, Australia
co-located with HPCA/CGO/PPoPP/CC 2026
Tue 3 Feb 2026 17:55 - 18:15 at Coogee - Industry Track Chair(s): Pradip Bose

As the cost of GPUs continues to rise, GPU-sharing solutions have become increasingly important for improving efficiency and maximizing resource utilization. At the same time, large-scale operational deployments of such solutions remain relatively less explored, especially in heterogeneous production environments where workload dynamics and orchestration complexity introduce new practical considerations. In this paper, we introduce eGPU, an elastic, efficient, and scalable GPU-sharing framework tailored for production-scale concurrent machine learning (ML) training and inference. eGPU enables fine-grained, runtime-adjustable sharing of GPUs across multiple jobs, while preserving high resource utilization and fault isolation. To address communication bottlenecks, eGPU supports native NVLink/NCCL-based communication between shared GPU instances, capabilities that are limited or unavailable in many existing designs. Built with production deployment in mind, eGPU integrates with Kubernetes (K8s) to support large-scale orchestration. It has been deployed and running stably in production clusters with over 10,000 GPUs for five years. Our evaluation results show that eGPU achieves elastic and precise control over instance sizes, improves job efficiency by 21% to 31% than SOTA sharing solutions, saves the number of GPUs required by up to 8×, and improves cluster GPU utilization by more than 3×.

Tue 3 Feb

Displayed time zone: Hobart change

17:15 - 18:15
Industry TrackIndustry Track at Coogee
Chair(s): Pradip Bose IBM
17:15
20m
Industry talk
Enterprise Class On-Chip Accelerator Integration
Industry Track
17:35
20m
Industry talk
Characterizing Cloud-Native LLM Inference at ByteDance and Exposing Optimization Challenges and Opportunities for Future AI Accelerators
Industry Track
Jingwei Cai ByteDance Seed, Dehao Kong , Huang Hantao ByteDance Seed, Zishan Jiang ByteDance Seed, Zixuan Ma ByteDance Seed, Qingyu Guo ByteDance Seed, Zhenxing Zhang ByteDance Seed, Guiming Shi Tsinghua University, Mingyu Gao Tsinghua University, Kaisheng Ma Tsinghua University, Minghui Yu ByteDance Seed
17:55
20m
Industry talk
eGPU: Production-Scale Elastic Sharing over 10,000 GPUs
Industry Track
Xiaochuan Tang Alibaba Group, Hao Qi , Jianbo Dong Alibaba Group, Yinghao Yu Alibaba Group, Zhennan Xue Alibaba Group, Zhengyu Zhang Alibaba Group, Daocheng Ying Alibaba Group, Zheng Cao Alibaba Group, Xiaoyi Lu UC Merced