ORANGE: Exploring \underline{O}ckham's \underline{R}azor for Neural Rendering by \underline{A}ccelerating 3DGS on \underline{N}PUs with \underline{GE}MM-Friendly Blending and Balanced Workloads (HPCA 2026 - Main Conference)

Who

Haomin Li, Yue Liang, Fangxin Liu, Bowen Zhu, Zongwu Wang, Yu Feng, Liqiang Lu, Li Jiang, Haibing Guan

Track

HPCA 2026 Main Conference

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 2 Feb 2026 16:50 - 17:10 at Cronulla - 3D Graphics and Rendering Acceleration Chair(s): Yunho Oh

Abstract

3D Gaussian Splatting (3DGS) is an emerging neural rendering technique that delivers efficient and high-fidelity rendering, meeting the growing demands of applications such as AR/VR. As 3DGS is increasingly integrated into diverse applications, DNNs are often deployed alongside it to support tasks such as skeletal pose estimation for human avatars or semantic processing for 3D perception. Unfortunately, existing domain-specific accelerators (DSAs) designed for 3DGS excel at rendering but struggle to execute DNN workloads efficiently. Moreover, these DSAs incur significant design and fabrication costs, limiting their practicality. To address these challenges, we propose ORANGE, a novel approach that enables general-purpose DNN-oriented Neural Processing Units (NPUs) to efficiently execute 3DGS without requiring specialized accelerators. The key insight of ORANGE is that we introduce a GEMM-friendly blending process, which reformulates the conventional 3DGS blending operation to fully utilize the matrix multiplication units prevalent in NPUs during rendering. Additionally, to mitigate workload imbalances caused by variable execution latencies across tiles, we develop a sampling-based latency prediction method paired with a tile batching strategy to minimize idle computing resources. Experiments demonstrate that ORANGE achieves up to $1.67\times$ and $15.5\times$ speedup compared to state-of-the-art 3DGS accelerators and the NVIDIA Xavier NX GPU, respectively, in neural rendering tasks. Our approach offers a cost-effective and versatile solution, adhering to the principle of Ockham’s Razor by maximizing efficiency without specialized hardware.

Haomin Li

Shanghai Jiao Tong University

China

Yue Liang

Shanghai Jiao Tong University

Fangxin Liu

Shanghai Jiao Tong University

Bowen Zhu

Shanghai Jiao Tong University

Zongwu Wang

Shanghai Jiao Tong University

Yu Feng

Shanghai Jiao Tong University

Liqiang Lu

Zhejiang University

Li Jiang

Shanghai Jiaotong University

Haibing Guan

Shanghai Jiao Tong University

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 2 Feb
Displayed time zone: Hobart change

15:50 - 17:10	3D Graphics and Rendering AccelerationMain Conference at Cronulla Chair(s): Yunho Oh Korea University

15:50 20m Talk		GRTX: Efficient Ray Tracing for 3D Gaussian-Based Rendering Main Conference Junseo Lee Seoul National University, Sangyun Jeon Seoul National University, Jungi Lee Seoul National University, Junyong Park Seoul National University, Jaewoong Sim Seoul National University
16:10 20m Talk		Splatonic: Architecture Support for 3D Gaussian Splatting SLAM via Sparse Processing Main Conference Xiaotong Huang Shanghai Jiao Tong University, He Zhu Shanghai Jiao Tong University, Tianrui Ma Institute of Computing Technology, Chinese Academy of Sciences, Yuxiang Xiong Shanghai Jiao Tong University, Fangxin Liu Shanghai Jiao Tong University, Zhezhi He Shanghai Jiao Tong University, Yiming Gan Institute of Computing Technology, Chinese Academy of Sciences, Zihan Liu Shanghai Jiao Tong University, Jingwen Leng Shanghai Jiao Tong University, Yu Feng Shanghai Jiao Tong University, Minyi Guo Shanghai Jiao Tong University
16:30 20m Talk		FractalCloud: A Fractal-Inspired Architecture for Efficient Large-Scale Point Cloud Processing Main Conference Yuzhe Fu Duke University, Changchun Zhou Duke University, Hancheng Ye Duke University, Bowen Duan Duke University, Qiyu Huang Yale University, Chiyue Wei Duke University, Cong Guo Duke University, Hai "Helen" Li Duke University, Yiran Chen Duke University
16:50 20m Talk		ORANGE: Exploring \underline{O}ckham's \underline{R}azor for Neural Rendering by \underline{A}ccelerating 3DGS on \underline{N}PUs with \underline{GE}MM-Friendly Blending and Balanced Workloads Main Conference Haomin Li Shanghai Jiao Tong University, Yue Liang Shanghai Jiao Tong University, Fangxin Liu Shanghai Jiao Tong University, Bowen Zhu Shanghai Jiao Tong University, Zongwu Wang Shanghai Jiao Tong University, Yu Feng Shanghai Jiao Tong University, Liqiang Lu Zhejiang University, Li Jiang Shanghai Jiaotong University, Haibing Guan Shanghai Jiao Tong University