Public Research Seminar by Microelectronics Thrust, Function Hub, HKUST (GZ) - Building Network Abstractions on Specialized Hardware for Distributed Computing

3:00pm - 4:00pm
Zoom ( Meetign ID: 868 8940 6840 Password: 846486)

In recent years, data centers have experienced a significant shift towards heterogeneity to accommodate the ever-growing workloads. Specialized hardware, particularly FPGAs, are widely deployed for their custom circuit efficiency and reconfigurability. Despite their potential, the development of distributed FPGA-accelerated applications is hindered by the lack of suitable communication infrastructures and abstractions. To bridge this gap, this talk introduces a suite of open-source communication infrastructures tailored for hardware accelerators. These infrastructures support a variety of protocols, including TCP, RDMA, and MPI collectives, making them versatile across different platforms. With these novel infrastructures, we can utilize specialized hardware both as smartNICs, relieving CPU load from networking tasks, and as distributed accelerators to collectively handle large-scale applications. We will highlight the practical benefits and capabilities of these infrastructuresv through a case study on distributing deep learning recommendation model inference across a heterogeneous cluster.

讲者/ 表演者:
Mr. Zhenhao HE
ETH Zurich

Zhenhao is a final-year PhD candidate at the Systems Group, ETH Zurich, where he also completed his master's degree after obtaining his bachelor's degree from Tongji University. His research focuses on enhancing data processing systems for large-scale workloads by leveraging heterogeneous hardware, distributed computing, and advanced data center networking. He develops specialized networking abstractions, including TCP, RDMA, and MPI, tailored for hardware accelerators to efficiently orchestrate heterogeneous clusters with smart-NICs and in-network processors.

语言
英文
适合对象
教职员
研究生
本科生
主办单位
Function Hub, HKUST(GZ)
新增活动
请各校内团体将活动发布至大学活动日历。