AQFC2015

Accommodating LLM Service over Heterogeneous Computational Resources

----------------------------------------------------------------------------------------------------







    Department of Systems Engineering and Engineering Management



                       The Chinese University of Hong Kong



----------------------------------------------------------------------------------------------------



Date: Friday, Feburary 14, 4:00 pm – 5:30 pm



Venue: ERB 513, The Chinese University of Hong Kong



Title: Accommodating LLM Service over Heterogeneous Computational Resources



Speaker: Professor Binhang YUAN, HKUST





Abstract:



Serving large-scale language model service is crucial to contemporary

AI applications. We focus on deploying such services in a

heterogeneous and potentially decentralized setting to mitigate the

substantial costs typically associated with centralized data centers.

Our work relies on carefully designed scheduling algorithms where we

model the computation capacity and inter-machine connection precisely

and propose an efficient searching algorithm to find the optimal

allocations for different LLM serving paradigms. Our empirical study

suggests that the proposed method can efficiently reduce the service

cost while preserving the service quality.



Biography:



Binhang YUAN is an Assistant Professor at the Department of Computer

Science and Engineering (CSE), the Hong Kong University of Science and

Technology (HKUST). He received his Ph.D. and master's degrees from

Rice University and his bachelor's degree from Fudan University.

Before joining HKUST, he was a Postdoc at the Swiss Federal Institute

of Technology Zurich (ETH Zurich). His main research interests are in

data management systems for machine learning and distributed and

decentralized machine learning systems. He won the VLDB Best Paper

Honorable Mention Award in 2019 and the SIGMOD Research Highlight

Award in 2020.

Everyone is welcome to attend the talk!

Date: 
Friday, February 14, 2025 - 17:00