Meeting Minutes 2023-0523

Time (mins)	Agenda Items	Presented By	Note /Links/
30		MIGU

MAR 230th, 2023 Attendance list: CMCC hanyu ding Hanyu Ding、Arm Tina、MIGU, Huawei @jianpeng he、BUPT

MIGU introduce the serveral problems about GPU scheduling:

1、GPU独占模式，资源利用率低。是否有方案通过智能调度分析业务，动态调整实例数释放GPU，提供复用能力；

2、GPU虚拟化后，多POD支持共享GPU卡。同一物理卡上的服务存在资源竞争情况，因为业务峰谷可能重合，导致共享时GPU算力使用冲突；

3、不同应用服务开发用cuda/cudnn版本不一致，存在适配问题；

4、服务使用GPU无法动态调整；

5、k8s支持windows容器编排的解决方案？当前渲染业务使用的渲染引擎，大多基于windows开发，没有linux版本。

1. GPU exclusive mode, low resource utilization. Whether there is a scheme to dynamically adjust the number of instances to release GPU and provide GPU reuse ability through intelligent scheduling;

2. After GPU virtualization, multiple PODs support sharing one GPU card. There is resource competition for services on the same physical card, because the peak and valley of app may overlap, resulting in conflict in the use of GPU computing force when sharing;

3. The versions of cuda/cudnn used for development of different application services are inconsistent, so there are adaptation problems;

4. The service cannot be dynamically adjusted by using GPU;

5. Does k8s support the solution of windows container arrangement? At present, most rendering engines used in rendering business are developed based on windows but not on linux.

Space shortcuts

Page tree

Time
(mins)

Agenda Items

Presented By

Note /Links/