共享GPU

702 阅读3分钟

部署

  • GPU Manager Pro

    • 功能:负责管理GPU设备,为K8s屏蔽底层资源,以Device Plugin形式实现
  • GPU Scheduler Pro

    • 功能:负责完成GPU场景下的Pod调度,以Scheduler Extender形式实现

    kind: ClusterRoleapiVersion: rbac.authorization.k8s.io/v1metadata: name: gpu-manager-prorules:- apiGroups: - "" resources: - nodes verbs: - get - list - watch- apiGroups: - "" resources: - events verbs: - create - patch- apiGroups: - "" resources: - pods verbs: - update - patch - get - list - watch- apiGroups: - "" resources: - nodes/status verbs: - patch - update---apiVersion: v1kind: ServiceAccountmetadata: name: gpu-manager-pro namespace: kube-system---kind: ClusterRoleBindingapiVersion: rbac.authorization.k8s.io/v1metadata: name: gpu-manager-pro namespace: kube-systemroleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: gpu-manager-prosubjects:- kind: ServiceAccount name: gpu-manager-pro namespace: kube-system

DaemonSet

apiVersion: apps/v1kind: DaemonSetmetadata:  name: gpu-manager-pro  namespace: kube-systemspec:  selector:    matchLabels:      app: gpu-manager-pro  template:    metadata:      labels:        app: gpu-manager-pro    spec:      serviceAccount: gpu-manager-pro      hostNetwork: true      nodeSelector:        nvidia-device-enable: "enable"      containers:      - image: ccr.ccs.tencentyun.com/tkeimages/gpu-manager-pro:v0.0.1        name: gpu-manager-pro        command:          - gpu-manager-pro        resources:          limits:            memory: "300Mi"            cpu: "0.5"          requests:            memory: "300Mi"            cpu: "0.5"        env:        - name: KUBECONFIG          value: /etc/kubernetes/kubelet.conf        - name: NODE_NAME          valueFrom:            fieldRef:              fieldPath: spec.nodeName        securityContext:          allowPrivilegeEscalation: false          capabilities:            drop: ["ALL"]        volumeMounts:          - name: device-plugin            mountPath: /var/lib/kubelet/device-plugins      volumes:        - name: device-plugin          hostPath:            path: /var/lib/kubelet/device-plugins

ClusterRole, ServiceAccount, ClusterRoleBinding

kind: ClusterRoleapiVersion: rbac.authorization.k8s.io/v1metadata:  name: gpu-scheduler-prorules:- apiGroups:  - ""  resources:  - nodes  verbs:  - get  - list  - watch- apiGroups:  - ""  resources:  - events  verbs:  - create  - patch- apiGroups:  - ""  resources:  - pods  verbs:  - update  - patch  - get  - list  - watch- apiGroups:  - ""  resources:  - bindings  - pods/binding  verbs:  - create- apiGroups:  - ""  resources:  - configmaps  verbs:  - get  - list  - watch---apiVersion: v1kind: ServiceAccountmetadata:  name: gpu-scheduler-pro  namespace: kube-system---kind: ClusterRoleBindingapiVersion: rbac.authorization.k8s.io/v1metadata:  name: gpu-scheduler-pro  namespace: kube-systemroleRef:  apiGroup: rbac.authorization.k8s.io  kind: ClusterRole  name: gpu-scheduler-prosubjects:- kind: ServiceAccount  name: gpu-scheduler-pro  namespace: kube-system

Deployment, Service

kind: DeploymentapiVersion: apps/v1metadata:  name: gpu-scheduler-pro  namespace: kube-systemspec:  replicas: 1  selector:    matchLabels:      app: gpu-scheduler-pro  template:    metadata:      labels:        app: gpu-scheduler-pro      annotations:        scheduler.alpha.kubernetes.io/critical-pod: ''    spec:      hostNetwork: true      tolerations:      - effect: NoSchedule        operator: Exists        key: node-role.kubernetes.io/master      serviceAccount: gpu-scheduler-pro      containers:        - name: gpu-scheduler-pro          image: ccr.ccs.tencentyun.com/tkeimages/gpu-scheduler-pro:v0.0.2          command: ["gpu-scheduler-pro"]          args: ["-priority", "binpack"]          env:          - name: PORT            value: "12345"         ---apiVersion: v1kind: Servicemetadata:  name: gpu-scheduler-pro  namespace: kube-system  labels:    app: gpu-scheduler-prospec:  ports:  - port: 12345    name: http    targetPort: 12345  selector:    app: gpu-scheduler-pro

修改 kube-scheduler 配置

vim /etc/kubernetes/scheduler-policy-config.json
{  "kind": "Policy",  "apiVersion": "v1",  "extenders": [    {      "urlPrefix": "http://gpu-quota-admission.kube-system:12345/scheduler",      "filterVerb": "filter",      "prioritizeVerb": "priorities",      "weight": 2,      "bindVerb":   "bind",      "enableHttps": false,      "nodeCacheCapable": true,      "managedResources": [        {          "name": "tencent.com/gpu-percent",          "ignoredByScheduler": false        }      ],      "ignorable": false    }  ]}vim /etc/kubernetes/manifests/kube-scheduler.yaml
--policy-config-file=/etc/kubernetes/scheduler-policy-config.json
    volumeMounts:
    - mountPath: /etc/kubernetes/scheduler-policy-config.json
      name: scheduler-policy-config
      readOnly: true
  volumes:
  - hostPath:
      path: /etc/kubernetes/scheduler-policy-config.json
      type: FileOrCreate
    name: scheduler-policy-config
dnsPolicy: ClusterFirstWithHostNet
kubectl label node *.*.*.* nvidia-device-enable=enable

独占模式

  1. tencent.com/gpu-percent: "100"——占用一张整卡

  2. tencent.com/gpu-percent: "320"——占用四张整卡(向上取整)

共享模式

  1. tencent.com/gpu-percent: "60"——占用一张卡60%的资源(这里的资源可以认为是显存或算力的任意一种)

  2. tencent.com/gpu-percent: "35"——占用一张卡35%的资源(这里的资源可以认为是显存或算力的任意一种)

共享策略

本方案提供三种共享策略

  • Spread:均衡集群内GPU卡上的资源占用

  • Random:随机发放任务

  • Binpack:优先“填满”已经占用了资源的GPU卡

具体可以通过配置 gpu-scheduler-pro 的 -priority 参数

可选的值为:"spread","random","binpack"

使用

        resources:          limits:            tencent.com/gpu-percent: 30

参考:docs.qq.com/doc/DSlFZWE…

参考:cloud.tencent.com/developer/a…

测试

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: vcuda-test
    qcloud-app: vcuda-test
  name: vcuda-test
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: vcuda-test
  template:
    metadata:
      labels:
        k8s-app: vcuda-test
        qcloud-app: vcuda-test
    spec:
      containers:
      - command:
        - sleep
        - 360000s
        env:
        - name: PATH
          value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
        image: menghe.tencentcloudcr.com/public/tensorflow-gputest:0.2
        imagePullPolicy: IfNotPresent
        name: tensorflow-test
cd /data/tensorflow/mnist && python convolutional.py在物理机上通过nvidia-smi pmon -s u -d 1命令查看GPU资源使用情况

限制GPU功率 GPU Fan ERR

/usr/bin/nvidia-persistenced --verbose

nvidia-smi -pl 240 -i 2 

nvidia-smi -pl 240 -i 1 

nvidia-smi -pl 240 -i 0

安装驱动

nvidia-uninstall

wget us.download.nvidia.com/tesla/440.9…

chmod +x NVIDIA-Linux-x86_64-440.95.01.run

GPU管理命令

手动控制风扇转速操作

nvidia-xconfig

生成/etc/X11/xorg.conf文件

nvidia-xconfig --enable-all-gpus

开启所有gpu

nvidia-xconfig --cool-bits=4

设置手动控制风扇转速

reboot

pip install coolgpus

$(which coolgpus) --temp 15 85 --speed 20 90

当GPU温度从15度变化到85度时,风扇转速从20%线性增加到90%

nvidia-settings -a "[gpu:0]/GPUFanControlState=1" 
nvidia-settings -a "[gpu:1]/GPUFanControlState=1" 
nvidia-settings -a "[gpu:2]/GPUFanControlState=1" 
nvidia-settings -a "[gpu:3]/GPUFanControlState=1" 
nvidia-settings -a "[fan:0]/GPUCurrentFanSpeed=80" 
nvidia-settings -a "[fan:1]/GPUCurrentFanSpeed=85" 
nvidia-settings -a "[fan:2]/GPUCurrentFanSpeed=86" 
nvidia-settings -a "[fan:3]/GPUCurrentFanSpeed=90"

开启持久模式

nvidia-smi -pm 1

/usr/bin/nvidia-smi -pl 240 -i 0

限制显卡功率

参考:mp.weixin.qq.com/s/jnn4IVHwR…