Kubernetes CSI初探

170 阅读4分钟

概述

CSI是Kubernetes比较复杂的开放接口设计,实际上存储并没有普通人想象得那么复杂。CSI的核心就是Controller接口和Node接口,为什么需要Controller,因为Volume需要动态Provision,需要提前准备,需要有快照功能。

details

volumeLifecycleModes

  • Ephemeral: 只在本地Node上挂载的临时持久性或非持久性存储,所有的参数都写在 Pod Spec 里,由 CSI 的 NodeServer 全权负责管理Volume,插件只需要实现IdentityServer和NodeServer
  • Persistence: 可在远程持久化的持久性存储,通过 PV/PVC 机制运行的,需要实现全面的接口IdentityServer、ControllerServer和NodeServer

k8s controller

external

  • k8s官方很贴心地把CSI中的一些机械操作实现成了sidecar,只需要把这些sidecar和CSI插件容器部署在一起,就能自动完成一些很酷的能力
  • registry.k8s.io/sig-storage/livenessprobe 帮助 NodeServer 暴露出 CSI 的 health 状况
  • registry.k8s.io/sig-storage/csi-node-driver-registrar 帮助 NodeServer 将CSI注册socket放置在kubelet监听的目录下
  • external-provisioner 监听了 APIServer 里的 PVC 对象。当一个 PVC 被创建时,它就会调用 CSI Controller 的 CreateVolume 方法,为你创建对应 PV。
  • external-snapshotter:是一个由官方 K8s sig 小组维护的辅助容器(sidecar),主要功能是实现持久卷的快照(VolumeSnapshot)、备份恢复等能力;
  • external-resizer:是一个由官方 K8s sig 小组维护的辅助容器(sidecar),主要功能是实现持久卷的弹性扩缩容,需要云厂商插件提供相应的能力

attach/dettach

  • 通过 StatefulSet 在任意一个节点上再启动一个 CSI 插件,为 External Components 提供 CSI Controller 服务。StatefulSet 是严格保证顺序的,即:只有在前一个 Pod 停止并删除之后,它才会创建并启动下一个 Pod。
  • 对于in-tree,Volume Controller(AttachDetachController)会不断检查已绑定的PV/PVC
  • 对于out-tree,k8s官方提供了external-attacher 监听到 VolumeAttachment informer,调用 RPC-ControllerPublishVolume 实现 AttachVolume;
  • 在实际使用 CSI 插件的时候,我们会将这三个 External Components 作为 sidecar 容器和 CSI 插件放置在同一个 Pod 中。由于 External Components 对 CSI 插件的调用非常频繁,所以这种 sidecar 的部署方式非常高效。

mount/unmount

  • VolumeManager 是 kubelet 组件的一部分,是一个独立于 kubelet 主循环的 Goroutine。
  • 在定义 DaemonSet Pod 的时候,我们需要把宿主机的 /var/lib/kubelet 以 Volume 的方式挂载进 CSI 插件容器的同名目录下,然后设置这个 Volume 的 mountPropagation=Bidirectional,即开启双向挂载传播,从而将容器在这个目录下进行的挂载操作“传播”给宿主机,反之亦然。
type VolumeManager interface {
    Run(sourcesReady config.SourcesReady, stopCh <-chan struct{})
	WaitForAttachAndMount(pod *v1.Pod) error
	GetMountedVolumesForPod(podName types.UniquePodName) container.VolumeMap
	GetExtraSupplementalGroupsForPod(pod *v1.Pod) []int64
	GetVolumesInUse() []v1.UniqueVolumeName
	ReconcilerStatesHasBeenSynced() bool
	VolumeIsAttached(volumeName v1.UniqueVolumeName) bool
	MarkVolumesAsReportedInUse(volumesReportedAsInUse []v1.UniqueVolumeName)
}

func (rc *reconciler) reconcile() {
	// Unmounts are triggered before mounts so that a volume that was
	// referenced by a pod that was deleted and is now referenced by another
	// pod is unmounted from the first pod before being mounted to the new
	// pod.
	rc.unmountVolumes()

	// Next we mount required volumes. This function could also trigger
	// attach if kubelet is responsible for attaching volumes.
	// If underlying PVC was resized while in-use then this function also handles volume
	// resizing.
	rc.mountAttachVolumes()

	// Ensure devices that should be detached/unmounted are detached/unmounted.
	rc.unmountDetachDevices()
}

hostpath csi

csi.IdentityServer

type IdentityServer interface {
	GetPluginInfo(context.Context, *GetPluginInfoRequest) (*GetPluginInfoResponse, error)
	GetPluginCapabilities(context.Context, *GetPluginCapabilitiesRequest) (*GetPluginCapabilitiesResponse, error)
	Probe(context.Context, *ProbeRequest) (*ProbeResponse, error)
}

csi.ControllerServer

type ControllerServer interface {
	CreateVolume(context.Context, *CreateVolumeRequest) (*CreateVolumeResponse, error)
	DeleteVolume(context.Context, *DeleteVolumeRequest) (*DeleteVolumeResponse, error)
	ControllerPublishVolume(context.Context, *ControllerPublishVolumeRequest) (*ControllerPublishVolumeResponse, error)
	ControllerUnpublishVolume(context.Context, *ControllerUnpublishVolumeRequest) (*ControllerUnpublishVolumeResponse, error)
	ValidateVolumeCapabilities(context.Context, *ValidateVolumeCapabilitiesRequest) (*ValidateVolumeCapabilitiesResponse, error)
	ListVolumes(context.Context, *ListVolumesRequest) (*ListVolumesResponse, error)
	GetCapacity(context.Context, *GetCapacityRequest) (*GetCapacityResponse, error)
	ControllerGetCapabilities(context.Context, *ControllerGetCapabilitiesRequest) (*ControllerGetCapabilitiesResponse, error)
	CreateSnapshot(context.Context, *CreateSnapshotRequest) (*CreateSnapshotResponse, error)
	DeleteSnapshot(context.Context, *DeleteSnapshotRequest) (*DeleteSnapshotResponse, error)
	ListSnapshots(context.Context, *ListSnapshotsRequest) (*ListSnapshotsResponse, error)
	ControllerExpandVolume(context.Context, *ControllerExpandVolumeRequest) (*ControllerExpandVolumeResponse, error)
	ControllerGetVolume(context.Context, *ControllerGetVolumeRequest) (*ControllerGetVolumeResponse, error)
	ControllerModifyVolume(context.Context, *ControllerModifyVolumeRequest) (*ControllerModifyVolumeResponse, error)
}
  • CSI Controller 服务的实际调用者,并不是 Kubernetes,而是 External Provisioner 和 External Attacher。这两个 External Components,分别通过监听 PVC 和 VolumeAttachement 对象,来跟 Kubernetes 进行协作。

CreateVolume:

  • hp.state.GetVolumeByName
  • hp.createVolume
  • hp.loadFromSnapshot

CreateSnapshot:

  • hostPathVolume, err := hp.state.GetVolumeByID(volumeID)
  • snapshotID := uuid.NewUUID().String()
  • file := hp.getSnapshotPath(snapshotID)
  • hp.createSnapshotFromVolume(hostPathVolume, file)

csi.NodeServer

type NodeServer interface {
	NodeStageVolume(context.Context, *NodeStageVolumeRequest) (*NodeStageVolumeResponse, error)
	NodeUnstageVolume(context.Context, *NodeUnstageVolumeRequest) (*NodeUnstageVolumeResponse, error)
	NodePublishVolume(context.Context, *NodePublishVolumeRequest) (*NodePublishVolumeResponse, error)
	NodeUnpublishVolume(context.Context, *NodeUnpublishVolumeRequest) (*NodeUnpublishVolumeResponse, error)
	NodeGetVolumeStats(context.Context, *NodeGetVolumeStatsRequest) (*NodeGetVolumeStatsResponse, error)
	NodeExpandVolume(context.Context, *NodeExpandVolumeRequest) (*NodeExpandVolumeResponse, error)
	NodeGetCapabilities(context.Context, *NodeGetCapabilitiesRequest) (*NodeGetCapabilitiesResponse, error)
	NodeGetInfo(context.Context, *NodeGetInfoRequest) (*NodeGetInfoResponse, error)
}

NodeStageVolume:

  • vol.Staged.Add(stagingTargetPath)

NodePublishVolume:

  • mounter := mount.New("")
  • mounter.Mount(path, targetPath, "", options)

csi.GroupControllerServer

type GroupControllerServer interface {
	GroupControllerGetCapabilities(context.Context, *GroupControllerGetCapabilitiesRequest) (*GroupControllerGetCapabilitiesResponse, error)
	CreateVolumeGroupSnapshot(context.Context, *CreateVolumeGroupSnapshotRequest) (*CreateVolumeGroupSnapshotResponse, error)
	DeleteVolumeGroupSnapshot(context.Context, *DeleteVolumeGroupSnapshotRequest) (*DeleteVolumeGroupSnapshotResponse, error)
	GetVolumeGroupSnapshot(context.Context, *GetVolumeGroupSnapshotRequest) (*GetVolumeGroupSnapshotResponse, error)
}

practice

开源实现

安装host-path

pre

host-path实现比较全面,需要提前安装好官方的二方CRD


wget https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/release-6.3/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
wget https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/release-6.3/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
wget https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/release-6.3/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml

wget https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v6.3.3/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
wget https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v6.3.3/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml

deploy

必须切换到default namespace

deploy/kubernetes-1.24/deploy.sh
deploy/kubernetes-1.24/destroy.sh

red deployment guide