基础背景
-
InMemory:A temporary storage driver using a local inmemory map. This exists solely for reference and testing.
-
FileSystem:A local storage driver configured to use a directory tree in the local filesystem.
-
S3:A driver storing objects in an Amazon Simple Storage Service (S3) bucket.
-
Azure:A driver storing objects in Microsoft Azure Blob Storage.
-
Swift:A driver storing objects in Openstack Swift.
-
OSS:A driver storing objects in Aliyun OSS.
-
GCS:A driver storing objects in a Google Cloud Storage bucket.
$ docker run -d \ -p 5000:5000 \ --restart=always \ --name registry \ -v /mnt/registry:/var/lib/registry \ registry:2
生产环境挑战
-
性能问题:基于磁盘文件系统的 Docker Registry 进程读取延迟大,无法满足高并发高吞吐镜像请求需要,且受限于单机磁盘,CPU,网络资源限制,无法满足上百台机器同时拉取镜像的负载压力。
-
容量问题:单机磁盘容量有限,存储容量存在瓶颈。知乎生产环境中现有的不同版本镜像大概有上万个,单备份的容量在 15T 左右,加上备份这个容量还要增加不少。
-
权限控制:在生产环境中,需要对镜像仓库配置相应的权限认证。缺少权限认证的镜像仓库就如同没有认证的 Git 仓库一样,很容易造成信息泄露或者代码污染。
知乎解决方案
-
基于仓库目录的权限管理:针对不同的仓库目录,提供不同的权限控制,例如 /v2/path1 作为公有仓库目录,可以直接进行访问,而 /v2/path2 作为私有仓库目录,必须经过认证才能访问。
-
基于机器的权限管理:只允许某些特定的机器有 pull/push 镜像的权限。
proxy_cache_path /dev/shm/registry-cache levels=1:2 keys_zone=registry-cache:10m max_size=124G;
type StorageDriver interface { // Name returns the human-readable "name" of the driver。 Name() string // GetContent retrieves the content stored at "path" as a []byte. GetContent(ctx context.Context, path string) ([]byte, error) // PutContent stores the []byte content at a location designated by "path". PutContent(ctx context.Context, path string, content []byte) error // Reader retrieves an io.ReadCloser for the content stored at "path" // with a given byte offset. Reader(ctx context.Context, path string, offset int64) (io.ReadCloser, error) // Writer returns a FileWriter which will store the content written to it // at the location designated by "path" after the call to Commit. Writer(ctx context.Context, path string, append bool) (FileWriter, error) // Stat retrieves the FileInfo for the given path, including the current // size in bytes and the creation time. Stat(ctx context.Context, path string) (FileInfo, error) // List returns a list of the objects that are direct descendants of the //given path. List(ctx context.Context, path string) ([]string, error) // Move moves an object stored at sourcePath to destPath, removing the // original object. Move(ctx context.Context, sourcePath string, destPath string) error // Delete recursively deletes all objects stored at "path" and its subpaths. Delete(ctx context.Context, path string) error URLFor(ctx context.Context, path string, options map[string]interface{}) (string, error)}type FileWriter interface { io.WriteCloser // Size returns the number of bytes written to this FileWriter. Size() int64 // Cancel removes any written content from this FileWriter. Cancel() error // Commit flushes all content written to this FileWriter and makes it // available for future calls to StorageDriver.GetContent and // StorageDriver.Reader. Commit() error}
其中需要注意的是 StorageDriver 的 Writer 方法里的 append 参数,这就要求存储后端及其客户端必须提供相应的 append 方法,colinmarc/hdfs 这个 HDFS 客户端中没有实现 append 方法,我们补充实现了这个方法。
镜像清理
持续集成系统中,每次生产环境代码发布都对应有容器镜像的构建和发布,会导致镜像仓库存储空间的持续上涨,需要及时清理不用的镜像释放存储空间。但 Docker Registry 本身并没有配置镜像 TTL 的机制,需要自己开发定时清理脚本。
Docker Registry 删除镜像有两种方式,一种是删除镜像:
DELETE /v2/<name>/manifests/<reference>
另一种是直接删除镜像层 blob 数据:
DELETE /v2/<name>/blobs/<digest>
由于容器镜像层级间存在依赖引用关系,所以推荐使用第一种方式清理过期镜像的引用,然后由 Docker Registry 自身判断镜像层数据没有被引用后再执行物理删除。