问题描述
在使用 kubectl create -f
命令创建完 rc
,再使用kubectl get pods
查看 pod 状态,发现 pod 一直卡在 ContainerCreating 状态,执行步骤如下
# kubectl create -f mysql-rc.yaml
replicationcontroller "mysql" created
# kubectl get pods
NAME READY STATUS RESTARTS AGE
mysql-nznsb 0/1 ContainerCreating 0 12m
我的mysql-rc.yaml
apiVersion: v1
kind: ReplicationController
metadata:
name: mysql
spec:
replicas: 1
selector:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql/mysql-server:8.0.18-1.1.13
ports:
- containerPort: 3306
env:
- name: MYSQL_ROOT_PASSWORD
value: "123456"
问题排查及解决
使用kubectl describe
命令查看 pod 最近的事件
# kube describe pod mysql
Name: mysql-dkh46
Namespace: default
Node: 127.0.0.1/127.0.0.1
Start Time: Sun, 04 Jul 2021 19:48:13 +0800
Labels: app=mysql
Status: Pending
IP:
Controllers: ReplicationController/mysql
Containers:
mysql:
Container ID:
Image: mysql/mysql-server:8.0.18-1.1.13
Image ID:
Port: 3306/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Volume Mounts: <none>
Environment Variables:
MYSQL_ROOT_PASSWORD: 123456
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
No volumes.
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath TypeReason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 {default-scheduler } Normal Scheduled Successfully assigned mysql-dkh46 to 127.0.0.1
1m 28s 4 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request. details: (open /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt: no such file or directory)"
1m 2s 5 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"registry.access.redhat.com/rhel7/pod-infrastructure:latest\""
发现在拉取镜像的时候,报了下面的错误
Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request. details: (open /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt: no such file or directory)"
原因是找不到/etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt
这个证书,接着使用ll
命令发现该地址为一个软链接,而且链接的文件也不存在
# ll /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt
lrwxrwxrwx. 1 root root 27 7月 4 17:26 /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt -> /etc/rhsm/ca/redhat-uep.pem
# ll /etc/rhsm/ca | grep redhat | wc -l
0
网上搜索了资料了解到 rhsm
系列是 redhat 红帽的订阅服务相关包,centos 是重编译 redhat 发布得到的,所以也需要用到 rhsm。报错信息里报告的缺的证书位置,其实只是个符号链接,真正缺的证书位置在 /etc/rhsm/ca/redhat-uep.pem 。 某个版本以前这个证书是通过 python-rhsm-certificates 包提供,但 centos 7 里提示这个包被 subscription-manager-rhsm-certificates 替代了。坑人的点是这个新换上来的包有bug, 包装完了提示正确,其实没证书,有兴趣的可以到这里查看 issue
issue 里提供了一种不用下载包,也不用从旧版 python-rhsm-certificates 包提取证书的办法,只需要执行下面命令即可
openssl s_client -showcerts -servername registry.access.redhat.com -connect registry.access.redhat.com:443 </dev/null 2>/dev/null | openssl x509 -text > /etc/rhsm/ca/redhat-uep.pem
提取完证书就能正确正常启动 pod 啦
# kubectl get pods
NAME READY STATUS RESTARTS AGE
mysql-nznsb 1/1 Running 0 29m
参考链接: