今天登录了下自己的开发环境,发现一个问题,kubectl 命令不能用了,看样子好像是api-server 有点问题,6443端口连接超时,
[root@dev-1 root]# kubectl get pod
E0731 16:12:11.008773 3160 memcache.go:265] couldn't get current server API group list: Get "https://10.40.30.125:6443/api?timeout=32s": net/http: TLS handshake timeout
E0731 16:12:14.576349 3160 memcache.go:265] couldn't get current server API group list: Get "https://10.40.30.125:6443/api?timeout=32s": dial tcp 10.40.30.125:6443: connect: connection refused - error from a previous attempt: read tcp 10.40.30.125:41220->10.40.30.125:6443: read: connection reset by peer
E0731 16:12:14.576932 3160 memcache.go:265] couldn't get current server API group list: Get "https://10.40.30.125:6443/api?timeout=32s": dial tcp 10.40.30.125:6443: connect: connection refused
E0731 16:12:14.578884 3160 memcache.go:265] couldn't get current server API group list: Get "https://10.40.30.125:6443/api?timeout=32s": dial tcp 10.40.30.125:6443: connect: connection refused
E0731 16:12:14.580557 3160 memcache.go:265] couldn't get current server API group list: Get "https://10.40.30.125:6443/api?timeout=32s": dial tcp 10.40.30.125:6443: connect: connection refused
检查下api-server 容器的状态,好家伙死了好多次然后又重启了,直接查看下日志,好像是证书的问题。
[root@dev-1 root]# docker ps -a |grep api
ae18b7325416 771ffcf9ca63 "kube-apiserver --ad…" 2 seconds ago Up 1 second k8s_kube-apiserver_kube-apiserver-dev-1_kube-system_f3db9fee3bdc5df10da40f9cf7397e2d_143
07692db807ed 771ffcf9ca63 "kube-apiserver --ad…" 41 seconds ago Exited (1) 20 seconds ago k8s_kube-apiserver_kube-apiserver-dev-1_kube-system_f3db9fee3bdc5df10da40f9cf7397e2d_142
[root@dev-1 root]# docker logs 92c8421d3ad3 -f --tail 10
W0731 08:13:15.960932 1 clientconn.go:1223] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate has expired or is not yet valid: current time 2023-07-31T08:13:15Z is after 2023-07-28T05:38:26Z". Reconnecting...
W0731 08:13:16.528727 1 clientconn.go:1223] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate has expired or is not yet valid: current time 2023-07-31T08:13:16Z is after 2023-07-28T05:38:26Z". Reconnecting...
W0731 08:13:17.601180 1 clientconn.go:1223] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate has expired or is not yet valid: current time 2023-07-31T08:13:17Z is after 2023-07-28T05:38:26Z". Reconnecting...
使用kubeadm 命令检查下证书的状态,看看是不是过期了。
确认下证书
[root@dev-1 root]# kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Jul 28, 2023 05:38 UTC <invalid> no
apiserver Jul 28, 2023 05:38 UTC <invalid> ca no
apiserver-etcd-client Jul 28, 2023 05:38 UTC <invalid> etcd-ca no
apiserver-kubelet-client Jul 28, 2023 05:38 UTC <invalid> ca no
controller-manager.conf Jul 28, 2023 05:38 UTC <invalid> no
etcd-healthcheck-client Jul 28, 2023 05:38 UTC <invalid> etcd-ca no
etcd-peer Jul 28, 2023 05:38 UTC <invalid> etcd-ca no
etcd-server Jul 28, 2023 05:38 UTC <invalid> etcd-ca no
front-proxy-client Jul 28, 2023 05:38 UTC <invalid> front-proxy-ca no
scheduler.conf Jul 28, 2023 05:38 UTC <invalid> no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Jul 25, 2032 05:38 UTC 8y no
etcd-ca Jul 25, 2032 05:38 UTC 8y no
front-proxy-ca Jul 25, 2032 05:38 UTC 8y no
果然,证书过期了,更新下证书就搞定了
[root@dev-1 root]# kubeadm certs renew all
[renew] Reading configuration from the cluster...
[renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[renew] Error reading configuration from the Cluster. Falling back to default configuration
certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
certificate the apiserver uses to access etcd renewed
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healthcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed
Done renewing certificates. You must restart the kube-apiserver, kube-controller-manager, kube-scheduler and etcd, so that they can use the new certificates.
[root@dev-1 root]# cp /etc/kubernetes/admin.conf /root/.kube/config
cp: overwrite ‘/root/.kube/config’? yes