systemd version does not support ability to start a slice as transient unit
proxy和calico这些基础的pod经常性not ready
systemctl status sshd
pam_systemd(sshd:session): Failed to create session: No buffer space available
systemctl status systemd-logind.service
Failed to start session scope session-97997.scope: The maximum number of pending replies per connection has been reached
网上说是升级dbus导致的,但是并没有升级过,尝试重启systemd-logind.service,中间还测试了重启这个服务tmux的连接会不会受影响。。
找到了一篇文章是临时解决方案
systemctl daemon-reexec
systemctl start systemd-logind.service
也解决了这几天困扰的docker问题
node not ready 上去看docker报错
accept unix /var/run/docker.sock: accept4: too many open files
lsof `ps -ef|grep dockerd|grep -v grep|awk {print $1}`|wc -l
docker进程已经打开了655350+文件
当时没找到解决办法,就重启docker,然后docker的各种命令就失效了,也没办法了,18.06的docker升级到了20.10,docker ps可以执行了,不过都是create状态,stop、restart、inspect、start一执行就卡住
kubectl报错PLEG is not healthy
start 容器时日志了发现了一条有用的信息
The maximum number of pending replies per connection has been reached
上网一搜也是和dbus有关,验证sshd服务报错一样
已知原因
内存耗尽导致org.freedesktop.systemd模块崩溃
busctl
org.freedesktop.systemd1 - - - (activatable)