Kubernetes 源码编译与调试

1,980 阅读2分钟

本文所示编译与调试环境:

  • 操作系统:Debian 11
  • Kubernetes版本:v1.24.0

依赖安装

  • Go 1.16+:注意当前运行 Kubernetes 版本所需的 Go 版本
  • Docker:其实只需要 Containerd 即可
  • Etcd:只需将二进制程序加入 PATH 即可,无需运行(安装指南
  • rsync:文件同步与传输工具(安装指南
  • openssl 和 cfssl:证书配置相关工具(cfssl 安装指南
  • pyyaml 和 jq:处理 yaml 与 json 的工具
  • delve:Go 语言调试工具(安装指南

源码下载

# 下载源码
~# cd $GOPATH/src/k8s.io
~/go/src/k8s.io# git clone git@github.com:kubernetes/kubernetes.git

# 切换分支
~/go/src/k8s.io# git checkout -b v1.24.0-gaofubao v1.24.0

源码编译

~/go/src/k8s.io# cd kubernetes

# 编译指定组件,比如只编译 apiserver 组件
~/go/src/k8s.io/kubernetes# make WHAT="cmd/kube-apiserver"

# 编译所有组件
~/go/src/k8s.io/kubernetes# make all

本地运行

# 使用脚本 local-up-cluster.sh 在本地启动集群,使用 -O 参数可以跳过编译
~/go/src/k8s.io/kubernetes# ./hack/local-up-cluster.sh

启动过程如下:

包括编译二进制文件、启动 etcd、配置证书、启动各个组件、初始化配置。

~/go/src/k8s.io/kubernetes# ./hack/local-up-cluster.sh
make: Entering directory '/root/go/src/k8s.io/kubernetes'
make[1]: Entering directory '/root/go/src/k8s.io/kubernetes'
+++ [0622 13:43:33] Building go targets for linux/amd64
    k8s.io/kubernetes/hack/make-rules/helpers/go2make (non-static)
make[1]: Leaving directory '/root/go/src/k8s.io/kubernetes'
+++ [0622 13:43:40] Building go targets for linux/amd64
    k8s.io/kubernetes/cmd/kubectl (static)
    k8s.io/kubernetes/cmd/kube-apiserver (static)
    k8s.io/kubernetes/cmd/kube-controller-manager (static)
    k8s.io/kubernetes/cmd/cloud-controller-manager (non-static)
    k8s.io/kubernetes/cmd/kubelet (non-static)
    k8s.io/kubernetes/cmd/kube-proxy (static)
    k8s.io/kubernetes/cmd/kube-scheduler (static)
make: Leaving directory '/root/go/src/k8s.io/kubernetes'
API SERVER secure port is free, proceeding...
Detected host and ready to start services.  Doing some housekeeping first...
Using GO_OUT /root/go/src/k8s.io/kubernetes/_output/local/bin/linux/amd64
Starting services now!
Starting etcd
etcd --advertise-client-urls http://127.0.0.1:2379 --data-dir /tmp/tmp.ReyfG9jljT --listen-client-urls http://127.0.0.1:2379 --log-level=warn 2> "/tmp/etcd.log" >/dev/null
Waiting for etcd to come up.
+++ [0622 13:44:22] On try 2, etcd: : {"health":"true","reason":""}
{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"2","raft_term":"2"}}Generating a RSA private key
.....................+++++
.+++++
writing new private key to '/var/run/kubernetes/server-ca.key'
-----
Generating a RSA private key
............................................+++++
.............+++++
writing new private key to '/var/run/kubernetes/client-ca.key'
-----
Generating a RSA private key
............+++++
..................................................................+++++
writing new private key to '/var/run/kubernetes/request-header-ca.key'
-----
2022/06/22 13:44:23 [INFO] generate received request
2022/06/22 13:44:23 [INFO] received CSR
2022/06/22 13:44:23 [INFO] generating key: rsa-2048
2022/06/22 13:44:24 [INFO] encoded CSR
2022/06/22 13:44:24 [INFO] signed certificate with serial number 316484342648151277788223520987798181372669204023
2022/06/22 13:44:24 [INFO] generate received request
2022/06/22 13:44:24 [INFO] received CSR
2022/06/22 13:44:24 [INFO] generating key: rsa-2048
2022/06/22 13:44:24 [INFO] encoded CSR
2022/06/22 13:44:24 [INFO] signed certificate with serial number 483402128848834578820238608702910701792117615120
2022/06/22 13:44:24 [INFO] generate received request
2022/06/22 13:44:24 [INFO] received CSR
2022/06/22 13:44:24 [INFO] generating key: rsa-2048
2022/06/22 13:44:24 [INFO] encoded CSR
2022/06/22 13:44:24 [INFO] signed certificate with serial number 643619994566886900285939593322108881683622809329
2022/06/22 13:44:24 [INFO] generate received request
2022/06/22 13:44:24 [INFO] received CSR
2022/06/22 13:44:24 [INFO] generating key: rsa-2048
2022/06/22 13:44:24 [INFO] encoded CSR
2022/06/22 13:44:24 [INFO] signed certificate with serial number 687198736284224653860822250521043475532728156987
2022/06/22 13:44:24 [INFO] generate received request
2022/06/22 13:44:24 [INFO] received CSR
2022/06/22 13:44:24 [INFO] generating key: rsa-2048
2022/06/22 13:44:24 [INFO] encoded CSR
2022/06/22 13:44:24 [INFO] signed certificate with serial number 459931259388187713827201725437722810587624043676
2022/06/22 13:44:24 [INFO] generate received request
2022/06/22 13:44:24 [INFO] received CSR
2022/06/22 13:44:24 [INFO] generating key: rsa-2048
2022/06/22 13:44:24 [INFO] encoded CSR
2022/06/22 13:44:24 [INFO] signed certificate with serial number 124310695926746580002510955205416073606619827050
2022/06/22 13:44:24 [INFO] generate received request
2022/06/22 13:44:24 [INFO] received CSR
2022/06/22 13:44:24 [INFO] generating key: rsa-2048
2022/06/22 13:44:25 [INFO] encoded CSR
2022/06/22 13:44:25 [INFO] signed certificate with serial number 24647499205355822568100299681233210589893301809
2022/06/22 13:44:25 [INFO] generate received request
2022/06/22 13:44:25 [INFO] received CSR
2022/06/22 13:44:25 [INFO] generating key: rsa-2048
2022/06/22 13:44:25 [INFO] encoded CSR
2022/06/22 13:44:25 [INFO] signed certificate with serial number 334203862632984583752933093166287671219818601350
Waiting for apiserver to come up
+++ [0622 13:44:29] On try 4, apiserver: : ok
clusterrolebinding.rbac.authorization.k8s.io/kube-apiserver-kubelet-admin created
clusterrolebinding.rbac.authorization.k8s.io/kubelet-csr created
Cluster "local-up-cluster" set.
use 'kubectl --kubeconfig=/var/run/kubernetes/admin-kube-aggregator.kubeconfig' to use the aggregated API server
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
coredns addon successfully deployed.
Checking CNI Installation at /opt/cni/bin
WARNING : The kubelet is configured to not fail even if swap is enabled; production deployments should disable swap unless testing NodeSwap feature.
2022/06/22 13:44:32 [INFO] generate received request
2022/06/22 13:44:32 [INFO] received CSR
2022/06/22 13:44:32 [INFO] generating key: rsa-2048
2022/06/22 13:44:32 [INFO] encoded CSR
2022/06/22 13:44:32 [INFO] signed certificate with serial number 279362741530389144417621340526570102701309197245
kubelet ( 449358 ) is running.
wait kubelet ready
No resources found
No resources found
No resources found
No resources found
No resources found
No resources found
No resources found
127.0.0.1   NotReady   <none>   1s    v1.25.0-alpha.1.65+3beb8dc5967801
2022/06/22 13:44:47 [INFO] generate received request
2022/06/22 13:44:47 [INFO] received CSR
2022/06/22 13:44:47 [INFO] generating key: rsa-2048
2022/06/22 13:44:47 [INFO] encoded CSR
2022/06/22 13:44:47 [INFO] signed certificate with serial number 22613956019520304841489646238773318865523346776
Create default storage class for
storageclass.storage.k8s.io/standard created
Local Kubernetes cluster is running. Press Ctrl-C to shut it down.

Logs:
  /tmp/kube-apiserver.log
  /tmp/kube-controller-manager.log

  /tmp/kube-proxy.log
  /tmp/kube-scheduler.log
  /tmp/kubelet.log

To start using your cluster, you can open up another terminal/tab and run:

  export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
  cluster/kubectl.sh

Alternatively, you can write to the default kubeconfig:

  export KUBERNETES_PROVIDER=local

  cluster/kubectl.sh config set-cluster local --server=https://localhost:6443 --certificate-authority=/var/run/kubernetes/server-ca.crt
  cluster/kubectl.sh config set-credentials myself --client-key=/var/run/kubernetes/client-admin.key --client-certificate=/var/run/kubernetes/client-admin.crt
  cluster/kubectl.sh config set-context local --cluster=local --user=myself
  cluster/kubectl.sh config use-context local
  cluster/kubectl.sh

测试集群:

# 测试集群
~/go/src/k8s.io/kubernetes# export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
~/go/src/k8s.io/kubernetes# ./cluster/kubectl.sh get nodes

注意:

  1. 启动过程中遇到报错可在 /tmp 目录下查看相关组件的日志;
  2. 我曾遇到的报错是 kubele 启动失败,主要是 containerd 配置有导致,使用命令 containerd config default > /etc/containerd/config.toml 生成配置,并重启 containerd 服务。

源码调试

以 debug apiserver 为例:
先使用 local-up-cluster.sh 脚本启动集群。

使用 dlv 重新启动 apiserver

  1. 查看 apiserver 进程 pid 和启动命令行: 查看启动命令行

  2. kill apiserver

# kill apiserver
~/go/src/k8s.io/kubernetes# kill -9 758814
  1. 使用 dlv 启动 apiserver dlv exec <apiserver 主程序> --headless --listen=:12345 --api-version=2 --log --log-output=debugger,gdbwire,lldbout,debuglineerr,rpc,dap,fncall,minidump --log-dest=/tmp/deleve.log -- <apiserver 启动参数>
# 使用 dlv 启动 apiserver
~/go/src/k8s.io/kubernetes# dlv exec /root/go/src/k8s.io/kubernetes/_output/local/bin/linux/amd64/kube-apiserver --headless --listen=:12345 --api-version=2 --log --log-output=debugger,gdbwire,lldbout,debuglineerr,rpc,dap,fncall,minidump --log-dest=/tmp/deleve.log -- --authorization-mode=Node,RBAC  --cloud-provider= --cloud-config=   --v=3 --vmodule= --audit-policy-file=/tmp/kube-audit-policy-file --audit-log-path=/tmp/kube-apiserver-audit.log --authorization-webhook-config-file= --authentication-token-webhook-config-file= --cert-dir=/var/run/kubernetes --egress-selector-config-file=/tmp/kube_egress_selector_configuration.yaml --client-ca-file=/var/run/kubernetes/client-ca.crt --kubelet-client-certificate=/var/run/kubernetes/client-kube-apiserver.crt --kubelet-client-key=/var/run/kubernetes/client-kube-apiserver.key --service-account-key-file=/tmp/kube-serviceaccount.key --service-account-lookup=true --service-account-issuer=https://kubernetes.default.svc --service-account-jwks-uri=https://kubernetes.default.svc/openid/v1/jwks --service-account-signing-key-file=/tmp/kube-serviceaccount.key --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,Priority,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,NodeRestriction --disable-admission-plugins= --admission-control-config-file= --bind-address=0.0.0.0 --secure-port=6443 --tls-cert-file=/var/run/kubernetes/serving-kube-apiserver.crt --tls-private-key-file=/var/run/kubernetes/serving-kube-apiserver.key --storage-backend=etcd3 --storage-media-type=application/vnd.kubernetes.protobuf --etcd-servers=http://127.0.0.1:2379 --service-cluster-ip-range=10.0.0.0/24 --feature-gates=AllAlpha=false --external-hostname=localhost --requestheader-username-headers=X-Remote-User --requestheader-group-headers=X-Remote-Group --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-client-ca-file=/var/run/kubernetes/request-header-ca.crt --requestheader-allowed-names=system:auth-proxy --proxy-client-cert-file=/var/run/kubernetes/client-auth-proxy.crt --proxy-client-key-file=/var/run/kubernetes/client-auth-proxy.key --cors-allowed-origins="/127.0.0.1(:[0-9]+)?$,/localhost(:[0-9]+)?$"

连接 debug server

使用命令行连接

~# dlv connect localhost:12345

调试界面如下,可输入 help 查看更多调试指令:

命令行调试

使用 vscode 连接

在桌面使用 vscode 打开源码,并切换至服务端相同的分支,同时添加 launch.json 配置。

# launch.json
{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Connect to server",
      "type": "go",
      "request": "attach",
      "mode": "remote",
      "port": 12345,
      "host": "42.192.13.253"
    }
  ]
}

在 vscode 上打上断点,并启动 apiserver(apiserver 的 main 文件位于 cmd/kube-apiserver/apiserver.go)。

使用 postman 请求 apiserver

由于 kubectl 连接 apiserver 时,本地会有缓存,有些请求可能不会发送到 server 端,所以在调试时建议使用 postman 等工具发起请求。
使用外部客户端访问 apiserver 前需配置好认证与授权:

  1. 创建 Service Account ServiceAccount 是 apiserver 提供的认证机制之一。
~/go/src/k8s.io/kubernetes# ./cluster/kubectl.sh create sa postman
  1. 创建 Secret 出于安全考虑,1.24 及之后的版本,不再自动为一个 Service Account 创建 Secret,需要自己手动创建。
~/go/src/k8s.io/kubernetes# ./cluster/kubectl.sh apply -f /root/postman-sa-secret.yaml

postman-sa-secret.yaml 文件内容如下:

apiVersion: v1
kind: Secret
metadata:
  name: postman-sa-secret
  annotations:
    kubernetes.io/service-account.name: postman
type: kubernetes.io/service-account-token
  1. 创建 ClusterRole 为 Service Account 授权,从而能够操作 API 对象。
~/go/src/k8s.io/kubernetes# ./cluster/kubectl.sh create rolebinding postman-admin --clusterrole cluster-admin --serviceaccount default:postman
  1. 获取 Secret 中的 Token
~/go/src/k8s.io/kubernetes# ./cluster/kubectl.sh describe secret postman-sa-secret

postman-sa-secret

  1. 提取 Secret 中的证书
~/go/src/k8s.io/kubernetes# ./cluster/kubectl.sh get secret postman-sa-secret -o jsonpath="{.data['ca.crt']}" | base64 -d > /tmp/ca.crt
  1. 配置 postman 在 postman 上配置 token 和导出的证书。 postman

注意事项

修改编译参数

修改 hack/lib/golang.sh 文件,使编译器不去优化debug信息:

  • 禁止 -gcflags="all=-w -s",保留文件名和行号
  • 加上 -gcflags="all=-N -l",禁止优化和内联 golang.sh

修改 CA 证书

如果希望 kubectl 可以跨主机访问,需将主机名(或 IP 地址)写入 apiserver 证书中,在 hack/local-up-cluster.sh 文件的 kube::util::create_serving_certkey 函数后追加一个该主机名作为参数: local-up-cluster.sh

我正在参与掘金技术社区创作者签约计划招募活动,点击链接报名投稿