tekton与argocd实现gitops

454 阅读7分钟

tekton与argocd实现gitops

tekton与argocd完成CI/CD,argo-rollout结合istio、prometheus完成流量自动迁移,流量健康分析后自动发布与回滚

可以完全做到业务发布无人值守,自动根据请求的指标是否正常,完成发布与回滚

gitops核心点:

gitops是devops的进一步落地方案,在git管理方面

  • 传统devops将业务代码与发布配置放在同一git仓库,使用的push模型发布业务,关键配置信息也是未加密直接暴露
  • gitops将代码仓库与配置仓库分开管理,代码仓库仅用于开发人员使用,配置仅用于运维/管理人员使用,且应用的核心配置文件更符合k8s的原生方式管理,直接使用secret配置,使用pull模型发布业务,也不会因为代码仓库的错误push事件而导致业务应用的版本更新

值得优化点:

  • tekton判断gitlab事件类型,能详细分析是否对主分支pr合并、tag是创建还是删除(目前是对git仓库的所有push、pr、tag事件做处理,但未细分tag创建还是删除,可以写个程序判断事件后交给tekton,或由knative-eventing处理事件)
  • argocd做多发布环境配置,目前仅做了对一个环境配置
  • 流量健康分析时需要有人请求服务才有分析指标,可以写程序或用k8s的job在发布应用时做一段时间持续请求(写个模板请求对应api,被请求内容由CI部分生成),发布完成后删除这个临时程序即可,发布异常则通知相关人员

示例代码仓库:

可以将示例代码克隆到自己本地的gitlab仓库中,方便修改

案例

提前准备内容:

需要将访问gitlab的ssh公私钥、私有harbor登录凭据创建好

#ssh
kubectl create secret generic ssh-secret --from-file ~/.ssh/id_rsa --from-file ~/.ssh/id_rsa.pub --from-file ~/.ssh/known_hosts

#harbor
kubectl create secret docker-registry harbor-http-secret --docker-server=2.2.2.67 --docker-username=admin --docker-password=123456

1)创建CI任务清单(tekton部分)

完成克隆代码,构建、上传镜像,修改配置仓库,推送配置仓库部分(配置仓库从安全考虑,应该人工手动复审配置变动)

 apiVersion: tekton.dev/v1
 kind: Task
 metadata:
   name: git-clone
 spec:
   description: Clone the code repository to the workspace.
   params:
   - name: url
     type: string
     default: ""
   - name: branch
     type: string
     default: "main"
   workspaces:
   - name: source
   - name: ssh-file
     mountPath: /tmp/ssh
   steps:
   - name: git-clone
     image: alpine/git:2.40.1
     script: |
       sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories
       date
       apk add tzdata bash
       cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
       echo "Asia/Shanghai" > /etc/timezone
       install -m 700 -d /root/.ssh 
       install -m 644 $(workspaces.ssh-file.path)/id_rsa.pub ~/.ssh/id_rsa.pub
       install -m 600 $(workspaces.ssh-file.path)/id_rsa ~/.ssh/id_rsa
       install -m 600 $(workspaces.ssh-file.path)/known_hosts ~/.ssh/known_hosts
 ​
       cat > tmp.sh <<eof
       #!/bin/bash
       if [[ $(params.branch) =~ ^refs/heads/(.*)$ ]] ;then
         echo "${BASH_REMATCH[1]}"
       else
         echo main
       fi
       eof
       branch=`bash -x tmp.sh`
       echo "$branch -- $(params.url) -- $(workspaces.source.path)/code"
       git clone -v -b $branch $(params.url) $(workspaces.source.path)/code
       cd $(workspaces.source.path)/code
       git tag --sort=-v:refname |head
       ls -hltr 
 ---
 # 打包
 apiVersion: tekton.dev/v1
 kind: Task
 metadata:
   name: build-to-pkg
 spec:
   description: build application and package the files to image
   workspaces:
   - name: source
   steps:
   - name: build
     image: maven:3.8.7-openjdk-18-slim
     workingDir: $(workspaces.source.path)/code
     script: |
       pwd
       sed -i -e '//mirrors/i\    <mirror>' \
       -e '//mirrors/i\      <id>aliyunmaven</id>' \
       -e '//mirrors/i\      <mirrorOf>*</mirrorOf>' \
       -e '//mirrors/i\      <name>阿里云公共仓库</name>' \
       -e '//mirrors/i\      <url>https://maven.aliyun.com/repository/public</url>' \
       -e '//mirrors/i\    </mirror>' \
       /usr/share/maven/conf/settings.xml
       mvn clean install
     volumeMounts:
     - name: mvn-cache
       mountPath: /root/.m2
   volumes:
   - name: mvn-cache
     persistentVolumeClaim:
       claimName: mvn-cache
 ---
 # 生成镜像版本id
 apiVersion: tekton.dev/v1
 kind: Task
 metadata:
   name: generate-build-id
 spec:
   params:
   - name: version
     type: string
     default: latest
   results:
   - name: datetime
   - name: buildId
   steps:
   - name: generate-datetime
     image: busybox
     script: |
       datetime=`date +%Y%m%d-%H%M%S`
       echo -n ${datetime} | tee $(results.datetime.path)
   - name: generate-buildid
     image: alpine
     script: |
       sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories
       date
       apk add tzdata bash
       cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
       echo "Asia/Shanghai" > /etc/timezone
       cat > tmp.sh <<eof
       #!/bin/bash
       if [[ $(params.version) =~ ^refs/tags/(.*)$ ]] ;then
         echo "${BASH_REMATCH[1]}"
       elif [[ $(params.version) =~ ^refs/heads/main$ ]] ;then
         echo latest
       else
         echo test
       fi
       eof
       tag=`bash -x tmp.sh`
       buildDatetime=`cat $(results.datetime.path)`
       [ `echo $tag|grep -w latest` ] || [ -z `echo $tag|grep -w test` ] && buildId="${tag}" || buildId="${tag}-${buildDatetime}"
       echo -n ${buildId} |tee $(results.buildId.path)
 ---
 # 镜像构建上传
 apiVersion: tekton.dev/v1
 kind: Task
 metadata:
   name: img-build
 spec:
   description: package the application files to image
   params:
   - name: dockerfile
     default: Dockerfile
   - name: image-url
   - name: image-tag
     default: latest
   workspaces:
   - name: source
   - name: dockerconfig
     #将docker账号信息,挂载到kaniko容器中,用于推送镜像到仓库
     mountPath: /kaniko/.docker
   - name: harbor-cert
     mountPath: /tmp/cert
   steps:
   - name: build-and-push-img
     image: m.daocloud.io/gcr.io/kaniko-project/executor:debug
     securityContext:
       runAsUser: 0
     command:
     - /kaniko/executor
     args:
     #选项解释:https://github.com/GoogleContainerTools/kaniko#additional-flags
     - -f=$(params.dockerfile)       #dockefile文件名
     - -c=$(workspaces.source.path)/code     #构建镜像时的工作目录
     - -d=$(params.image-url):$(params.image-tag)    #镜像上传路径与标签
     - --cache=true
     - --cache-repo
     - --cache-run-layers
     #缓存7天,默认2周,单位h
     - --cache-ttl=168h
     #- --insecure
     #- --insecure-registry=2.2.2.67   #http访问的harbor,允许多次指定
     #- --insecure-registry=harbor.hj.com
     - --push-retry=3
     - --image-download-retry=2
     - --registry-certificate=2.2.2.67=/tmp/cert/harbor.hj.com.crt   #格式:仓库url=证书路径
     #- --registry-certificate=2.2.2.67=$(workspaces.source.path)/ca.crt
     #- --skip-tls-verify    #所有push/pull都跳过tls验证
     #允许多次指定,不验证tls的仓库
     - --skip-tls-verify-registry=2.2.2.67
     - --skip-tls-verify-registry=harbor.hj.com
     #跳过push权限检验,可以提高访问速度
     - --skip-push-permission-check
 ---
 apiVersion: tekton.dev/v1
 kind: Task
 metadata:
   name: update-config-repo
 spec:
   description: "更新git配置仓库"
   params:
   - name: url
     type: string
     default: ""
   - name: branch
     type: string
     default: "main"
   - name: image-url
   - name: image-tag
   - name: deploy-config-file
   workspaces:
   - name: source
   - name: ssh-file
     mountPath: /tmp/ssh
   steps:
   - name: update-config-repo
     image: alpine/git:2.40.1
     script: |
       sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories
       commit_time=`date +"%F %T"`
       apk add tzdata bash
       cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
       echo "Asia/Shanghai" > /etc/timezone
       install -m 700 -d /root/.ssh 
       install -m 644 $(workspaces.ssh-file.path)/id_rsa.pub ~/.ssh/id_rsa.pub
       install -m 600 $(workspaces.ssh-file.path)/id_rsa ~/.ssh/id_rsa
       install -m 600 $(workspaces.ssh-file.path)/known_hosts ~/.ssh/known_hosts
 ​
       #git代码仓库的配置仓库命名规范应该是:代码仓库名-deployment
       url=$(params.url)
       if [ `echo $url | grep http` ] ;then
         url=`echo $url |sed -r -e 's#:[0-8]+##' -e 's#/#:#3' -e 's#.*//(.*).git#git@\1-deployment#'`
       else
         url=${url/%.git/-deployment}
       fi
       echo "main -- $url -- $(workspaces.source.path)/config"
 ​
       git clone -v -b main $url $(workspaces.source.path)/config
       cd $(workspaces.source.path)/config
       git tag --sort=-v:refname |head
       if [ -z `git branch -a | grep 'remotes/origin/develop$'` ] ;then
         git checkout -b develop
       else
         git checkout -b develop remotes/origin/develop
       fi
       sed -ri "s@(.*image:) .*spring-boot-helloworld.*@\1 $(params.image-url):$(params.image-tag)@g" "$(params.deploy-config-file)"
       echo "[$commit_time] $(params.image-url):$(params.image-tag)" > update.log
       cat update.log
       git config --global user.email admin@admin.com
       git add .
       git commit -m "update $commit_time"
       git push -u origin develop
       ls -hltr

2)编排CI流水线(tekton部分)

 apiVersion: tekton.dev/v1
 kind: Pipeline
 metadata:
   name: source-to-img
   namespace: default
 spec:
   params:
     - name: git-url
       type: string
     - name: git-revision
       type: string
     - default: .
       name: img-build-context
       type: string
     - name: image-url
       type: string
     - default: rollouts/helloworld-canary-with-analysis/argo-rollouts-with-analysis.yaml
       name: deploy-config-file
       type: string
     - default: latest
       name: version
       type: string
     - name: docker-secret
       type: string
   tasks:
     - name: git-clone
       params:
         - name: url
           value: $(params.git-url)
         - name: branch
           value: $(params.version)
       taskRef:
         kind: Task
         name: git-clone
       workspaces:
         - name: source
           workspace: codebase
         - name: ssh-file
           workspace: ssh-file
     - name: build-to-pkg
       runAfter:
         - git-clone
       taskRef:
         kind: Task
         name: build-to-pkg
       workspaces:
         - name: source
           workspace: codebase
     - name: generate-build-id
       params:
         - name: version
           value: $(params.version)
       runAfter:
         - git-clone
       taskRef:
         kind: Task
         name: generate-build-id
     - name: img-build-push
       params:
         - name: image-url
           value: $(params.image-url)
         - name: image-tag
           value: $(tasks.generate-build-id.results.buildId)
       runAfter:
         - build-to-pkg
         - generate-build-id
       taskRef:
         kind: Task
         name: img-build
       workspaces:
         - name: source
           workspace: codebase
         - name: dockerconfig
           workspace: docker-config
         - name: harbor-cert
           workspace: harbor-cert
     - name: update-config-repo
       params:
         - name: url
           value: $(params.git-url)
         - name: deploy-config-file
           value: $(params.deploy-config-file)
         - name: image-url
           value: $(params.image-url)
         - name: image-tag
           value: $(tasks.generate-build-id.results.buildId)
       runAfter:
         - img-build-push
       taskRef:
         kind: Task
         name: update-config-repo
       workspaces:
         - name: source
           workspace: codebase
         - name: ssh-file
           workspace: ssh-file
   workspaces:
     - name: codebase
     - name: docker-config
     - name: harbor-cert
     - name: ssh-file

3)过滤gitlab事件获取参数信息(tekton部分)

 apiVersion: triggers.tekton.dev/v1beta1
 kind: TriggerBinding
 metadata:
   name: gitlab-binding
 spec:
   params:
   - name: git-revision
     value: $(body.checkout_sha)
   - name: git-repo-url
     value: $(body.repository.git_http_url)
   - name: image-url
     value: 2.2.2.67/spring-boot-helloworld/spring-boot-helloworld
   #部署的版本,实际场景中应该从gitlab事件源中提取tag最新标签,而非手动指定
   - name: version
     value: $(body.ref)
   - name: docker-secret
     #value: docker-config
     value: harbor-http-secret

4)创建触发器执行任务模板(tekton部分)

 apiVersion: triggers.tekton.dev/v1beta1
 kind: TriggerTemplate
 metadata:
   name: gitlab-trigger-tt
 spec:
   params:  # 定义参数
   - name: git-revision
   - name: git-repo-url
   - name: image-url
   - name: version
   - name: docker-secret
   resourcetemplates:
   - apiVersion: tekton.dev/v1beta1
     kind: PipelineRun
     metadata:
       generateName: gitlab-trigger-run-  # PipelineRun 名称前缀
     spec:
       serviceAccountName: default
       pipelineRef:
         name: source-to-img
       params:
         - name: git-url
           value: $(tt.params.git-repo-url)
         - name: git-revision
           value: $(tt.params.git-revision)
         - name: image-url
           value: $(tt.params.image-url)
         - name: version
           value: $(tt.params.version)
         - name: docker-secret
           value: $(tt.params.docker-secret)
       workspaces:
         - name: codebase
           volumeClaimTemplate:
             spec:
               accessModes:
                 - ReadWriteMany
               resources:
                 requests:
                   storage: 100Mi
               storageClassName: nfs-csi
         - name: docker-config
           secret:
             secretName: docker-config
         - name: harbor-cert
           secret:
             secretName: harbor-cert
         - name: ssh-file
           secret:
             secretName: ssh-secret

5)创建事件监听器,监听gitlab事件(tekton部分)

 apiVersion: triggers.tekton.dev/v1beta1
 kind: EventListener
 metadata:
   name: gitlab-event-listener 
 spec:
   serviceAccountName: tekton-triggers-gitlab-sa
   triggers:
   - name: gitlab-push-events-trigger
     interceptors:
     - ref:
         name: "gitlab"
       params:
       - name: "secretRef"
         value:
           secretName: gitlab-webhook-token 
           secretKey: webhookToken
       - name: "eventTypes"
         value:
           - "Push Hook"
           - "Tag Push Hook"
           - "Merge Request Hook"
     bindings:
     - ref: gitlab-binding
     template:
       ref: gitlab-trigger-tt

6)gitlab解析事件监听器的webhook地址

 ##### k8s主机操作
 kubectl port-forward deploy/el-gitlab-event-listener --address 0.0.0.0 8080
 ​
 #为tekton配置一个虚拟ip
 ip link a vip0 type dummy
 ip add a 2.2.2.67/32 dev vip0
 ​
 #查看事件监听器的访问地址,后面要将这个添加到gitlab webhook
 kubectl get svc -l app.kubernetes.io/managed-by=EventListener
 ​
 ​
 ##### gitlab主机操作
 #gitlab主机中添加hosts解析,解析tekton的事件监听器地址
 echo 2.2.2.67 el-gitlab-event-listener.default.svc.cluster.local >> /etc/hosts
 curl el-gitlab-event-listener.default.svc.cluster.local:8080

image-20240105152008650

7)修改代码仓库,模拟版本更新,触发事件

gitlab在代码仓库新建标签,让tekton触发流水线,执行镜像构建上传、配置仓库更新image-20240109143740835

tekton任务获取代码仓库的最新标签image-20240109143554229

更新配置仓库成功 image-20240109143437691

gitlab查看配置仓库的develop分支(考虑到生产环境,需要人工复审,所以更新发布的配置应提交到develop分支,审核完后手动合并到master分支)image-20240109143842568

image-20240109143523062

harbor查看容器镜像构建上传成功image-20240109142845462

8)argocd监视配置仓库,完成CD流程(argocd部分)

app资源

 apiVersion: argoproj.io/v1alpha1
 kind: Application
 metadata:
   name: spring-boot-helloworld
   namespace: argocd
 spec:
   project: default
   source:
     repoURL: http://2.2.2.15:80/root/spring-boot-helloworld-deployment.git
     targetRevision: HEAD
     path: rollouts/helloworld-canary-with-analysis
   destination:
     server: https://kubernetes.default.svc
     #配置清单中定义在demo命名空间,所以要对应
     namespace: demo
   syncPolicy:
     automated: 
       prune: true
       selfHeal: true
       allowEmpty: false
     syncOptions:
     - Validate=false
     - CreateNamespace=true
     - PrunePropagationPolicy=foreground
     - PruneLast=true
     - ApplyOutOfSyncOnly=true
     retry:
       limit: 5
       backoff:
         duration: 5s
         factor: 2
         maxDuration: 3m
   ignoreDifferences:
   - group: networking.istio.io
     kind: VirtualService
     jsonPointers:
     - /spec/http/0

9)查看argocd执行CD过程(argocd部分)

 #查看argo-rollout对业务应用的版本控制
 kubectl-argo-rollouts get rollout -n demo -w rollouts-helloworld-with-analysis
 ​
 #查看istio的流量迁移过程
 watch -n.5 'kubectl describe vs -n demo'

argocd对比当前集群状态与配置仓库的期望状态,发现集群没有运行相关pod,开始运行master分支的配置清单(第一次部署)image-20240109145744608

argo-rollout运行master分支配置成功image-20240109145530259

istio流量比例 image-20240109151028398

10)更新配置仓库,再查看argocd执行CD过程(argocd部分)

前面只是将新配置推送到develop分支,argocd配置中只监视主分支变更,所以需要手动将配置仓库的develop分支,合并到主分支 image-20240109151511561

argocd检测到配置仓库更新了,开始部署新版本

注: 检测时间可能有点久,要过几分钟,也可以手动点刷新按钮,手动同步image-20240109151548779

argo-rollout已经运行了1个新版本podimage-20240109155754195

istio也迁移了5%流量到新pod(稳定版是旧版本,新版本是canary)image-20240109155809309

在配置清单中定义了第2步开始结合prometheus分析流量健康,此时分析成功,继续部署新pod并加大流量迁移比例到10%,一直到30%(除非有流量不健康,否则不会回滚)image-20240109160045513

继续观察,所有pod启动成功,流量迁移完成,剩余16s清理旧版本pod

注: 如果在旧版本pod清理倒计时过程中,如果出现分析的流量健康状态异常,会立即自动回滚,清理后则不会了 image-20240109160247649

命令行查看image-20240109160334460

istio也完成所有流量迁移,此时稳定版就是最新版 image-20240109160305344