声明:本AI应用开发系列教程首发在同名公众号:王中阳,未经授权禁止转载。
本指南提供了在 Kubernetes 集群上部署 Go-Eino Interview Agent 平台的全面说明。该部署策略利用容器化服务,具备适当的编排、扩展和监控能力。
架构概述
Kubernetes 部署架构遵循微服务模式,每个组件都容器化并通过 Kubernetes 原语管理。系统由前端、后端、数据库、缓存和向量存储服务组成,作为一个 cohesive unit 进行编排。
前置条件
在部署到 Kubernetes 之前,请确保你具备以下条件:
- Kubernetes 集群(v1.24+)
- 已配置集群访问的 kubectl
- 容器镜像仓库访问权限(Docker Hub、AWS ECR、GCR 等)
- 至少 8GB RAM 和 4 CPU 内核可用
- 已为持久卷配置存储类
容器镜像准备
项目使用多阶段 Docker 构建来优化生产镜像:
后端容器
基于 backend/Dockerfile,后端容器使用 Go 1.24-alpine 进行构建,并使用 Alpine Linux 作为运行时,提供最小化的占用空间和基本的安全特性。
前端容器
基于 frontend/Dockerfile,前端容器使用 Node.js 18-alpine 构建 Next.js 应用程序,并以生产模式提供服务。
构建镜像并推送到你的仓库
docker build -t your-registry/go-eino-backend:latest ./backend
docker push your-registry/go-eino-backend:latest
# Frontend
docker build -t your-registry/go-eino-frontend:latest ./frontend
docker push your-registry/go-eino-frontend:latest
Kubernetes 清单
命名空间和配置
为面试代理创建专用命名空间:
apiVersion: v1
kind: Namespace
metadata:
name: interview-agent
ConfigMaps
基于 docker-compose.yml 和 docker-compose-prod.yml 的应用程序配置:
apiVersion: v1
kind: ConfigMap
metadata:
name: interview-agent-config
namespace: interview-agent
data:
DB_HOST: "mysql-service"
DB_PORT: "3306"
DB_NAME: "interview_agent"
REDIS_HOST: "redis-service"
REDIS_PORT: "6379"
ETCD_ENDPOINTS: "etcd-service:2379"
TZ: "Asia/Shanghai"
Secrets
敏感数据管理:
apiVersion: v1
kind: Secret
metadata:
name: interview-agent-secrets
namespace: interview-agent
type: Opaque
data:
DB_PASSWORD: <base64-encoded-password>
REDIS_PASSWORD: <base64-encoded-password>
MYSQL_ROOT_PASSWORD: <base64-encoded-password>
数据库 StatefulSet
MySQL 部署及持久存储:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
namespace: interview-agent
spec:
serviceName: mysql-service
replicas: 1
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
ports:
- containerPort: 3306
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: interview-agent-secrets
key: MYSQL_ROOT_PASSWORD
- name: MYSQL_DATABASE
valueFrom:
configMapKeyRef:
name: interview-agent-config
key: DB_NAME
volumeMounts:
- name: mysql-storage
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: mysql-storage
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 20Gi
Redis Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
namespace: interview-agent
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7-alpine
ports:
- containerPort: 6379
command: ["redis-server", "--appendonly", "yes", "--requirepass", "$(REDIS_PASSWORD)"]
env:
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: interview-agent-secrets
key: REDIS_PASSWORD
volumeMounts:
- name: redis-storage
mountPath: /data
volumes:
- name: redis-storage
persistentVolumeClaim:
claimName: redis-pvc
后端 Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
namespace: interview-agent
spec:
replicas: 3
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
containers:
- name: backend
image: your-registry/go-eino-backend:latest
ports:
- containerPort: 8888
envFrom:
- configMapRef:
name: interview-agent-config
- secretRef:
name: interview-agent-secrets
livenessProbe:
httpGet:
path: /health
port: 8888
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8888
initialDelaySeconds: 5
periodSeconds: 5
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
前端 Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
namespace: interview-agent
spec:
replicas: 2
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- name: frontend
image: your-registry/go-eino-frontend:latest
ports:
- containerPort: 3000
env:
- name: NEXT_PUBLIC_API_BASE_URL
value: "http://backend-service:8888/api"
livenessProbe:
httpGet:
path: /
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "250m"
Nginx Ingress Controller
基于 nginx.conf 配置:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
namespace: interview-agent
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:alpine
ports:
- containerPort: 80
volumeMounts:
- name: nginx-config
mountPath: /etc/nginx/conf.d/default.conf
subPath: default.conf
volumes:
- name: nginx-config
configMap:
name: nginx-config
---
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-config
namespace: interview-agent
data:
default.conf: |
upstream backend {
server backend-service:8888;
}
upstream frontend {
server frontend-service:3000;
}
server {
listen 80;
server_name localhost;
client_max_body_size 100M;
location /api/ {
proxy_pass http://backend/api/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_buffering off;
proxy_cache off;
proxy_connect_timeout 7d;
proxy_send_timeout 7d;
proxy_read_timeout 7d;
}
location / {
proxy_pass http://frontend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_buffering off;
proxy_request_buffering off;
}
location /health {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}
}
Services
apiVersion: v1
kind: Service
metadata:
name: mysql-service
namespace: interview-agent
spec:
selector:
app: mysql
ports:
- port: 3306
targetPort: 3306
clusterIP: None
---
apiVersion: v1
kind: Service
metadata:
name: redis-service
namespace: interview-agent
spec:
selector:
app: redis
ports:
- port: 6379
targetPort: 6379
---
apiVersion: v1
kind: Service
metadata:
name: backend-service
namespace: interview-agent
spec:
selector:
app: backend
ports:
- port: 8888
targetPort: 8888
---
apiVersion: v1
kind: Service
metadata:
name: frontend-service
namespace: interview-agent
spec:
selector:
app: frontend
ports:
- port: 3000
targetPort: 3000
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
namespace: interview-agent
spec:
selector:
app: nginx
ports:
- port: 80
targetPort: 80
type: LoadBalancer
部署流程
分步部署
按依赖顺序部署服务以避免启动失败。数据库和缓存服务必须在部署应用服务之前完全可用。
部署命令
# 1. Create namespace
kubectl apply -f namespace.yaml
# 2. Deploy configuration
kubectl apply -f configmap.yaml
kubectl apply -f secrets.yaml
# 3. Deploy storage
kubectl apply -f persistent-volumes.yaml
# 4. Deploy database layer
kubectl apply -f mysql-statefulset.yaml
kubectl apply -f redis-deployment.yaml
# 5. Wait for database readiness
kubectl wait --for=condition=ready pod -l app=mysql -n interview-agent --timeout=300s
kubectl wait --for=condition=ready pod -l app=redis -n interview-agent --timeout=300s
# 6. Deploy application services
kubectl apply -f backend-deployment.yaml
kubectl apply -f frontend-deployment.yaml
# 7. Deploy ingress
kubectl apply -f nginx-deployment.yaml
kubectl apply -f services.yaml
# 8. Verify deployment
kubectl get pods -n interview-agent
kubectl get services -n interview-agent
扩展和高可用性
水平 Pod 自动扩展
为应用服务配置 HPA:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: backend-hpa
namespace: interview-agent
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: backend
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
数据库高可用性
对于生产环境,考虑 MySQL 集群:
apiVersion: mysql.oracle.com/v2
kind: InnoDBCluster
metadata:
name: mysql-cluster
namespace: interview-agent
spec:
instances: 3
router:
instances: 1
secretName: mysql-secret
tlsUseSelfSigned: true
监控和可观测性
健康检查
所有部署都包含基于 backend/Dockerfile 和 frontend/Dockerfile 容器配置的全面健康检查:
- Liveness Probes:检测并重启不健康的容器
- Readiness Probes:确保流量仅路由到就绪的容器
- Startup Probes:处理启动缓慢的应用程序
日志配置
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
namespace: interview-agent
data:
fluent.conf: |
<source>
@type tail
path /var/log/containers/*interview-agent*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
format json
</source>
<match kubernetes.**>
@type elasticsearch
host elasticsearch-service
port 9200
index_name interview-agent-logs
</match>
指标收集
部署 Prometheus 监控:
apiVersion: v1
kind: ServiceMonitor
metadata:
name: interview-agent-metrics
namespace: interview-agent
spec:
selector:
matchLabels:
app: backend
endpoints:
- port: metrics
interval: 30s
path: /metrics
安全配置
网络策略
实施网络分段:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: interview-agent-netpol
namespace: interview-agent
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: interview-agent
egress:
- to:
- namespaceSelector:
matchLabels:
name: interview-agent
- to: []
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
Pod 安全策略
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: interview-agent-psp
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
- 'persistentVolumeClaim'
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
备份和灾难恢复
数据库备份策略
apiVersion: batch/v1
kind: CronJob
metadata:
name: mysql-backup
namespace: interview-agent
spec:
schedule: "0 2 * * *" # Daily at 2 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: mysql-backup
image: mysql:8.0
command:
- /bin/bash
- -c
- |
mysqldump -h mysql-service -u root -p$MYSQL_ROOT_PASSWORD \
--single-transaction --routines --triggers \
interview_agent > /backup/backup-$(date +%Y%m%d).sql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: interview-agent-secrets
key: MYSQL_ROOT_PASSWORD
volumeMounts:
- name: backup-storage
mountPath: /backup
volumes:
- name: backup-storage
persistentVolumeClaim:
claimName: backup-pvc
restartPolicy: OnFailure
故障排除指南
常见问题和解决方案
| 问题 | 症状 | 解决方案 |
|---|---|---|
| Pod Pending | 资源不足 | 检查节点容量并调整资源请求 |
| 数据库连接失败 | 后端日志显示连接错误 | 验证服务名称和网络策略 |
| 内存使用率高 | OOMKilled 事件 | 增加内存限制或优化应用程序 |
| 响应时间慢 | 高延迟指标 | 水平扩展或优化数据库查询 |
调试命令
# Check pod status
kubectl get pods -n interview-agent -o wide
# View pod logs
kubectl logs -f deployment/backend -n interview-agent
# Debug pod issues
kubectl exec -it deployment/backend -n interview-agent -- /bin/sh
# Check resource usage
kubectl top pods -n interview-agent
# Verify service connectivity
kubectl port-forward service/backend-service 8888:8888 -n interview-agent
注意:在应用到生产环境之前,始终在预发环境中测试部署。使用金丝雀部署逐步推出新版本。
性能优化
资源调优
基于容器配置,优化资源分配:
- 后端:每个副本从 512Mi 内存、250m CPU 开始
- 前端:每个副本从 256Mi 内存、100m CPU 开始
- 数据库:MySQL 最少 2Gi 内存、1 CPU
- Redis:最少 256Mi 内存、100m CPU
缓存策略
实施 Redis 缓存以存储频繁访问的数据:
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-config
namespace: interview-agent
data:
redis.conf: |
maxmemory 256mb
maxmemory-policy allkeys-lru
save 900 1
save 300 10
save 60 10000
迁移策略
从 Docker Compose 迁移到 Kubernetes
从 docker-compose.yml 迁移到 Kubernetes 涉及:
- 服务映射:将每个服务转换为 Kubernetes Deployments/StatefulSets
- 卷管理:将绑定挂载替换为 PersistentVolumes
- 网络:将 Compose 网络转换为 Kubernetes Services 和 NetworkPolicies
- 配置:将环境变量移动到 ConfigMaps 和 Secrets
- 健康检查:实施 Kubernetes 原生健康探针
滚动更新策略
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
namespace: interview-agent
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
# ... rest of deployment spec
后续步骤
完成 Kubernetes 部署后,建议探索:
- 性能优化 以调整集群性能
- 安全实施 以实现高级安全配置
- 监控和可观测性 以建立全面的监控体系
Kubernetes 部署为 Go-Eino Interview Agent 平台提供了可扩展、有弹性的基础,通过适当的监控、扩展和灾难恢复能力实现生产级运营。