kubeadm 证书过期处理
目录
免责申明
本记录仅可用于测试环境,不保证可完全修复集群,有些操作是自己摸索的不代表正确。
说明
因为经常要做 Kubernetes 相关的测试,于是环境里有很多集群,有时候长时间不开机,集群默认 1 年的证书就过期了,过期后 kubelet 还有 API Server 等都无法正常启动,下面记录下修复过程。
renew API server 等证书
登录 master 节点,使用下列命令直接 renew 所有证书:
kubeadm certs renew all
# 将 kubelet.conf 复制为 bootstrap-kubelet.conf,欺骗一下 kubelet,这样可以让 API Server 启动(如果你在部署集群时使用了自定义的 kubeadm-config,则可以跳过下面的部分)
cp /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf
# 后续 kubelet 会自动尝试读取 bootstrap-kubelet.conf,然后 api Server 会起来,kubelet 会报下面的错误,这个错误可以忽略,未来会修复
Nov 10 22:12:38 master01 kubelet[2811]: E1110 22:12:38.962816 2811 kubelet.go:2456] "Error getting node" err="node \"master01\" not found"
Nov 10 22:12:39 master01 kubelet[2811]: E1110 22:12:39.063282 2811 kubelet.go:2456] "Error getting node" err="node \"master01\" not found"
kubeadm.conf 配置中证书已经被 kubeadm 续期了,证书的部分内容如下:
Subject Name=
====Organization=system:masters
====Common Name=kubernetes-admin
Issuer Name=
====Common Name=kubernetes
Serial Number=5755935322539516670
Version=3
Signature Algorithm=SHA-256 with RSA Encryption ( 1.2.840.113549.1.1.11 )
====Parameters=None
Not Valid Before=Sunday, August 14, 2022 at 18:47:31 China Standard Time
Not Valid After=Saturday, November 9, 2024 at 11:29:52 China Standard Time
使用 kubectl --kubeconfig /etc/kubernetes/admin.conf get node
命令可以看到所有节点。将此 config 复制到用户目录方便使用。
cp /etc/kubernetes/admin.conf ~/.kube/config
kubectl get node
kubeadm 配置文件导出
一般部署 kubeadm 时都会准备一个用于初始化的 kubeadm config 文件,文件中包含集群自定义的配置,恢复 kubelet.conf 需要此文件:
# cat kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
networking:
podSubnet: 10.39.0.0/16,2001::/64
serviceSubnet: 10.96.0.0/16,2002::/110
controllerManager:
extraArgs:
"node-cidr-mask-size-ipv4": "25"
"node-cidr-mask-size-ipv6": "80"
imageRepository: "registry.cn-hangzhou.aliyuncs.com/google_containers"
clusterName: "rhel-k8scluster"
kubernetesVersion: "v1.22.11"
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: "10.29.8.171"
bindPort: 6443
nodeRegistration:
kubeletExtraArgs:
node-ip: 10.29.8.171,2000::171
如果集群部署时用的全部是默认配置,未使用自定义的配置,当 API Server 可用后,需要从下列 configmap 中导出 kubeadm-config.yaml 文件:
kubectl get cm kubeadm-config -n kube-system -o json | jq -r '.data.ClusterConfiguration' > kubeadm-config.yaml
# 导出时只需要保留 data.ClusterConfiguration 中的内容
在获取完 kubeadm-config 后,可以移除 bootstrap-kubelet.conf(此时 api Server 已经启动了,后续就是要去修复 kubelet.conf 了)
mv /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf.bak
renew master 节点 kubelet 证书
之后 systemctl restart kubelet
,观察到下列错误,提示 kubelet 证书过期:
Nov 09 22:05:57 master01 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Nov 09 22:05:57 master01 kubelet[2712]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Nov 09 22:05:57 master01 kubelet[2712]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Nov 09 22:05:57 master01 systemd[1]: Started Kubernetes systemd probe.
Nov 09 22:05:57 master01 kubelet[2712]: I1109 22:05:57.184996 2712 server.go:440] "Kubelet version" kubeletVersion="v1.22.11"
Nov 09 22:05:57 master01 kubelet[2712]: I1109 22:05:57.185198 2712 server.go:868] "Client rotation is on, will bootstrap in background"
Nov 09 22:05:57 master01 kubelet[2712]: E1109 22:05:57.186388 2712 bootstrap.go:265] part of the existing bootstrap client certificate in /etc/kubernetes/kubelet.conf is expired: 2023-08-14 10:47:33 +0000 UTC
Nov 09 22:05:57 master01 kubelet[2712]: E1109 22:05:57.186411 2712 server.go:294] "Failed to run kubelet" err="failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory"
Nov 09 22:05:57 master01 systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
Nov 09 22:05:57 master01 systemd[1]: Unit kubelet.service entered failed state.
Nov 09 22:05:57 master01 systemd[1]: kubelet.service failed.
/etc/kubernetes/kubelet.conf
中指定了 CA 证书信息以及 kubelet 使用的证书和秘钥,CA 证书是 10 年的还未过期,kubelet 证书已过期,需要修复。
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeU1EZ3hOREV3TkRjek1Wb1hEVE15TURneE1URXdORGN6TVZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTVJOCi9Nb1hnK2dCNTBZa0IvOXNlaWlPSjdOaXRETmRyS2thK2E0NUo3amg3OThuZjdSN2ZMYlBjTkxyZFgyN3lUT2wKZ1dKMTdCN1B3alNIWXZXejVGbGMyS3NBelFHaXlyOXZ3cDhUcnZVQUdRSkFaUllJMmcrYTFjWnZGSXl5SkdYdQpoSkIxcUQrN2ZPYWVCWkFYZ00wb0NvZjEvVVNIU1BUWGpVRjY4a09lUFUyS2RvSUthZWE5eXF5a3oxN1M0MWZhClFNS0hmVHZHbDJNMUg0SGU4WEhQMUVoV1dDai9MMEs0SDRIaENJYVV4MmdBZ3UwZy9oSDUyRFQ3anRnZG5KQWQKUUczdWxWOVdFeFVrSjlnNUZISEVCR1lDVmlKUkFGR011RVQyMmlFYlI3VWx0RDlaOGlBQ0x3ZURGOHJUdmN1bgpOSUlocXMwaTVCMkFlYitLZUZrQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZNVzJlZzgzM0I5SXI1b2I2cnF2T2Jqd2VIek1NQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQWw3RjR4bVo3cFkzeGJWODZaYQpTUnBkR054Z1Jka3VXNkFCOUhReHhlbFhDUUUwZmFoU3FCQTZ2blk2NEJwbFBOYURsdmNHNVFPZXNENVhaTXVCCjRzRGUvMjFYUU4rL215TEFaUmcyaDJLZ3l4ZUZuUkJqTWplOFRkQXVTWGxSN3hOVkdYSDhLU1VxOXc3dFN4QWUKUGRIWEozYTZiUkRhc2NhdWNmZFIvUy9INGZaa0RPS1picGl1UUk3RGRTQkEyVVJTeHBjL25lRnl2eVJ6djNGbQpiN1FCZ1RmdEpRU3NHQXArR2NCU29uaUVZYzdJSFJzaGk3ZWNLV3hiU2swZi9ZL1Z1eFptUUxYMWp0cVpuSFNwCjBmZXoyNW1MYWZ2Ri9Ba04wTkNhZldCb3Z3VFg3RWk3SGRHV0tsRGVXMDJTVkVidzdWR0h5UUZ0UWJPVDBYSXcKbVhjPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
server: https://10.29.8.171:6443
name: rhel-k8scluster
contexts:
- context:
cluster: rhel-k8scluster
user: system:node:master01
name: system:node:master01@rhel-k8scluster
current-context: system:node:master01@rhel-k8scluster
kind: Config
preferences: {}
users:
- name: system:node:master01
user:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
先对原来的文件进行备份:
mkdir ~/bak
cp /etc/kubernetes/kubelet.conf ~/bak
cp /var/lib/kubelet/pki/kubelet-client* ~/bak
使用 kubeadm 生成新的 kubelet.conf,此配置会自动附加新的 kubelet 证书:
kubeadm kubeconfig user --org system:nodes --client-name system:node:master01 --config=kubeadm-config.yaml > /etc/kubernetes/kubelet.conf
# 重启 kubelet 服务
systemctl restart kubelet
之后查看到 master 节点为 Ready:
[root@master01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master01 Ready control-plane,master 452d v1.22.11
renew 其他节点证书
接着使用相似操作,在 master 节点(包含 ca 根证书的节点)为其他 worker 节点创建 kubelet 配置文件:
kubeadm kubeconfig user --org system:nodes --client-name system:node:worker01 --config=kubeadm-config.yaml > kubelet-worker01.conf
kubeadm kubeconfig user --org system:nodes --client-name system:node:worker02 --config=kubeadm-config.yaml > kubelet-worker02.conf
... ...
创建完后依次 scp 到相应 worker 节点,然后重启 kubelet 服务:
scp kubelet-worker01.conf [email protected]:/etc/kubernetes/kubelet.conf
scp kubelet-worker02.conf [email protected]:/etc/kubernetes/kubelet.conf
... ...
systemctl restart kubelet
最终可以看到所有节点恢复正常:
[root@master01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master01 Ready control-plane,master 452d v1.22.11
worker01 Ready <none> 452d v1.22.11
worker02 Ready <none> 452d v1.22.11
worker03 Ready <none> 452d v1.22.11
worker04 Ready <none> 452d v1.22.11
worker05 Ready <none> 255d v1.22.11
worker06 Ready <none> 253d v1.22.11
worker07 Ready <none> 168d v1.22.11