k8s系列12-kubeadm升级k8s集群

本文最后更新于:December 24, 2022 pm

本文主要介绍如何使用kubeadm对K8S集群进行升级。

此前写的一些关于k8s基础知识和集群搭建的一些方案,有需要的同学可以看一下。

1、概述

K8S集群的升级主要可以分为三步:

  1. 升级一个主控制面节点
  2. 升级其余的控制面节点
  3. 升级剩下的worker节点

本次升级的集群为三主三从组合,使用ciliumcontainerd,K8S集群版本为1.25.4,计划升级到1.26.0版本。

1
2
3
4
5
6
7
8
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-cilium-master-10-31-80-1.tinychen.io Ready control-plane 15d v1.25.4 10.31.80.1 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-master-10-31-80-2.tinychen.io Ready control-plane 15d v1.25.4 10.31.80.2 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-master-10-31-80-3.tinychen.io Ready control-plane 15d v1.25.4 10.31.80.3 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-4.tinychen.io Ready <none> 15d v1.25.4 10.31.80.4 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-5.tinychen.io Ready <none> 15d v1.25.4 10.31.80.5 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-6.tinychen.io Ready <none> 15d v1.25.4 10.31.80.6 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11

2、准备工作

开始升级之前我们先做一些准备工作:

  1. 仔细阅读版本更新说明:主要关注当前版本到目标升级版本之间的变更,尤其是大版本升级更加要注意;
  2. K8S集群必须使用静态控制面节点
  3. K8S集群的etcd必须是静态pod部署或者是外置etcd
  4. 备份重要的数据和业务:虽然kubeadm升级只涉及k8s的内部组件,但是对于一些重要业务和app级别的有状态服务,还是有备无患比较放心
  5. 禁用集群的SWAP内存

一些注意事项:

  • 对于任何kubelet的小版本升级(minor version upgrade),一定要先驱逐该node上面的所有负载(drain the node),避免残留一些诸如coredns之类的重要workload影响整个集群的稳定性;
  • 由于containerspec hash value在集群升级后会发生变更,因此所有的container在集群升级完成后都会重启
  • 可以使用systemctl status kubelet或者journalctl -xeu kubelet来查看kubelet的日志从而确定其升级是否成功;
  • 不建议在kubeadm upgrade升级集群的同时使用--config来重新配置集群,如果需要更新集群的配置,可以参考这篇官方教程

3、升级kubeadm

在升级集群之前我们需要先对集群上面的所有节点的kubeadm都进行升级,这里我们使用yum升级到对应的1.26.0版本即可。

1
2
3
4
5
# 查看所有可用的kubeadm的版本
$ yum list --showduplicates kubeadm --disableexcludes=kubernetes

# 然后升级kubeadm的版本到1.26.0
$ yum install -y kubeadm-1.26.0-0 --disableexcludes=kubernetes

完成之后我们检查kubeadm的版本信息,顺利输出下面的信息则说明升级成功

1
2
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.0", GitCommit:"b46a3f887ca979b1a5d14fd39cb1af43e7e5d12d", GitTreeState:"clean", BuildDate:"2022-12-08T19:57:06Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}

4、升级控制面节点

4.1 升级K8S组件

首先我们从三个控制面节点中挑选一个进行升级,这里先对10.31.80.1进行升级。

接下来我们查看一下升级计划,这里会列出升级过程中涉及的组件和API,以及升级前后的变化和是否需要我们进行手动操作等详细信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
$ kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.25.4
[upgrade/versions] kubeadm version: v1.26.0
[upgrade/versions] Target version: v1.26.0
[upgrade/versions] Latest version in the v1.25 series: v1.25.5

W1223 17:23:27.554231 15530 configset.go:177] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeproxy.config.k8s.io", Version:"v1alpha1", Kind:"KubeProxyConfiguration"}: strict decoding error: unknown field "udpIdleTimeout"
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT TARGET
kubelet 6 x v1.25.4 v1.25.5

Upgrade to the latest version in the v1.25 series:

COMPONENT CURRENT TARGET
kube-apiserver v1.25.4 v1.25.5
kube-controller-manager v1.25.4 v1.25.5
kube-scheduler v1.25.4 v1.25.5
kube-proxy v1.25.4 v1.25.5
CoreDNS v1.9.3 v1.9.3
etcd 3.5.5-0 3.5.6-0

You can now apply the upgrade by executing the following command:

kubeadm upgrade apply v1.25.5

_____________________________________________________________________

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT TARGET
kubelet 6 x v1.25.4 v1.26.0

Upgrade to the latest stable version:

COMPONENT CURRENT TARGET
kube-apiserver v1.25.4 v1.26.0
kube-controller-manager v1.25.4 v1.26.0
kube-scheduler v1.25.4 v1.26.0
kube-proxy v1.25.4 v1.26.0
CoreDNS v1.9.3 v1.9.3
etcd 3.5.5-0 3.5.6-0

You can now apply the upgrade by executing the following command:

kubeadm upgrade apply v1.26.0

_____________________________________________________________________


The table below shows the current state of component configs as understood by this version of kubeadm.
Configs that have a "yes" mark in the "MANUAL UPGRADE REQUIRED" column require manual config upgrade or
resetting to kubeadm defaults before a successful upgrade can be performed. The version to manually
upgrade to is denoted in the "PREFERRED VERSION" column.

API GROUP CURRENT VERSION PREFERRED VERSION MANUAL UPGRADE REQUIRED
kubeproxy.config.k8s.io v1alpha1 v1alpha1 no
kubelet.config.k8s.io v1beta1 v1beta1 no
_____________________________________________________________________

由于本次升级的版本跨度并不大,因此并无太多需要变动的地方,我们可以直接升级。

kubeadm upgrade命令在升级的过程中还会同时更新集群的证书,如果不想更新的话可以加上参数--certificate-renewal=false

For more information see the certificate management guide.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
$ kubeadm upgrade apply v1.26.0
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W1223 17:30:04.493109 17731 configset.go:177] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeproxy.config.k8s.io", Version:"v1alpha1", Kind:"KubeProxyConfiguration"}: strict decoding error: unknown field "udpIdleTimeout"
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.26.0"
[upgrade/versions] Cluster version: v1.25.4
[upgrade/versions] kubeadm version: v1.26.0
[upgrade] Are you sure you want to proceed? [y/N]: y
[upgrade/prepull] Pulling images required for setting up a Kubernetes cluster
[upgrade/prepull] This might take a minute or two, depending on the speed of your internet connection
[upgrade/prepull] You can also perform this action in beforehand using 'kubeadm config images pull'
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.26.0" (timeout: 5m0s)...
[upgrade/etcd] Upgrading to TLS for etcd
[upgrade/staticpods] Preparing for "etcd" upgrade
[upgrade/staticpods] Renewing etcd-server certificate
[upgrade/staticpods] Renewing etcd-peer certificate
[upgrade/staticpods] Renewing etcd-healthcheck-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/etcd.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2022-12-23-17-31-20/etcd.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 3 Pods for label selector component=etcd
[upgrade/staticpods] Component "etcd" upgraded successfully!
[upgrade/etcd] Waiting for etcd to become available
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests3224088652"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2022-12-23-17-31-20/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 3 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2022-12-23-17-31-20/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 3 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2022-12-23-17-31-20/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 3 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
W1223 17:33:49.448558 17731 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy

[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.26.0". Enjoy!

[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.

看到最后输出类似的信息就说明该节点已经升级成功。

1
2
3
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.26.0". Enjoy!

[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.

但是这时候不要急,我们此时只是升级了一个节点,还需要继续对剩下的两个节点进行同样的操作

注意这时候在第二个节点执行kubeadm upgrade plan命令看到的信息和之前已经不一样了,因为在第一个控制面节点升级的时候已经更新了集群内的configmap。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.0", GitCommit:"b46a3f887ca979b1a5d14fd39cb1af43e7e5d12d", GitTreeState:"clean", BuildDate:"2022-12-08T19:57:06Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}

$ kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.26.0
[upgrade/versions] kubeadm version: v1.26.0
[upgrade/versions] Target version: v1.26.0
[upgrade/versions] Latest version in the v1.26 series: v1.26.0

所以这里我们对其余的控制面节点更新命令也对应变成了

1
$ kubeadm upgrade node

当看到类似的输出信息时则说明升级成功

1
2
[upgrade] The configuration for this node was successfully updated!
[upgrade] Now you should go ahead and upgrade the kubelet package using your package manager.

4.2 升级kubelet和kubectl

上面的更新操作完成之后我们只是把K8S集群中的相关pod都升级了一遍,但是kubelet并没有升级,因此这里看到的版本信息还是1.25.4

1
2
3
4
5
6
7
8
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-cilium-master-10-31-80-1.tinychen.io Ready control-plane 15d v1.25.4 10.31.80.1 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-master-10-31-80-2.tinychen.io Ready control-plane 15d v1.25.4 10.31.80.2 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-master-10-31-80-3.tinychen.io Ready control-plane 15d v1.25.4 10.31.80.3 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-4.tinychen.io Ready <none> 15d v1.25.4 10.31.80.4 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-5.tinychen.io Ready <none> 15d v1.25.4 10.31.80.5 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-6.tinychen.io Ready <none> 15d v1.25.4 10.31.80.6 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11

升级kubelet之前我们要对节点进行驱逐操作,将上面除了daemonset之外的全部工作负载都驱逐掉

1
2
3
4
$ kubectl drain k8s-cilium-master-10-31-80-1.tinychen.io --ignore-daemonsets
node/k8s-cilium-master-10-31-80-1.tinychen.io cordoned
Warning: ignoring DaemonSet-managed Pods: kube-system/cilium-gj4vm, kube-system/kube-proxy-r7pj8, kube-system/kube-router-szdml
node/k8s-cilium-master-10-31-80-1.tinychen.io drained

接着就可以使用yum更新kubelet和kubectl

1
2
3
4
5
6
7
$ yum install -y kubelet-1.26.0-0 kubectl-1.26.0-0 --disableexcludes=kubernetes
$ systemctl daemon-reload
$ systemctl restart kubelet

# 查看日志检查相关服务是否正常
$ systemctl status kubelet -l
$ journalctl -xeu kubelet

这时候再查看相关的状态就能看到节点已经升级成功,版本信息更新为1.26.0

1
2
3
4
5
6
7
8
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-cilium-master-10-31-80-1.tinychen.io Ready,SchedulingDisabled control-plane 15d v1.26.0 10.31.80.1 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-master-10-31-80-2.tinychen.io Ready control-plane 15d v1.25.4 10.31.80.2 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-master-10-31-80-3.tinychen.io Ready control-plane 15d v1.25.4 10.31.80.3 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-4.tinychen.io Ready <none> 15d v1.25.4 10.31.80.4 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-5.tinychen.io Ready <none> 15d v1.25.4 10.31.80.5 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-6.tinychen.io Ready <none> 15d v1.25.4 10.31.80.6 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11

确定该节点正常后,就能恢复它的调度

1
$ kubectl uncordon k8s-cilium-master-10-31-80-1.tinychen.io

接着对剩下的两个节点进行同样的操作,都更新完成之后我们可以看到整个集群的控制面就完成升级了。

1
2
3
4
5
6
7
8
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-cilium-master-10-31-80-1.tinychen.io Ready control-plane 15d v1.26.0 10.31.80.1 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-master-10-31-80-2.tinychen.io Ready control-plane 15d v1.26.0 10.31.80.2 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-master-10-31-80-3.tinychen.io Ready control-plane 15d v1.26.0 10.31.80.3 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-4.tinychen.io Ready <none> 15d v1.25.4 10.31.80.4 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-5.tinychen.io Ready <none> 15d v1.25.4 10.31.80.5 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-6.tinychen.io Ready <none> 15d v1.25.4 10.31.80.6 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11

5、升级worker节点

worker节点的upgrade操作则要简单很多,因为上面没有控制面相关的组件,所以只需要更新kubelet的配置即可。

1
2
3
4
5
6
7
8
9
$ kubeadm upgrade node
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks
[preflight] Skipping prepull. Not a control plane node.
[upgrade] Skipping phase. Not a control plane node.
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[upgrade] The configuration for this node was successfully updated!
[upgrade] Now you should go ahead and upgrade the kubelet package using your package manager.

接下来就是重复上面的驱逐节点--> 升级kubelet --> 检查服务 --> 恢复节点的步骤即可

一般worker节点上不安装kubectl,因此也就不需要升级kubectl了

1
2
3
4
5
6
7
8
9
10
# 在控制面节点上使用kubectl执行
$ kubectl drain k8s-cilium-worker-10-31-80-4.tinychen.io --ignore-daemonsets

# 在worker节点上执行
$ yum install -y kubelet-1.26.0-0 --disableexcludes=kubernetes
$ systemctl daemon-reload
$ systemctl restart kubelet

# 在控制面节点上使用kubectl执行
$ kubectl uncordon k8s-cilium-worker-10-31-80-4.tinychen.io

接着对所有的worker节点重复上面的操作即可完成整个K8S集群的升级

1
2
3
4
5
6
7
8
9
10
11
12
13
# 在worker节点上执行
$ kubeadm upgrade node

# 在控制面节点上使用kubectl执行
$ kubectl drain <换成node的名字> --ignore-daemonsets

# 在worker节点上执行
$ yum install -y kubelet-1.26.0-0 --disableexcludes=kubernetes
$ systemctl daemon-reload
$ systemctl restart kubelet

# 在控制面节点上使用kubectl执行
$ kubectl uncordon <换成node的名字>

最后检查集群状态可以看到所有的node都已经升级到1.26.0

1
2
3
4
5
6
7
8
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-cilium-master-10-31-80-1.tinychen.io Ready control-plane 15d v1.26.0 10.31.80.1 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-master-10-31-80-2.tinychen.io Ready control-plane 15d v1.26.0 10.31.80.2 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-master-10-31-80-3.tinychen.io Ready control-plane 15d v1.26.0 10.31.80.3 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-4.tinychen.io Ready <none> 15d v1.26.0 10.31.80.4 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-5.tinychen.io Ready <none> 15d v1.26.0 10.31.80.5 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11
k8s-cilium-worker-10-31-80-6.tinychen.io Ready <none> 15d v1.26.0 10.31.80.6 <none> CentOS Linux 7 (Core) 6.0.11-1.el7.elrepo.x86_64 containerd://1.6.11

6、升级失败

这里的升级失败情况我们主要考虑的是控制面节点升级失败,因为worker节点的升级过程相对简单,而且升级前如果按步骤驱逐掉worker节点上面的工作负载,那么最坏的情况也不过是将该节点从集群中删除后重新安装再加入集群,对整体影响并不算大。但是如果控制面节点在升级过程中出现故障,是很有可能导致整个集群崩溃的,因此这里官方也给了一些升级失败的应对手段。

  • 升级失败自动回滚:这是不幸中的万幸,使用kubeadm升级集群节点如果失败了,理论上是会自动回滚到升级前的旧版本,如果回滚成功,那么此时的集群应该是可以正常工作的;

  • 升级过程被意外中断:升级过程中因为网络状况等各种问题被中断了,之后只需要再次执行之前的升级命令即可;因为官方声称kubeadm的操作是幂等的,也就是说只需要在升级完成后检查集群状态正常即可;

  • 升级失败手动回滚:kubeadm在升级之前会把相关的manifests文件和etcd的数据都备份到/etc/kubernetes/tmp目录下面,最坏的情况就是需要我们自己去手动恢复数据并重启相关服务