家里有一个RaspberryPi5(8G内存版本)、有一个OrangePi3 LTS(内存2G),最近也刚好在工作中需要熟悉K8s的操作,所以想着组个K8s集群玩玩
具体的需求是两个ARM的开发板作为node节点,还需要1个机器作为Master节点,由于没有其他开发板了,就在Nas的VM上虚拟一个Ubuntu(AMD64)出来吧
以下是集群搭建的记录
所有的操作系统都使用Ubuntu
ARM
安装docker
1
| sudo apt install -y docker.io
|
配置docker
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| sudo docker info
Client:
Version: 24.0.5
Context: default
Debug Mode: false
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 24.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
|
确认Cgroup Driver是否systemd,如果不是,需要增加以下文件/etc/docker/daemon.json,内容如下;如果默认已经是systemd,则忽略下面这步
1
2
3
4
5
6
7
8
| {
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
|
保存后执行命令重启docker
1
| sudo systemctl restart docker
|
允许iptables查看bridged流量
增加以下文件/etc/sysctl.d/k8s.conf,内容如下
1
2
| net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
|
reload系统配置
安装Kubernetes(用阿里云镜像)
1
2
3
4
5
| sudo apt-get install -y apt-transport-https
curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
|
一般建议使用手工的方式更新以上3个软件,否则很容易导致各种兼容性问题,所以需要用命令对这3个软件版本进行锁定
1
| sudo apt-mark hold kubelet kubeadm kubectl
|
AMD64
安装docker
安装必要的一些系统工具
1
2
| sudo apt-get update
sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common
|
安装GPG证书
1
| curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
|
写入软件源信息
1
| sudo add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
|
更新并安装 Docker-CE
1
2
| sudo apt-get -y update
sudo apt-get -y install docker-ce
|
更改cgroup driver
1
2
3
4
5
6
7
8
9
10
| cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
|
安装Kubernetes(用阿里云镜像)
同ARM章节
创建k8s集群
在master的节点的机器上操作
创建token
1
2
| TOKEN=$(sudo kubeadm token generate)
echo $TOKEN
|
建议放到bashprofile里
初始化集群
1
| sudo kubeadm init --token=${TOKEN} --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address 192.168.6.11
|
apiserver-advertise-address后面的地址就是master节点的机器ip
成功后,会显示很多信息,记录其中的sha256信息,后面会需要用到
成功后拷贝配置到当前用户目录
1
2
3
| mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
|
查看集群状态
1
2
3
4
| kubectl get nodes
NAME STATUS ROLES AGE VERSION
ubuntu-master NotReady master 2d10h v1.19.4
|
集群未ready,安装CNI插件
1
| kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
|
等待一会儿通过命令查看集群状态,就是Ready状态了
其他集群加入
在node节点执行
1
| sudo kubeadm join 192.168.6.11:6443 --token <TOKEN> --discovery-token-ca-cert-hash sha256:<DISCOVERY-TOKEN>
|
参考
部署ARM架构Kubernetes — Cloud Atlas beta 文档 (cloud-atlas.readthedocs.io)
kubernetes镜像_kubernetes下载地址_kubernetes安装教程-阿里巴巴开源镜像站 (aliyun.com)
基于阿里云镜像站安装Kubernetes | ljchen’s Notes
问题记录
执行kubeadm init,报rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService错误
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| sudo kubeadm init --token=${TOKEN} --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address 192.168.6.11
报错:
I0205 16:33:20.853342 16127 version.go:256] remote version is much newer: v1.29.1; falling back to: stable-1.28
[init] Using Kubernetes version: v1.28.6
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR CRI]: container runtime is not running: output: time="2024-02-05T16:33:22+08:00" level=fatal msg="validate service connection: validate CRI v1 runtime API for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
|
解决
1
2
3
4
5
6
| vi /etc/containerd/config.toml
注释掉其中的:disabled_plugins = ["cri"]
再重新执行初始化命令
sudo systemctl restart containerd
|