RKE2 cluster on Ubuntu 22.04 in minutes – Ansible Galaxy and Manual options (2 ways)

3 Ubuntu Server 22.04 up to date and running (4GB RAM / 2 vCPU)
Ability to run a seashell docker container on your workstation (First install option)
No Firewall between nodes
ntp time synchronization (chronyd)
passwordless ssh & sudo to each host (ssh key installed, ssh-agent running to unlock the key)
DNS updated and resolution working
1 Ubuntu Server 22.04 external load balancer (1GB RAM / 1vCPU) (Second install option only)

Rancher made RKE2 straightforward to install. Here is an example configuration. For details, refer to docs.rke2.io

The LAB description

Hostname	IP Address	vCPU	RAM (GB)
proxy.home.pivert.org	192.168.66.20	1	1
rke21.home.pivert.org	192.168.66.21	2	4
rke22.home.pivert.org	192.168.66.22	2	4
rke23.home.pivert.org	192.168.66.23	2	4

Lab description table

Sanity checks

Before going further, let’s make sure you successfully run the below tests from your workstation. (Adapt IPs)

for x in 192.168.66.2{0..3}; do ping -W1 -c1 $x > /dev/null && echo "$x : OK" || echo "$x : Not OK"; done;

for x in proxy rke2{1..3}; do ping -W1 -c1 $x > /dev/null && echo "$x : OK" || echo "$x : Not OK"; done;

for x in 192.168.66.2{0..3}; do ping -W1 -c1 $x > /dev/null && echo "$x : OK" || echo "$x : Not OK"; done; for x in proxy rke2{1..3}; do ping -W1 -c1 $x > /dev/null && echo "$x : OK" || echo "$x : Not OK"; done;

for x in 192.168.66.2{0..3}; do ping -W1 -c1 $x > /dev/null && echo "$x : OK" || echo "$x : Not OK"; done;
for x in proxy rke2{1..3}; do ping -W1 -c1 $x > /dev/null && echo "$x : OK" || echo "$x : Not OK"; done;

First option: Ansible Galaxy

Example 1: Deploy a 3 nodes cluster

The Ansible Galaxy RKE2 Role will set up a multi-nodes cluster in minutes.

It will also install keepalived on the masters to maintain a virtual IP address (proxy) between the nodes, so you do not need any external haproxy nor external keepalived. Make sure the virtual IP for proxy (192.168.66.20 in the example) is free.

Start a seashell

seashell container offers a Bash and Neovim environment with all the common kubernetes tools (kubectl, oc, k9s, helm, …) and Ansible.

seashell will mount the current folder for persistence (your config files). Also, if you need ssh-agent, run it before starting seashell or eval $(ssh-agent -s)
As user, create a new folder and cd into it.

curl -O https://gitlab.com/pivert/seashell/-/raw/main/seashell && chmod a+x seashell && seashell

curl -O https://gitlab.com/pivert/seashell/-/raw/main/seashell && chmod a+x seashell && seashell

Install the `lablabs/rke2` role.

ansible-galaxy install lablabs.rke2

ansible-galaxy install lablabs.rke2

Create the 2 configuration files

The inventory hosts file

[masters]

rke21

rke22

rke23

[k8s_cluster:children]

masters

[masters] rke21 rke22 rke23 [k8s_cluster:children] masters

[masters]
rke21
rke22
rke23

[k8s_cluster:children]
masters

Just make sure you keep the group names in the inventory file.

The deploy_rke2.yaml playbook

- name: Deploy RKE2

hosts: k8s_cluster

become: yes

vars:

rke2_ha_mode: true

rke2_api_ip: 192.168.66.20

rke2_download_kubeconf: true

roles:

- role: lablabs.rke2

- name: Deploy RKE2 hosts: k8s_cluster become: yes vars: rke2_ha_mode: true rke2_api_ip: 192.168.66.20 rke2_download_kubeconf: true roles: - role: lablabs.rke2

- name: Deploy RKE2
  hosts: k8s_cluster
  become: yes
  vars:
    rke2_ha_mode: true
    rke2_api_ip: 192.168.66.20
    rke2_download_kubeconf: true
  roles:
    - role: lablabs.rke2

Run the playbook

ansible-playbook -i hosts deploy_rke2.yaml

ansible-playbook -i hosts deploy_rke2.yaml

Check

From seashell

mv /tmp/rke2.yaml ./

export KUBECONFIG=$(pwd)/rke2.yaml

k9s

mv /tmp/rke2.yaml ./ export KUBECONFIG=$(pwd)/rke2.yaml k9s

mv /tmp/rke2.yaml ./
export KUBECONFIG=$(pwd)/rke2.yaml
k9s

Example 2: Deploy a 6 nodes cluster with 3 masters and 3 workers

hosts

[masters]

master1 rke2_type=server

master2 rke2_type=server

master3 rke2_type=server

[workers]

worker1 rke2_type=agent

worker2 rke2_type=agent

worker3 rke2_type=agent

[k8s_cluster:children]

masters

workers

[masters] master1 rke2_type=server master2 rke2_type=server master3 rke2_type=server [workers] worker1 rke2_type=agent worker2 rke2_type=agent worker3 rke2_type=agent [k8s_cluster:children] masters workers

[masters]
master1 rke2_type=server
master2 rke2_type=server
master3 rke2_type=server

[workers]
worker1 rke2_type=agent
worker2 rke2_type=agent
worker3 rke2_type=agent

[k8s_cluster:children]
masters
workers

deploy_rke2.yaml playbook

- name: Deploy RKE2

hosts: k8s_cluster

become: yes

vars:

rke2_version: v1.27.3+rke2r1

rke2_ha_mode: true

rke2_ha_mode_keepalived: false

rke2_ha_mode_kubevip: true

rke2_api_ip: 192.168.66.60

rke2_download_kubeconf: true

rke2_additional_sans:

- rke.home.pivert.org

- 192.168.66.60

rke2_loadbalancer_ip_range: 192.168.66.100-192.168.66.200

rke2_server_node_taints:

- "CriticalAddonsOnly=true:NoExecute"

roles:

- role: lablabs.rke2

- name: Deploy RKE2 hosts: k8s_cluster become: yes vars: rke2_version: v1.27.3+rke2r1 rke2_ha_mode: true rke2_ha_mode_keepalived: false rke2_ha_mode_kubevip: true rke2_api_ip: 192.168.66.60 rke2_download_kubeconf: true rke2_additional_sans: - rke.home.pivert.org - 192.168.66.60 rke2_loadbalancer_ip_range: 192.168.66.100-192.168.66.200 rke2_server_node_taints: - "CriticalAddonsOnly=true:NoExecute" roles: - role: lablabs.rke2

- name: Deploy RKE2
  hosts: k8s_cluster
  become: yes
  vars:
    rke2_version: v1.27.3+rke2r1
    rke2_ha_mode: true
    rke2_ha_mode_keepalived: false
    rke2_ha_mode_kubevip: true
    rke2_api_ip: 192.168.66.60
    rke2_download_kubeconf: true
    rke2_additional_sans:
      - rke.home.pivert.org
      - 192.168.66.60
    rke2_loadbalancer_ip_range: 192.168.66.100-192.168.66.200
    rke2_server_node_taints:
      - "CriticalAddonsOnly=true:NoExecute"
  roles:
    - role: lablabs.rke2

group_vars

Create a ./group_vars/ subfolder with 3 files, and provide the rke2_token and rke2_argent_token.
The rke2_token for the workers must be equal to the rke2_agent_token for the masters. This is an additional security feature to ensure the workers do not need the token reserved for masters joining.

First set your global variables in all.yml

---

ansible_user: pivert

rke2_token: ""

--- ansible_user: pivert rke2_token: ""

---
ansible_user: pivert
rke2_token: ""

Generate 2 tokens, and place them in a masters.yml file.

---

rke2_token: EQKZhm2klhTE0G3WGjlrSB8pHNejRWZlH4oW7y8mCW9xZN13OTMw7BF10mXdBPLN

rke2_agent_token: MIJH9VSJ4Pu3YTavEGOzClkLvvApvspjHCd4fugVEgSkJ0YQlha8ha6RcvOdMyZv

--- rke2_token: EQKZhm2klhTE0G3WGjlrSB8pHNejRWZlH4oW7y8mCW9xZN13OTMw7BF10mXdBPLN rke2_agent_token: MIJH9VSJ4Pu3YTavEGOzClkLvvApvspjHCd4fugVEgSkJ0YQlha8ha6RcvOdMyZv

---
rke2_token: EQKZhm2klhTE0G3WGjlrSB8pHNejRWZlH4oW7y8mCW9xZN13OTMw7BF10mXdBPLN
rke2_agent_token: MIJH9VSJ4Pu3YTavEGOzClkLvvApvspjHCd4fugVEgSkJ0YQlha8ha6RcvOdMyZv

Create the workers.yml with only the “agent token” as rke2_token.

---

rke2_token: MIJH9VSJ4Pu3YTavEGOzClkLvvApvspjHCd4fugVEgSkJ0YQlha8ha6RcvOdMyZv

--- rke2_token: MIJH9VSJ4Pu3YTavEGOzClkLvvApvspjHCd4fugVEgSkJ0YQlha8ha6RcvOdMyZv

---
rke2_token: MIJH9VSJ4Pu3YTavEGOzClkLvvApvspjHCd4fugVEgSkJ0YQlha8ha6RcvOdMyZv

Run the playbook

ansible-playbook -i hosts deploy_rke2.yaml

ansible-playbook -i hosts deploy_rke2.yaml

Check

After 15-45 minutes :

mv /tmp/rke2.yaml ./

export KUBECONFIG=$(pwd)/rke2.yaml

k9s

mv /tmp/rke2.yaml ./ export KUBECONFIG=$(pwd)/rke2.yaml k9s

mv /tmp/rke2.yaml ./
export KUBECONFIG=$(pwd)/rke2.yaml
k9s

Troubleshoot

If the cluster does not come up 30 minutes after everything has been downloaded (monitor your internet download usage), you can use crictl to ensure all your pods are running.

Create a crictl.yamlfile locally containing

runtime-endpoint: unix:///run/k3s/containerd/containerd.sock

runtime-endpoint: unix:///run/k3s/containerd/containerd.sock

Then use ansible to copy the file, and to install cri-tools :

ansible k8s_cluster -m copy -a "dest=/etc/crictl.yaml content='runtime-endpoint: unix:///run/k3s/containerd/containerd.sock\n'" -b

VERSION="v1.28.0"

ansible k8s_cluster -m shell -a "wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz"

ansible k8s_cluster -m shell -a "tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin" -b

ansible k8s_cluster -m shell -a "rm -f crictl-$VERSION-linux-amd64.tar.gz"

ansible k8s_cluster -m copy -a "dest=/etc/crictl.yaml content='runtime-endpoint: unix:///run/k3s/containerd/containerd.sock\n'" -b VERSION="v1.28.0" ansible k8s_cluster -m shell -a "wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz" ansible k8s_cluster -m shell -a "tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin" -b ansible k8s_cluster -m shell -a "rm -f crictl-$VERSION-linux-amd64.tar.gz"

ansible k8s_cluster -m copy -a "dest=/etc/crictl.yaml content='runtime-endpoint: unix:///run/k3s/containerd/containerd.sock\n'" -b
VERSION="v1.28.0"
ansible k8s_cluster -m shell -a "wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz"
ansible k8s_cluster -m shell -a "tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin" -b
ansible k8s_cluster -m shell -a "rm -f crictl-$VERSION-linux-amd64.tar.gz"

crictl uses a syntax very close to docker. So command like this will show you all the Pods:

ansible k8s_cluster -a 'crictl ps'

ansible k8s_cluster -a 'crictl ps'

Feel free to check logs with crictl if the cluster did not initialize and you did not get the /tmp/rke2.yaml file.

Scratch the cluster

ansible k8s_cluster -a /usr/local/bin/rke2-killall.sh

ansible k8s_cluster -a /usr/local/bin/rke2-uninstall.sh

ansible k8s_cluster -a 'rm /etc/systemd/system/rke2-server.service'

ansible k8s_cluster -a 'systemctl daemon-reload'

ansible k8s_cluster -a /usr/local/bin/rke2-killall.sh ansible k8s_cluster -a /usr/local/bin/rke2-uninstall.sh ansible k8s_cluster -a 'rm /etc/systemd/system/rke2-server.service' ansible k8s_cluster -a 'systemctl daemon-reload'

ansible k8s_cluster -a /usr/local/bin/rke2-killall.sh
ansible k8s_cluster -a /usr/local/bin/rke2-uninstall.sh
ansible k8s_cluster -a 'rm /etc/systemd/system/rke2-server.service'
ansible k8s_cluster -a 'systemctl daemon-reload'

The 2 last steps above to remove the service are important, since there is a bug

Second option: Manual

This option is described in the docs.rke2.io documentation.

Setup your haproxy or loadbalancer

This option does not describe how to run keepalived on the hosts, so you need an external loadbalancer.
Feel free to use any loadbalancer, or DNS round . Here is an example for haproxy:

global

log 127.0.0.1 local2

pidfile /var/run/haproxy.pid

maxconn 4000

daemon

defaults

mode http

log global

option dontlognull

option http-server-close

option redispatch

retries 3

timeout http-request 10s

timeout queue 1m

timeout connect 10s

timeout client 1m

timeout server 1m

timeout http-keep-alive 10s

timeout check 10s

maxconn 3000

# RKE2

listen rke2-api-server-6443

bind 192.168.66.20:6443

mode tcp

option httpchk HEAD /readyz

http-check expect status 200

option ssl-hello-chk

server rke21 192.168.66.21:6443 check inter 1s

server rke22 192.168.66.22:6443 check inter 1s

server rke23 192.168.66.23:6443 check inter 1s

listen rke2-machine-config-server-9345

bind 192.168.66.20:9345

mode tcp

server rke21 192.168.66.21:9345 check inter 1s

server rke22 192.168.66.22:9345 check inter 1s

server rke23 192.168.66.23:9345 check inter 1s

listen rke2-ingress-router-80

bind 192.168.66.20:80

mode tcp

balance source

server rke21 192.168.66.21:80 check inter 1s

server rke22 192.168.66.22:80 check inter 1s

server rke23 192.168.66.23:80 check inter 1s

listen rke2-ingress-router-443

bind 192.168.66.20:443

mode tcp

balance source

server rke21 192.168.66.21:443 check inter 1s

server rke22 192.168.66.22:443 check inter 1s

server rke23 192.168.66.23:443 check inter 1s

global log 127.0.0.1 local2 pidfile /var/run/haproxy.pid maxconn 4000 daemon defaults mode http log global option dontlognull option http-server-close option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout http-keep-alive 10s timeout check 10s maxconn 3000 # RKE2 listen rke2-api-server-6443 bind 192.168.66.20:6443 mode tcp option httpchk HEAD /readyz http-check expect status 200 option ssl-hello-chk server rke21 192.168.66.21:6443 check inter 1s server rke22 192.168.66.22:6443 check inter 1s server rke23 192.168.66.23:6443 check inter 1s listen rke2-machine-config-server-9345 bind 192.168.66.20:9345 mode tcp server rke21 192.168.66.21:9345 check inter 1s server rke22 192.168.66.22:9345 check inter 1s server rke23 192.168.66.23:9345 check inter 1s listen rke2-ingress-router-80 bind 192.168.66.20:80 mode tcp balance source server rke21 192.168.66.21:80 check inter 1s server rke22 192.168.66.22:80 check inter 1s server rke23 192.168.66.23:80 check inter 1s listen rke2-ingress-router-443 bind 192.168.66.20:443 mode tcp balance source server rke21 192.168.66.21:443 check inter 1s server rke22 192.168.66.22:443 check inter 1s server rke23 192.168.66.23:443 check inter 1s

global
  log         127.0.0.1 local2
  pidfile     /var/run/haproxy.pid
  maxconn     4000
  daemon
defaults
  mode                    http
  log                     global
  option                  dontlognull
  option http-server-close
  option                  redispatch
  retries                 3
  timeout http-request    10s
  timeout queue           1m
  timeout connect         10s
  timeout client          1m
  timeout server          1m
  timeout http-keep-alive 10s
  timeout check           10s
  maxconn                 3000
# RKE2
listen rke2-api-server-6443 
  bind 192.168.66.20:6443
  mode tcp
  option httpchk HEAD /readyz
  http-check expect status 200
  option ssl-hello-chk
  server rke21 192.168.66.21:6443 check inter 1s
  server rke22 192.168.66.22:6443 check inter 1s
  server rke23 192.168.66.23:6443 check inter 1s
listen rke2-machine-config-server-9345 
  bind 192.168.66.20:9345
  mode tcp
  server rke21 192.168.66.21:9345 check inter 1s
  server rke22 192.168.66.22:9345 check inter 1s
  server rke23 192.168.66.23:9345 check inter 1s
listen rke2-ingress-router-80 
  bind 192.168.66.20:80
  mode tcp
  balance source
  server rke21 192.168.66.21:80 check inter 1s
  server rke22 192.168.66.22:80 check inter 1s
  server rke23 192.168.66.23:80 check inter 1s
listen rke2-ingress-router-443 
  bind 192.168.66.20:443
  mode tcp
  balance source
  server rke21 192.168.66.21:443 check inter 1s
  server rke22 192.168.66.22:443 check inter 1s
  server rke23 192.168.66.23:443 check inter 1s

This might be the most important part. Also make sure you do not have any firewall in the way or carefully watch for DROP or REJECTS.

This is not mandatory for this tutorial, but if you’re using haproxy, configure 2 in HA with Keepalived.

Ignore CNI-managed interfaces (NetworkManager)

Configure NetworkManager to ignore calico/flannel related network interfaces on the 3 nodes

cat <<EOF > /var/lib/rancher/rke2/server/node-token

[keyfile]

unmanaged-devices=interface-name:cali*;interface-name:flannel*

EOF

systemctl reload systemd-networkd.service

cat <<EOF > /var/lib/rancher/rke2/server/node-token [keyfile] unmanaged-devices=interface-name:cali*;interface-name:flannel* EOF systemctl reload systemd-networkd.service

cat <<EOF > /var/lib/rancher/rke2/server/node-token
[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:flannel*
EOF
systemctl reload systemd-networkd.service

Reload

systemctl reload NetworkManager

systemctl reload NetworkManager

Install the first node

Create a shared-key and cluster config

mkdir -p /etc/rancher/rke2/

cat <<EOF > /etc/rancher/rke2/config.yaml

token: $(cat /proc/sys/kernel/random/uuid)

tls-san:

- proxy.home.pivert.org

EOF

mkdir -p /etc/rancher/rke2/ cat <<EOF > /etc/rancher/rke2/config.yaml token: $(cat /proc/sys/kernel/random/uuid) tls-san: - proxy.home.pivert.org EOF

mkdir -p /etc/rancher/rke2/
cat <<EOF > /etc/rancher/rke2/config.yaml

token: $(cat /proc/sys/kernel/random/uuid)
tls-san:
  - proxy.home.pivert.org
EOF

Install

curl -sfL https://get.rke2.io | sh -

curl -sfL https://get.rke2.io | sh -

Start

systemctl enable rke2-server.service --now

systemctl enable rke2-server.service --now

Activation should take 3-10 minutes since it will download container images. Watch the log

journalctl -u rke2-server.service -f

journalctl -u rke2-server.service -f

Check

kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes

kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get pods -A

kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get pods -A

kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes
kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get pods -A

Install and start other nodes

Proceed one node at a time. Don’t go to the next node as long as you do not see the second node in “Ready” state with

kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes

kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes

Copy the config.yaml and add the server line

Copy the above file to other nodes, and add the server on the first line. Make sure the token is the same on all nodes. The file should look like this:

server: https://proxy.home.pivert.org:9345

token: e0ea0aed-ccb8-4770-880a-2cc49175c0a2

tls-san:

- proxy.home.pivert.org

server: https://proxy.home.pivert.org:9345 token: e0ea0aed-ccb8-4770-880a-2cc49175c0a2 tls-san: - proxy.home.pivert.org

server: https://proxy.home.pivert.org:9345
token: e0ea0aed-ccb8-4770-880a-2cc49175c0a2
tls-san:
  - proxy.home.pivert.org

curl -sfL https://get.rke2.io | sh -

curl -sfL https://get.rke2.io | sh -

systemctl enable rke2-server.service --now

systemctl enable rke2-server.service --now

Results

You should get something like this

root@rke21:~# kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes

NAME STATUS ROLES AGE VERSION

rke21 Ready control-plane,etcd,master 55m v1.25.9+rke2r1

rke22 Ready control-plane,etcd,master 29m v1.25.9+rke2r1

rke23 Ready control-plane,etcd,master 12m v1.25.9+rke2r1

root@rke21:~# kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get pods -A

NAMESPACE NAME READY STATUS RESTARTS AGE

kube-system cloud-controller-manager-rke21 1/1 Running 0 56m

kube-system cloud-controller-manager-rke22 1/1 Running 0 29m

kube-system cloud-controller-manager-rke23 1/1 Running 0 13m

kube-system etcd-rke21 1/1 Running 0 55m

kube-system etcd-rke22 1/1 Running 0 29m

kube-system etcd-rke23 1/1 Running 0 12m

kube-system helm-install-rke2-canal-rq7m8 0/1 Completed 0 56m

kube-system helm-install-rke2-coredns-85d7v 0/1 Completed 0 56m

kube-system helm-install-rke2-ingress-nginx-c78xr 0/1 Completed 0 56m

kube-system helm-install-rke2-metrics-server-pfzln 0/1 Completed 0 56m

kube-system helm-install-rke2-snapshot-controller-2kj8x 0/1 Completed 1 56m

kube-system helm-install-rke2-snapshot-controller-crd-qpbm8 0/1 Completed 0 56m

kube-system helm-install-rke2-snapshot-validation-webhook-kk7bc 0/1 Completed 0 56m

kube-system kube-apiserver-rke21 1/1 Running 0 56m

kube-system kube-apiserver-rke22 1/1 Running 0 29m

kube-system kube-apiserver-rke23 1/1 Running 0 13m

kube-system kube-controller-manager-rke21 1/1 Running 0 56m

kube-system kube-controller-manager-rke22 1/1 Running 0 29m

kube-system kube-controller-manager-rke23 1/1 Running 0 13m

kube-system kube-proxy-rke21 1/1 Running 0 56m

kube-system kube-proxy-rke22 1/1 Running 0 29m

kube-system kube-proxy-rke23 1/1 Running 0 13m

kube-system kube-scheduler-rke21 1/1 Running 0 56m

kube-system kube-scheduler-rke22 1/1 Running 0 29m

kube-system kube-scheduler-rke23 1/1 Running 0 13m

kube-system rke2-canal-774ps 2/2 Running 0 30m

kube-system rke2-canal-fftn5 2/2 Running 0 55m

kube-system rke2-canal-mn24r 2/2 Running 0 13m

kube-system rke2-coredns-rke2-coredns-6b9548f79f-cl9zt 1/1 Running 0 55m

kube-system rke2-coredns-rke2-coredns-6b9548f79f-lpwzd 1/1 Running 0 30m

kube-system rke2-coredns-rke2-coredns-autoscaler-57647bc7cf-6xxhj 1/1 Running 0 55m

kube-system rke2-ingress-nginx-controller-nll9j 1/1 Running 0 12m

kube-system rke2-ingress-nginx-controller-nsjgf 1/1 Running 0 29m

kube-system rke2-ingress-nginx-controller-s2j8v 1/1 Running 0 54m

kube-system rke2-metrics-server-7d58bbc9c6-98j9s 1/1 Running 0 54m

kube-system rke2-snapshot-controller-7b5b4f946c-mw9l9 1/1 Running 0 54m

kube-system rke2-snapshot-validation-webhook-7748dbf6ff-dzmjf 1/1 Running 0 54m

root@rke21:~# kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes NAME STATUS ROLES AGE VERSION rke21 Ready control-plane,etcd,master 55m v1.25.9+rke2r1 rke22 Ready control-plane,etcd,master 29m v1.25.9+rke2r1 rke23 Ready control-plane,etcd,master 12m v1.25.9+rke2r1 root@rke21:~# kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system cloud-controller-manager-rke21 1/1 Running 0 56m kube-system cloud-controller-manager-rke22 1/1 Running 0 29m kube-system cloud-controller-manager-rke23 1/1 Running 0 13m kube-system etcd-rke21 1/1 Running 0 55m kube-system etcd-rke22 1/1 Running 0 29m kube-system etcd-rke23 1/1 Running 0 12m kube-system helm-install-rke2-canal-rq7m8 0/1 Completed 0 56m kube-system helm-install-rke2-coredns-85d7v 0/1 Completed 0 56m kube-system helm-install-rke2-ingress-nginx-c78xr 0/1 Completed 0 56m kube-system helm-install-rke2-metrics-server-pfzln 0/1 Completed 0 56m kube-system helm-install-rke2-snapshot-controller-2kj8x 0/1 Completed 1 56m kube-system helm-install-rke2-snapshot-controller-crd-qpbm8 0/1 Completed 0 56m kube-system helm-install-rke2-snapshot-validation-webhook-kk7bc 0/1 Completed 0 56m kube-system kube-apiserver-rke21 1/1 Running 0 56m kube-system kube-apiserver-rke22 1/1 Running 0 29m kube-system kube-apiserver-rke23 1/1 Running 0 13m kube-system kube-controller-manager-rke21 1/1 Running 0 56m kube-system kube-controller-manager-rke22 1/1 Running 0 29m kube-system kube-controller-manager-rke23 1/1 Running 0 13m kube-system kube-proxy-rke21 1/1 Running 0 56m kube-system kube-proxy-rke22 1/1 Running 0 29m kube-system kube-proxy-rke23 1/1 Running 0 13m kube-system kube-scheduler-rke21 1/1 Running 0 56m kube-system kube-scheduler-rke22 1/1 Running 0 29m kube-system kube-scheduler-rke23 1/1 Running 0 13m kube-system rke2-canal-774ps 2/2 Running 0 30m kube-system rke2-canal-fftn5 2/2 Running 0 55m kube-system rke2-canal-mn24r 2/2 Running 0 13m kube-system rke2-coredns-rke2-coredns-6b9548f79f-cl9zt 1/1 Running 0 55m kube-system rke2-coredns-rke2-coredns-6b9548f79f-lpwzd 1/1 Running 0 30m kube-system rke2-coredns-rke2-coredns-autoscaler-57647bc7cf-6xxhj 1/1 Running 0 55m kube-system rke2-ingress-nginx-controller-nll9j 1/1 Running 0 12m kube-system rke2-ingress-nginx-controller-nsjgf 1/1 Running 0 29m kube-system rke2-ingress-nginx-controller-s2j8v 1/1 Running 0 54m kube-system rke2-metrics-server-7d58bbc9c6-98j9s 1/1 Running 0 54m kube-system rke2-snapshot-controller-7b5b4f946c-mw9l9 1/1 Running 0 54m kube-system rke2-snapshot-validation-webhook-7748dbf6ff-dzmjf 1/1 Running 0 54m

root@rke21:~# kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes
NAME    STATUS   ROLES                       AGE   VERSION
rke21   Ready    control-plane,etcd,master   55m   v1.25.9+rke2r1
rke22   Ready    control-plane,etcd,master   29m   v1.25.9+rke2r1
rke23   Ready    control-plane,etcd,master   12m   v1.25.9+rke2r1
root@rke21:~# kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get pods -A
NAMESPACE     NAME                                                    READY   STATUS      RESTARTS   AGE
kube-system   cloud-controller-manager-rke21                          1/1     Running     0          56m
kube-system   cloud-controller-manager-rke22                          1/1     Running     0          29m
kube-system   cloud-controller-manager-rke23                          1/1     Running     0          13m
kube-system   etcd-rke21                                              1/1     Running     0          55m
kube-system   etcd-rke22                                              1/1     Running     0          29m
kube-system   etcd-rke23                                              1/1     Running     0          12m
kube-system   helm-install-rke2-canal-rq7m8                           0/1     Completed   0          56m
kube-system   helm-install-rke2-coredns-85d7v                         0/1     Completed   0          56m
kube-system   helm-install-rke2-ingress-nginx-c78xr                   0/1     Completed   0          56m
kube-system   helm-install-rke2-metrics-server-pfzln                  0/1     Completed   0          56m
kube-system   helm-install-rke2-snapshot-controller-2kj8x             0/1     Completed   1          56m
kube-system   helm-install-rke2-snapshot-controller-crd-qpbm8         0/1     Completed   0          56m
kube-system   helm-install-rke2-snapshot-validation-webhook-kk7bc     0/1     Completed   0          56m
kube-system   kube-apiserver-rke21                                    1/1     Running     0          56m
kube-system   kube-apiserver-rke22                                    1/1     Running     0          29m
kube-system   kube-apiserver-rke23                                    1/1     Running     0          13m
kube-system   kube-controller-manager-rke21                           1/1     Running     0          56m
kube-system   kube-controller-manager-rke22                           1/1     Running     0          29m
kube-system   kube-controller-manager-rke23                           1/1     Running     0          13m
kube-system   kube-proxy-rke21                                        1/1     Running     0          56m
kube-system   kube-proxy-rke22                                        1/1     Running     0          29m
kube-system   kube-proxy-rke23                                        1/1     Running     0          13m
kube-system   kube-scheduler-rke21                                    1/1     Running     0          56m
kube-system   kube-scheduler-rke22                                    1/1     Running     0          29m
kube-system   kube-scheduler-rke23                                    1/1     Running     0          13m
kube-system   rke2-canal-774ps                                        2/2     Running     0          30m
kube-system   rke2-canal-fftn5                                        2/2     Running     0          55m
kube-system   rke2-canal-mn24r                                        2/2     Running     0          13m
kube-system   rke2-coredns-rke2-coredns-6b9548f79f-cl9zt              1/1     Running     0          55m
kube-system   rke2-coredns-rke2-coredns-6b9548f79f-lpwzd              1/1     Running     0          30m
kube-system   rke2-coredns-rke2-coredns-autoscaler-57647bc7cf-6xxhj   1/1     Running     0          55m
kube-system   rke2-ingress-nginx-controller-nll9j                     1/1     Running     0          12m
kube-system   rke2-ingress-nginx-controller-nsjgf                     1/1     Running     0          29m
kube-system   rke2-ingress-nginx-controller-s2j8v                     1/1     Running     0          54m
kube-system   rke2-metrics-server-7d58bbc9c6-98j9s                    1/1     Running     0          54m
kube-system   rke2-snapshot-controller-7b5b4f946c-mw9l9               1/1     Running     0          54m
kube-system   rke2-snapshot-validation-webhook-7748dbf6ff-dzmjf       1/1     Running     0          54m

Troubleshoot

Check your certificate, and especially the «X509v3 Subject Alternative Name»

openssl s_client -connect localhost:6443 < /dev/null | openssl x509 -text

openssl s_client -connect localhost:6443 < /dev/null | openssl x509 -text

Restart the installation or recreate the node.

cp /etc/rancher/rke2/config.yaml /root # Save your config.yaml since it will be deleted by the uninstall below

# You must delete the node from the cluster to be able to re-join

kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml drain $HOSTNAME --delete-emptydir-data --force --ignore-daemonsets

kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml delete node $HOSTNAME

systemctl disable rke2-server.service --now

rke2-killall.sh

rke2-uninstall.sh

mkdir -p /etc/rancher/rke2/

cp /root/config.yaml /etc/rancher/rke2/

curl -sfL https://get.rke2.io | sh -

systemctl enable rke2-server.service --now

journalctl -u rke2-server.service -f

cp /etc/rancher/rke2/config.yaml /root # Save your config.yaml since it will be deleted by the uninstall below # You must delete the node from the cluster to be able to re-join kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml drain $HOSTNAME --delete-emptydir-data --force --ignore-daemonsets kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml delete node $HOSTNAME systemctl disable rke2-server.service --now rke2-killall.sh rke2-uninstall.sh mkdir -p /etc/rancher/rke2/ cp /root/config.yaml /etc/rancher/rke2/ curl -sfL https://get.rke2.io | sh - systemctl enable rke2-server.service --now journalctl -u rke2-server.service -f

cp /etc/rancher/rke2/config.yaml /root       # Save your config.yaml since it will be deleted by the uninstall below
# You must delete the node from the cluster to be able to re-join
kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml drain $HOSTNAME --delete-emptydir-data --force --ignore-daemonsets
kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml delete node $HOSTNAME
systemctl disable rke2-server.service --now
rke2-killall.sh
rke2-uninstall.sh
mkdir -p /etc/rancher/rke2/
cp /root/config.yaml /etc/rancher/rke2/
curl -sfL https://get.rke2.io | sh -
systemctl enable rke2-server.service --now
journalctl -u rke2-server.service -f

The kubectl delete might hang, just <ctrl>+c once. The node delete/creation takes about 3-5 minutes.

What’s next ?

Install Rancher to get a GUI

Make sure you’re in the above-mentioned seashell, and have proper KUBECONFIG environment variable. For instance:
export KUBECONFIG=~/rke2.yaml
Or add or copy the rke2.yaml content to your ~/.kube/config
From the seashell, you should be able to run aliases such as kgpoall (800 aliases starting by k…)

helm repo add jetstack https://charts.jetstack.io

helm repo update

helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --version v1.11.0

helm install rancher rancher-latest/rancher --namespace cattle-system --set hostname=proxy.home.pivert.org --set bootstrapPassword=admin --values values.yaml --set global.cattle.psp.enabled=false

echo https://proxy.home.pivert.org/dashboard/?setup=$(kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}') # =>Click on the link after 5-10 minutes

helm repo add jetstack https://charts.jetstack.io helm repo update helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --version v1.11.0 helm install rancher rancher-latest/rancher --namespace cattle-system --set hostname=proxy.home.pivert.org --set bootstrapPassword=admin --values values.yaml --set global.cattle.psp.enabled=false echo https://proxy.home.pivert.org/dashboard/?setup=$(kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}') # =>Click on the link after 5-10 minutes

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager   --namespace cert-manager   --create-namespace   --version v1.11.0
helm install rancher rancher-latest/rancher --namespace cattle-system --set hostname=proxy.home.pivert.org --set bootstrapPassword=admin --values values.yaml --set global.cattle.psp.enabled=false
echo https://proxy.home.pivert.org/dashboard/?setup=$(kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}') # =>Click on the link after 5-10 minutes

Also make sure you have enough RAM & vCPU to run your workloads. Think about minimum 16 GB RAM & 4vCPU.

Conclusion

RKE2 is very easy to set up, and the documentation is great. Both options allows you to install a cluster in minutes. Check the documentation for more.

2 responses to “RKE2 cluster on Ubuntu 22.04 in minutes – Ansible Galaxy and Manual options (2 ways)”

Ashwin says:

4 July 2024 at 12:36 pm

How about installing onto a single server which acts as server and agent?
The officiat documentations doesnt explain it well!

- François says:
  
  7 July 2024 at 12:18 am
  
  Of course, you can always install both server and agent (or master and worker) on a single server. You’ll need 3 servers instead of 6. The reason to split servers and agent is to prevent a high workload on an agent to impact the controlplane and/or etcd.