Set up Kubernetes cluster with Ansible (Part 1)
I recently got my homelab setup with Proxmox hypervisor and decided to create to a kubernetes cluster. I am using an Intel NUC 13 with with 16 cores for my homelab. One issue I quickly faced is how to automate installing the necessary software packages to get Kubernetes up and running. Logging into each individual node is just out of the question. I am a lazy developer lol and automation is my best friend. This is where Ansible comes in handy.
What is Ansible? It is an agentless tool for managing remote nodes. Note that Ansible comes into play after the infrastructure has already been provisioned with tools like Terraform or Vagrant or cloudinit etc. One thing I love about Ansible is it does not require an agent to be running on the remote nodes like Puppet and Chef.
So how does Ansible work? First, you need an inventory file (inventory.ini
) which defines the list IP addresses or hostnames of the remote machines you want to manage. Here is an example:
[azure_nodes]
testvm1.azure.net
testvm2.azure.net
testvm3.azure.net
[homelab_nodes]
10.0.0.122
10.0.0.123
10.0.0.124
10.0.0.125
You can also put them in groups like this:
[azure_control_plane]
testvm1.azure.net
[azure_worker_nodes]
testvm2.azure.net
testvm3.azure.net
[azure_nodes:children]
azure_control_plane
azure_worker_nodes
...
Verify ansible is able to connect to the nodes with the following command by pinging each host. You can choose to ping a group of hosts rather than all hosts defined in the file and use a different user with the -u
flag.
ansible -m ping azure_nodes --private-key=~/.ssh/id_azure -u dev -i inventory.ini
Next you need a playbook. No, not that playbook that NFL teams use. A playbook is where you define the list of commands to run, in what order and on what machines. It is declarative meaning it represents the final state of the machine as opposed to imperative like Bash or Powershell scripts where you provide step-by-step instructions on how to get to the final state. I won’t dive too deep into Ansible as there is a lot of information already available on the internet. It follows a simple structure where you define a host and a list of tasks to perform. Each task can run a module which I think is ansible’s way of providing a cross-platform way to run Linux commands in a declarative way. You can still opt to run a shell command using the shell
module but if Ansible already provides a module for what you’re doing, I would encourage you to use it.
Let’s start with a playbook that has to run on all nodes. We’ll call this all-playbook.yml
. My playbook has been tested on Debian Bookworm version 12 but I’m sure it will run on any system using apt
for package management. The first thing I do is to make sure the remote machines are running Debian 12 before installing the required packages for Kubernetes.
---
- hosts: azure_nodes
become: true
tasks:
- name: Fail if playbook is not running on Debian machines
fail:
msg: "This playbook is not meant to be run on the target host. Run it from your local machine."
when: ansible_distribution != 'Debian' or ansible_distribution_major_version != "12"
- name: Install required packages
apt:
name: "{{ packages }}"
state: present
update_cache: yes
purge: true
vars:
packages:
- apt-transport-https
- ca-certificates
- curl
- software-properties-common
- gnupg-agent
- gnupg2
- vim
- ufw
- git
- jq
- build-essential
I want to point out the use of ansible_distribution
and ansible_distribution_major_version
. These are something Ansible calls facts about the remote machine. It is just data or information related to the system. You can access them all in the playbook using the ansible_
prefix or run ansible azure_nodes -m setup
to get a list of all system facts.
Next disable swap and load the overlay
and br_netfilter
kernel modules
- name: Disable swap on all nodes
command: swapoff -a
- name: Remove swap from /etc/fstab
command: sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
- name: Set kernel parameters on all nodes
blockinfile:
create: true
path: /etc/modules-load.d/k8s.conf
block: |
overlay
br_netfilter
- name: Load kernel modules
command: modprobe {{ item }}
with_items:
- overlay
- br_netfilter
- name: Sysctl parameters required for Kubernetes networking
lineinfile:
create: true
path: /etc/sysctl.d/k8s.conf
line: "{{ item }}"
with_items:
- 'net.bridge.bridge-nf-call-iptables = 1'
- 'net.bridge.bridge-nf-call-ip6tables = 1'
- 'net.ipv4.ip_forward = 1'
- name: Apply sysctl parameters
command: sysctl --system
Notice the modules blockinfile
and lineinfile
could have been achieved through a shell command with cat
or something similar. Using the builtin modules make the yaml file more readable in my opinion.
Now unto Docker and Kubernetes. The steps are similar as we first need to add the apt repository and official GPG keys before grabbing the packages.
- name: Add Docker repository
apt_repository:
repo: "deb [arch={{ 'amd64' if ansible_architecture == 'x86_64' else 'arm64' }} signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian {{ ansible_distribution_release }} stable"
filename: docker
update_cache: yes
- name: Install Docker and Containerd
apt:
name: "{{ packages }}"
update_cache: yes
vars:
packages:
- docker-ce
- docker-ce-cli
- containerd.io
- docker-buildx-plugin
- docker-compose-plugin
- name: Add Kubernetes official GPG key
get_url:
url: https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key
dest: /etc/apt/keyrings/kubernetes-apt-keyring.asc
mode: 0644
- name: Add Kubernetes repository
apt_repository:
repo: "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.asc] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /"
filename: kubernetes
update_cache: yes
- name: Install Kubernetes packages
apt:
name: "{{ packages }}"
update_cache: yes
vars:
packages:
- kubelet=1.29.3-1.1
- kubeadm=1.29.3-1.1
- kubectl=1.29.3-1.1
We just have two more steps left. Bear with me. Kubernetes v1.29 needs Container Networking Plugin (CNI) for cluster networking. The CNI plugin is what actually implements the Kubernetes network model which imposes the following rules straight from the documentation:
- Pods can communicate with all other pods on any other node without NAT
- Agents on a node (e.g. system daemons, kubelet) can communicate with all pods on that node
- name: Install Kubernetes CNI plugins
unarchive:
src: "https://github.com/containernetworking/plugins/releases/download/v1.4.0/cni-plugins-linux-{{ 'amd64' if ansible_architecture == 'x86_64' else 'arm64' }}-v1.4.0.tgz"
dest: /opt/cni/bin
remote_src: true
mode: '0755'
The last step is to make sure containerd uses systemd as the Cgroup driver. Cgroups (short for control group) is a kernel feature that allows admins to impose a set of limits on resources allocated to a process. It is more complicated than that and I honestly don’t know the intricate details. Starting from Kubernetes v1.22, the kubelet process running on the nodes will use systemd cgroup driver by default if it is not explicitly set. We just need to make sure our container runtime is also using systemd so they both have the same view of the resources in the system and restart.
- name: Use systemd Cgroup driver for containerd
shell: |
containerd config default > /etc/containerd/config.toml
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
- name: Restart containerd
systemd:
name: containerd
state: restarted
enabled: yes
daemon_reload: yes
We can finally run our playbook with the following command:
ansible-playbook --private-key=~/.ssh/id_azure -u dev -i inventory.ini -f 7 all-playbook.yml
Note the -f 5
parameter. This controls the level of parallelism or forks that Ansible uses. If you have the compute power and also have many nodes, you can experiment with this number and set it to a higher value.
I want to keep this short (well kind of) and so in the next article, let’s setup the control plane and the worker nodes.
Next article in series: Set up Kubernetes cluster with Ansible (Part 2)