Overview

We will create a Talos Linux Kubernetes cluster running on Proxmox using Terraform infrastructure as code (IAC).

Prerequisites

Create a Terraform Directory

mkdir k8s-tf-example
cd k8s-tf-example
touch main.tf

Add Proxmox and Talos providers

Edit main.tf, change the endpoint to match your Proxmox URL.

terraform {
  required_providers {
    proxmox = {
      source = "bpg/proxmox"
      version = "~> 0.68.0"
    }
    talos = {
      source = "siderolabs/talos"
      version = "~> 0.6.1"
    }
  }
}

provider "proxmox" {
  endpoint = "https://192.168.1.21:8006/"
  insecure = true
}

Set Proxmox Secret Environment Variables

Put in your root password for Proxmox. If you'd like you can create another user as long as they have sufficient permissions. If so use that username as well.

export PROXMOX_VE_USERNAME="root@pam"
export PROXMOX_VE_PASSWORD="super-secret"

Create Talos Cluster Module

This uses an open source Terraform module developed by BB Tech Systems - talos. If you encounter issues or have feature requests please create an issue on the Github repo.

Append module and output blocks to main.tf. Change the module inputs to match your needs. In particular change the control_nodes and worker_nodes maps to match your desired VM names(keys) and proxmox node names(values).

module "talos" {
    source  = "bbtechsys/talos/proxmox"
    version = "0.1.2"
    talos_cluster_name = "test-cluster"
    talos_version = "1.8.3"
    control_nodes = {
        "test-control-0" = "pve1"
        "test-control-1" = "pve1"
        "test-control-2" = "pve1"
    }
    worker_nodes = {
        "test-worker-0" = "pve1"
        "test-worker-1" = "pve1"
        "test-worker-2" = "pve1"
    }
}

output "talos_config" {
    description = "Talos configuration file"
    value       = module.talos.talos_config
    sensitive   = true
}

output "kubeconfig" {
    description = "Kubeconfig file"
    value       = module.talos.kubeconfig
    sensitive   = true
}

Terraform Init

terraform init

Terraform Apply

terraform apply

Download the Client Configurations and Test

terraform output -raw kubeconfig > kubeconfig
terraform output -raw talos_config > talos_config.yaml
export KUBECONFIG=$(pwd)/kubeconfig
export TALOSCONFIG=$(pwd)/talos_config.yaml

If you wish to make this permanent run:

mkdir ~/.talos
cp talos_config.yaml ~/.talos/config
mkdir ~/.kube
cp kubeconfig ~/.kube/config

Test your config.

kubectl get nodes

You should see something like:

NAME            STATUS   ROLES           AGE     VERSION
talos-4vt-cmm   Ready    control-plane   5m21s   v1.31.1
talos-dxq-l3o   Ready    <none>          5m24s   v1.31.1
talos-exw-zz9   Ready    <none>          5m15s   v1.31.1
talos-g1d-91g   Ready    <none>          5m21s   v1.31.1
talos-s1z-dgw   Ready    control-plane   5m14s   v1.31.1
talosctl containers

You should see something like:

NODE            NAMESPACE   ID                     IMAGE   PID    STATUS
192.168.1.210   system      apid                           2122   RUNNING
192.168.1.210   system      ext-qemu-guest-agent           1938   RUNNING
192.168.1.211   system      apid                           2088   RUNNING
192.168.1.211   system      ext-qemu-guest-agent           1934   RUNNING
192.168.1.209   system      apid                           2094   RUNNING
192.168.1.209   system      ext-qemu-guest-agent           1939   RUNNING
192.168.1.213   system      apid                           2097   RUNNING
192.168.1.213   system      ext-qemu-guest-agent           1936   RUNNING
192.168.1.213   system      trustd                         2140   RUNNING
192.168.1.208   system      apid                           2098   RUNNING
192.168.1.208   system      ext-qemu-guest-agent           1940   RUNNING
192.168.1.208   system      trustd                         2146   RUNNING
192.168.1.212   system      apid                           2096   RUNNING
192.168.1.212   system      ext-qemu-guest-agent           1935   RUNNING
192.168.1.212   system      trustd                         2144   RUNNING

Install KubeRay Operator (optional, will be used in future blog posts)

For future blog posts, we will be exploring running Ray for Machine Learning, Distributed SQL using DataFusion Ray, and Reinforcement Learning using Ray RLLib. This will install kuberay, a Kubernetes operator that will make it possible to run Ray in Kubernetes.

From raycluster quick start:

helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update

# Install both CRDs and KubeRay operator v1.2.2.
helm install kuberay-operator kuberay/kuberay-operator --version 1.2.2

# Confirm that the operator is running in the namespace `default`.
kubectl get pods
# NAME                                READY   STATUS    RESTARTS   AGE
# kuberay-operator-7fbdbf8c89-pt8bk   1/1     Running   0          27s

Destroy (if desired)

WARNING - This will delete everything we just created. Use this only if you want to start over.

terraform destroy