Efficiently scaling RKE with Terraform

We've set up a quick start setup with one of everything in the previous blog post. One of everything is excellent for a quick test or check, but you might want to up those rookie numbers. In this blog post, we'll make it scale very easily.

If you haven't destroyed your previous setup, you should. Create a new clean account if you can't or don't want to. It's essential to have your infrastructure set up reproducible at every change from the ground up, and that's why we start from the beginning.

Creating multiple instances

First, let's move the instance definition to its own file called local_instances.tf. I've also changed the resource name to "local".

Now let's scale this instance up to two. We start with two and will scale up to three at the end to see what happens while it's in production. To scale it up easily, we need to add  count = 2 to the instance configuration. However, more changes are required in the other parts of the Terraform config to make it work with the scaled instance.

After adding count = 2 to the instance config, I've also changed the name = "test" to name = "local-node${count.index + 1}" to reflect the name of the node in the CloudStack UI.

The complete local_instances.tf file now looks like this:

resource "cloudstack_instance" "local_nodes" {
  count              = 2
  name               = "local-node${count.index + 1}"
  service_offering   = "VM 4G/4C"
  network_id         = "g56cf51f-93ab-2351-a222-9c9525dc8533"
  template           = "Ubuntu 20.04"
  zone               = "zone.ams.net"
  keypair            = cloudstack_ssh_keypair.testkey.id
  expunge            = true
  security_group_ids = [cloudstack_security_group.Default-SG.id]
  root_disk_size     = 20

  connection {
    type        = "ssh"
    user        = "root"
    private_key = file("../test_rsa")
    host        = self.ip_address
  }
  
  provisioner "remote-exec" {
    inline  = ["curl https://releases.rancher.com/install-docker/20.10.sh | sh"]
  }
}
New local_instances.tf file

Now is the time to set up the DNS of the domain you'll use for Rancher.

Dynamic RKE nodes

To let RKE know it needs to install three nodes instead of one, we need to expose three IP addresses. To automate this, we'll use a wildcard array.

The outputs block is now changed to the following:

output "ip_address" {
  value = cloudstack_instance.local_nodes[*].ip_address
}
Output data

Now let's apply this config first. Terraform would remove the old node if you didn't remove it already. Terraform will create two new nodes.

To change RKE to dynamically add the nodes based on the output data of Cloud, we need to add a dynamic block.

resource "rke_cluster" "cluster_local" {
  dynamic "nodes" {
    for_each = data.terraform_remote_state.cloud.outputs.ip_address
    content {
      address = nodes.value
      user    = "root"
      role    = ["controlplane", "worker", "etcd"]
      ssh_key = file("../test_rsa")
    }
  }
}

The word "nodes" on the second line indicates the name of the block, but also the iterator name. The naming can be confusing, but the dynamic block label should match the wanted block's name. You can change the iterator name by adding iterator = "anothername" before content.

Before you run terraform plan, be sure to delete the terraform.tfstate file to start over. The removal is necessary because the old RKE cluster doesn't exist anymore. You can also remove it with terraform state rm.

When you run terraform plan you'll see it dynamically created the nodes blocks.

      + nodes {
          + address        = "1.2.3.4"
          + role           = [
              + "controlplane",
              + "worker",
              + "etcd",
            ]
          + ssh_agent_auth = (known after apply)
          + ssh_key        = (sensitive value)
          + user           = (sensitive value)
        }
      + nodes {
          + address        = "1.2.3.5"
          + role           = [
              + "controlplane",
              + "worker",
              + "etcd",
            ]
          + ssh_agent_auth = (known after apply)
          + ssh_key        = (sensitive value)
          + user           = (sensitive value)
        }
    }

Before we apply this config, we need to open up the firewall between them.

Security groups

To allow communication between the RKE nodes, we need to open up the firewall to each other and the world. I've created the following security group rules based on this page from Rancher.

resource "cloudstack_security_group_rule" "Default-SG-RKEs-Ruleset" {
  security_group_id = cloudstack_security_group.Default-SG.id

  rule {
    cidr_list = [for s in cloudstack_instance.local_nodes : format("%s/32", s.ip_address)]
    protocol  = "tcp"
    ports     = ["2379", "2380", "10250", "6443"]
  }

  rule {
    cidr_list = [for s in cloudstack_instance.local_nodes : format("%s/32", s.ip_address)]
    protocol  = "udp"
    ports     = ["8472"]
  }
}

I've also added 30000-32767 to Default-SG-Home-Ruleset.

Let's apply the cloud configuration now. You'll only create a new security_group_rule.

Scaling RKE

Now that the firewall is set up, you can run terraform apply. Once the two-node RKE cluster is up. Check whether it functions correctly using the created kubeconfig.yaml file.

marco@DESKTOP-WS:~/terra/rke$ export KUBECONFIG="./kubeconfig.yaml"
marco@DESKTOP-WS:~/terra/rke$ kubectl get nodes
NAME            STATUS   ROLES                      AGE   VERSION
1.2.3.4         Ready    controlplane,etcd,worker   46m   v1.21.7
1.2.3.5         Ready    controlplane,etcd,worker   46m   v1.21.7

Now let's check if everything will work as expected when we bump the count = 2 to count = 3! First, move back to the Cloud config. Then up the count and run apply again.

Plan: 1 to add, 1 to change, 0 to destroy.

Changes to Outputs:
  ~ ip_address = [
        # (1 unchanged element hidden)
        "1.2.3.5",
      + (known after apply),
    ]

It seems like one extra instance is created. We also see that Terraform recreated the security rule, which could form an issue in high traffic production usage. I'd suggest using dynamic block configuration to fix this issue maybe.

Perform the apply command. Watch the extra node, node3, be created and move to the RKE directory.

Applying the RKE config looks slightly off, and I think this is caused by the changing order of IP addresses that come from Cloud's output. We could change this into a mapped value, which could also come in handy to set the node_name, which is missing.

After 3 minutes and 35 seconds, the cluster is expanded. Let's check the node uptime ages.

local_sensitive_file.kube_config_yaml: Creating...
local_sensitive_file.kube_config_yaml: Creation complete after 0s [id=f5e0de88e06ce7c347247247f69d69a1268830732]

Apply complete! Resources: 1 added, 1 changed, 1 destroyed.
marco@DESKTOP-WS:~/tests/rke$ kubectl get nodes
NAME            STATUS   ROLES                      AGE    VERSION
1.2.3.4         Ready    controlplane,etcd,worker   60m    v1.21.7
1.2.3.5         Ready    controlplane,etcd,worker   60m    v1.21.7
1.2.3.6         Ready    controlplane,etcd,worker   103s   v1.21.7
marco@DESKTOP-WS:~/tests/rke$ kubectl get pods -n ingress-nginx
NAME                             READY   STATUS    RESTARTS   AGE
nginx-ingress-controller-88pf4   1/1     Running   0          62m
nginx-ingress-controller-h2glc   1/1     Running   0          4m35s
nginx-ingress-controller-srkwg   1/1     Running   0          62m

Conclusion

The cluster is scaled up without downtime to the running pods efficiently. I would implement a couple of more changes to the configuration before taking this to a production level. For example, you could:

  • Move it to modules and include these three directories from there
  • Have your state files in an S3 bucket
  • Moving variables to a separate tfvars file, making the config setup-agnostic
  • Mapping the IP addresses to their corresponding node-names