From Terraform monolith to modules

Moving config from a large monolith config to multiple modules. I'm looking into taking run order control and errors I ran into.

In the last two posts, I've explored Terraform and its terraforming capabilities in enabling quick, repeatable environments in the cloud. This time I want to take the three previous subdirectories and put an over-arcing Terraform configuration which will use the other directories as modules. The config will probably allow me to easily manage their dependencies on each other.

Let's first start by getting a main.tf that tries to run them from the parent directory.

module "nodes" {
  source = "./nodes"
}

module "rke" {
  source = "./rke"
}

module "rancher" {
  source = "./rancher"
}
The over-arcing main Terraform configuration

Just run terraform init and terraform plan and you'll notice it won't work immediately.

Error: Unable to find remote state
 
   with module.rke.data.terraform_remote_state.nodes,
   on rke/main.tf line 19, in data "terraform_remote_state" "nodes":
   19: data "terraform_remote_state" "nodes" {
 
 No stored state was found for the given workspace in the given backend.
Error because of missing dependencies

We first need the cloud machines to run and generate the relevant configs before we can plan further. In this case, we're hitting a dependency order problem.

First, let's clean things up in the subdirectories. I just removed all Terraform generated stuff like this:

rm -rf terraform.tfstate* .terraform*
 
# only for RKE
rm -rf kubeconfig.yaml rke_debug.log 
Remove unnecessary files

Be sure that you've deleted all infrastructure. It's easier when you've created a separate account.

Next, I've added the depends_on argument, and this resulted in another error:

 Error: Module module.rancher contains provider configuration
 
 Providers cannot be configured within modules using count, for_each or depends_on.


 Error: Module module.rke contains provider configuration
 
 Providers cannot be configured within modules using count, for_each or depends_on.
Error after depends_on

It wasn't a good idea to have the providers defined in the modules anyway. Let's rip them out and give them a nice place of their own.

I moved all provider configs into their own file, which caused multiple issues, which I will write below.

If you get the following error:

Error: Failed to query available provider packages

Could not retrieve the list of available versions for provider hashicorp/rke: provider registry registry.terraform.io does not have a provider named registry.terraform.io/hashicorp/rke

Did you intend to use rancher/rke? If so, you must specify that source address in each module which requires that provider. To see which modules are currently depending on hashicorp/rke, run the following
command:
    terraform providers

You should add the required_providers to the modules themselves too. Just don't add the provider block.

The first plan immediately showed the following issue:

Error: Provider configuration not present

To work with module.rancher.rancher2_bootstrap.admin its original provider configuration at module.rancher.provider["registry.terraform.io/rancher/rancher2"].bootstrap is required, but it has been removed.

This occurs when a provider configuration is removed while objects created by that provider still exist in the state. Re-add the provider configuration to destroy module.rancher.rancher2_bootstrap.admin,
after which you can remove the provider configuration again.

To fix this, make the module block of Rancher look like this:

module "rancher" {
  source = "./modules/rancher"
  depends_on = [
    module.rke
  ]
  providers = {
    rancher2.bootstrap = rancher2.bootstrap
    rancher2.admin = rancher2.admin    
  }
}

And in the Rancher main.tf:

terraform {
  required_providers {
    local = {
      source  = "hashicorp/local"
      version = "2.2.2"
    }
    rancher2 = {
      source  = "rancher/rancher2"
      version = "1.22.2"
      configuration_aliases = [ rancher2.bootstrap, rancher2.admin ]
    }
    helm = {
      source  = "hashicorp/helm"
      version = "2.4.1"
    }
  }
}

Now let's run terraform apply for the fourth time and have my fingers crossed.

It didn't work, and the RKE module stopped when it couldn't read the remote state of "cloud", which shouldn't be needed anymore.

The error:

No stored state was found for the given workspace in the given backend.

I've changed the following lines:

# In file rke.tf
 - for_each = data.terraform_remote_state.cloud.outputs.ip_address
 + for_each = var.rke_cluster_ips

# RKE module in file 00-main.tf
module "rke" {
  source = "./modules/rke"
  depends_on = [
    module.cloud
  ]

  rke_cluster_ips = module.cloud.ip_address
}

Running apply again gave me some more relative directory issues, which were easily fixed by adding ${path.module}/ or just removing the ../

The final directory structure is this:

.
├── modules
│   ├── cloud
│   │   ├── local_instances.tf
│   │   ├── main.tf
│   │   ├── provision-docker.sh
│   │   └── security_groups.tf
│   ├── rancher
│   │   ├── certmanager.tf
│   │   ├── main.tf
│   │   └── rancher.tf
│   └── rke
│       ├── main.tf
│       └── rke.tf
├── 00-main.tf
├── 01-cloud.tf
├── 02-rke.tf
├── 03-rancher-bootstrap.tf
├── rke_debug.log
├── terraform.tfstate
├── terraform.tfstate.backup
├── test_rsa
└── test_rsa.pub

4 directories, 18 files

Conclusion

With terraform apply running, we've squashed three runs into one single run. We can hit deploy and get something to drink, pet the cat, and return to an entirely freshly provisioned environment.