Creating a golden image with Packer

Packer is a tool to automate the build process for machine images. I've started looking into Packer to generate a golden image. Using a golden image lets me quickly set up a fresh development environment without keeping the EC2 instance for long periods and thinking about configuration drift. Fixing an issue can be as easy as recreating the EC2 instance. Besides that, sharing your environment with others and experimenting with different setups is made easy.

The GitHub repo for this blog article can be found here.

Getting started

Building machine images for AWS can be a time-consuming and error-prone process, but with Packer, you can automate the entire process. Packer is a powerful tool that allows you to create golden images for your development environment with just a few simple steps:

  1. Booting an existing AMI image
  2. Running some commands over SSH
  3. Shutting it down
  4. Taking a snapshot
  5. Creating an AMI from the snapshot
  6. Cleaning up

It's that simple, but Packer will automate these simple tasks. You can also use Packer to build against multiple platforms and architectures, which can be helpful when running on AWS Graviton instances like the t4g and x86 like t3 or t3a.

I've gone to some lengths to create a separate role for Packer in AWS IAM. You can easily strip it out if you want to use your all-mighty admin user.

Installing Packer

As usual, with modern commercial tools, the installation is straightforward.

wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor | sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install packer

You can test Packer by running the following: packer version

IAM policy and role for Packer in AWS

If you don't want to create IAM policies and choose to use your own AWS account, you can skip this path. Using an IAM role is particularly effective when Packer runs from an EC2 instance. I used an IAM role rather than our admin user for security reasons. If we run Packer from our local machine, it will assume the IAM role we've created, which has limited permissions. This reduces the risk of accidental exposure of our AWS credentials.

While using an IAM group might seem more straightforward, it has some drawbacks. We'll need an IAM role if we eventually want to use EC2 to build EC2 images. Additionally, using an IAM role is more secure, as it allows us to limit the permissions of Packer to only what it needs. Packer also provides a chroot builder, which uses a continuously running machine. It should be faster and able to leverage the IAM role we're creating. This is out of the scope of this blog post, however.

We can find the needed IAM policy here. It gives a lot of privileges. I've read this article by Stefan Koch about reducing the privileges required but did not implement it here. Not implementing Stefan's way is quicker but breaks the least privileges principle. Imagine a pipeline running with these privileges and accidentally exposing its credentials. It's okay for development but not for production. Also, consider dedicating a specific AWS account to building AMIs if you have many of them.

Using Terraform to create IAM resources

We can easily manage and clean up the privileges handed out by leveraging infrastructure as code to create IAM resources. It also allows us to collaborate with others by using git, giving us version control. You can find the resulting Terraform config in my GitHub repository.

To gain the privileges needed to build images with Packer, we need a couple of Terraform IAM resources:

Dependency tree based on Terraform graph beautifier

According to Terraform:

The recommended approach to building AWS IAM policy documents within Terraform is the highly customizable aws_iam_policy_document data source.

That is a lot of resource blocks for just one IAM role. I created my current IAM policies by hand while playing with them, but now I have to clean them up.

To confirm deletion, enter the policy name in the text input field.

After cleaning up, I ran the Terraform module creating the visualized resources above. It also writes a file named: packer.pkrvar.hcl. I now have a packer user that can create AMIs for me. Controlled and managed by a Terraform config, written as code.

Preparing and validating Packer

We can now create the Packer build file. I called it aws-k3s.pkr.hcl, and you can find it in the GitHub repository.

The variables are read from the file Terraform just created (packer.pkrvar.hcl). If you skipped that part, you need to fill it in yourself like this:

packer_access_key = "my-access-key"
packer_secret_key = "my-secret-key"
packer_region     = "eu-central-1"
packer_role_arn   = "arn:aws:iam::123456789012:role/packer_role"

packer.pkrvar.hcl

The main Packer file refers to these variables in the source "amazon-ebs" "ubuntu" resource. This resource selects one of the builders, which can be found under the plugins section of the documentation. We're using the Amazon EC2 EBS builder for now.

Preparing images with provisioners

Most of the Packer setups use the Shell provisioner. This way, Packer runs a script on the template source machine over SSH, which differs from the cloud-init we'll later use to initialize an instance running this image.

For Ubuntu, it's essential to include the first line:

cloud-init status --wait

If you don't wait for it to finish, some files still being created during the post-first boot script runs will not be available, and things like APT can thus show issues. More about this here.

The current configuration only tests the build since the variable skip_create_ami is set to true. This is an important setting when testing, as it doesn't create an actual AMI after it's done. You can immediately set the variable skip_create_ami to false if you're not changing the provisioner.

You can check the Packer config by running the following:

marco@DESKTOP:~/ebpf-xdp-dev/packer$ packer validate -var-file=packer.pkrvar.hcl aws-k3s.pkr.hcl 
The configuration is valid.
marco@DESKTOP:~/ebpf-xdp-dev/packer$ packer build -var-file=packer.pkrvar.hcl aws-k3s.pkr.hcl 
k3s.amazon-ebs.ubuntu: output will be in this color.

==> k3s.amazon-ebs.ubuntu: Prevalidating any provided VPC information
==> k3s.amazon-ebs.ubuntu: Prevalidating AMI Name: ebpf-xdp-dev-2023-01-30-17-54-28
    k3s.amazon-ebs.ubuntu: Found Image ID: ami-12a3456c325f02ab
==> k3s.amazon-ebs.ubuntu: Creating temporary keypair: packer_12r18272-21d6-f24f-fb33-bc4e2f50e00a
==> k3s.amazon-ebs.ubuntu: Creating temporary security group for this instance: packer_9217wegf9e-b58b-4acc-d18e-2b19ad93bc96
==> k3s.amazon-ebs.ubuntu: Authorizing access to port 22 from [0.0.0.0/0] in the temporary security groups...
==> k3s.amazon-ebs.ubuntu: Launching a source AWS instance...
    k3s.amazon-ebs.ubuntu: Adding tag: "Creator": "Packer"
    k3s.amazon-ebs.ubuntu: Adding tag: "Creator": "Packer"
    k3s.amazon-ebs.ubuntu: Instance ID: i-04db1b92a25d2ede6

You can change the variable skip_create_ami to true if everything is correct and rerun it. This will do an entire run. The longest time will be spent in the snapshotting phase, as can be seen in a timetable here:

00:00-00:00 - Prevalidation
00:00-00:01 - Creating keypair & security group
00:01-00:02 - Launching a source AWS instance
00:02-00:27 - Connected to SSH and running provisioning script
00:27-01:37 - Stopping the source instance
01:37-08:56 - Creating AMI (creating snapshot is included in this step)
08:56-09:13 - Cleaning up

Timetable for Packer run

As you can see, almost 80% of the time is spent waiting on AWS for the AMI to become ready. This is why it's recommended to use the skip_create_ami variable. Fast iterations of testing are necessary to reach your goal quickly, and with the chroot builder, the iterations could be even quicker.

Costs

Just like when stopping an EC2 instance, you have to pay a fee for saving EBS snapshots. This snapshot has to exist as long as you're holding on to that AMI. The costs for this are approximately $0.05 per GB per month.

For my single image of 8GB, I'll pay $0.62 per 31-day month.

Remember to set up AMI deprecation and deregistration if you're creating images on a schedule. $0.62 isn't much, but run this daily, and you'll pay $19.22 after a month.

This is just a little cheaper than holding on to a stopped EC2's EBS volume at $0.0952. However, that EC2 volume might be three to four times larger than the image we've created to prevent disk space issues while running.

Especially when using EBS type io2 instead of gp3, it can be much cheaper to throw the volume away and recreate it when needed, for example, during the auto-scaling of EC2 instances.

Final thoughts on Packer

After exploring Packer and the golden image principle, I am convinced this workflow is a valuable addition to my toolkit.

Using Packer and AMIs can help maintain consistency and improve collaboration in a project, especially when multiple team members are involved.

Another advantage of Packer and AMIs is the ability to quickly restore an instance to its original state in case of issues.

However, it's worth noting that starting an EC2 instance from scratch using an AMI can take longer than just starting an instance. In some cases, it may be more efficient to simply start and stop instances. Using a combination of both will give you the best of both worlds.