AWS CDK vs. Terraform
IntroductionLink to Introduction
We are cloud enthusiasts at Metosin, and we have used or tried quite a few infrastructure as code (IaC) tools in all "big three" cloud platforms (AWS, Azure, and GCP). Kari has written some blog posts regarding these experiences:
- Comparing AWS CloudFormation and Terraform
- Comparing Azure ARM and Terraform
- Comparing GCP Deployment Manager and Terraform
- Terraform vs. Pulumi Experiences
This autumn, Kari got a chance to try AWS Cloud Development Kit since he had to implement a data pipeline using one relatively new AWS service, which still does not have Terraform support. So, we thought we would create the data pipeline using AWS CDK and the rest of the AWS infra using Terraform. This blog post describes our experiences using AWS CDK and how it compares to Terraform, our preferred IoC tool (at least for now).
TerraformLink to Terraform
Let's first refresh our memory regarding the old IaC kid on the block, Terraform.
Terraform is an excellent declarative Infrastructure as Code tool. The main advantage of Terraform is that you can use it with all major cloud providers (AWS, Azure, and GCP). Terraform also provides various Terraform providers. Using these providers, you can create various infrastructure solutions for all kinds of services (e.g., Terraform provider for MasterCard.)
The Terraform Language created with HCL is well suited for expressing configuration of infrastructure objects as data, in a way that is expressable with JSON. The latest stable version is 1.0.9 (as of writing this blog post).
Terraform's power is that it is almost purely declarative and, therefore, an excellent way to declare cloud infrastructures. However, there are times when some procedural logic is needed, and in these situations, a procedural language would be easier to work with.
resource "aws_subnet" "public_subnet" {
count = var.public-subnet-count
availability_zone = data.aws_availability_zones.main.names[count.index]
cidr_block = "10.0.${count.index}.0/24"
map_public_ip_on_launch = true
vpc_id = aws_vpc.vpc.id
tags = {
Name = "${local.res_prefix}-public-subnet-${count.index}"
SubnetType = "public"
}
}
Terraform language example: a subnet configuration.
Kari's AWS Cloud Development Kit Initial ExperienceLink to Kari's AWS Cloud Development Kit Initial Experience
Using AWS Cloud Development Kit you can use your favorite programming language to define cloud infrastructures (as of writing, the supported languages are: TypeScript, JavaScript, Python, Java, and C# - it might be interesting to try Clojure with CDK). Since I use the programming language as a scripting glue to create and integrate cloud resources, I used Python, which according to my short experiences with CDK, turned out to be an excellent language with CDK.
I was astonished at how easy it was to start using AWS CDK. The learning curve was almost non-existing (which you definitely cannot say about Terraform). I'm pretty excited about the new IoT project, and I didn't have time to read lengthy tutorials about using AWS CDK, so I just created an AWS CDK app using the initialization script (provided in Your first AWS CDK app):
cdk init app --language python
This command creates a new empty CDK app. Then I just read the AWS CDK Python Reference on creating the AWS resources and integrating them. I must point out that the AWS documentation is excellent - I had almost no issues whatsoever creating a data pipeline (AWS IoT Analytics => S3) as described in my previous blog post AWS IoT First Reflections. My biggest surprise was that I could implement the whole pipeline in one day. And using a new IaC tool, AWS CDK (without any previous experience with AWS CDK), and using AWS IoT Analytics - a service that I experimented just once with AWS Console.
my_name = f'{self.etsi_system_name}_{self.etsi_env_name}_{self.module}_channel'
iot_analytics_channel = iot_a.CfnChannel(
self,
my_name,
channel_name = my_name,
channel_storage = iot_a.CfnChannel.ChannelStorageProperty(
customer_managed_s3 = iot_a.CfnChannel.CustomerManagedS3Property(
bucket = iot_analytics_bucket.bucket_name,
role_arn = iot_analytics_storage_iam_role.role_arn,
key_prefix = 'raw/',
),
),
)
iot_analytics_channel.node.add_dependency(iot_analytics_bucket)
iot_analytics_channel.node.add_dependency(iot_analytics_storage_iam_role)
AWS CDK language example using Python: an IoT Analytics Channel.
ComparisonLink to Comparison
Configuration managementLink to Configuration management
You can handle configuration management quite easily using both tools:
Terraform: Create a file in which you can give default values for each environment (e.g., RDS instance size…) as a map, and then provide other maps for each environment to override the default values. Merge the maps — and you have simple configuration management using Terraform.
AWS CDK: It's easy to use the programming language constructs. E.g., you can create a method that populates the default tags to each resource and provide the AWS CDK construct (the resource) as an argument to the method.
ModularityLink to Modularity
Modularity is simple using both tools:
Terraform: You can easily create re-usable Terraform modules and pass parameters to those modules. You can even store the modules in git repositories.
AWS CDK: You have the power of your favorite programming language to modularize your solution any way you want, and even share and use the modules as libraries.
Consistent Resource NamingLink to Consistent Resource Naming
We are pretty stringent about naming our cloud resources in a certain way: e.g., providing the prefix + environment in every resource name. This way, we can easily search resources using either prefix (e.g., “SystemX”) or environment (e.g., “dev”) or the specific deployment (e.g., “SystemX-dev”).
Terraform: It is pretty easy to create a local variable for naming all entities for the current module and then use this local variable to concatenate consistent resource names for each resource entity.
AWS CDK: Very straightforward since you can use a real programming language. In our current project, we create the AWS IoT => S3 pipeline using AWS CDK and the other parts of the AWS infrastructure using Terraform. We used the same resource naming scheme in both solutions.
Consistent Resource TaggingLink to Consistent Resource Tagging
In the same way, as we are pretty stringent with resource naming, we also want to be strict with resource tagging. We want to add a set of default tags to all infra tool created resources + some extra tags for certain specific resources.
Terraform: Quite easy. You can provide the default tags, e.g., in the common environment configuration. Then you merge default and custom tags when creating the actual resource (and provide merged tags). Nowadays, aws provider supports default tags.
AWS CDK: Also relatively easy. After creating a resource, we just call a method to add the same default tags to every resource, e.g.:
self.add_default_tags(iot_analytics_channel)
LanguageLink to Language
How powerful is the language? How easy is it to provide a declarative configuration of cloud infrastructure? How easy is it to provide customization and conditional situations (e.g., create this entity only if condition X exists…)?
Terraform: Terraform is a purely declarative language. That is its power and weakness. Terraform guides you to a particular declarative configuration that makes infrastructure code clean. On the other hand, certain conditional situations are a bit clumsy (though usually possible). One example is to create conditional resources by using the count meta argument.
AWS CDK: You can use a real programming language — therefore, the solutions often are as different as the developers creating those solutions - in this respect, Terraform forces you towards a more specific declarative infrastructure solution. On the other hand, you don’t have to make various tricks to provide conditional situations (as with Terraform), but you can use any conditional logic using your favorite programming language.
IDE SupportLink to IDE Support
We are using Intellij IDEA and Emacs. The IDE support is excellent for both editors. Highlighting, navigation, suggestions etc., work out of the box (IntelliJ Terraform plugin and Terraform LSP servers that you can use with Emacs).
Deployment ProcessLink to Deployment Process
Can you see a plan phase before actual deployment — i.e., what resources will be added/modified/deleted? How complicated is the deployment process? What happens if something goes wrong?
Terraform: Terraform supports a plan phase. The actual deployment (apply) is 99% of the deployments, pretty straightforward. Terraform does not try to roll-back if something goes haywire during the deployment. Usually this is a good thing.
AWS CDK: AWS CDK is just an abstraction to AWS CloudFormation. There are two steps involved in AWS CDK deployment:
- CloudFormation stack creation:
cdk synth
. In this phase, you can see the CloudFormation stack that is going to be created/updated. - Deploying the actual CF stack:
cdk deploy
.
Example:
λ> cdk synth
...
Resources:
XXXdevt1iotanalyticsbucketYYYYY:
Type: AWS::S3::Bucket
...
λ> cdk deploy
IotanalyticsStack: deploying...
IotanalyticsStack: creating CloudFormation changeset...
...
✅ IotanalyticsStack
Stack ARN:
arn:aws:cloudformation:xx-yyyy-1:9999999999:stack/IotanalyticsStack/xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx
If something goes wrong during the deployment, CDK rolls back the created stack. This might be a good thing, or not.
Multi-cloud CapabilityLink to Multi-cloud Capability
Can you use the IaC tool with all three major clouds (AWS, Azure, and GCP)? If you do multi-cloud work, there is a significant benefit to using either Terraform or Pulumi. If you are using AWS, you can decide to use AWS CDK or Terraform/Pulumi.
AWS CDK or CloudFormation?Link to AWS CDK or CloudFormation?
We have also used native CloudFormation, and we would say we didn't like to work with those JSON or YAML files that much. If you decide to use CloudFormation, we strongly suggest using AWS CDK instead of native CloudFormation JSON/YAML to create your cloud infrastructure code.
Terraform and CDK IntegrationLink to Terraform and CDK Integration
If you use Terraform to create your infrastructure, but you realize that you need an AWS service that Terraform does not yet support, you can create only that part of the infra using CDK and then integrate that infra part to Terraform.
We have created IoT Analytics using CDK in the example below since Terraform does not yet support it. In the Terraform code, we can query an IoT Analytics Channel ARN that we need to inject to an IoT Core Topic Rule:
#############################
# RESOURCES FROM THE CDK SIDE
# This is the integration point to the resources created on the CDK side.
# Since as of writing this, Terraform does not yet support IoT Analytics.
# Query the arn.
data "external" "cdk-iotanalytics-role-arn-program" {
program = ["aws", "iam", "get-role",
"--role-name", "${var.prefix}-${terraform.workspace}-iotanalytics-channel-iam-role",
"--query", "{\"role-arn\": Role.Arn}"]
}
...
# And use the arn.
resource "aws_iot_topic_rule" "test-topic-rule" {
name = "${var.prefix}_${terraform.workspace}_test_topic_rule"
enabled = true
sql = "SELECT * FROM '${local.test_topic}/1'"
sql_version = "2016-03-23"
iot_analytics {
channel_name = local.iotanalytics-channel-name
role_arn = data.external.cdk-iotanalytics-role-arn-program.result["role-arn"]
}
}
ConclusionsLink to Conclusions
Both Terraform and AWS CDK are excellent IaC tools to create and maintain cloud infrastructures. Terraform is more purely declarative, and with AWS CDK, you can use your favorite programming language with imperative style.