An AWS tagging strategy for cost tracking, security, and DataDog integration
Comprehensively applied AWS tags will pay dividends many times over
By Doug Ireton
AWS Tags are a lightweight way to apply valuable, company-specific context across AWS resources. Each tag becomes another dimension by which you can slice your AWS resources programmatically into collections you’d like to act upon. Once you’ve achieved this, you can easily manage thousands of resources with a very small team.
In this post, I’ll show you a complete tagging strategy that will help you accurately assign costs to your apps and services via AWS Cost Explorer, categorize data stores according to security and privacy requirements, and group apps and services in line with DataDog’s Unified Service Tagging guidelines.
Tagging enables dynamic management
A good tagging classification system allows you to programmatically filter and group resources by tag or other easy-to-read properties. Imagine you have no outside context, only what’s available via combining tags, e.g.,
env:staging && app:website && service:db.
- Use AWS Cost Explorer to group costs by product, service, or team. This would help you calculate AWS infrastructure costs for your team’s main app across dev, staging, and prod environments.
- Identify AWS resources that need security/compliance controls to meet SOX or PCI DSS standards.
- Use AWS Backup, which relies on tags to apply the correct backup plan for RDS databases.
- Map AWS resources costs-to-cost centers for yearly budget planning.
- Use tags for automation, e.g., stopping dev or staging ECS Clusters from 5:00 PM to 7:00 AM.
- Automatically page the correct on-call team for a given AWS resource.
- In Terraform, query AWS resources (e.g., private subnets) via tags instead of relying on Remote State which leads to closely-coupled infrastructure.
- Include tags required for DataDog’s Unified Service Tagging.
You’ll know you’ve made a mistake in your tag design if you can’t just search for a single component of a specific app via an AND’d together set of tags.
I recommend you use these tags on all AWS-taggable resources. Tags should always be lowercase with hyphens-instead-of-spaces for consistency, since AWS treats “
app:MyApp” and “
App:myApp” as completely separate tags.
Tags to apply to all resources
- app, e.g.,
app: website. Application monolith or logical grouping of related services which together provide a unit of business value. I like to think of this as what a business person or customer refers to when they talk about an application.
- service, e.g.,
service: coredb. Logical components of an application, like “database, cache, request-queue, etc.” This also maps nicely to the ECS and Kubernetes ideas of services.
- role, e.g.,
role: read-replica. If a Service has multiple components to provide the service, Role is a logical component of the service. For example, in a DB Cluster (e.g., RDS), you might have two database Roles, primary and read-replica. If you had three Read Replica nodes, they would all be tagged with role: read-replica. Note that for a service with sidecar containers, each sidecar container would have its own role, e.g.,
- env, e.g.,
- owning-team-email, e.g.,
owning-team-email: email@example.com. Using an email is handy because you can also use it as a destination for alerts.
- managed-by, e.g.,
managed-by: terraformfor resources you’re managing with Terraform. Substitute with “
pulumi” or “
cloudformation” as appropriate.
- iac-repo, e.g.,
iac-repo: https://github.com/mycorp/vpc. Allows you to trace back this infrastructure resource to its corresponding Infrastructure as Code (Terraform) git repo.
- data-classification, e.g.,
data-classification: confidential. For data classification, the best overview I’ve found is the CMU Guidelines for Data Classification or Data Classification for Compliance. I’d recommend limiting your data-classification values to
- cost-center, e.g.,
cost-center: 1234. If you want a monthly cost breakdown of AWS costs per cost-center, you’ll need to activate this tag as a Cost Allocation Tag.
AWS resource-specific tags
Some AWS resources need resource-specific tags to be more easily managed, like a subnet-type tag to differentiate between public and private subnets. The following tags apply to subnets only.
- subnet-type, e.g.,
subnet-type: public. Public vs. Private subnet.
private. In Terraform you can query public or private subnets via tags instead of relying on Remote State (which leads to closely-coupled infrastructure).
- kubernetes.io/role/internal-elb, e.g.,
kubernetes.io/role/internal-elb: 1To allow Kubernetes to use your private subnets for internal load balancers, tag all private subnets in your VPC with this key-value pair. See the AWS EKS VPC Subnet Discovery docs for more information.
- Name, e.g.,
Name: My EC2 instance. Note that “Name” is Title-Cased because AWS exposes the “Name” tag in the AWS Console. It’s convenient to see a human-friendly name of your AWS resources in the Console for quick identification.
Tag reporting and enforcement
Tag reporting and enforcement is an entire topic in its own right, but I’d recommend using the AWS Config Required Tag rule to report on tag compliance and applying an AWS Tag Policy to your AWS root OU to enforce tag compliance across your AWS Organization.
A tagging strategy for 10x growth
In this post I showed you a complete tagging strategy to help accurately assign costs to your apps and services via AWS Cost Explorer, categorize data stores according to security and privacy requirements, and group apps and services in line with DataDog’s Unified Service Tagging guidelines.
A comprehensively applied AWS tagging strategy will pay dividends many times over. At Allma, we have a very small team managing test and production infrastructure. Since implementing a comprehensive tagging strategy, we’ve gained accurate insight into our costs by application and service as well as being able to automatically monitor our apps and services via DataDog’s Unified Service Tagging. I’m confident that the time spent applying a consistent and well-thought-out set of AWS tags will enable us to scale up our infrastructure ten-fold and more as we grow while maintaining a small, focused engineering team.
Doug Ireton is an experienced SRE/Infrastructure Engineer for Allma.io He has AWS and DevOps experience from over 15 years as software tester, software developer, Principal DevOps Engineer, Chef team lead, and AWS consultant. He founded the largest active DevOps meet-up in the U.S., and is an open source contributor and conference speaker.