Cloud bills details are important. Lesson learnt from a mistake I found recently when working with #Terraform holding its state files in S3. Plus some good practices to keep the billing sane. A thread.

like/rt for reach, thx

#devopslearning #devops
So recently, while browsing billing reports in AWS cloud I found out, that "terraform" account bill was already around $103 (14th day of the month). I decided to dig into it-even thus $100 is a very small part of the total bill of the whole organization as it was not normal.
So this is what I found out in billing details (image attached). Weird thing: DynamoDB usage was very high on an account that is used only for Terraform resources (S3s and DynamoDB). Why? I was afraid of some problem between Terragrunt, Terraform and our CI pipelines.
So I navigated to DynamoDB statistics + CloudWatch and understood, that.. we use provisioned mode for billing. And it's really overprovisioned. We had provisioned 20/20 write/read rqs/s and used 0 with spikes to 1, 2 (when CI was processing a change). Wow.
80 DynamoDB tables times $12 monthly (estimated cost per table) equals a huge amount of moneys. So I navigated to our Terraform code and confirmed that (image attached).
The patch was deployed quickly - the fix is to switch to "PAY_PER_REQUEST" billing_mode (called "on-demand" in AWS). Just make sure you have billing budgets and cost anomaly detection turned on - "on-demand" mode can skyrocket and cost a lot more as it's not capped.
So now a few lessons learnt:

1. I assume this mistake was a "copy/paste" one that slipped through review. When doing a review for TF resources always think about the impact on billing.
2. Turn on AWS billing cost anomaly detection. It's very helpful. You can switch it on for specific services, cost allocation tags, accounts and something more. Pity, it can't be terraformed, but still - do it manually.
3. Enable billing budgets. Probably 99% of users have it done already, but some startups may not. In the beginning, use some estimated values and review/adjust daily / weekly.
4. For some specific, global workloads and resources (global for whole organization) I prefer creating separated AWS account. That's why we have an account for Terraform resources. It's decoupled, has its own IAM policies and it's very easy to spot mistakes like this.
5. Use cost allocation tags and cost explorer. These tags provide you with more granularity in the process of understanding your costs in the cost explorer. Teach your finance team to use cost explorer.
6. Finally - give your engineers access to view and understand billing. Thanks to it they will understand how much application workloads cost. This is very important, as when they can see it and understand, they will feel more connected and responsible for making costs sane.
That's it for today. Hope you enjoyed it. I think the next thread will be about CI pipelines for Terraform/infrastructure in multicloud setup, Terragrunt, tgenv, tfenv and something more. Enjoy the weekend!
You can follow @docent_net.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.