One-click autoscaling of DynamoDB

If you are here, then there is no need to explain the benefit of autoscaling to you.

We at Neptune use DynamoDB extensively. Given that our co-founder was one of the early engineers who developed DynamoDB at AWS, we have an affinity for DynamoDB for sure.

Today's challenge

As most of you know, the fixed provisioned capacity on tables leads to throttled requests when consumed capacity exceeds our pre-planned provisioned capacity. When it happens today, we wake up at midnight to manually update the provisioned capacity. However we might have lost some transactions and customer goodwill by the time we manually update the capacity.

To deal with this situation, most people simply over-provision the capacity. It results in severe under-utilization and overpayment to AWS.

This problem is a perfect fit for our own event-driven automation platform - Neptune. So we built a cool dynamoDB autoscaling feature that we dog food ourselves. We observed anywhere between 30-40% cost savings on our tables depending on traffic patterns. The best part is that we never have to wake up ever again to manually update the tables' capacity.

Our solution

It works by creating cloudwatch alarms on the consumed capacity metrics. When they go off, we simply call the update table API to scale up or down the provisioned capacities. Recently, AWS started supporting 1 minute granularity on dynamodb metrics. So we can respond quicker than before :)

Load based autoscaling

Step 1 : Just sign up with Neptune and add your AWS IAM keys with required privileges for dynamodb and cloudwatch.

Step 2 : Go to AWS DynamoDB page, select your specific table or a secondary index on your table and enable autoscaling

DynamoDB_List_Tables

Step 3 : In the configuration page, select the thresholds and the autoscaling factors. You can also specify the maximum and minimum limits for scaling.

DynamoDB_Autoscale_Table

In the background Neptune creates two alarms in cloudwatch and two rules that scale up or down your provisioned capacities when those alarms trigger in an event driven fashion. End result will look like this :

DynamoDB_Autoscale_Result_Graph

Time based autoscaling

In case you have very predictable traffic bursts on a schedule, it's very easy to scale up and down using a simple cron expression

Step 1: Click create rule, and in the trigger section, select cron trigger and specify the time in the dropdown. For e.g : Every monday at 8pm UTC.

DynamoDB_Cron_Trigger

Step 2: In the action section, select CLI_ACTION on AWS, and then search for autoscaling dynamodb runbook. Simply change your table parameters in the shell script of AWS CLI commands. DynamoDB_CLI_Runbook

Step 3: Repeat the above two steps for the scale down rule as well. For e.g : At Tuesday morning 8am.

It's as simple as that. Now your dynamodb table will get scaled up every week at monday 8pm and scaled down automatically at Tuesday morning 8am.

Autoscaling recommendations

  1. Firstly, you should analyze your traffic over a week or a month to determine the average and max consumed capacities. This analysis should inform your scaling factors, minimum and maximum limits for autoscaling.

  2. As a best practice, the minimum limit should cover your average traffic well with a 50% buffer. For e.g if your average is 35, we recommend 70 as minimum limit. (If you are nerdy, use Mean+sigma to cover 68.5% of your traffic without a need for autoscaling, and Mean+2 sigma to cover 95% of the time).

  3. We recommend scaling up quickly by using higher scaling factors when upper threshold violation occurs for two or three minutes. (Dynamodb provides a non-guaranteed 5 minutes burst buffer, which can potentially cover your traffic during those 2-3 minutes). On the other side, scale down slowly by waiting for 2-3 hours for the lower threshold violation to occur to ensure that your traffic is really down. It will also help you deal with the 4 times downscaling per day restriction by AWS (Don't ask us why they do that!)

  4. Finally, read and fully understand dynamodb
    multi-partition behavior and hot-key issues before auto-scaling. Especially so, if your table is bigger than 10GB or if your read/3000 + write/1000 is greater than 1 (which means you have multiple partitions for your table). Refer to part-II of this blog post for more Do's and Don'ts of dynamoDB autoscaling.

It takes just 2 min to autoscale your tables. Since Neptune is a SaaS platform, you have nothing to maintain as well. Go ahead, autoscale DynamoDB and sleep peacefully knowing that your traffic needs are handled well by autoscaling and that you are saving costs :)

You can try it out www.neptune.io

We would love to hear your comments and feedback below.

comments powered by Disqus
Subscribe to get DevOps best practices delivered to you directly