Handling Unpredictable Traffic Demand – AWS AutoScaling
You just spent weeks (months even), planning marketing campaigns to drive traffic to your website – and are hoping to see that translate into ‘new users’ on your website. What if your prayers are answered? Are you prepared to handle TWICE the number of expected users? What about 200 times the number of expected users?
These are not conjured up scenarios – we do not have to look that far back. Just two years ago, the launch of Healthcare.gov (the insurance marketplace), faced just such a challenge. In spite of highly provisioned servers, their infrastructure was brought to a crawl due to unanticipated demand.
Unpredictable Traffic Patterns
When a huge traffic spike hits your servers, will it be able to withstand it? The more successful your marketing campaign was the more likely it could result in traffic load spikes. Not just on launch day – but any day from there on. You might become the victim of your own success!
Fear Not. This is where Cloud Computing – in particular – the elastic nature of cloud resources – really comes to the rescue. A feature called autoscaling (on AWS) helps you deal with unpredictable traffic spikes. You DO NOT need to accurately predict or understand the exact demand your website will experience. You DO NOT need to provision high capacity servers in advance – like you would have to in a non-cloud environment.
AWS AutoScaling
Imagine you are watering your lone single plant in your backyard – with a hose. Now, all of a sudden, you see – not one, but a dozen small plants – all needing water. They are spread out across a large area – so you are not even sure if your hose can even reach all of them.
The solution is simple – you buy a portable sprinkler head – attach it to your hose – and turn up the water pressure. This gets the water sprinkling to all the plants.
Autoscaling groups work in a somewhat similar manner. Think of a ‘load balancer’ as your sprinkler. The load balancer will distribute load (water) across all the instances (plants) in your environment. The best part is that you do not need to have all the instances in your Auto Scaling Group up and running all the time. It is almost as if your SPRINKLER was smart enough to detect new plants – and automatically adjusting the water pressure to get the water to the new plants.
You can monitor certain instance-specific attributes (memory, CPU, disk usage etc.) – and decide whether your ONE poor, overloaded server – needs help or not. If so, the ASG will AUTOMATICALLY spin up another instance (or however many instances you want it to). The only requirement is that there be an AMI (Amazon Machine Image) that can be used to create an Amazon EC2 instance. Most popular OSes (Windows, Linux) and the software that runs on them (SQL Server, Oracle, Tableau…) all have AMIs (available on AWS’s marketplace).
What happens when the peak demand dies down? What do you do with all those running instances – since they are no longer being used? Again, as you probably guessed, you can AUTOMATICALLY have these instances switch off (based on low CPU, Memory usage) – or manually stop the instances if you wish.
Horizontal versus Vertical Scalability in the Cloud (on AWS)
Using Cloudwatch (another AWS offering), it is possible to monitor CPU usage, memory utilization (and Available memory) and disk space – among other attributes. Based on the average values for these, you would configure the number of instances that you want in your Auto Scaling Group. For e.g. – if you find that your CPU utilization exceeds 50%, it may be wise to spin up an extra VM. Or if you find your available memory is down to 20% of total memory, it may be time to spin up additional VMs. Keep in mind, that if the underlying resource issue is in your application (e.g. poor coding leading to high memory usage), then – the issue will exist across all the instances in your Auto Scale Group.
Still, an ASG will help you avoid any downtime – and stay up and running – while you figure out what is causing the high memory usage.
An Auto Scaling Group is an example of HORIZONTAL scaling – you scale your application across multiple instances. You are not really utilizing any ELASTIC computing benefits here. Elasticity (Elastic Computing) comes into play when you try to scale a SINGLE instance VERTICALLY (e.g. by increasing its provisioned memory, CPU etc.).
Elastic Computing is available for any instance you create in AWS. Which means you can combine – both VERTICAL Scaling (Elastic computing) and HORIZONTAL Scaling (Auto Scaling Groups) – to get the best of both worlds.
Summary
Netflix (containing some of the highest traffic media servers in the world) runs on the AWS constructs described in this post. As does the U.S. Federal Government. If done right, not only do you get seamlessly scalable resources, you get them at a lower cost than with traditional hosting.
I was surfing the web for AWS and I saw your Blog. I read a few of your posts and think they were awesome. Thank you.
Many startups may face this problem in their growth, thank you for cost effective solutions for this AWS.