High Availability with Auto Scaling and Load Balancing

Overview

Highly available systems are reliable and they continue to function even if essential components fail. Auto Scaling and Elastic Load Balancing are features of the AWS that can be used separately or together for elasticity and high availability. Auto Scaling automates the process of launching and terminating the EC2 instances based on the traffic for your application. It helps ensure that you have the required number of instances available to maintain steady, predictable performance at the lowest possible cost. It provides elasticity and scalability.

On the other hand, Load Balancing automatically distributes incoming application traffic across multiple targets like EC2 instances, containers, and IP addresses. An Elastic Load Balancer can handle the varying load of your application traffic in a single Availability Zone or across multiple Availability Zones. In this project, I have used Application Load balancer which is best suited for load balancing HTTP and HTTPS traffic

Architecture

Steps to create a highly available architecture with auto scaling and load balancing

  1. Create a VPC in us-east-1 region and to design for high availability, configure VPC with 2 subnet spaces in 2 availability zones us-east-1a and us-east-1c where each availability zone is set to have 1 public subnet and 1 private subnet

  1. Create a NAT gateway in one of the availability zones, so that instances in the private subnet can connect to servers outside the VPC, but the services outside cannot make a connection to the instances

  1. Launch an EC2 instance, configure with Amazon Linux AMI, t2 micro instance type in the public subnet-1 with a security group that allows SSH and HTTP traffic from any source. In the advanced settings, update the user data with an initialization script that installs a LAMP stack, web pages, AWS SDK, updates existing packages and start the web server

  1. After the instance is launched, using public IP address, we should be able to access the web page

  1. Create an IAM instance profile for Systems Manager by attaching AmazonSSMManagedInstanceCore policy and assign it to the EC2 instance. The web server will later use this IAM role to connect to the RDS

  1. Create an Amazon Machine Image (AMI) using the EC2 instance created earlier. After the image is created, it automatically terminates the EC2 instance that was launched initially. This newly created AMI will be used when configuring the Auto Scaling Group to automatically launch the instances

  1. Create target group, configure and set up the Application Load Balancer to handle load balancing HTTP requests

  1. Now that the ALB is created, the instances must be placed behind the load balancer. But, to configure and launch the EC2 with Auto Scaling Group, launch template must be created to use

  1. Create a security group and add an inbound rule by selecting ALB security group that was created earlier. This configuration will only allow HTTP traffic coming from the ALB. Now, create a launch template with the AMI that was created before and with the ASG security group that will allow HTTP traffic

  1. Create an Auto Scaling Group with the launch template created earlier, set the auto-scaling desired capacity and minimum capacity to 2 instances and maximum capacity to 4 instances

  1. With the Auto Scaling Group configured, the desired number of EC2 instances are now automatically launched

  1. Verify the load balancer by copying and pasting the load balancer DNS name in the web browser and find that the traffic is distributed among both the availability zones us-east-1a and us-east-1c. Test this by refreshing the browser multiple times

  1. Increase CPU load using load test on the web servers which makes CPU utilization bring up to 100% after which the Auto Scaling Group starts launching 2 more instances which makes it the maximum capacity, and terminates 2 instances once the load is decreased

  1. Create a security group for RDS instance by configuring it with the security group of Auto Scaling Group so that, only the web servers with the Auto Scaling Group can access the RDS instances

  1. Create an RDS instance with the database engine as Amazon Aurora (MySQL compatible edition). Specify administrator information to identify the RDS instances and choose the instance configuration to be db.r5.large. Select VPC to be the one that was created for this project, choose the security group that was created for the RDS instance

  1. Store RDS credentials in AWS Secrets Manager, because the web server created initially contains a sample code that generates a simple address book to RDS. A secret is created that contains data connection information. The web server will later be given the appropriate permission to retrieve the secret

  1. Create a policy that allows the web server to read a secret and attach this policy to the role that was created initially which is assigned to the web server (SSM-InstanceProfile)

  1. Now that we have given the web server all the necessary permissions, when connected to the web server using DNS name of the load balancer, we should be able to access RDS database and also add, edit contacts in the address book

With this, we have now built a web service with guaranteed high availability