Table of Contents
Scalability and High Availability
- Scalability is a process to handle a growing amount of work load by adding resources to the system. There are two kinds of scalability:
- Vertical Scaling
- Horizontal Scaling(= elasticity)
- Vertical Scaling
- Vertical scaling refers to adding more resources (CPU/RAM/DISK) to your server
- For example, your application runs on a t2.micro. Scaling that application vertically means changing it on a t2.large.
- RDS, ElastiCache are services that can scale vertically
- Horizontal Scalability.
- Horizontal scaling, commonly referred to as scale-out, is the capability to automatically add systems/instances in a distributed manner in order to handle an increase in load
- The High Availability
- The High Availability (HA) describes a system that is operating continuously without any failure.
- High availability means running your application in at least 2 data centers
- The goal of high availability is to survive a data center loss
AWS Elastic Load Balancing
- Elastic Load Balancing distributes your incoming traffic across multiple targets, such as EC2 instances, containers, and IP addresses, in one or more Availability Zones
- It monitors the health of its registered targets, and routes traffic only to the healthy targets.
- Elastic Load Balancing supports the following load balancers:
- Classic Load Balancers
- Application Load Balancers
- Network Load Balancers
- Gateway Load Balancers.
Load Balancer Benefits
- A load balancer distributes workloads across multiple compute resources, such as virtual servers.
- Increases the availability and fault tolerance of your applications.
- You can add and remove compute resources from your load balancer as your needs change, without disrupting the overall flow of requests to your applications.
- You can configure health checks, which monitor the health of the compute resources
- Enforce stickiness with cookies
- High availability across zones
- Separate public traffic from private traffic
ELB Listener
- A listener is a process that checks for connection requests.
- It is configured with a protocol and a port for front-end connections, and a protocol and a port for back-end connections.
- Elastic Load Balancing supports the following protocols:
- HTTP
- HTTPS (secure HTTP)
- TCP
- SSL (secure TCP)
- HTTPS protocol uses the SSL protocol to establish secure connections over the HTTP layer.
- SSL protocol to establish secure connections over the TCP layer.
Classic Load Balancer
- A load balancer distributes incoming application traffic across multiple EC2 instances in multiple Availability Zones.
- Load balancer serves as a single point of contact for clients.
- Increases the fault tolerance of your applications.
- Increases the availability of your application
- Elastic Load Balancing detects unhealthy instances and routes traffic only to healthy instances.
- You can add and remove instances from your load balancer as your needs change, without disrupting the overall flow of requests to your application.
- Elastic Load Balancing scales your load balancer as traffic to your application changes over time. Elastic Load Balancing can scale to the vast majority of workloads automatically.
- A listener checks for connection requests from clients, using the protocol and port that you configure, and forwards requests to one or more registered instances using the protocol and port number that you configure.
- You can configure health checks, which are used to monitor the health of the registered instances so that the load balancer only sends requests to the healthy instances.
- By default, the load balancer distributes traffic evenly across the Availability Zones. Enable cross-zone load balancing to distribute traffic evenly across all registered instances.
Classic Load Balancer Types
Internet Facing Load Balancer
- An internet-facing load balancer has a publicly resolvable DNS name
- It can route requests from clients over the internet to the private IP addresses of EC2 instances that are registered with the load balancer.
- You need one “Public” subnet in each AZ where the internet facing ELB will be defined.
Internal Load Balancer
- The nodes of an internal load balancer have only private IP addresses.
- The DNS name of an internal load balancer is publicly resolvable to the private IP addresses of the nodes.
- Internal load balancers can only route requests from clients with access to the VPC for the load balancer.
Internal Load Balancer Use Case
- If your application has multiple tiers, for example web servers that must be connected to the internet and application servers, you can design an architecture that uses both internal and internet-facing load balancers.
- Create an internet-facing load balancer and register the web servers with it.
- Create an internal load balancer and register the application servers with it.
- The web servers receive requests from the internet-facing load balancer and send requests for the application servers to the internal load balancer.
- The application servers receive requests from the internal load balancer.
Listener Configurations
- https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/using-elb-listenerconfig-quickref.html
- Protocol and port supported:
- HTTP
- HTTPS
- TCP
- SSL
Assign Security Groups
- In a VPC, you must ensure that the security groups for your instances allow the load balancer to communicate with your instances on both the listener port and the health check port.
- In a VPC, your security groups and network access control lists (ACL) must allow traffic in both directions on these ports.
- The security group must allow traffic in both directions to the listener and health check ports for the load balancer.
Configure Health Check
- The load balancer monitors the health of its registered instances and ensures that it routes traffic only to healthy instances
- A healthy instance shows as “In-Service” under the ELB
- When the ELB detects an unhealthy instance, it stops routing traffic to that instance. An un-healthy instance shows as “Out-of-Service” under the ELB
- When the ELB service detects the EC2 instance is back healthy, it resumes traffic routing to the Instance again
- The default values of health check
- Ping Protocol: HTTP
- Ping Port: 80
- Ping Path: /index.html
- You can add any custom value and configure the health check
- More details
- Response Timeout: 5 seconds
- Interval: 30 seconds
- Unhealthy threshold: 2
- Healthy threshold: 10
- Registered instances must respond with a HTTP “200 OK” message within the timeout period, else, it will be considered as unhealthy
- Number of consecutive failed health checks that should occur before the instance is declared unhealthy
- Number of consecutive successful health checks that must occur before the instance is considered healthy
Cross Zone Load Balancing
- The CLB will distribute traffic evenly between registered EC2 instances in the different AZ’s.
- This is to ensure that each registered/healthy instance gets an equal share of traffic from the CLB
- If you have 8 EC2 instances in one AZ, and 2 in another, cross-zone load balancing will ensure that each registered EC2 instance will be getting around the same amount of traffic load from the ELB
cross-zone load balancing is enabled
cross-zone load balancing is disabled
Routing Algorithm
- Uses the round robin routing algorithm for TCP listeners
- Uses the least outstanding requests routing algorithm for HTTP and HTTPS listeners
- Supports HTTP headers
- X-Forwarded-For
- X-Forwarded-Proto
- X-Forwarded-Port
Configure Sticky Sessions
- Classic Load Balancer routes each request independently to the registered instance with the smallest load.
- You can use the sticky session feature (also known as session affinity), which enables the load balancer to bind a user’s session to a specific instance.
- This ensures that all requests from the user during the session are sent to the same instance.
Connection Draining
- To ensure that a Classic Load Balancer stops sending requests to instances that are de-registering or unhealthy, while keeping the existing connections open, use connection draining.
- This enables the load balancer to complete in-flight requests made to instances that are de-registering or unhealthy.
- When you enable connection draining, you can specify a maximum time for the load balancer to keep connections alive before reporting the instance as de-registered.
- Timeout range 1 ~ 3,600 seconds (default: 300)
- When the maximum time limit is reached, the load balancer forcibly closes connections.
- During the connection draining, the Back end instance state will be “InService : Instance Deregistration Currently In Progress”
- AWS Auto-Scaling would also honor the connection draining setting for unhealthy instances
- During the connection draining period, ELB will not send new requests to the unhealthy Instance
Idle Connection Timeout
- For each request, load balancer maintains two connections.
- Front-end connection is between the client and the load balancer.
- Back-end connection is between the load balancer and a registered EC2 instance.
- The load balancer has a configured idle timeout period that applies to its connections.
- If no data has been sent or received by the time that the idle timeout period elapses, the load balancer closes the connection.
- Idle timeout range 1 ~ 4,000 seconds.
- Default 60 seconds
Monitoring Load Balancer
- CloudWatch Metrics
- Elastic Load Balancing publishes data points to Amazon CloudWatch for your load balancers and your back-end instances.
- AWS Cloud Watch can be used to trigger an SNS notification if a threshold you define is reached
- Elastic Load Balancing Access Logs
- Disabled by default
- You can obtain request information such as requester, time a request was received, the client’s IP address, latencies, request path, and server responses
- You can choose to store the access logs in an S3 bucket that you specify
- CloudTrail logs
- You can use it to capture all API calls for your ELB
- You can store these logs in an S3 bucket that you specify
Quotas for your Classic Load Balancer
- Load balancers per Region: 20
- Listeners per load balancer: 100
- Security groups per load balancer: 5
- Registered instances per load balancer: 1,000
- Subnets per Availability Zone per load balancer: 1
Application Load Balancer
- Automatically distributes your incoming traffic across multiple targets, such as EC2 instances, Containers, Lambda Functions, and IP addresses, in one or more Availability Zones.
- It monitors the health of its registered targets, and routes traffic only to the healthy targets.
- Scales your load balancer as your incoming traffic changes over time. It can automatically scale to the vast majority of workloads.
- Cross zone load balancing is enabled by default
- Supports enhanced health checks and enhanced CloudWatch metrics
- ALB provides additional information in Access Logs compared to CLB
- Internet facing ALB supports IPv4 and DualStack
- Internal ALB uses IPv4 only (no dual stack support yet)
Application Load Balancer Components
- Listeners
- Target Groups
- Targets
- Rules (Condition, Action, Priority)
ALB Listeners
- A listener is a process that checks for connection requests, using the configured protocol and port.
- The rules that you define for a listener determine how the load balancer routes requests to its registered targets.
- Protocols: HTTP, HTTPS
- Ports: 1-65535
- WebSockets protocol support
- Application Load Balancers provide native support for Websockets.
- Websockets allow for full duplex communication
- Both HTTP and HTTPS listeners.
- Enabled by default
- HTTP/2 Support
- Allow multiple requests at the same time
ALB Listener Rules
- Each listener has a default rule
- You can optionally define additional rules.
- Each rule consists of a
- Priority,
- One or more actions
- One or more conditions
- Rules are evaluated in priority order, from the lowest value to the highest value.
- When the condition is met, the traffic is forwarded to the target group
- The default rule is evaluated last.
- You can delete the non-default rules for a listener at any time.
- You cannot delete the default rule for a listener
ALB Target Groups
- Target Group is used to route requests to one or more registered targets.
- When listener rule created, a target group and conditions is specified.
- When a rule condition is met, traffic is forwarded to the corresponding target group.
- You can create different target groups for different types of requests.
- Health check settings provided per target group basis.
- The load balancer continually monitors the health of all targets registered with the target group
- The ALB, routes requests to its targets using the protocol and port number that is specified in the target group
ALB Target Groups
- Target type
- Instance – The targets are specified by instance ID.
- IP – The targets are IP addresses.
- Lambda – The target is a Lambda function
- Protocols
- HTTP
- HTTPS
- Ports
- 1 ~ 65535
ALB Routing
- Routing tables to different target groups:
- Routing based on path in URL
- Ex: abc.com/users, abc.com/groups/main
- Routing based on hostname in URL
- app.abc.com, web.abc.com)
- Routing based on Query String
- Ex: Headers abc.com/users?id=ID0089
- Routing based on path in URL
ALB – Monitoring
- CloudWatch Metrics
- Published every 1 minute if there are requests flowing through the ALB
- Access logs
- You can use access logs to capture detailed information about the requests made to your load balancer and store them as log files in Amazon S3.
- You can use these access logs to analyze traffic patterns and to troubleshoot issues with your targets.
- CloudTrail logs
- You can use AWS CloudTrail to capture detailed information about the calls made to the ELB API and store them as log files in Amazon S3.
- You can use these CloudTrail logs to determine which calls were made, the source IP address where the call came from, who made the call, when the call was made, and so on.
Network Load Balancer
- A Network Load Balancer functions at the 4th layer (Network Layer) of the Open Systems Interconnection (OSI) model.
- It can handle millions of requests per second. Faster than ALB
- After the load balancer receives a connection request, it selects a target from the target group for the default rule.
- It attempts to open a TCP connection to the selected target on the port specified in the listener configuration.
- Elastic Load Balancing creates a network interface for each Availability Zone you enable.
- Each load balancer node in the Availability Zone uses this network interface to get a static IP address.
- When you create an Internet-facing load balancer, you can optionally associate one Elastic IP address per subnet.
Benefits of NLB over CLB
- Ability to handle volatile workloads and scale to millions of requests per second.
- Support for static IP addresses for the load balancer. You can also assign one Elastic IP address per subnet enabled for the load balancer.
- Support for registering targets by IP address, including targets outside the VPC for the load balancer.
- Support for routing requests to multiple applications on a single EC2 instance. You can register each instance or IP address with the same target group using multiple ports.
- Support for containerized applications.
- Support for monitoring the health of each service independently
How NLB Works
- Your client makes a request to your application.
- The load balancer receives the request either directly or through an endpoint for private connectivity
- The listeners in your load balancer receive requests of matching protocol and port, and route these requests based on the default action that you specify. You can use a TLS listener to offload the work of encryption and decryption to your load balancer.
- Healthy targets in one or more target groups receive traffic according to the flow hash algorithm.
Monitoring Network Load Balancers
- CloudWatch metrics
- Amazon CloudWatch is used to retrieve statistics about data points for your load balancers and targets metrics.
- VPC Flow Logs
- VPC Flow Logs can be used to capture detailed information about the traffic going to and from your Network Load Balancer.
- Create a flow log for each network interface for your load balancer. There is one network interface per load balancer subnet.
- Access logs
- You can use access logs to capture detailed information about TLS requests made to your load balancer. The log files are stored in Amazon S3.
- CloudTrail logs
- AWS CloudTrail can be used to capture detailed information about the calls made to the Elastic Load Balancing API and store them as log files in Amazon S3.
- You can use these CloudTrail logs to determine which calls were made, the source IP address where the call came from, who made the call, when the call was made, and so on.
Elastic Load Balancing