Skip to main content

NLB AZ down

The NLB (Network Load Balancer) AZ (Availability Zone) down fault triggers the unavailability of an AZ on a target network load balancer, resulting in potential disruptions to service delivery. This fault deliberately restricts access to specific availability zones by blocking the subnet ACL (Access Control List) for a defined duration. By simulating this scenario, you can assess the resilience and performance of your system when faced with an inaccessible AZ.

NLB AZ Down

Use cases

  • With this experiment, you can evaluate the application's behavior and assess its ability to handle and recover from a scenario where traffic from a particular AZ is blocked.
  • It conducts an application test by deliberately blocking traffic originating from a specific AZ on the network load balancer. This experiment involves intentionally preventing incoming and outgoing traffic from the designated AZ from reaching the application through the load balancer.

Prerequisites

  • Kubernetes >= 1.17
  • ECS cluster running with the desired tasks and containers and familiarity with ECS service update and deployment concepts.
  • Create a Kubernetes secret that has the AWS access configuration(key) in the CHAOS_NAMESPACE. Below is a sample secret file:
apiVersion: v1
kind: Secret
metadata:
name: cloud-secret
type: Opaque
stringData:
cloud_config.yml: |-
# Add the cloud AWS credentials respectively
[default]
aws_access_key_id = XXXXXXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXX
tip

It is recommended to use the same secret name, that is, cloud-secret. Otherwise, you will need to update the AWS_SHARED_CREDENTIALS_FILE environment variable in the fault template and you may be unable to use the default health check probes.

Permissions required

Here is an example AWS policy to execute the fault.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"elasticloadbalancing:DescribeLoadBalancers",
"ec2:DescribeSubnets",
"ec2:CreateNetworkAcl",
"ec2:CreateNetworkAclEntry",
"ec2:DescribeNetworkAcls",
"ec2:ReplaceNetworkAclAssociation",
"ec2:DeleteNetworkAcl"
],
"Resource": "*"
}
]
}

Fault tunables

Mandatory tunables

Tunable Description Notes
LOAD_BALANCER_ARN Target load balancer ARN whose AZ should be detached For example, arn:aws:elasticloadbalancing:us-east-2:11111111111:loadbalancer/app/test-nlb/09121290906ffab7.
ZONES Target zones that should be detached from the NLB For example, us-east-1a.
REGION Region name for the target volumes For example, us-east-1.

Optional tunables

Tunable Description Notes
TOTAL_CHAOS_DURATION Duration to insert chaos (in seconds) Default: 30 s. For more information, go to duration of the chaos.
CHAOS_INTERVAL Duration between the attachment and detachment of the volumes (in seconds) Default: 30 s. For more information, go to chaos interval.
SEQUENCE Sequence of chaos execution for multiple volumes Default: parallel. Supports serial and parallel. For more information, go to sequence of chaos execution.
RAMP_TIME Duration to wait before and after injecting chaos (in seconds) For example, 30 s. For more information, go to ramp time.

Target zones

Comma-separated list of target zones. Tune it by using the ZONES environment variable.

The following YAML snippet illustrates the use of this environment variable:

# contains nlb az down for given zones
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: nlb-az-down
spec:
components:
env:
# load balancer arn for chaos
- name: LOAD_BALANCER_ARN
value: 'arn:aws:elasticloadbalancing:us-east-2:11111111111:loadbalancer/app/test-nlb/09121290906ffab7'
# target zones for the chaos
- name: ZONES
value: 'us-east-1a,us-east-1b'
# region for chaos
- name: REGION
value: 'us-east-1'