RDS instance reboot

Introduction

RDS instance reboot derives the instance under chaos from an RDS cluster.

RDS Instance Reboot

Use cases

RDS instance reboot determines the resilience of an application when an instance under chaos is derived from an RDS cluster.

note

Kubernetes version 1.17 or later is required to execute this fault.
AWS access to reboot RDS instances.
The RDS instance must be in a healthy state.

Kubernetes secret must have the AWS access configuration(key) in the CHAOS_NAMESPACE. A sample secret file looks like:

apiVersion: v1
kind: Secret
metadata:
  name: cloud-secret
type: Opaque
stringData:
  cloud_config.yml: |-
    # Add the cloud AWS credentials respectively
    [default]
    aws_access_key_id = XXXXXXXXXXXXXXXXXXX
    aws_secret_access_key = XXXXXXXXXXXXXXX

Harness recommends using the same secret name, that is, cloud-secret. Otherwise, you must update the AWS_SHARED_CREDENTIALS_FILE environment variable in the fault template and you won't be able to use the default health check probes.
Go to AWS named profile for chaos to use a different profile for AWS faults.
Go to the superset permission/policy to execute all AWS faults.
Go to the common tunables and AWS-specific tunables to tune the common tunables for all faults and AWS-specific tunables.

Below is an example AWS policy to execute the fault.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstanceStatus",
                "ec2:DescribeInstances",
                "rds:DescribeDBClusters",
                "rds:DescribeDBInstances",
                "rds:RebootDBInstance"
            ],
            "Resource": "*"
        }
    ]
}

Fault tunables

Mandatory tunables

Tunable	Description	Notes
CLUSTER_NAME	Name of the target RDS cluster	For example, rds-cluster-1
RDS_INSTANCE_IDENTIFIER	Name of the target RDS Instances	For example, rds-cluster-1-instance
REGION	The region name of the target ECS cluster	For example, us-east-1

Optional tunables

Tunable	Description	Notes
TOTAL_CHAOS_DURATION	Duration that you specify, through which chaos is injected into the target resource (in seconds).	Default: 30 s
INSTANCE_AFFECTED_PERC	The Percentage of total RDS instance that are part of RDS cluster to target	Default: 0 (corresponds to 1 instance), provide numeric value only
SEQUENCE	It defines sequence of chaos execution for multiple instance	Default value: parallel. Supported: serial, parallel
AWS_SHARED_CREDENTIALS_FILE	Provide the path for aws secret credentials	Default: `/tmp/cloud_config.yml`
RAMP_TIME	Period to wait before and after injection of chaos in sec	For example, 30 s

RDS cluster name

Cluster name of the target RDS cluster. Tune it by using the CLUSTER_NAME environment variable. If this variable is not provided, the fault selects the Instance Idenfier.

The following YAML snippet illustrates the use of this environment variable:

# reboot the RDS instances
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  annotationCheck: "false"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: rds-instance-reboot
    spec:
      components:
        env:
        # provide the name of RDS cluster
        - name: CLUSTER_NAME
          value: 'rds-demo-cluster'
        - name: REGION
          value: 'us-east-2'
        - name: TOTAL_CHAOS_DURATION
          value: '60'

RDS instance identifier

RDS instance name. Tune it by using the RDS_INSTANCE_IDENTIFIER environment variable.

The following YAML snippet illustrates the use of this environment variable:

# reboot the RDS instances
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  annotationCheck: "false"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: rds-instance-reboot
    spec:
      components:
        env:
        # provide the RDS instance identifier
        - name: RDS_INSTANCE_IDENTIFIER
          value: 'rds-demo-instance-1,rds-demo-instance-2'
        - name: INSTANCE_AFFECTED_PERC
          value: '100'
        - name: REGION
          value: 'us-east-2'
        - name: TOTAL_CHAOS_DURATION
          value: '60'

Introduction​

Use cases​

Fault tunables​