Pod network corruption
Introduction
Pod network corruption is a Kubernetes pod-level chaos fault that injects corrupted packets of data into the specified container. This is achieved by starting a traffic control (tc) process with netem rules to add egress packet corruption.
Use cases
Pod network corruption:
- Simulates degraded network with varied percentages of dropped packets between microservices (dropped at the destination).
- Tests the application's resilience to lossy or flaky network.
- Kubernetes> 1.16 is required to execute this fault.
- The application pods should be in the running state before and after injecting chaos.
Fault tunables
Optional tunables
Tunable | Description | Notes |
---|---|---|
NETWORK_INTERFACE | Name of the ethernet interface considered to shape the traffic. | For more information, go to network interface. |
TARGET_CONTAINER | Name of the container subject to network corruption. | Applicable for containerd and crio runtime only. With these runtimes, if the value is not provided, the fault injects chaos into the first container of the pod. For more information, go to target specific container. |
NETWORK_PACKET_CORRUPTION_PERCENTAGE | Packet corruption in percentage. | Default: 100. For more information, go to network packet corruption. |
CONTAINER_RUNTIME | Container runtime interface for the cluster. | Default: containerd. Supports docker, containerd and crio. For more information, go to container runtime. |
SOCKET_PATH | Path of the containerd or crio or docker socket file. | Default: /run/containerd/containerd.sock . For more information, go to socket path. |
TOTAL_CHAOS_DURATION | Duration for which to insert chaos (in seconds). | Default: 60 s. For more information, go to duration of the chaos. |
TARGET_PODS | Comma-separated list of application pod names subject to pod network corruption. | If this value not provided, the fault selects the target pods randomly based on provided appLabels. For more information, go to target specific pods. |
DESTINATION_IPS | Comma-separated IP addresses of the services or pods or the CIDR blocks(range of IPs) whose accessibility is impacted. If this value is not provided, the fault induces network chaos for all IPs or destinations. | For more information, go to destination IPS. |
DESTINATION_HOSTS | DNS names or FQDN names of the services whose accessibility is impacted. | If this value is not provided, the fault induces network chaos for all IPs and destinations or DESTINATION_IPS if already defined. For more information, go to destination hosts. |
SOURCE_PORTS | Ports of the target application, the accessibility to which is impacted | Comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports. |
DESTINATION_PORTS | Ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted | Comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports. |
PODS_AFFECTED_PERC | Percentage of the total pods to target. Provide numeric values. | Default: 0 (corresponds to 1 replica). For more information, go to pod affected percentage. |
RAMP_TIME | Period to wait before and after injecting chaos (in seconds). | For example, 30 s. For more information, go to ramp time. |
SEQUENCE | Sequence of chaos execution for multiple target pods. | Default: parallel. Supports serial and parallel. For more information, go to sequence of chaos execution. |
Network packet corruption
Network packet corruption (in percentage) injected into the target application. Tune it by using the NETWORK_PACKET_CORRUPTION_PERCENTAGE
environment variable.
The following YAML snippet illustrates the use of this environment variable:
# it injects network-corruption for the egress traffic
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-network-corruption
spec:
components:
env:
# network packet corruption percentage
- name: NETWORK_PACKET_CORRUPTION_PERCENTAGE
value: '100' #in percentage
- name: TOTAL_CHAOS_DURATION
value: '60'
Destination IPs and destination hosts
Default IPs and hosts whose traffic is interrupted due to the network faults. Tune it by using the DESTINATION_IPS
and DESTINATION_HOSTS
environment variabes, respectively.
DESTINATION_IPS
: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs) whose accessibility is impacted.DESTINATION_HOSTS
: It contains the DNS names or FQDN names of the services whose accessibility is impacted.
The following YAML snippet illustrates the use of these environment variables:
# it injects the chaos for the egress traffic for specific ips/hosts
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-network-corruption
spec:
components:
env:
# supports comma separated destination ips
- name: DESTINATION_IPS
value: '8.8.8.8,192.168.5.6'
# supports comma separated destination hosts
- name: DESTINATION_HOSTS
value: 'nginx.default.svc.cluster.local,google.com'
- name: TOTAL_CHAOS_DURATION
value: '60'
Source And Destination Ports
By default, the network experiments disrupt traffic for all the source and destination ports. The interruption of specific port(s) can be tuned via SOURCE_PORTS
and DESTINATION_PORTS
ENV.
SOURCE_PORTS
: It contains ports of the target application, the accessibility to which is impactedDESTINATION_PORTS
: It contains the ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted
Use the following example to tune this:
# it inject the chaos for the egress traffic for specific ports
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-network-corruption
spec:
components:
env:
# supports comma separated source ports
- name: SOURCE_PORTS
value: '80'
# supports comma separated destination ports
- name: DESTINATION_PORTS
value: '8080,9000'
- name: TOTAL_CHAOS_DURATION
value: '60'
Network interface
Name of the ethernet interface considered to shape the traffic. Its default value is eth0
. Tune it by using the NETWORK_INTERFACE
environment variable.
The following YAML snippet illustrates the use of this environment variable:
# provide the network interface
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-network-corruption
spec:
components:
env:
# name of the network interface
- name: NETWORK_INTERFACE
value: 'eth0'
- name: TOTAL_CHAOS_DURATION
value: '60'
Container runtime and socket path
The CONTAINER_RUNTIME
and SOCKET_PATH
environment variables to set the container runtime and socket file path, respectively.
CONTAINER_RUNTIME
: It supportsdocker
,containerd
, andcrio
runtimes. The default value iscontainerd
.SOCKET_PATH
: It contains path of containerd socket file by default(/run/containerd/containerd.sock
). Fordocker
, specify path as/var/run/docker.sock
. Forcrio
, specify path as/var/run/crio/crio.sock
.
The following YAML snippet illustrates the use of these environment variables:
## provide the container runtime and socket file path
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-network-corruption
spec:
components:
env:
# runtime for the container
# supports docker, containerd, crio
- name: CONTAINER_RUNTIME
value: 'containerd'
# path of the socket file
- name: SOCKET_PATH
value: '/run/containerd/containerd.sock'
- name: TOTAL_CHAOS_DURATION
VALUE: '60'