SLO probe
Service Level Objective (SLO) probes let users validate the error budget for a given SLO when the corresponding application is under chaos, and determine the verdict based on the percentage change of the error budget. The probe leverages the API from the Service Reliability Management (SRM) module, and fetches the error budget values during the chaos execution time period. The success of a chaos probe can be defined based on the drop in the percentage of the error budget values. The percentage drop is defined by the user in the probe configuration.
Defining the probe
You can define the probes at .spec.experiments[].spec.probe path inside the chaos engine.
kind: Workflow
apiVersion: argoproj.io/v1alpha1
spec:
templates:
- inputs:
artifacts:
- raw:
data: |
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
spec:
experiments:
- spec:
probe:
####################################
Probes are defined here
####################################
Schema
Listed below is the probe schema for the SLO probe, with properties shared across all the probes and properties unique to the SLO probe.
Field | Description | Type | Range | Notes |
name | Flag to hold the name of the probe | Mandatory | N/A type: string | The name holds the name of the probe. It can be set based on the usecase. |
type | Flag to hold the type of the probe | Mandatory | httpProbe, k8sProbe, cmdProbe, promProbe | The type supports four types of probes. It can one of the httpProbe, k8sProbe, cmdProbe, promProbe. |
mode | Flag to hold the mode of the probe | Mandatory | EOT, Edge, Continuous, OnChaos | The mode supports five modes of probes. SLO Probe supports EOT mode since the SRM API is called post the chaos execution. |
platformEndpoint | Flag to hold the platfrom endpoint | Mandatory | N/Atype: string | The platformEndpoint stores the value of NG manager platform endpoint. ex: https://app.harness.io/gateway/cv/api |
sloIdentifier | Flag to hold the slo identifier of the SLO | Mandatory | N/Atype: string | The sloIdentifier field consists of the SLO identifier for which the error budget is calculated. |
Source Metadata
Field | Description | Type | Range | Notes |
apiTokenSecret | Flag to hold API Token secret | Mandatory | N/A type: string | The apiTokenSecret contains the API Token. The secret should be added with X-API-KEY as the key and should be present in the same namespace where experiment is running. |
accountIdentifier | Flag to hold Account ID | Mandatory | N/A type: string | Account ID of the entity |
orgIdentifier | Flag to hold Org ID | Mandatory | N/A type: string | Organization ID of the entity |
projectIdentifier | Flag to hold Project Identifier | Mandatory | N/A type: string | Project ID of the entity |
Comparator
Field | Description | Type | Range | Notes |
type | Flag to hold type of the data used for comparison | Optional | float | The type contains type of data, which should be compared as part of comparison operation. |
criteria | Flag to hold criteria for the comparison | Mandatory | It supports >=, <=, ==, >, <, !=, oneOf, between for int & float type. And equal, notEqual, contains, matches, notMatches, oneOf for string type. | The criteria contains criteria of the comparison, which should be fulfill as part of comparison operation. |
value | Flag to hold value for the comparison | Mandatory | N/A type: string | The value contains value of the comparison, which should follow the given criteria as part of comparison operation. |
Run properties
Field | Description | Type | Range | Notes |
probeTimeout | Flag to hold the timeout of the probe | Mandatory | N/A type: integer | The probeTimeout represents the time limit for the probe to execute the specified check and return the expected data. |
attempt | Flag to hold the attempt count of the probe | Mandatory | N/A type: integer | The retry contains the number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed. |
stopOnFailure | Flags to hold the stop or continue the experiment on probe failure | Optional | N/A type: boolean | The stopOnFailure can be set to true/false to stop or continue the experiment execution after probe fails. |
evaluationTimeout | Flags to hold the total evaluation time for the probe | Optional | N/A type: string | The evaluationTimeout is the time period for which the error budget values are fetched and based on the chaos execution time period, the percentage change is calculated. |
Definition
probe:
- name: "slo-probe"
type: "sloProbe"
sloProbe/inputs:
platformEndpoint: "<platform-endpoint>"
sloIdentifier: "<slo-identifier>"
sloSourceMetadata:
apiTokenSecret: "<api-token>"
scope:
accountIdentifier: "<account-identifier>"
orgIdentifier: "<org-idetifier>"
projectIdentifier: "<project-identifier>"
comparator:
type: float
criteria: <
value: "0.1"
mode: "EOT"
runProperties:
evaluationTimeout: 5m
attempt: 2
probeTimeout: 1000ms
stopOnFailure: false