CodeQL Scanner Reference
You can scan your code repositories using CodeQL, an analysis engine used by developers to automate security checks, and by security researchers to perform variant analysis.
The following steps outline the basic workflow:
Run a CodeQL scan, either externally or as part of a Run step, and publish the results to SARIF.
Add the SARIF data to your pipeline. If you ran the scan outside the pipeline, do the following:
In the stage where you ingest the results, go to Overview > Shared Paths and create a folder under
/shared
such as/shared/customer_artifacts
.Use a Run step to add your scan results to the shared folder.
Use a CodeQL step to ingest the results.
noteCurrently, the CodeQL scanner template is behind the feature flag
STO_STEP_PALETTE_CODEQL
. Contact Harness Support to enable this feature.Alternately, you can use a Custom ingest step to ingest your data instead.
This topic includes an end-to-end YAML pipeline that illustrates this workflow.
Before you begin
Root access requirements
You need to run the scan step with root access if you need to add trusted certificates to your scan images at runtime.
You can set up your STO scan images and pipelines to run scans as non-root and establish trust for your own proxies using self-signed certificates. For more information, go to Configure STO to Download Images from a Private Registry.
CodeQL step configuration
The recommended workflow is to add a CodeQL step to a Security Tests or CI Build stage and then configure it as described below. You can also configure CodeQL scans programmatically by copying, pasting, and editing the YAML definition.
- UI configuration support is currently limited to a subset of scanners. Extending UI support to additional scanners is on the Harness engineering roadmap.
- Each scanner template shows only the options that apply to a specific scan. If you're setting up a repository scan, for example, the UI won't show Container Image settings.
- Docker-in-Docker is not required for these steps unless you're scanning a container image. If you're scanning a repository using Bandit, for example, you don't need to set up a Background step running DinD.
- Support is currently limited to Kubernetes and Harness Cloud AMD64 build infrastructures only.
CodeQL Step Palette
Scan Mode
- Ingestion Ingestion scans are not orchestrated. The Security step ingest results from a previous scan (for a scan run in an previous step) and then normallizes and compresses the results.
Scan Configuration
The predefined configuration to use for the scan. All scan steps have at least one configuration.
Target
Type
- Repository Scan a codebase repo.
Name
The Identifier that you want to assign to the target you’re scanning in the pipeline. Use a unique, descriptive name such as codebaseAlpha
or jsmith/myalphaservice
. Using descriptive target names will make it much easier to navigate your scan data in the STO UI.
Variant
An identifier for a specific variant to scan, such as the branch name or image tag. This identifier is used to differentiate or group results for a target. Harness maintains a historical trend for each variant.
You can see the target name, type, and variant in the Test Targets UI:
Ingestion file
The results data file to use when running an Ingestion scan. STO steps can ingest scan data in SARIF and Harness Custom JSON format. Generally an Ingestion scan consists of a scan step (to generate the data file) and an ingestion step (to ingest the data file).
Log Level, CLI flags, and Fail on Severity
Log Level
The minimum severity of the messages you want to include in your scan logs. You can specify one of the following:
- DEBUG
- INFO
- WARNING
- ERROR
Additional CLI flags
You can use this field to customize the scan with specific command-line arguments supported by that scanner.
Fail on Severity
Every Security step has a Fail on Severity setting. If the scan finds any vulnerability with the specified severity level or higher, the pipeline fails automatically. You can specify one of the following:
CRITICAL
HIGH
MEDIUM
LOW
INFO
NONE
— Do not fail on severity
The YAML definition looks like this: fail_on_severity : critical # | high | medium | low | info | none
Settings
You can add a tool_args
setting to run the CodeQL scanner binary with specific command-line arguments. For example, you can skip certain tests using -skip
followed by a list of test IDs: tool_args
= -skip testID_1, testID_3, testID_5
Additional Configuration
In the Additional Configuration settings, you can use the following options:
Advanced settings
In the Advanced settings, you can use the following options:
CodeQL pipeline example
The following pipeline example is an ingestion workflow. It consists of two steps. A Run step installs CodeQL, scans the repository defined in the Codebase object, and publishes the scan results to a SARIF file. A CodeQL step then ingests the SARIF file.
pipeline:
projectIdentifier: STO
orgIdentifier: default
tags: {}
properties:
ci:
codebase:
connectorRef: wwdvpwa
repoName: dvpwa
build: <+input>
stages:
- stage:
name: codeql
identifier: codeql
type: CI
spec:
cloneCodebase: true
execution:
steps:
- step:
type: Run
name: codeql_analyze
identifier: codeql_analyze
spec:
connectorRef: account.harnessImage
image: ubuntu:20.04
shell: Sh
command: |-
#!/bin/bash
# Change the working directory to the app directory.
mkdir /app
cd /app
# Update and upgrade the apt packages.
apt update -y -q
apt upgrade -y -q
# Install the wget and tar packages and python3.
export DEBIAN_FRONTEND="noninteractive"
apt install -y -q wget tar python3.9-venv python3.9 build-essential
# Download the CodeQL bundle.
wget -q https://github.com/github/codeql-action/releases/latest/download/codeql-bundle-linux64.tar.gz
# Extract the CodeQL bundle.
tar -xvzf ./codeql-bundle-linux64.tar.gz -C /app/
# Set the PATH environment variable to include the CodeQL directory.
export PATH="${PATH}:/app/codeql"
# Resolve the CodeQL packs
codeql resolve qlpacks
# Move back to the code folder before scanning
cd /harness
# Create a CodeQL database.
codeql database create python_database --language=python
# Run the CodeQL analyzer.
codeql database analyze python_database --format=sarif-latest --output=/shared/customer_artifacts/dvpwa-codeql-results.sarif
imagePullPolicy: Always
resources:
limits:
memory: 2G
cpu: 2000m
- step:
type: CodeQL
name: CodeQL_ingest
identifier: CodeQL_1
spec:
mode: ingestion
config: default
target:
name: codeql-dvpwa
type: repository
variant: test
advanced:
log:
level: info
fail_on_severity: critical
ingestion:
file: /shared/customer_artifacts/dvpwa-codeql-results.sarif
sharedPaths:
- /shared/customer_artifacts/
infrastructure:
type: KubernetesDirect
spec:
connectorRef: hyharnesslegate
namespace: harness-delegate-ng
automountServiceAccountToken: true
nodeSelector: {}
os: Linux
identifier: CodeQLv3
name: CodeQL-v3