If you upgrade your GitLab instance while the GitLab Runner is processing jobs, the trace updates will fail. Once GitLab is back online, then the trace updates should self-heal. However, depending on the error, the GitLab Runner will either retry or eventually terminate job handling.
If you upgrade your GitLab instance while the GitLab Runner is processing jobs, the trace updates fail. When GitLab is back online, the trace updates should self-heal. However, depending on the error, the GitLab Runner either retries or eventually terminates job handling.
As for the artifacts, the GitLab Runner will attempt to upload them three times, after which the job will eventually fail.
As for the artifacts, the GitLab Runner attempts to upload them three times, after which the job eventually fails.
To address the above two scenario's, it is advised to do the following prior to upgrading:
To address the above two scenario's, it is advised to do the following prior to upgrading:
...
@@ -206,7 +206,7 @@ upgrade paths.
...
@@ -206,7 +206,7 @@ upgrade paths.
Upgrading the *major* version requires more attention.
Upgrading the *major* version requires more attention.
Backward-incompatible changes and migrations are reserved for major versions.
Backward-incompatible changes and migrations are reserved for major versions.
Follow the directions carefully as we
Follow the directions carefully as we
cannot guarantee that upgrading between major versions will be seamless.
cannot guarantee that upgrading between major versions is seamless.
It is required to follow the following upgrade steps to ensure a successful *major* version upgrade:
It is required to follow the following upgrade steps to ensure a successful *major* version upgrade:
...
@@ -402,7 +402,7 @@ Git 2.31.x and later is required. We recommend you use the
...
@@ -402,7 +402,7 @@ Git 2.31.x and later is required. We recommend you use the
### 13.9.0
### 13.9.0
We've detected an issue [with a column rename](https://gitlab.com/gitlab-org/gitlab/-/issues/324160)
We've detected an issue [with a column rename](https://gitlab.com/gitlab-org/gitlab/-/issues/324160)
that will prevent upgrades to GitLab 13.9.0, 13.9.1, 13.9.2 and 13.9.3 when following the zero-downtime steps. It is necessary
that prevents upgrades to GitLab 13.9.0, 13.9.1, 13.9.2, and 13.9.3 when following the zero-downtime steps. It is necessary
to perform the following additional steps for the zero-downtime upgrade:
to perform the following additional steps for the zero-downtime upgrade:
1. Before running the final `sudo gitlab-rake db:migrate` command on the deploy node,
1. Before running the final `sudo gitlab-rake db:migrate` command on the deploy node,
...
@@ -423,7 +423,7 @@ to perform the following additional steps for the zero-downtime upgrade:
...
@@ -423,7 +423,7 @@ to perform the following additional steps for the zero-downtime upgrade:
```
```
If you have already run the final `sudo gitlab-rake db:migrate` command on the deploy node and have
If you have already run the final `sudo gitlab-rake db:migrate` command on the deploy node and have
encountered the [column rename issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160), you will
encountered the [column rename issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160), you
@@ -22,7 +22,7 @@ All Geo sites have the following settings:
...
@@ -22,7 +22,7 @@ All Geo sites have the following settings:
| Setting | Description |
| Setting | Description |
| --------| ----------- |
| --------| ----------- |
| Primary | This marks a Geo site as **primary** site. There can be only one **primary** site. |
| Primary | This marks a Geo site as **primary** site. There can be only one **primary** site. |
| Name | The unique identifier for the Geo site. It's highly recommended to use a physical location as a name. Good examples are "London Office" or "us-east-1". Avoid words like "primary", "secondary", "Geo", or "DR". This makes the failover process easier because the physical location does not change, but the Geo site role can. All nodes in a single Geo site use the same site name. Nodes use the `gitlab_rails['geo_node_name']` setting in `/etc/gitlab/gitlab.rb` to lookup their Geo site record in the PostgreSQL database. If `gitlab_rails['geo_node_name']` is not set, then the node's `external_url` with trailing slash is used as fallback. The value of `Name` is case-sensitive, and most characters are allowed. |
| Name | The unique identifier for the Geo site. It's highly recommended to use a physical location as a name. Good examples are "London Office" or "us-east-1". Avoid words like "primary", "secondary", "Geo", or "DR". This makes the failover process easier because the physical location does not change, but the Geo site role can. All nodes in a single Geo site use the same site name. Nodes use the `gitlab_rails['geo_node_name']` setting in `/etc/gitlab/gitlab.rb` to lookup their Geo site record in the PostgreSQL database. If `gitlab_rails['geo_node_name']` is not set, the node's `external_url` with trailing slash is used as fallback. The value of `Name` is case-sensitive, and most characters are allowed. |
| URL | The instance's user-facing URL. |
| URL | The instance's user-facing URL. |
The site you're currently browsing is indicated with a blue `Current` label, and
The site you're currently browsing is indicated with a blue `Current` label, and
...
@@ -35,32 +35,32 @@ the **primary** node is listed first as `Primary site`.
...
@@ -35,32 +35,32 @@ the **primary** node is listed first as `Primary site`.
| Setting | Description |
| Setting | Description |
|---------------------------|-------------|
|---------------------------|-------------|
| Selective synchronization | Enable Geo [selective sync](../../administration/geo/replication/configuration.md#selective-synchronization) for this **secondary** site. |
| Selective synchronization | Enable Geo [selective sync](../../administration/geo/replication/configuration.md#selective-synchronization) for this **secondary** site. |
| Repository sync capacity | Number of concurrent requests this **secondary** site will make to the **primary** site when backfilling repositories. |
| Repository sync capacity | Number of concurrent requests this **secondary** site makes to the **primary** site when backfilling repositories. |
| File sync capacity | Number of concurrent requests this **secondary** site will make to the **primary** site when backfilling files. |
| File sync capacity | Number of concurrent requests this **secondary** site makes to the **primary** site when backfilling files. |
## Geo backfill
## Geo backfill
**Secondary** sites are notified of changes to repositories and files by the **primary** site,
**Secondary** sites are notified of changes to repositories and files by the **primary** site,
and will always attempt to synchronize those changes as quickly as possible.
and always attempt to synchronize those changes as quickly as possible.
Backfill is the act of populating the **secondary** site with repositories and files that
Backfill is the act of populating the **secondary** site with repositories and files that
existed *before* the **secondary** site was added to the database. Since there may be
existed *before* the **secondary** site was added to the database. Because there may be
extremely large numbers of repositories and files, it's infeasible to attempt to
extremely large numbers of repositories and files, it's not feasible to attempt to
download them all at once, so GitLab places an upper limit on the concurrency of
download them all at once; so, GitLab places an upper limit on the concurrency of
these operations.
these operations.
How long the backfill takes is a function of the maximum concurrency, but higher
How long the backfill takes is dependent on the maximum concurrency, but higher
values place more strain on the **primary** site. From [GitLab 10.2](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/3107),
values place more strain on the **primary** site. From [GitLab 10.2](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/3107),
the limits are configurable. If your **primary** site has lots of surplus capacity,
the limits are configurable. If your **primary** site has lots of surplus capacity,
you can increase the values to complete backfill in a shorter time. If it's
you can increase the values to complete backfill in a shorter time. If it's
under heavy load and backfill is reducing its availability for normal requests,
under heavy load and backfill reduces its availability for normal requests,
you can decrease them.
you can decrease them.
## Using a different URL for synchronization
## Using a different URL for synchronization
The **primary** site's Internal URL is used by **secondary** sites to contact it
The **primary** site's Internal URL is used by **secondary** sites to contact it
(to sync repositories, for example). The name Internal URL distinguishes it from
(to sync repositories, for example). The name Internal URL distinguishes it from
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/323423) in GitLab 13.12.
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/323423) in GitLab 13.12.
WARNING:
WARNING:
This product is an early access and is considered a [beta](https://about.gitlab.com/handbook/product/gitlab-the-product/#beta) feature.
This product is in an early-access stage and is considered a [beta](https://about.gitlab.com/handbook/product/gitlab-the-product/#beta) feature.
GitLab DAST's new browser-based crawler is a crawl engine built by GitLab to test Single Page Applications (SPAs) and traditional web applications.
GitLab DAST's new browser-based crawler is a crawl engine built by GitLab to test Single Page Applications (SPAs) and traditional web applications.
Due to the reliance of modern web applications on JavaScript, handling SPAs or applications that are dependent on JavaScript is paramount to ensuring proper coverage of an application for Dynamic Application Security Testing (DAST).
Due to the reliance of modern web applications on JavaScript, handling SPAs or applications that are dependent on JavaScript is paramount to ensuring proper coverage of an application for Dynamic Application Security Testing (DAST).
The browser-based crawler works by loading the target application into a specially-instrumented Chromium browser. A snapshot of the page is taken prior to a search to find any actions that a user might perform,
The browser-based crawler works by loading the target application into a specially-instrumented Chromium browser. A snapshot of the page is taken before a search to find any actions that a user might perform,
such as clicking on a link or filling in a form. For each action found, the crawler will execute it, take a new snapshot and determine what in the page changed from the previous snapshot.
such as clicking on a link or filling in a form. For each action found, the crawler executes it, takes a new snapshot, and determines what in the page changed from the previous snapshot.
Crawling continues by taking more snapshots and finding subsequent actions.
Crawling continues by taking more snapshots and finding subsequent actions.
The benefit of crawling by following user actions in a browser is that the crawler can interact with the target application much like a real user would, identifying complex flows that traditional web crawlers don't understand. This results in better coverage of the website.
The benefit of crawling by following user actions in a browser is that the crawler can interact with the target application much like a real user would, identifying complex flows that traditional web crawlers don't understand. This results in better coverage of the website.
...
@@ -57,17 +57,17 @@ The browser-based crawler can be configured using CI/CD variables.
...
@@ -57,17 +57,17 @@ The browser-based crawler can be configured using CI/CD variables.
| `DAST_BROWSER_IGNORED_HOSTS` | List of strings | `site.com,another.com` | Hostnames included in this variable are accessed but not reported against. |
| `DAST_BROWSER_IGNORED_HOSTS` | List of strings | `site.com,another.com` | Hostnames included in this variable are accessed but not reported against. |
| `DAST_BROWSER_MAX_ACTIONS` | number | `10000` | The maximum number of actions that the crawler performs. For example, clicking a link, or filling a form. |
| `DAST_BROWSER_MAX_ACTIONS` | number | `10000` | The maximum number of actions that the crawler performs. For example, clicking a link, or filling a form. |
| `DAST_BROWSER_MAX_DEPTH` | number | `10` | The maximum number of chained actions that the crawler takes. For example, `Click -> Form Fill -> Click` is a depth of three. |
| `DAST_BROWSER_MAX_DEPTH` | number | `10` | The maximum number of chained actions that the crawler takes. For example, `Click -> Form Fill -> Click` is a depth of three. |
| `DAST_BROWSER_NUMBER_OF_BROWSERS` | number | `3` | The maximum number of concurrent browser instances to use. For shared runners on GitLab.com we recommended a maximum of three. Private runners with more resources may benefit from a higher number, but will likely produce little benefit after five to seven instances. |
| `DAST_BROWSER_NUMBER_OF_BROWSERS` | number | `3` | The maximum number of concurrent browser instances to use. For shared runners on GitLab.com, we recommended a maximum of three. Private runners with more resources may benefit from a higher number, but are likely to produce little benefit after five to seven instances. |
| `DAST_BROWSER_COOKIES` | dictionary | `abtesting_group:3,region:locked` | A cookie name and value to be added to every request. |
| `DAST_BROWSER_COOKIES` | dictionary | `abtesting_group:3,region:locked` | A cookie name and value to be added to every request. |
| `DAST_BROWSER_LOG` | List of strings | `brows:debug,auth:debug` | A list of modules and their intended log level. |
| `DAST_BROWSER_LOG` | List of strings | `brows:debug,auth:debug` | A list of modules and their intended log level. |
| `DAST_BROWSER_NAVIGATION_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `15s` | The maximum amount of time to wait for a browser to navigate from one page to another |
| `DAST_BROWSER_NAVIGATION_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `15s` | The maximum amount of time to wait for a browser to navigate from one page to another. |
| `DAST_BROWSER_ACTION_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `7s` | The maximum amount of time to wait for a browser to complete an action |
| `DAST_BROWSER_ACTION_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `7s` | The maximum amount of time to wait for a browser to complete an action. |
| `DAST_BROWSER_STABILITY_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `7s` | The maximum amount of time to wait for a browser to consider a page loaded and ready for analysis |
| `DAST_BROWSER_STABILITY_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `7s` | The maximum amount of time to wait for a browser to consider a page loaded and ready for analysis. |
| `DAST_BROWSER_NAVIGATION_STABILITY_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `7s` | The maximum amount of time to wait for a browser to consider a page loaded and ready for analysis after a navigation completes |
| `DAST_BROWSER_NAVIGATION_STABILITY_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `7s` | The maximum amount of time to wait for a browser to consider a page loaded and ready for analysis after a navigation completes. |
| `DAST_BROWSER_ACTION_STABILITY_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `800ms` | The maximum amount of time to wait for a browser to consider a page loaded and ready for analysis after completing an action |
| `DAST_BROWSER_ACTION_STABILITY_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `800ms` | The maximum amount of time to wait for a browser to consider a page loaded and ready for analysis after completing an action. |
| `DAST_BROWSER_SEARCH_ELEMENT_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `3s` | The maximum amount of time to allow the browser to search for new elements or navigations |
| `DAST_BROWSER_SEARCH_ELEMENT_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `3s` | The maximum amount of time to allow the browser to search for new elements or navigations. |
| `DAST_BROWSER_EXTRACT_ELEMENT_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `5s` | The maximum amount of time to allow the browser to extract newly found elements or navigations |
| `DAST_BROWSER_EXTRACT_ELEMENT_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `5s` | The maximum amount of time to allow the browser to extract newly found elements or navigations. |
| `DAST_BROWSER_ELEMENT_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `600ms` | The maximum amount of time to wait for an element before determining it is ready for analysis |
| `DAST_BROWSER_ELEMENT_TIMEOUT` | [Duration string](https://golang.org/pkg/time/#ParseDuration) | `600ms` | The maximum amount of time to wait for an element before determining it is ready for analysis. |
The [DAST variables](index.md#available-cicd-variables)`SECURE_ANALYZERS_PREFIX`, `DAST_FULL_SCAN_ENABLED`, `DAST_AUTO_UPDATE_ADDONS`, `DAST_EXCLUDE_RULES`, `DAST_REQUEST_HEADERS`, `DAST_HTML_REPORT`, `DAST_MARKDOWN_REPORT`, `DAST_XML_REPORT`,
The [DAST variables](index.md#available-cicd-variables)`SECURE_ANALYZERS_PREFIX`, `DAST_FULL_SCAN_ENABLED`, `DAST_AUTO_UPDATE_ADDONS`, `DAST_EXCLUDE_RULES`, `DAST_REQUEST_HEADERS`, `DAST_HTML_REPORT`, `DAST_MARKDOWN_REPORT`, `DAST_XML_REPORT`,
@@ -80,27 +80,27 @@ While the browser-based crawler crawls modern web applications efficiently, vuln
...
@@ -80,27 +80,27 @@ While the browser-based crawler crawls modern web applications efficiently, vuln
The crawler runs the target website in a browser with DAST/ZAP configured as the proxy server. This ensures that all requests and responses made by the browser are passively scanned by DAST/ZAP.
The crawler runs the target website in a browser with DAST/ZAP configured as the proxy server. This ensures that all requests and responses made by the browser are passively scanned by DAST/ZAP.
When running a full scan, active vulnerability checks executed by DAST/ZAP do not use a browser. This difference in how vulnerabilities are checked can cause issues that require certain features of the target website to be disabled to ensure the scan works as intended.
When running a full scan, active vulnerability checks executed by DAST/ZAP do not use a browser. This difference in how vulnerabilities are checked can cause issues that require certain features of the target website to be disabled to ensure the scan works as intended.
For example, for a target website that contains forms with Anti-CSRF tokens, a passive scan will scan as intended because the browser displays pages/forms as if a user is viewing the page.
For example, for a target website that contains forms with Anti-CSRF tokens, a passive scan works as intended because the browser displays pages and forms as if a user is viewing the page.
However, active vulnerability checks run in a full scan will not be able to submit forms containing Anti-CSRF tokens. In such cases we recommend you disable Anti-CSRF tokens when running a full scan.
However, active vulnerability checks that run in a full scan cannot submit forms containing Anti-CSRF tokens. In such cases, we recommend you disable Anti-CSRF tokens when running a full scan.
## Managing scan time
## Managing scan time
It is expected that running the browser-based crawler will result in better coverage for many web applications, when compared to the normal GitLab DAST solution.
It is expected that running the browser-based crawler results in better coverage for many web applications, when compared to the normal GitLab DAST solution.
This can come at a cost of increased scan time.
This can come at a cost of increased scan time.
You can manage the trade-off between coverage and scan time with the following measures:
You can manage the trade-off between coverage and scan time with the following measures:
- Limit the number of actions executed by the browser with the [variable](#available-cicd-variables)`DAST_BROWSER_MAX_ACTIONS`. The default is `10,000`.
- Limit the number of actions executed by the browser with the [variable](#available-cicd-variables)`DAST_BROWSER_MAX_ACTIONS`. The default is `10,000`.
- Limit the page depth that the browser-based crawler will check coverage on with the [variable](#available-cicd-variables)`DAST_BROWSER_MAX_DEPTH`. The crawler uses a breadth-first search strategy, so pages with smaller depth are crawled first. The default is `10`.
- Limit the page depth that the browser-based crawler will check coverage on with the [variable](#available-cicd-variables)`DAST_BROWSER_MAX_DEPTH`. The crawler uses a breadth-first search strategy, so pages with smaller depth are crawled first. The default is `10`.
- Vertically scaling the runner and using a higher number of browsers with [variable](#available-cicd-variables)`DAST_BROWSER_NUMBER_OF_BROWSERS`. The default is `3`.
- Vertically scale the runner and use a higher number of browsers with [variable](#available-cicd-variables)`DAST_BROWSER_NUMBER_OF_BROWSERS`. The default is `3`.
## Timeouts
## Timeouts
Due to poor network conditions or heavy application load, the default timeouts may not be applicable to your application.
Due to poor network conditions or heavy application load, the default timeouts may not be applicable to your application.
Browser-based scans offer the ability to adjust various timeouts to ensure it continues smoothly as it transitions from one page to the next. These values are configured using a [Duration string](https://golang.org/pkg/time/#ParseDuration) which allow you to configure durations with a prefix: `m` for minutes, `s` for seconds, and `ms` for milliseconds.
Browser-based scans offer the ability to adjust various timeouts to ensure it continues smoothly as it transitions from one page to the next. These values are configured using a [Duration string](https://golang.org/pkg/time/#ParseDuration), which allow you to configure durations with a prefix: `m` for minutes, `s` for seconds, and `ms` for milliseconds.
Navigations, or the act of loading a new page, usually require the most amount of time as they are
Navigations, or the act of loading a new page, usually require the most amount of time because they are
loading multiple new resources such as JavaScript or CSS files. Depending on the size of these resources, or the speed at which they are returned, the default `DAST_BROWSER_NAVIGATION_TIMEOUT` may not be sufficient.
loading multiple new resources such as JavaScript or CSS files. Depending on the size of these resources, or the speed at which they are returned, the default `DAST_BROWSER_NAVIGATION_TIMEOUT` may not be sufficient.
Stability timeouts, such as those configurable with `DAST_BROWSER_NAVIGATION_STABILITY_TIMEOUT`, `DAST_BROWSER_STABILITY_TIMEOUT`, and `DAST_BROWSER_ACTION_STABILITY_TIMEOUT` can also be configured. Stability timeouts determine when browser-based scans consider
Stability timeouts, such as those configurable with `DAST_BROWSER_NAVIGATION_STABILITY_TIMEOUT`, `DAST_BROWSER_STABILITY_TIMEOUT`, and `DAST_BROWSER_ACTION_STABILITY_TIMEOUT` can also be configured. Stability timeouts determine when browser-based scans consider
...
@@ -110,11 +110,11 @@ a page fully loaded. Browser-based scans consider a page loaded when:
...
@@ -110,11 +110,11 @@ a page fully loaded. Browser-based scans consider a page loaded when:
1. There are no open or outstanding requests that are deemed important, such as JavaScript and CSS. Media files are usually deemed unimportant.
1. There are no open or outstanding requests that are deemed important, such as JavaScript and CSS. Media files are usually deemed unimportant.
1. Depending on whether the browser executed a navigation, was forcibly transitioned, or action:
1. Depending on whether the browser executed a navigation, was forcibly transitioned, or action:
- There are no new Document Object Model (DOM) modification events after the `DAST_BROWSER_NAVIGATION_STABILITY_TIMEOUT`, `DAST_BROWSER_STABILITY_TIMEOUT` or `DAST_BROWSER_ACTION_STABILITY_TIMEOUT` durations
- There are no new Document Object Model (DOM) modification events after the `DAST_BROWSER_NAVIGATION_STABILITY_TIMEOUT`, `DAST_BROWSER_STABILITY_TIMEOUT`, or `DAST_BROWSER_ACTION_STABILITY_TIMEOUT` durations.
After these events have occurred, browser-based scans consider the page loaded and ready and attempt the next action.
After these events have occurred, browser-based scans consider the page loaded and ready, and attempt the next action.
If your application experiences latency or returns many navigation failures, consider adjusting the timeout values such in this example:
If your application experiences latency or returns many navigation failures, consider adjusting the timeout values such as in this example:
```yaml
```yaml
include:
include:
...
@@ -132,7 +132,7 @@ dast:
...
@@ -132,7 +132,7 @@ dast:
```
```
NOTE:
NOTE:
Adjusting these values may impact scan time as they adjust how long each browser waits for various activities to complete.
Adjusting these values may impact scan time because they adjust how long each browser waits for various activities to complete.
## Debugging scans using logging
## Debugging scans using logging
...
@@ -168,7 +168,7 @@ The modules that can be configured for logging are as follows:
...
@@ -168,7 +168,7 @@ The modules that can be configured for logging are as follows:
| Log module | Component overview |
| Log module | Component overview |
| ---------- | ----------- |
| ---------- | ----------- |
| `AUTH` | Used for creating an authenticated scan. |
| `AUTH` | Used for creating an authenticated scan. |
| `BROWS` | Used for querying the state/page of the browser. |
| `BROWS` | Used for querying the state or page of the browser. |
| `BPOOL` | The set of browsers that are leased out for crawling. |
| `BPOOL` | The set of browsers that are leased out for crawling. |
| `CRAWL` | Used for the core crawler algorithm. |
| `CRAWL` | Used for the core crawler algorithm. |
| `DATAB` | Used for persisting data to the internal database. |
| `DATAB` | Used for persisting data to the internal database. |