Commit 91cb8c31 authored by Adam Hegyi's avatar Adam Hegyi

Merge branch 'docs-post-merge-review-dev-guidelines-db-transactions' into 'master'

Docs: Post-merge review DB transactions doc

See merge request gitlab-org/gitlab!71282
parents 356730c6 5e13d66e
...@@ -8,17 +8,21 @@ info: To determine the technical writer assigned to the Stage/Group associated w ...@@ -8,17 +8,21 @@ info: To determine the technical writer assigned to the Stage/Group associated w
This document gives a few examples of the usage of database transactions in application code. This document gives a few examples of the usage of database transactions in application code.
For further reference please check PostgreSQL documentation about [transactions](https://www.postgresql.org/docs/current/tutorial-transactions.html). For further reference, check PostgreSQL documentation about [transactions](https://www.postgresql.org/docs/current/tutorial-transactions.html).
## Database decomposition and sharding ## Database decomposition and sharding
The [sharding group](https://about.gitlab.com/handbook/engineering/development/enablement/sharding/) plans to split the main GitLab database and move some of the database tables to other database servers. The [sharding group](https://about.gitlab.com/handbook/engineering/development/enablement/sharding/) plans
to split the main GitLab database and move some of the database tables to other database servers.
The group will start decomposing the `ci_*` related database tables first. To maintain the current application development experience, tooling and static analyzers will be added to the codebase to ensure correct data access and data modification methods. By using the correct form for defining database transactions, we can save significant refactoring work in the future. We'll start decomposing the `ci_*`-related database tables first. To maintain the current application
development experience, we'll add tooling and static analyzers to the codebase to ensure correct
data access and data modification methods. By using the correct form for defining database transactions,
we can save significant refactoring work in the future.
## The transaction block ## The transaction block
The `ActiveRecord` library provides a convenient way to group database statements into a transaction. The `ActiveRecord` library provides a convenient way to group database statements into a transaction:
```ruby ```ruby
issue = Issue.find(10) issue = Issue.find(10)
...@@ -30,16 +34,19 @@ ApplicationRecord.transaction do ...@@ -30,16 +34,19 @@ ApplicationRecord.transaction do
end end
``` ```
This transaction involves two database tables, in case of an error, each `UPDATE` statement will be rolled back to the previous, consistent state. This transaction involves two database tables. In case of an error, each `UPDATE`
statement rolls back to the previous consistent state.
NOTE: NOTE:
Avoid referencing the `ActiveRecord::Base` class and use `ApplicationRecord` instead. Avoid referencing the `ActiveRecord::Base` class and use `ApplicationRecord` instead.
## Transaction and database locks ## Transaction and database locks
When a transaction block is opened, the database will try to acquire the necessary locks on the resources. The type of locks will depend on the actual database statements. When a transaction block is opened, the database tries to acquire the necessary
locks on the resources. The type of locks depend on the actual database statements.
Consider a concurrent update scenario where the following code is executed at the same time from two different processes: Consider a concurrent update scenario where the following code is executed at the
same time from two different processes:
```ruby ```ruby
issue = Issue.find(10) issue = Issue.find(10)
...@@ -51,15 +58,22 @@ ApplicationRecord.transaction do ...@@ -51,15 +58,22 @@ ApplicationRecord.transaction do
end end
``` ```
The database will try to acquire the `FOR UPDATE` lock for the referenced `issue` and `project` records. In our case, we have two competing transactions for these locks, one of them will successfully acquire them. The other transaction will have to wait in the lock queue until the first transaction finishes. The execution of the second transaction is blocked at this point. The database tries to acquire the `FOR UPDATE` lock for the referenced `issue` and
`project` records. In our case, we have two competing transactions for these locks,
and only one of them will successfully acquire them. The other transaction will have
to wait in the lock queue until the first transaction finishes. The execution of the
second transaction is blocked at this point.
## Transaction speed ## Transaction speed
To prevent lock contention and maintain stable application performance, the transaction block should finish as fast as possible. When a transaction acquires locks, it will hold on to them until the transaction finishes. To prevent lock contention and maintain stable application performance, the transaction
block should finish as fast as possible. When a transaction acquires locks, it holds
on to them until the transaction finishes.
Apart from application performance, long-running transactions can also affect the application upgrade processes by blocking database migrations. Apart from application performance, long-running transactions can also affect application
upgrade processes by blocking database migrations.
### Dangerous example: 3rd party API calls ### Dangerous example: third-party API calls
Consider the following example: Consider the following example:
...@@ -73,20 +87,29 @@ Member.transaction do ...@@ -73,20 +87,29 @@ Member.transaction do
end end
``` ```
Here, we ensure that the `notification_email_sent` column is updated only when the `send_notification_email` method succeeds. The `send_notification_email` method executes a network request to an email sending service. If the underlying infrastructure does not specify timeouts or the network call takes too long time, the database transaction will stay open. Here, we ensure that the `notification_email_sent` column is updated only when the
`send_notification_email` method succeeds. The `send_notification_email` method
executes a network request to an email sending service. If the underlying infrastructure
does not specify timeouts or the network call takes too long time, the database transaction
stays open.
Ideally, a transaction should only contain database statements. Ideally, a transaction should only contain database statements.
Avoid doing in a `transaction` block: Avoid doing in a `transaction` block:
- External network requests such as: triggering Sidekiq jobs, sending emails, HTTP API calls and running database statements using a different connection. - External network requests such as:
- Triggering Sidekiq jobs.
- Sending emails.
- HTTP API calls.
- Running database statements using a different connection.
- File system operations. - File system operations.
- Long, CPU intensive computation. - Long, CPU intensive computation.
- Calling `sleep(n)`. - Calling `sleep(n)`.
## Explicit model referencing ## Explicit model referencing
If a transaction modifies records from the same database table, it's advised to use the `Model.transaction` block: If a transaction modifies records from the same database table, we advise to use the
`Model.transaction` block:
```ruby ```ruby
build_1 = Ci::Build.find(1) build_1 = Ci::Build.find(1)
...@@ -98,7 +121,8 @@ Ci::Build.transaction do ...@@ -98,7 +121,8 @@ Ci::Build.transaction do
end end
``` ```
The transaction above will use the same database connection for the transaction as the models in the `transaction` block. In a multi-database environment the following example would be dangerous: The transaction above uses the same database connection for the transaction as the models
in the `transaction` block. In a multi-database environment the following example is dangerous:
```ruby ```ruby
# `ci_builds` table is located on another database # `ci_builds` table is located on another database
...@@ -114,4 +138,6 @@ ActiveRecord::Base.transaction do ...@@ -114,4 +138,6 @@ ActiveRecord::Base.transaction do
end end
``` ```
The `ActiveRecord::Base` class uses a different database connection than the `Ci::Build` records. The two statements in the transaction block will not be part of the transaction and will not be rolled back in case something goes wrong. They act as 3rd part calls. The `ActiveRecord::Base` class uses a different database connection than the `Ci::Build` records.
The two statements in the transaction block will not be part of the transaction and will not be
rolled back in case something goes wrong. They act as 3rd part calls.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment