Commit e23a470f authored by Yorick Peterse's avatar Yorick Peterse

Support custom tag formats for changelogs

When generating a changelog without an explicit start commit, we try to
find the tag of the previous release. Prior to this commit, the regex
used for this was fixed. This creates a problem for projects that use a
different tag format, including our very own Omnibus project. Omnibus
uses RC tags in the following format:

    13.10.0+rc41.ee.0

According to semantic versioning, this isn't a pre-release tag. Instead,
the `rc412.ee.0` suffix is part of the build metadata.

Changing Omnibus to use a correct tag format would be time consuming,
and delay rolling out the use of the new changelog API for our own
projects. In addition, other projects could suffer from similar problems
where a tag perhaps includes a valid version, but our fixed regex
doesn't match it.

In this commit we fix this by adding support for custom regular
expressions to extract versions from tag names. We use the re2 engine
for this instead of Ruby's regex engine, based on the issues with this
as outlined in our documentation [1]. Using a re2 pattern, we try to
extract the major, minor, patch, and build metadata components. We still
skip tags that produce a prerelease component. If a tag doesn't produce
at least the major, minor and patch components, it's ignored.

The default pattern we use is based on the official semver regex, with
added support of tags starting with the letter "v" (to keep the change
backwards compatible). Users wishing to use a custom format can probably
use a much simpler pattern, as they only need to support their own use
formats. For example, for Omnibus we could simply use something along
the lines of the following pattern (minus newlines):

    ^
    (?P<major>\d+)
    \.(?P<minor>\d+)
    \.(?P<patch>\d+)
    (\+(?P<prerelease>rc\d+))?
    (\.(?P<buildmetadata>\w+\.\d+))?
    $

See https://gitlab.com/gitlab-com/gl-infra/delivery/-/issues/1551 for
more information.

[1]: https://docs.gitlab.com/ee/development/secure_coding_guidelines.html#regular-expressions-guidelines
parent e4cd9e2b
......@@ -2,7 +2,7 @@
module Repositories
# A finder class for getting the tag of the last release before a given
# version.
# version, used when generating changelogs.
#
# Imagine a project with the following tags:
#
......@@ -13,36 +13,61 @@ module Repositories
# If the version supplied is 2.1.0, the tag returned will be v2.0.0. And when
# the version is 1.1.1, or 1.2.0, the returned tag will be v1.1.0.
#
# This finder expects that all tags to consider meet the following
# requirements:
# To obtain the tags, this finder requires a regular expression (using the re2
# syntax) to be provided. This regex must produce the following named
# captures:
#
# * They start with the letter "v" followed by a version, or immediately start
# with a version
# * They use semantic versioning for the version format
# - major (required)
# - minor (required)
# - patch (required)
# - pre
# - meta
#
# Tags not meeting these requirements are ignored.
class PreviousTagFinder
TAG_REGEX = /\Av?(?<version>#{Gitlab::Regex.unbounded_semver_regex})\z/.freeze
def initialize(project)
# If the `pre` group has a value, the tag is ignored. If any of the required
# capture groups don't have a value, the tag is also ignored.
class ChangelogTagFinder
def initialize(project, regex: Gitlab::Changelog::Config::DEFAULT_TAG_REGEX)
@project = project
@regex = regex
end
def execute(new_version)
tags = {}
versions = [new_version]
begin
regex = Gitlab::UntrustedRegexp.new(@regex)
rescue RegexpError => ex
# The error messages produced by default are not very helpful, so we
# raise a better one here. We raise the specific error here so its
# message is displayed in the API (where we catch this specific
# error).
raise(
Gitlab::Changelog::Error,
"The regular expression to use for finding the previous tag for a version is invalid: #{ex.message}"
)
end
@project.repository.tags.each do |tag|
matches = tag.name.match(TAG_REGEX)
matches = regex.match(tag.name)
next unless matches
# When using this class for generating changelog data for a range of
# commits, we want to compare against the tag of the last _stable_
# release; not some random RC that came after that.
next if matches[:prerelease]
next if matches[:pre]
major = matches[:major]
minor = matches[:minor]
patch = matches[:patch]
build = matches[:meta]
next unless major && minor && patch
version = "#{major}.#{minor}.#{patch}"
version += "+#{build}" if build
version = matches[:version]
tags[version] = tag
versions << version
end
......
......@@ -61,14 +61,14 @@ module Repositories
# rubocop: enable Metrics/ParameterLists
def execute
from = start_of_commit_range
config = Gitlab::Changelog::Config.from_git(@project)
from = start_of_commit_range(config)
# For every entry we want to only include the merge request that
# originally introduced the commit, which is the oldest merge request that
# contains the commit. We fetch there merge requests in batches, reducing
# the number of SQL queries needed to get this data.
mrs_finder = MergeRequests::OldestPerCommitFinder.new(@project)
config = Gitlab::Changelog::Config.from_git(@project)
release = Gitlab::Changelog::Release
.new(version: @version, date: @date, config: config)
......@@ -98,10 +98,12 @@ module Repositories
.commit(release: release, file: @file, branch: @branch, message: @message)
end
def start_of_commit_range
def start_of_commit_range(config)
return @from if @from
if (prev_tag = PreviousTagFinder.new(@project).execute(@version))
finder = ChangelogTagFinder.new(@project, regex: config.tag_regex)
if (prev_tag = finder.execute(@version))
return prev_tag.target_commit.id
end
......
---
title: Support custom tag formats for changelogs
merge_request: 56889
author:
type: added
......@@ -312,8 +312,9 @@ Supported attributes:
If the `from` attribute is unspecified, GitLab uses the Git tag of the last
stable version that came before the version specified in the `version`
attribute. For this to work, your project must create Git tags for versions
using one of the following formats:
attribute. This requires that Git tag names follow a specific format, allowing
GitLab to extract a version from the tag names. By default, GitLab considers
tags using these formats:
- `vX.Y.Z`
- `X.Y.Z`
......@@ -622,3 +623,51 @@ In an entry, the following variables are available (here `foo.bar` means that
The `author` and `merge_request` objects might not be present if the data
couldn't be determined. For example, when a commit is created without a
corresponding merge request, no merge request is displayed.
### Customize the tag format when extracting versions
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/56889) in GitLab 13.11.
GitLab uses a regular expression (using the
[re2](https://github.com/google/re2/) engine and syntax) to extract a semantic
version from tag names. The default regular expression is:
```plaintext
^v?(?P<major>0|[1-9]\d*)\.(?P<minor>0|[1-9]\d*)\.(?P<patch>0|[1-9]\d*)(?:-(?P<pre>(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+(?P<meta>[0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
```
This regular expression is based on the official
[semantic versioning](https://semver.org/) regular expression, and also includes
support for tag names that start with the letter `v`.
If your project uses a different format for tags, you can specify a different
regular expression. The regular expression used _must_ produce the following
capture groups. If any of these capture groups are missing, the tag is ignored:
- `major`
- `minor`
- `patch`
The following capture groups are optional:
- `pre`: If set, the tag is ignored. Ignoring `pre` tags ensures release candidate
tags and other pre-release tags are not considered when determining the range of
commits to generate a changelog for.
- `meta`: (Optional) Specifies build metadata.
Using this information, GitLab builds a map of Git tags and their release
versions. It then determines what the latest tag is, based on the version
extracted from each tag.
To specify a custom regular expression, use the `tag_regex` setting in your
changelog configuration YAML file. For example, this pattern matches tag names
such as `version-1.2.3` but not `version-1.2`.
```yaml
---
tag_regex: '^version-(?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)$'
```
To test if your regular expression is working, you can use websites such as
[regex101](https://regex101.com/). If the regular expression syntax is invalid,
an error is produced when generating a changelog.
......@@ -17,7 +17,24 @@ module Gitlab
# The default template to use for generating release sections.
DEFAULT_TEMPLATE = File.read(File.join(__dir__, 'template.tpl'))
attr_accessor :date_format, :categories, :template
# The regex to use for extracting the version from a Git tag.
#
# This regex is based on the official semantic versioning regex (as found
# on https://semver.org/), with the addition of allowing a "v" at the
# start of a tag name.
#
# We default to a strict regex as we simply don't know what kind of data
# users put in their tags. As such, using simpler patterns (e.g. just
# `\d+` for the major version) could lead to unexpected results.
#
# We use a String here as `Gitlab::UntrustedRegexp` is a mutable object.
DEFAULT_TAG_REGEX = '^v?(?P<major>0|[1-9]\d*)' \
'\.(?P<minor>0|[1-9]\d*)' \
'\.(?P<patch>0|[1-9]\d*)' \
'(?:-(?P<pre>(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))' \
'?(?:\+(?P<meta>[0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$'
attr_accessor :date_format, :categories, :template, :tag_regex
def self.from_git(project)
if (yaml = project.repository.changelog_config)
......@@ -46,6 +63,10 @@ module Gitlab
end
end
if (regex = hash['tag_regex'])
config.tag_regex = regex
end
config
end
......@@ -54,6 +75,7 @@ module Gitlab
@date_format = DEFAULT_DATE_FORMAT
@template = Parser.new.parse_and_transform(DEFAULT_TEMPLATE)
@categories = {}
@tag_regex = DEFAULT_TAG_REGEX
end
def contributor?(user)
......
......@@ -35,6 +35,10 @@ module Gitlab
matches
end
def match(text)
scan_regexp.match(text)
end
def match?(text)
text.present? && scan(text).present?
end
......
......@@ -2,11 +2,18 @@
require 'spec_helper'
RSpec.describe Repositories::PreviousTagFinder do
RSpec.describe Repositories::ChangelogTagFinder do
let(:project) { build_stubbed(:project) }
let(:finder) { described_class.new(project) }
describe '#execute' do
context 'when the regular expression is invalid' do
it 'raises Gitlab::Changelog::Error' do
expect { described_class.new(project, regex: 'foo+*').execute('1.2.3') }
.to raise_error(Gitlab::Changelog::Error)
end
end
context 'when there is a previous tag' do
it 'returns the previous tag' do
tag1 = double(:tag1, name: 'v1.0.0')
......@@ -15,10 +22,11 @@ RSpec.describe Repositories::PreviousTagFinder do
tag4 = double(:tag4, name: '0.9.0')
tag5 = double(:tag5, name: 'v0.8.0-pre1')
tag6 = double(:tag6, name: 'v0.7.0')
tag7 = double(:tag7, name: '0.5.0+42.ee.0')
allow(project.repository)
.to receive(:tags)
.and_return([tag1, tag3, tag2, tag4, tag5, tag6])
.and_return([tag1, tag3, tag2, tag4, tag5, tag6, tag7])
expect(finder.execute('2.1.0')).to eq(tag3)
expect(finder.execute('2.0.0')).to eq(tag2)
......@@ -26,6 +34,7 @@ RSpec.describe Repositories::PreviousTagFinder do
expect(finder.execute('1.0.1')).to eq(tag1)
expect(finder.execute('1.0.0')).to eq(tag4)
expect(finder.execute('0.9.0')).to eq(tag6)
expect(finder.execute('0.6.0')).to eq(tag7)
end
end
......
......@@ -37,7 +37,8 @@ RSpec.describe Gitlab::Changelog::Config do
project,
'date_format' => 'foo',
'template' => 'bar',
'categories' => { 'foo' => 'bar' }
'categories' => { 'foo' => 'bar' },
'tag_regex' => 'foo'
)
expect(config.date_format).to eq('foo')
......@@ -45,6 +46,7 @@ RSpec.describe Gitlab::Changelog::Config do
.to be_instance_of(Gitlab::Changelog::AST::Expressions)
expect(config.categories).to eq({ 'foo' => 'bar' })
expect(config.tag_regex).to eq('foo')
end
it 'raises Error when the categories are not a Hash' do
......
......@@ -136,4 +136,22 @@ RSpec.describe Gitlab::UntrustedRegexp do
end
end
end
describe '#match' do
context 'when there are matches' do
it 'returns a match object' do
result = described_class.new('(?P<number>\d+)').match('hello 10')
expect(result[:number]).to eq('10')
end
end
context 'when there are no matches' do
it 'returns nil' do
result = described_class.new('(?P<number>\d+)').match('hello')
expect(result).to be_nil
end
end
end
end
......@@ -130,13 +130,14 @@ RSpec.describe Repositories::ChangelogService do
describe '#start_of_commit_range' do
let(:project) { build_stubbed(:project) }
let(:user) { build_stubbed(:user) }
let(:config) { Gitlab::Changelog::Config.new(project) }
context 'when the "from" argument is specified' do
it 'returns the value of the argument' do
service = described_class
.new(project, user, version: '1.0.0', from: 'foo', to: 'bar')
expect(service.start_of_commit_range).to eq('foo')
expect(service.start_of_commit_range(config)).to eq('foo')
end
end
......@@ -145,12 +146,12 @@ RSpec.describe Repositories::ChangelogService do
service = described_class
.new(project, user, version: '1.0.0', to: 'bar')
finder_spy = instance_spy(Repositories::PreviousTagFinder)
finder_spy = instance_spy(Repositories::ChangelogTagFinder)
tag = double(:tag, target_commit: double(:commit, id: '123'))
allow(Repositories::PreviousTagFinder)
allow(Repositories::ChangelogTagFinder)
.to receive(:new)
.with(project)
.with(project, regex: an_instance_of(String))
.and_return(finder_spy)
allow(finder_spy)
......@@ -158,18 +159,18 @@ RSpec.describe Repositories::ChangelogService do
.with('1.0.0')
.and_return(tag)
expect(service.start_of_commit_range).to eq('123')
expect(service.start_of_commit_range(config)).to eq('123')
end
it 'raises an error when no tag is found' do
service = described_class
.new(project, user, version: '1.0.0', to: 'bar')
finder_spy = instance_spy(Repositories::PreviousTagFinder)
finder_spy = instance_spy(Repositories::ChangelogTagFinder)
allow(Repositories::PreviousTagFinder)
allow(Repositories::ChangelogTagFinder)
.to receive(:new)
.with(project)
.with(project, regex: an_instance_of(String))
.and_return(finder_spy)
allow(finder_spy)
......@@ -177,7 +178,7 @@ RSpec.describe Repositories::ChangelogService do
.with('1.0.0')
.and_return(nil)
expect { service.start_of_commit_range }
expect { service.start_of_commit_range(config) }
.to raise_error(Gitlab::Changelog::Error)
end
end
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment