Commit d82571bf authored by Yorick Peterse's avatar Yorick Peterse

Replace the changelog regex parser with Parslet

In https://gitlab.com/gitlab-org/gitlab/-/merge_requests/50063 we
introduced code for generating Markdown changelogs using the API, using
a custom template language. The compiler for this language would convert
a custom syntax into ERB, making sure arbitrary code execution isn't
possible; or so we thought. In
https://gitlab.com/gitlab-org/gitlab/-/issues/300224 we found a way to
bypass the template engine's harness, and run arbitrary Ruby code.

In response to this issue, I decided to investigate replacing the setup
with something more secure. We always planned on doing so when deemed
necessary, unfortunately that need arrived sooner than expected.

In this commit we replace the regex/ERB based setup with a parser built
using Parslet (http://kschiess.github.io/parslet/). Parslet makes it
pretty easy to write a parser, and was already an indirect dependency of
GitLab (through the license_finder Gem). This new parser doesn't allow
for arbitrary code execution, doesn't depend on ERB, and is less fragile
compared to the old setup. Templates are executed by walking and
evaluating the AST nodes the parser produces. While this won't break any
speed records, it's easy to maintain and understand, and fast enough for
our needs.

In this new setup there is a slight difference compared to the old
setup. In the old setup, expression tags on their own line don't add a
new line. So this:

    foo
    {% if something %}{% end %}
    bar

Compiles into this:

    foo
    bar

Getting this right using the Parslet parser proved difficult, so we took
a slightly different approach: any newline following an expression tag
(if, else, end, and each) consumes the newline that directly follows it
(if any). This requires only a small change in the template, is easy to
implement, and still intuitive to the user.

As part of this commit we also fix a small bug that would lead to empty
entries being included in the template, and add a note about using the
right YAML syntax to preserve newlines.
parent 1982e005
...@@ -519,3 +519,5 @@ gem 'webauthn', '~> 2.3' ...@@ -519,3 +519,5 @@ gem 'webauthn', '~> 2.3'
# IPAddress utilities # IPAddress utilities
gem 'ipaddress', '~> 0.8.3' gem 'ipaddress', '~> 0.8.3'
gem 'parslet', '~> 1.8'
...@@ -1464,6 +1464,7 @@ DEPENDENCIES ...@@ -1464,6 +1464,7 @@ DEPENDENCIES
omniauth_openid_connect (~> 0.3.5) omniauth_openid_connect (~> 0.3.5)
org-ruby (~> 0.9.12) org-ruby (~> 0.9.12)
parallel (~> 1.19) parallel (~> 1.19)
parslet (~> 1.8)
peek (~> 1.1) peek (~> 1.1)
pg (~> 1.1) pg (~> 1.1)
pg_query (~> 1.3.0) pg_query (~> 1.3.0)
......
...@@ -410,6 +410,7 @@ follows: ...@@ -410,6 +410,7 @@ follows:
- [{{ title }}]({{ commit.reference }})\ - [{{ title }}]({{ commit.reference }})\
{% if author.contributor %} by {{ author.reference }}{% end %}\ {% if author.contributor %} by {{ author.reference }}{% end %}\
{% if merge_request %} ([merge request]({{ merge_request.reference }})){% end %} {% if merge_request %} ([merge request]({{ merge_request.reference }})){% end %}
{% end %} {% end %}
{% end %} {% end %}
...@@ -457,11 +458,40 @@ If a line ends in a backslash, the next newline is ignored. This allows you to ...@@ -457,11 +458,40 @@ If a line ends in a backslash, the next newline is ignored. This allows you to
wrap code across multiple lines, without introducing unnecessary newlines in the wrap code across multiple lines, without introducing unnecessary newlines in the
Markdown output. Markdown output.
Tags that use `{%` and `%}` (known as expression tags) consume the newline that
directly follows them, if any. This means that this:
```plaintext
---
{% if foo %}
bar
{% end %}
---
```
Compiles into this:
```plaintext
---
bar
---
```
Instead of this:
```plaintext
---
bar
---
```
You can specify a custom template in your configuration like so: You can specify a custom template in your configuration like so:
```yaml ```yaml
--- ---
template: > template: |
{% if categories %} {% if categories %}
{% each categories %} {% each categories %}
### {{ title }} ### {{ title }}
...@@ -469,6 +499,7 @@ template: > ...@@ -469,6 +499,7 @@ template: >
{% each entries %} {% each entries %}
- [{{ title }}]({{ commit.reference }})\ - [{{ title }}]({{ commit.reference }})\
{% if author.contributor %} by {{ author.reference }}{% end %} {% if author.contributor %} by {{ author.reference }}{% end %}
{% end %} {% end %}
{% end %} {% end %}
...@@ -477,6 +508,9 @@ template: > ...@@ -477,6 +508,9 @@ template: >
{% end %} {% end %}
``` ```
Note that when specifying the template you should use `template: |` and not
`template: >`, as the latter doesn't preserve newlines in the template.
### Template data ### Template data
At the top level, the following variable is available: At the top level, the following variable is available:
......
...@@ -228,7 +228,7 @@ module API ...@@ -228,7 +228,7 @@ module API
service.execute service.execute
status(200) status(200)
rescue => ex rescue Gitlab::Changelog::Error => ex
render_api_error!("Failed to generate the changelog: #{ex.message}", 500) render_api_error!("Failed to generate the changelog: #{ex.message}", 500)
end end
end end
......
# frozen_string_literal: true
module Gitlab
module Changelog
# AST nodes to evaluate when rendering a template.
#
# Evaluating an AST is done by walking over the nodes and calling
# `evaluate`. This method takes two arguments:
#
# 1. An instance of `EvalState`, used for tracking data such as the number
# of nested loops.
# 2. An object used as the data for the current scope. This can be an Array,
# Hash, String, or something else. It's up to the AST node to determine
# what to do with it.
#
# While tree walking interpreters (such as implemented here) aren't usually
# the fastest type of interpreter, they are:
#
# 1. Fast enough for our use case
# 2. Easy to implement and maintain
#
# In addition, our AST interpreter doesn't allow for arbitrary code
# execution, unlike existing template engines such as Mustache
# (https://github.com/mustache/mustache/issues/244) or ERB.
#
# Our interpreter also takes care of limiting the number of nested loops.
# And unlike Liquid, our interpreter is much smaller and thus has a smaller
# attack surface. Liquid isn't without its share of issues, such as
# https://github.com/Shopify/liquid/pull/1071.
#
# We also evaluated using Handlebars using the project
# https://github.com/SmartBear/ruby-handlebars. Sadly, this implementation
# of Handlebars doesn't support control of whitespace
# (https://github.com/SmartBear/ruby-handlebars/issues/37), and the project
# didn't appear to be maintained that much.
#
# This doesn't mean these template engines aren't good, instead it means
# they won't work for our use case. For more information, refer to the
# comment https://gitlab.com/gitlab-org/gitlab/-/merge_requests/50063#note_469293322.
module AST
# An identifier in a selector.
Identifier = Struct.new(:name) do
def evaluate(state, data)
return data if name == 'it'
data[name] if data.is_a?(Hash)
end
end
# An integer used in a selector.
Integer = Struct.new(:value) do
def evaluate(state, data)
data[value] if data.is_a?(Array)
end
end
# A selector used for loading a value.
Selector = Struct.new(:steps) do
def evaluate(state, data)
steps.reduce(data) do |current, step|
break if current.nil?
step.evaluate(state, current)
end
end
end
# A tag used for displaying a value in the output.
Variable = Struct.new(:selector) do
def evaluate(state, data)
selector.evaluate(state, data).to_s
end
end
# A collection of zero or more expressions.
Expressions = Struct.new(:nodes) do
def evaluate(state, data)
nodes.map { |node| node.evaluate(state, data) }.join('')
end
end
# A single text node.
Text = Struct.new(:text) do
def evaluate(*)
text
end
end
# An `if` expression, with an optional `else` clause.
If = Struct.new(:condition, :true_body, :false_body) do
def evaluate(state, data)
result =
if truthy?(condition.evaluate(state, data))
true_body.evaluate(state, data)
elsif false_body
false_body.evaluate(state, data)
end
result.to_s
end
def truthy?(value)
# We treat empty collections and such as false, removing the need for
# some sort of `if length(x) > 0` expression.
value.respond_to?(:empty?) ? !value.empty? : !!value
end
end
# An `each` expression.
Each = Struct.new(:collection, :body) do
def evaluate(state, data)
values = collection.evaluate(state, data)
return '' unless values.respond_to?(:each)
# While unlikely to happen, it's possible users attempt to nest many
# loops in order to negatively impact the GitLab instance. To make
# this more difficult, we limit the number of nested loops a user can
# create.
state.enter_loop do
values.map { |value| body.evaluate(state, value) }.join('')
end
end
end
# A class for transforming a raw Parslet AST into a more structured/easier
# to work with AST.
#
# For more information about Parslet transformations, refer to the
# documentation at http://kschiess.github.io/parslet/transform.html.
class Transformer < Parslet::Transform
rule(ident: simple(:name)) { Identifier.new(name.to_s) }
rule(int: simple(:name)) { Integer.new(name.to_i) }
rule(text: simple(:text)) { Text.new(text.to_s) }
rule(exprs: subtree(:nodes)) { Expressions.new(nodes) }
rule(selector: sequence(:steps)) { Selector.new(steps) }
rule(selector: simple(:step)) { Selector.new([step]) }
rule(variable: simple(:selector)) { Variable.new(selector) }
rule(each: simple(:values), body: simple(:body)) do
Each.new(values, body)
end
rule(if: simple(:cond), true_body: simple(:true_body)) do
If.new(cond, true_body)
end
rule(
if: simple(:cond),
true_body: simple(:true_body),
false_body: simple(:false_body)
) do
If.new(cond, true_body, false_body)
end
end
end
end
end
...@@ -4,8 +4,6 @@ module Gitlab ...@@ -4,8 +4,6 @@ module Gitlab
module Changelog module Changelog
# A class used for committing a release's changelog to a Git repository. # A class used for committing a release's changelog to a Git repository.
class Committer class Committer
CommitError = Class.new(StandardError)
def initialize(project, user) def initialize(project, user)
@project = project @project = project
@user = user @user = user
...@@ -25,7 +23,7 @@ module Gitlab ...@@ -25,7 +23,7 @@ module Gitlab
# When retrying, we need to reprocess the existing changelog from # When retrying, we need to reprocess the existing changelog from
# scratch, otherwise we may end up throwing away changes. As such, all # scratch, otherwise we may end up throwing away changes. As such, all
# the logic is contained within the retry block. # the logic is contained within the retry block.
Retriable.retriable(on: CommitError) do Retriable.retriable(on: Error) do
commit = Gitlab::Git::Commit.last_for_path( commit = Gitlab::Git::Commit.last_for_path(
@project.repository, @project.repository,
branch, branch,
...@@ -57,7 +55,7 @@ module Gitlab ...@@ -57,7 +55,7 @@ module Gitlab
result = service.execute result = service.execute
raise CommitError.new(result[:message]) if result[:status] != :success raise Error.new(result[:message]) if result[:status] != :success
end end
end end
......
...@@ -4,8 +4,6 @@ module Gitlab ...@@ -4,8 +4,6 @@ module Gitlab
module Changelog module Changelog
# Configuration settings used when generating changelogs. # Configuration settings used when generating changelogs.
class Config class Config
ConfigError = Class.new(StandardError)
# When rendering changelog entries, authors are not included. # When rendering changelog entries, authors are not included.
AUTHORS_NONE = 'none' AUTHORS_NONE = 'none'
...@@ -37,17 +35,14 @@ module Gitlab ...@@ -37,17 +35,14 @@ module Gitlab
end end
if (template = hash['template']) if (template = hash['template'])
# We use the full namespace here (and further down) as otherwise Rails config.template = Parser.new.parse_and_transform(template)
# may use the wrong constant when autoloading is used.
config.template =
::Gitlab::Changelog::Template::Compiler.new.compile(template)
end end
if (categories = hash['categories']) if (categories = hash['categories'])
if categories.is_a?(Hash) if categories.is_a?(Hash)
config.categories = categories config.categories = categories
else else
raise ConfigError, 'The "categories" configuration key must be a Hash' raise Error, 'The "categories" configuration key must be a Hash'
end end
end end
...@@ -57,8 +52,7 @@ module Gitlab ...@@ -57,8 +52,7 @@ module Gitlab
def initialize(project) def initialize(project)
@project = project @project = project
@date_format = DEFAULT_DATE_FORMAT @date_format = DEFAULT_DATE_FORMAT
@template = @template = Parser.new.parse_and_transform(DEFAULT_TEMPLATE)
::Gitlab::Changelog::Template::Compiler.new.compile(DEFAULT_TEMPLATE)
@categories = {} @categories = {}
end end
......
# frozen_string_literal: true
module Gitlab
module Changelog
# An error raised when a changelog couldn't be generated.
Error = Class.new(StandardError)
end
end
# frozen_string_literal: true
module Gitlab
module Changelog
# A class for tracking state when evaluating a template
class EvalState
MAX_LOOPS = 4
def initialize
@loops = 0
end
def enter_loop
if @loops == MAX_LOOPS
raise Error, "You can only nest up to #{MAX_LOOPS} loops"
end
@loops += 1
retval = yield
@loops -= 1
retval
end
end
end
end
# frozen_string_literal: true
module Gitlab
module Changelog
# A parser for the template syntax used for generating changelogs.
#
# As a quick primer on the template syntax, a basic template looks like
# this:
#
# {% each users %}
# Name: {{name}}
# Age: {{age}}
#
# {% if birthday %}
# This user is celebrating their birthday today! Yay!
# {% end %}
# {% end %}
#
# For more information, refer to the Parslet documentation found at
# http://kschiess.github.io/parslet/.
class Parser < Parslet::Parser
root(:exprs)
rule(:exprs) do
(
variable | if_expr | each_expr | escaped | text | newline
).repeat.as(:exprs)
end
rule(:space) { match('[ \\t]') }
rule(:whitespace) { match('\s').repeat }
rule(:lf) { str("\n") }
rule(:newline) { lf.as(:text) }
# Escaped newlines are ignored, allowing the user to control the
# whitespace in the output. All other escape sequences are treated as
# literal text.
#
# For example, this:
#
# foo \
# bar
#
# Is parsed into this:
#
# foo bar
rule(:escaped) do
backslash = str('\\')
(backslash >> lf).ignore | (backslash >> chars).as(:text)
end
# A sequence of regular characters, with the exception of newlines and
# escaped newlines.
rule(:chars) do
char = match("[^{\\\\\n]")
# The rules here are such that we do treat single curly braces or
# non-opening tags (e.g. `{foo}`) as text, but not opening tags
# themselves (e.g. `{{`).
(
char.repeat(1) | curly_open >> (curly_open | percent).absent?
).repeat(1)
end
rule(:text) { chars.as(:text) }
# An integer, limited to 10 digits (= a 32 bits integer).
#
# The size is limited to prevents users from creating integers that are
# too large, as this may result in runtime errors.
rule(:integer) { match('\d').repeat(1, 10).as(:int) }
# An identifier to look up in a data structure.
#
# We only support simple ASCII identifiers as we simply don't have a need
# for more complex identifiers (e.g. those containing multibyte
# characters).
rule(:ident) { match('[a-zA-Z_]').repeat(1).as(:ident) }
# A selector is used for reading a value, consisting of one or more
# "steps".
#
# Examples:
#
# name
# users.0.name
# 0
# it
rule(:selector) do
step = ident | integer
whitespace >>
(step >> (str('.') >> step).repeat).as(:selector) >>
whitespace
end
rule(:curly_open) { str('{') }
rule(:curly_close) { str('}') }
rule(:percent) { str('%') }
# A variable tag.
#
# Examples:
#
# {{name}}
# {{users.0.name}}
rule(:variable) do
curly_open.repeat(2) >> selector.as(:variable) >> curly_close.repeat(2)
end
rule(:expr_open) { curly_open >> percent >> whitespace }
rule(:expr_close) do
# Since whitespace control is important (as Markdown is whitespace
# sensitive), we default to stripping a newline that follows a %} tag.
# This is less annoying compared to having to opt-in to this behaviour.
whitespace >> percent >> curly_close >> lf.maybe.ignore
end
rule(:end_tag) { expr_open >> str('end') >> expr_close }
# An `if` expression, with an optional `else` clause.
#
# Examples:
#
# {% if foo %}
# yes
# {% end %}
#
# {% if foo %}
# yes
# {% else %}
# no
# {% end %}
rule(:if_expr) do
else_tag =
expr_open >> str('else') >> expr_close >> exprs.as(:false_body)
expr_open >>
str('if') >>
space.repeat(1) >>
selector.as(:if) >>
expr_close >>
exprs.as(:true_body) >>
else_tag.maybe >>
end_tag
end
# An `each` expression, used for iterating over collections.
#
# Example:
#
# {% each users %}
# * {{name}}
# {% end %}
rule(:each_expr) do
expr_open >>
str('each') >>
space.repeat(1) >>
selector.as(:each) >>
expr_close >>
exprs.as(:body) >>
end_tag
end
def parse_and_transform(input)
AST::Transformer.new.apply(parse(input))
rescue Parslet::ParseFailed => ex
# We raise a custom error so it's easier to catch different changelog
# related errors. In addition, this ensures the caller of this method
# doesn't depend on a Parslet specific error class.
raise Error.new("Failed to parse the template: #{ex.message}")
end
end
end
end
...@@ -54,14 +54,16 @@ module Gitlab ...@@ -54,14 +54,16 @@ module Gitlab
end end
def to_markdown def to_markdown
state = EvalState.new
data = { 'categories' => entries_for_template }
# While not critical, we would like release sections to be separated by # While not critical, we would like release sections to be separated by
# an empty line in the changelog; ensuring it's readable even in its # an empty line in the changelog; ensuring it's readable even in its
# raw form. # raw form.
# #
# Since it can be a bit tricky to get this right using Liquid, we # Since it can be a bit tricky to get this right in a template, we
# enforce an empty line separator ourselves. # enforce an empty line separator ourselves.
markdown = markdown = @config.template.evaluate(state, data).strip
@config.template.render('categories' => entries_for_template).strip
# The release header can't be changed using the Liquid template, as we # The release header can't be changed using the Liquid template, as we
# need this to be in a known format. Without this restriction, we won't # need this to be in a known format. Without this restriction, we won't
...@@ -80,14 +82,20 @@ module Gitlab ...@@ -80,14 +82,20 @@ module Gitlab
end end
def entries_for_template def entries_for_template
@entries.map do |category, entries| rows = []
{
@entries.each do |category, entries|
next if entries.empty?
rows << {
'title' => category, 'title' => category,
'count' => entries.length, 'count' => entries.length,
'single_change' => entries.length == 1, 'single_change' => entries.length == 1,
'entries' => entries 'entries' => entries
} }
end end
rows
end end
end end
end end
......
...@@ -6,6 +6,7 @@ ...@@ -6,6 +6,7 @@
- [{{ title }}]({{ commit.reference }})\ - [{{ title }}]({{ commit.reference }})\
{% if author.contributor %} by {{ author.reference }}{% end %}\ {% if author.contributor %} by {{ author.reference }}{% end %}\
{% if merge_request %} ([merge request]({{ merge_request.reference }})){% end %} {% if merge_request %} ([merge request]({{ merge_request.reference }})){% end %}
{% end %} {% end %}
{% end %} {% end %}
......
# frozen_string_literal: true
module Gitlab
module Changelog
module Template
# Compiler is used for turning a minimal user templating language into an
# ERB template, without giving the user access to run arbitrary code.
#
# The template syntax is deliberately made as minimal as possible, and
# only supports the following:
#
# * Printing a value
# * Iterating over collections
# * if/else
#
# The syntax looks as follows:
#
# {% each users %}
#
# Name: {{user}}
# Likes cats: {% if likes_cats %}yes{% else %}no{% end %}
#
# {% end %}
#
# Newlines can be escaped by ending a line with a backslash. So this:
#
# foo \
# bar
#
# Is the same as this:
#
# foo bar
#
# Templates are compiled into ERB templates, while taking care to make
# sure the user can't run arbitrary code. By using ERB we can let it do
# the heavy lifting of rendering data; all we need to provide is a
# translation layer.
#
# # Security
#
# The template syntax this compiler exposes is safe to be used by
# untrusted users. Not only are they unable to run arbitrary code, the
# compiler also enforces a limit on the integer sizes and the number of
# nested loops. ERB tags added by the user are also disabled.
class Compiler
# A pattern to match a single integer, with an upper size limit.
#
# We enforce a limit of 10 digits (= a 32 bits integer) so users can't
# trigger the allocation of infinitely large bignums, or trigger
# RangeError errors when using such integers to access an array value.
INTEGER = /^\d{1,10}$/.freeze
# The name/path of a variable, such as `user.address.city`.
#
# It's important that this regular expression _doesn't_ allow for
# anything but letters, numbers, and underscores, otherwise a user may
# use those to "escape" our template and run arbirtary Ruby code. For
# example, take this variable:
#
# {{') ::Kernel.exit #'}}
#
# This would then be compiled into:
#
# <%= read(variables, '') ::Kernel.exit #'') %>
#
# Restricting the allowed characters makes this impossible.
VAR_NAME = /([\w\.]+)/.freeze
# A variable tag, such as `{{username}}`.
VAR = /{{ \s* #{VAR_NAME} \s* }}/x.freeze
# The opening tag for a statement.
STM_START = /{% \s*/x.freeze
# The closing tag for a statement.
STM_END = /\s* %}/x.freeze
# A regular `end` closing tag.
NORMAL_END = /#{STM_START} end #{STM_END}/x.freeze
# An `end` closing tag on its own line, without any non-whitespace
# preceding or following it.
#
# These tags need some special care to make it easier to control
# whitespace.
LONELY_END = /^\s*#{NORMAL_END}\s$/x.freeze
# An `else` tag.
ELSE = /#{STM_START} else #{STM_END}/x.freeze
# The start of an `each` tag.
EACH = /#{STM_START} each \s+ #{VAR_NAME} #{STM_END}/x.freeze
# The start of an `if` tag.
IF = /#{STM_START} if \s+ #{VAR_NAME} #{STM_END}/x.freeze
# The pattern to use for escaping newlines.
ESCAPED_NEWLINE = /\\\n$/.freeze
# The start tag for ERB tags. These tags will be escaped, preventing
# users from using ERB directly.
ERB_START_TAG = /<\\?\s*\\?\s*%/.freeze
def compile(template)
transformed_lines = ['<% it = variables %>']
# ERB tags must be stripped here, otherwise a user may introduce ERB
# tags by making clever use of whitespace. See
# https://gitlab.com/gitlab-org/gitlab/-/issues/300224 for more
# information.
template = template.gsub(ERB_START_TAG, '<%%')
template.each_line { |line| transformed_lines << transform(line) }
# We use the full namespace here as otherwise Rails may use the wrong
# constant when autoloading is used.
::Gitlab::Changelog::Template::Template.new(transformed_lines.join)
end
def transform(line)
line.gsub!(ESCAPED_NEWLINE, '')
# This replacement ensures that "end" blocks on their own lines
# don't add extra newlines. Using an ERB -%> tag sadly swallows too
# many newlines.
line.gsub!(LONELY_END, '<% end %>')
line.gsub!(NORMAL_END, '<% end %>')
line.gsub!(ELSE, '<% else -%>')
line.gsub!(EACH) do
# No, `it; variables` isn't a syntax error. Using `;` marks
# `variables` as block-local, making it possible to re-assign it
# without affecting outer definitions of this variable. We use
# this to scope template variables to the right input Hash.
"<% each(#{read_path(Regexp.last_match(1))}) do |it; variables| -%><% variables = it -%>"
end
line.gsub!(IF) { "<% if truthy?(#{read_path(Regexp.last_match(1))}) -%>" }
line.gsub!(VAR) { "<%= #{read_path(Regexp.last_match(1))} %>" }
line
end
def read_path(path)
return path if path == 'it'
args = path.split('.')
args.map! { |arg| arg.match?(INTEGER) ? "#{arg}" : "'#{arg}'" }
"read(variables, #{args.join(', ')})"
end
end
end
end
end
# frozen_string_literal: true
module Gitlab
module Changelog
module Template
# Context is used to provide a binding/context to ERB templates used for
# rendering changelogs.
#
# This class extends BasicObject so that we only expose the bare minimum
# needed to render the ERB template.
class Context < BasicObject
MAX_NESTED_LOOPS = 4
def initialize(variables)
@variables = variables
@loop_nesting = 0
end
def get_binding
::Kernel.binding
end
def each(value, &block)
max = MAX_NESTED_LOOPS
if @loop_nesting == max
::Kernel.raise(
::Template::TemplateError.new("You can only nest up to #{max} loops")
)
end
@loop_nesting += 1
result = value.each(&block) if value.respond_to?(:each)
@loop_nesting -= 1
result
end
# rubocop: disable Style/TrivialAccessors
def variables
@variables
end
# rubocop: enable Style/TrivialAccessors
def read(source, *steps)
current = source
steps.each do |step|
case current
when ::Hash
current = current[step]
when ::Array
return '' unless step.is_a?(::Integer)
current = current[step]
else
break
end
end
current
end
def truthy?(value)
value.respond_to?(:any?) ? value.any? : !!value
end
end
end
end
end
# frozen_string_literal: true
module Gitlab
module Changelog
module Template
# A wrapper around an ERB template user for rendering changelogs.
class Template
TemplateError = Class.new(StandardError)
def initialize(erb)
# Don't change the trim mode, as this may require changes to the
# regular expressions used to turn the template syntax into ERB
# tags.
@erb = ERB.new(erb, trim_mode: '-')
end
def render(data)
context = Context.new(data).get_binding
# ERB produces a SyntaxError when processing templates, as it
# internally uses eval() for this.
@erb.result(context)
rescue SyntaxError
raise TemplateError.new("The template's syntax is invalid")
end
end
end
end
end
# frozen_string_literal: true
require 'spec_helper'
RSpec.describe Gitlab::Changelog::AST::Identifier do
let(:state) { Gitlab::Changelog::EvalState.new }
describe '#evaluate' do
it 'evaluates a selector' do
data = { 'number' => 10 }
expect(described_class.new('number').evaluate(state, data)).to eq(10)
end
it 'returns nil if the key is not set' do
expect(described_class.new('number').evaluate(state, {})).to be_nil
end
it 'returns nil if the input is not a Hash' do
expect(described_class.new('number').evaluate(state, 45)).to be_nil
end
it 'returns the current data when using the special identifier "it"' do
expect(described_class.new('it').evaluate(state, 45)).to eq(45)
end
end
end
RSpec.describe Gitlab::Changelog::AST::Integer do
let(:state) { Gitlab::Changelog::EvalState.new }
describe '#evaluate' do
it 'evaluates a selector' do
expect(described_class.new(0).evaluate(state, [10])).to eq(10)
end
it 'returns nil if the index is not set' do
expect(described_class.new(1).evaluate(state, [10])).to be_nil
end
it 'returns nil if the input is not an Array' do
expect(described_class.new(0).evaluate(state, {})).to be_nil
end
end
end
RSpec.describe Gitlab::Changelog::AST::Selector do
let(:state) { Gitlab::Changelog::EvalState.new }
let(:data) { { 'numbers' => [10] } }
describe '#evaluate' do
it 'evaluates a selector' do
ident = Gitlab::Changelog::AST::Identifier.new('numbers')
int = Gitlab::Changelog::AST::Integer.new(0)
expect(described_class.new([ident, int]).evaluate(state, data)).to eq(10)
end
it 'evaluates a selector that returns nil' do
int = Gitlab::Changelog::AST::Integer.new(0)
expect(described_class.new([int]).evaluate(state, data)).to be_nil
end
end
end
RSpec.describe Gitlab::Changelog::AST::Variable do
let(:state) { Gitlab::Changelog::EvalState.new }
let(:data) { { 'numbers' => [10] } }
describe '#evaluate' do
it 'evaluates a variable' do
node = Gitlab::Changelog::Parser
.new
.parse_and_transform('{{numbers.0}}')
.nodes[0]
expect(node.evaluate(state, data)).to eq('10')
end
it 'evaluates an undefined variable' do
node =
Gitlab::Changelog::Parser.new.parse_and_transform('{{foobar}}').nodes[0]
expect(node.evaluate(state, data)).to eq('')
end
it 'evaluates the special variable "it"' do
node =
Gitlab::Changelog::Parser.new.parse_and_transform('{{it}}').nodes[0]
expect(node.evaluate(state, data)).to eq(data.to_s)
end
end
end
RSpec.describe Gitlab::Changelog::AST::Expressions do
let(:state) { Gitlab::Changelog::EvalState.new }
describe '#evaluate' do
it 'evaluates all expressions' do
node = Gitlab::Changelog::Parser
.new
.parse_and_transform('{{number}}foo')
expect(node.evaluate(state, { 'number' => 10 })).to eq('10foo')
end
end
end
RSpec.describe Gitlab::Changelog::AST::Text do
let(:state) { Gitlab::Changelog::EvalState.new }
describe '#evaluate' do
it 'returns the text' do
expect(described_class.new('foo').evaluate(state, {})).to eq('foo')
end
end
end
RSpec.describe Gitlab::Changelog::AST::If do
let(:state) { Gitlab::Changelog::EvalState.new }
describe '#evaluate' do
it 'evaluates a truthy if expression without an else clause' do
node = Gitlab::Changelog::Parser
.new
.parse_and_transform('{% if thing %}foo{% end %}')
.nodes[0]
expect(node.evaluate(state, { 'thing' => true })).to eq('foo')
end
it 'evaluates a falsy if expression without an else clause' do
node = Gitlab::Changelog::Parser
.new
.parse_and_transform('{% if thing %}foo{% end %}')
.nodes[0]
expect(node.evaluate(state, { 'thing' => false })).to eq('')
end
it 'evaluates a falsy if expression with an else clause' do
node = Gitlab::Changelog::Parser
.new
.parse_and_transform('{% if thing %}foo{% else %}bar{% end %}')
.nodes[0]
expect(node.evaluate(state, { 'thing' => false })).to eq('bar')
end
end
describe '#truthy?' do
it 'returns true for a non-empty String' do
expect(described_class.new.truthy?('foo')).to eq(true)
end
it 'returns true for a non-empty Array' do
expect(described_class.new.truthy?([10])).to eq(true)
end
it 'returns true for a Boolean true' do
expect(described_class.new.truthy?(true)).to eq(true)
end
it 'returns false for an empty String' do
expect(described_class.new.truthy?('')).to eq(false)
end
it 'returns true for an empty Array' do
expect(described_class.new.truthy?([])).to eq(false)
end
it 'returns false for a Boolean false' do
expect(described_class.new.truthy?(false)).to eq(false)
end
end
end
RSpec.describe Gitlab::Changelog::AST::Each do
let(:state) { Gitlab::Changelog::EvalState.new }
describe '#evaluate' do
it 'evaluates the expression' do
data = { 'animals' => [{ 'name' => 'Cat' }, { 'name' => 'Dog' }] }
node = Gitlab::Changelog::Parser
.new
.parse_and_transform('{% each animals %}{{name}}{% end %}')
.nodes[0]
expect(node.evaluate(state, data)).to eq('CatDog')
end
it 'returns an empty string when the input is not a collection' do
data = { 'animals' => 10 }
node = Gitlab::Changelog::Parser
.new
.parse_and_transform('{% each animals %}{{name}}{% end %}')
.nodes[0]
expect(node.evaluate(state, data)).to eq('')
end
it 'disallows too many nested loops' do
data = {
'foo' => [
{
'bar' => [
{
'baz' => [
{
'quix' => [
{
'foo' => [{ 'name' => 'Alice' }]
}
]
}
]
}
]
}
]
}
template = <<~TPL
{% each foo %}
{% each bar %}
{% each baz %}
{% each quix %}
{% each foo %}
{{name}}
{% end %}
{% end %}
{% end %}
{% end %}
{% end %}
TPL
node =
Gitlab::Changelog::Parser.new.parse_and_transform(template).nodes[0]
expect { node.evaluate(state, data) }
.to raise_error(Gitlab::Changelog::Error)
end
end
end
...@@ -88,7 +88,7 @@ RSpec.describe Gitlab::Changelog::Committer do ...@@ -88,7 +88,7 @@ RSpec.describe Gitlab::Changelog::Committer do
end end
context "when the changelog changes before saving the changes" do context "when the changelog changes before saving the changes" do
it 'raises a CommitError' do it 'raises a Error' do
release1 = Gitlab::Changelog::Release release1 = Gitlab::Changelog::Release
.new(version: '1.0.0', date: Time.utc(2020, 1, 1), config: config) .new(version: '1.0.0', date: Time.utc(2020, 1, 1), config: config)
...@@ -121,7 +121,7 @@ RSpec.describe Gitlab::Changelog::Committer do ...@@ -121,7 +121,7 @@ RSpec.describe Gitlab::Changelog::Committer do
branch: 'master', branch: 'master',
message: 'Test commit' message: 'Test commit'
) )
end.to raise_error(described_class::CommitError) end.to raise_error(Gitlab::Changelog::Error)
end end
end end
end end
......
...@@ -41,13 +41,15 @@ RSpec.describe Gitlab::Changelog::Config do ...@@ -41,13 +41,15 @@ RSpec.describe Gitlab::Changelog::Config do
) )
expect(config.date_format).to eq('foo') expect(config.date_format).to eq('foo')
expect(config.template).to be_instance_of(Gitlab::Changelog::Template::Template) expect(config.template)
.to be_instance_of(Gitlab::Changelog::AST::Expressions)
expect(config.categories).to eq({ 'foo' => 'bar' }) expect(config.categories).to eq({ 'foo' => 'bar' })
end end
it 'raises ConfigError when the categories are not a Hash' do it 'raises Error when the categories are not a Hash' do
expect { described_class.from_hash(project, 'categories' => 10) } expect { described_class.from_hash(project, 'categories' => 10) }
.to raise_error(described_class::ConfigError) .to raise_error(Gitlab::Changelog::Error)
end end
end end
......
# frozen_string_literal: true
require 'spec_helper'
RSpec.describe Gitlab::Changelog::Parser do
let(:parser) { described_class.new }
describe '#root' do
it 'parses an empty template' do
expect(parser.root).to parse('')
end
it 'parses a variable with a single identifier step' do
expect(parser.root).to parse('{{foo}}')
end
it 'parses a variable with a single integer step' do
expect(parser.root).to parse('{{0}}')
end
it 'parses a variable with multiple selector steps' do
expect(parser.root).to parse('{{foo.bar}}')
end
it 'parses a variable with an integer selector step' do
expect(parser.root).to parse('{{foo.bar.0}}')
end
it 'parses the special "it" variable' do
expect(parser.root).to parse('{{it}}')
end
it 'parses a text node' do
expect(parser.root).to parse('foo')
end
it 'parses an if expression' do
expect(parser.root).to parse('{% if foo %}bar{% end %}')
end
it 'parses an if-else expression' do
expect(parser.root).to parse('{% if foo %}bar{% else %}baz{% end %}')
end
it 'parses an each expression' do
expect(parser.root).to parse('{% each foo %}foo{% end %}')
end
it 'parses an escaped newline' do
expect(parser.root).to parse("foo\\\nbar")
end
it 'parses a regular newline' do
expect(parser.root).to parse("foo\nbar")
end
it 'parses the default changelog template' do
expect(parser.root).to parse(Gitlab::Changelog::Config::DEFAULT_TEMPLATE)
end
it 'raises an error when parsing an integer selector that is too large' do
expect(parser.root).not_to parse('{{100000000000}}')
end
end
describe '#parse_and_transform' do
it 'parses and transforms a template' do
node = parser.parse_and_transform('foo')
expect(node).to be_instance_of(Gitlab::Changelog::AST::Expressions)
end
it 'raises parsing errors using a custom error class' do
expect { parser.parse_and_transform('{% each') }
.to raise_error(Gitlab::Changelog::Error)
end
end
end
...@@ -93,6 +93,28 @@ RSpec.describe Gitlab::Changelog::Release do ...@@ -93,6 +93,28 @@ RSpec.describe Gitlab::Changelog::Release do
OUT OUT
end end
end end
context 'when a category has no entries' do
it "isn't included in the output" do
config.categories['kittens'] = 'Kittens'
config.categories['fixed'] = 'Bug fixes'
release.add_entry(
title: 'Entry title',
commit: commit,
category: 'fixed'
)
expect(release.to_markdown).to eq(<<~OUT)
## 1.0.0 (2021-01-05)
### Bug fixes (1 change)
- [Entry title](#{commit.to_reference(full: true)})
OUT
end
end
end end
describe '#header_start_position' do describe '#header_start_position' do
......
# frozen_string_literal: true
require 'spec_helper'
RSpec.describe Gitlab::Changelog::Template::Compiler do
def compile(template, data = {})
Gitlab::Changelog::Template::Compiler.new.compile(template).render(data)
end
describe '#compile' do
it 'compiles an empty template' do
expect(compile('')).to eq('')
end
it 'compiles a template with an undefined variable' do
expect(compile('{{number}}')).to eq('')
end
it 'compiles a template with a defined variable' do
expect(compile('{{number}}', 'number' => 42)).to eq('42')
end
it 'compiles a template with the special "it" variable' do
expect(compile('{{it}}', 'values' => 10)).to eq({ 'values' => 10 }.to_s)
end
it 'compiles a template containing an if statement' do
expect(compile('{% if foo %}yes{% end %}', 'foo' => true)).to eq('yes')
end
it 'compiles a template containing an if/else statement' do
expect(compile('{% if foo %}yes{% else %}no{% end %}', 'foo' => false))
.to eq('no')
end
it 'compiles a template that iterates over an Array' do
expect(compile('{% each numbers %}{{it}}{% end %}', 'numbers' => [1, 2, 3]))
.to eq('123')
end
it 'compiles a template that iterates over a Hash' do
output = compile(
'{% each pairs %}{{0}}={{1}}{% end %}',
'pairs' => { 'key' => 'value' }
)
expect(output).to eq('key=value')
end
it 'compiles a template that iterates over a Hash of Arrays' do
output = compile(
'{% each values %}{{key}}{% end %}',
'values' => [{ 'key' => 'value' }]
)
expect(output).to eq('value')
end
it 'compiles a template with a variable path' do
output = compile('{{foo.bar}}', 'foo' => { 'bar' => 10 })
expect(output).to eq('10')
end
it 'compiles a template with a variable path that uses an Array index' do
output = compile('{{foo.values.1}}', 'foo' => { 'values' => [10, 20] })
expect(output).to eq('20')
end
it 'compiles a template with a variable path that uses a Hash and a numeric index' do
output = compile('{{foo.1}}', 'foo' => { 'key' => 'value' })
expect(output).to eq('')
end
it 'compiles a template with a variable path that uses an Array and a String based index' do
output = compile('{{foo.numbers.bla}}', 'foo' => { 'numbers' => [10, 20] })
expect(output).to eq('')
end
it 'ignores ERB tags provided by the user' do
input = '<% exit %> <%= exit %> <%= foo -%>'
expect(compile(input)).to eq(input)
end
it 'removes newlines introduced by end statements on their own lines' do
output = compile(<<~TPL, 'foo' => true)
{% if foo %}
foo
{% end %}
TPL
expect(output).to eq("foo\n")
end
it 'supports escaping of trailing newlines' do
output = compile(<<~TPL)
foo \
bar\
baz
TPL
expect(output).to eq("foo barbaz\n")
end
# rubocop: disable Lint/InterpolationCheck
it 'ignores embedded Ruby expressions' do
input = '#{exit}'
expect(compile(input)).to eq(input)
end
# rubocop: enable Lint/InterpolationCheck
it 'ignores ERB tags inside variable tags' do
input = '{{<%= exit %>}}'
expect(compile(input)).to eq(input)
end
it 'ignores malicious code that tries to escape a variable' do
input = "{{') ::Kernel.exit # '}}"
expect(compile(input)).to eq(input)
end
it 'ignores malicious code that makes use of whitespace' do
input = "x<\\\n%::Kernel.system(\"id\")%>"
expect(Kernel).not_to receive(:system).with('id')
expect(compile(input)).to eq('x<%::Kernel.system("id")%>')
end
end
end
...@@ -672,7 +672,7 @@ RSpec.describe API::Repositories do ...@@ -672,7 +672,7 @@ RSpec.describe API::Repositories do
allow(spy) allow(spy)
.to receive(:execute) .to receive(:execute)
.and_raise(Gitlab::Changelog::Committer::CommitError.new('oops')) .and_raise(Gitlab::Changelog::Error.new('oops'))
post( post(
api("/projects/#{project.id}/repository/changelog", user), api("/projects/#{project.id}/repository/changelog", user),
......
...@@ -34,6 +34,7 @@ require 'rspec-parameterized' ...@@ -34,6 +34,7 @@ require 'rspec-parameterized'
require 'shoulda/matchers' require 'shoulda/matchers'
require 'test_prof/recipes/rspec/let_it_be' require 'test_prof/recipes/rspec/let_it_be'
require 'test_prof/factory_default' require 'test_prof/factory_default'
require 'parslet/rig/rspec'
rspec_profiling_is_configured = rspec_profiling_is_configured =
ENV['RSPEC_PROFILING_POSTGRES_URL'].present? || ENV['RSPEC_PROFILING_POSTGRES_URL'].present? ||
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment