Commit e9ac3df7 authored by Nick Thomas's avatar Nick Thomas

Fall back to the raw diff if base64 encoding fails

A small number of merge request diff files have `binary: true` set
inappropriately. During the external diffs migration on GitLab.com,
approximately 30 in every million (0.003%) fail because the `diffs`
column is the plain diff, rather than the base64-encoded diff.

It would be nicer to correct the data, but a quick and easy fix is to
fall back to the raw column data if the diff is unreadable as base64.

This is particularly safe because the data must *also* be processed as
a diff before being shown to the user, and two characters very commonly
seen in diffs - `-` and `@` - are not valid base64 characters.

This change will fix display and migration of these broken diffs, and
as a happy side effect, the migration to external storage, along with
this change, will also fix the `binary` column for us, so the error
will correct over time.
parent a60e3a28
......@@ -25,6 +25,16 @@ class MergeRequestDiffFile < ApplicationRecord
super
end
binary? ? content.unpack1('m0') : content
return content unless binary?
# If the data isn't valid base64, return it as-is, since it's almost certain
# to be a valid diff. Parsing it as a diff will fail if it's something else.
#
# https://gitlab.com/gitlab-org/gitlab/-/issues/240921
begin
content.unpack1('m0')
rescue ArgumentError
content
end
end
end
---
title: Fix reading some merge request diffs
merge_request: 40598
author:
type: fixed
......@@ -25,6 +25,14 @@ RSpec.describe MergeRequestDiffFile do
it 'unpacks from base 64' do
expect(subject.diff).to eq(unpacked)
end
context 'invalid base64' do
let(:packed) { '---/dev/null' }
it 'returns the raw diff' do
expect(subject.diff).to eq(packed)
end
end
end
context 'when the diff is not marked as binary' do
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment