Commit 334a9775 authored by Kamil Trzciński's avatar Kamil Trzciński Committed by Aleksei Lipniagov

Make constant-memory export serializer

This makes the serialization to be executed
lazily, using memory-optimised approach,
and being fast to recycle.

The biggest benefit comes from the fact that
storing full Hash of object serialization is expensive,
as it also often results in holding original
representations of objects.

This solves that, by making batch serialization,
that outputs exact raw JSON content to generated string.

JSON generator is optimised to have efficient string
appender, this makes use of that.
parent 7fcd35ff
...@@ -26,6 +26,51 @@ module Gitlab ...@@ -26,6 +26,51 @@ module Gitlab
class FastHashSerializer class FastHashSerializer
attr_reader :subject, :tree attr_reader :subject, :tree
# Usage of this class results in delayed
# serialization of relation. The serialization
# will be triggered when the `JSON.generate`
# is exected.
#
# This class uses memory-optimised, lazily
# initialised, fast to recycle relation
# serialization.
#
# The `JSON.generate` does use `#to_json`,
# that returns raw JSON content that is written
# directly to file.
class JSONBatchRelation
include Gitlab::Utils::StrongMemoize
def initialize(relation, options, preloads)
@relation = relation
@options = options
@preloads = preloads
end
def raw_json
strong_memoize(:raw_json) do
result = +''
batch = @relation
batch = batch.preload(@preloads) if @preloads
batch.each do |item|
result.concat(",") unless result.empty?
result.concat(item.to_json(@options))
end
result
end
end
def to_json(options = {})
raw_json
end
def as_json(*)
raise NotImplementedError
end
end
BATCH_SIZE = 100 BATCH_SIZE = 100
def initialize(subject, tree, batch_size: BATCH_SIZE) def initialize(subject, tree, batch_size: BATCH_SIZE)
...@@ -34,8 +79,6 @@ module Gitlab ...@@ -34,8 +79,6 @@ module Gitlab
@tree = tree @tree = tree
end end
# Serializes the subject into a Hash for the given option tree
# (e.g. Project#as_json)
def execute def execute
simple_serialize.merge(serialize_includes) simple_serialize.merge(serialize_includes)
end end
...@@ -85,12 +128,10 @@ module Gitlab ...@@ -85,12 +128,10 @@ module Gitlab
return record.as_json(options) return record.as_json(options)
end end
# has-many relation
data = [] data = []
record.in_batches(of: @batch_size) do |batch| # rubocop:disable Cop/InBatches record.in_batches(of: @batch_size) do |batch| # rubocop:disable Cop/InBatches
batch = batch.preload(preloads[key]) if preloads&.key?(key) data.append(JSONBatchRelation.new(batch, options, preloads[key]).tap(&:raw_json))
data += batch.as_json(options)
end end
data data
......
...@@ -20,7 +20,8 @@ module Gitlab ...@@ -20,7 +20,8 @@ module Gitlab
project_tree = serialize_project_tree project_tree = serialize_project_tree
fix_project_tree(project_tree) fix_project_tree(project_tree)
File.write(full_path, project_tree.to_json) project_tree_json = JSON.generate(project_tree)
File.write(full_path, project_tree_json)
true true
rescue => e rescue => e
......
require 'spec_helper' require 'spec_helper'
describe Gitlab::ImportExport::FastHashSerializer do describe Gitlab::ImportExport::FastHashSerializer do
subject { described_class.new(project, tree).execute } subject { JSON.parse(JSON.generate(described_class.new(project, tree).execute)) }
let!(:project) { setup_project } let!(:project) { setup_project }
let(:user) { create(:user) } let(:user) { create(:user) }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment