1. 03 Aug, 2016 1 commit
    • Yorick Peterse's avatar
      Improve AutolinkFilter#text_parse performance · dd35c3dd
      Yorick Peterse authored
      By using clever XPath queries we can quite significantly improve the
      performance of this method. The actual improvement depends a bit on the
      amount of links used but in my tests the new implementation is usually
      around 8 times faster than the old one. This was measured using the
      following benchmark:
      
          require 'benchmark/ips'
      
          text = '<p>' + Note.select("string_agg(note, '') AS note").limit(50).take[:note] + '</p>'
          document = Nokogiri::HTML.fragment(text)
          filter = Banzai::Filter::AutolinkFilter.new(document, autolink: true)
      
          puts "Input size: #{(text.bytesize.to_f / 1024 / 1024).round(2)} MB"
      
          filter.rinku_parse
      
          Benchmark.ips(time: 15) do |bench|
            bench.report 'text_parse' do
              filter.text_parse
            end
      
            bench.report 'text_parse_fast' do
              filter.text_parse_fast
            end
      
            bench.compare!
          end
      
      Here the "text_parse_fast" method is the new implementation and
      "text_parse" the old one. The input size was around 180 MB. Running this
      benchmark outputs the following:
      
          Input size: 181.16 MB
          Calculating -------------------------------------
                    text_parse     1.000  i/100ms
               text_parse_fast     9.000  i/100ms
          -------------------------------------------------
                    text_parse     13.021  (±15.4%) i/s -    188.000
               text_parse_fast    112.741  (± 3.5%) i/s -      1.692k
      
          Comparison:
               text_parse_fast:      112.7 i/s
                    text_parse:       13.0 i/s - 8.66x slower
      
      Again the production timings may (and most likely will) vary depending
      on the input being processed.
      dd35c3dd
  2. 02 Aug, 2016 16 commits
  3. 01 Aug, 2016 23 commits