1. 19 Sep, 2018 3 commits
    • Kirill Smelkov's avatar
      tests: Split TestDecode into main driver + worker that tests 1 particular pickle · 89d4f00e
      Kirill Smelkov authored
      We are going to modify the main tests table and for every case (e.g.
      int(1) add pickles with various encoding that all represent the same
      object when decoded. The main TestDecode driver will iterate over all
      those input data and hand it over for particular testing to testDecode
      worker.
      
      Use t.Run for spawning each case - it is less typing (we do not need to
      manually print test.name on errors, etc), and it allows to select which
      subtests to run via e.g. `go test -run Decode/int`
      89d4f00e
    • Kirill Smelkov's avatar
      tests: Move Graphite and "long line" data definition out from main test table · 81e5b5e7
      Kirill Smelkov authored
      Having that long data makes the table clumsy and harder to understand.
      By moving such data definition out of the table, we make it a bit easier
      to understand.
      
      In the case of "long line", the pickle input and the line itself were
      almost duplicating each other, so instead of having two long lines
      explicitly pasted, let's have the test input be defined as
      
      +	{"too long line", "V" + longLine + "\n.", longLine},
      
      In the case of Graphite messages, one long Graphite object was also
      duplicated in encode_test.go .
      
      Just a cleanup, no semantic change.
      81e5b5e7
    • Kirill Smelkov's avatar
      Drop support for Go1.6 · 283146f0
      Kirill Smelkov authored
      It will be handy to use testing.T.Run in the upcoming patches that add
      more decode/encode tests. However since testing.T.Run was was added in
      Go1.7 the code won't work with Go1.6 .
      
      Since Go1.6 was released in the beginning of 2016 - more than 2.5 years
      ago, and upstream Go policy is to support only current stable (currently
      Go1.11) and two previous releases (currently Go1.10 and Go1.9), as of
      today Go1.6 is both not supported and outdated. So let's drop explicit
      support for it in the name of progress.
      
      By above logic we could also drop support for 1.7 and 1.8, but since
      there is no pressing particular need to do so I'm not dropping them for
      now.
      283146f0
  2. 18 Sep, 2018 6 commits
    • Kirill Smelkov's avatar
      encoder: Raise signal/noise ratio when writing data out · 423fe587
      Kirill Smelkov authored
      Currently we often use constructs like
      
      	_, err := e.w.Write([]byte{opNone})
      	return err
      
      however with added Encoder.emit helper it can become only
      
      	return e.emit(opNone)
      
      which is more clearly readable.
      
      We can do similarly for formatted output:
      
      -	_, err := fmt.Fprintf(e.w, "%c%dL\n", opLong, b)
      -	return err
      +	return e.emitf("%c%dL\n", opLong, b)
      
      Due to much boilerplate in one place (encodeInt) the return of
      fmt.Fprintf was not even checked. We change the code there with
      
      -		_, err = e.w.Write([]byte{opInt})
      -		if err != nil {
      -			return err
      -		}
      -		fmt.Fprintf(e.w, "%d\n", i)
      +		return e.emitf("%c%d\n", opInt, i)
      
      which is now hopefully more visible for what is going on and is easier
      to catch potential mistake by eyes.
      
      A corresponding test for "int64 encoded as string" will be added later.
      423fe587
    • Kirill Smelkov's avatar
      Add definition for Protocol 3 and missing from Protocol 4 opcodes · b21a5f61
      Kirill Smelkov authored
      Protocol 3 adds opcodes to represent Python bytes.
      For Protocol 4 we were missing definitions of several of the opcodes,
      such as BINUNICODE8, EMPTY_SET etc.
      
      Add definitions for them all.
      b21a5f61
    • Kirill Smelkov's avatar
      Split "Protocol 1" opcodes out from "Protocol 0" opcodes · da584909
      Kirill Smelkov authored
      We already keep "Protocol 2" and "Protocol 4" opcodes separately, and
      for a good reason - it must be clear for a person just by looking at
      opcodes definition from which protocol version an opcode comes. However
      at present Protocol 0 and Protocol 1 opcodes come all intermixed with
      each other.
      
      In the upcoming patches we'll be teaching Encoder to take desired pickle
      protocol version into account (similarly to how pickle.dumps accepts desired
      protocol version in Python), and then the encoder will have to use
      different opcodes for different versions: for example if user asks to
      produce "protocol 0" pickle stream it will be invalid to use "protocol
      1" opcodes - e.g. BININT, EMPTY_TUPLE, etc.
      
      So I went through
      
      	https://github.com/python/cpython/blob/master/Lib/pickletools.py
      
      for each opcode searching its protocol version and separated the opcodes
      with proto=1 from protocol 0 opcodes.
      da584909
    • Kirill Smelkov's avatar
      Make all opcode constants to have type byte · f6ba06e8
      Kirill Smelkov authored
      It was probably a thinko in d00e99e7 (Add more opcodes so we can read
      Graphite's Carbon stream) which changed opcodes from string ("X") to
      character ('X') constants and marked the first opcode with byte type,
      probably with the idea that following opcodes will have the same type.
      
      Unfortunately it is not so as the following program demonstrates:
      
      	package main
      
      	const (
      	        opAAA byte = 'a'
      	        opBBB      = 'b'
      	)
      
      	func main() {
      	        op := opBBB
      	        if true {
      	                op = opAAA	// <-- line 11
      	        }
      	        println(op)
      	}
      
      	--> ./bc.go:11:6: cannot use opAAA (type byte) as type rune in assignment
      
      Similarly if we try comparing opcodes, it also results in compile error:
      
      	func main() {
      	        op := opBBB
      	        if op == opAAA {	// <-- line 10
      	                panic(0)
      	        }
      	        println(op)
      	}
      
      	--> ./bc.go:10:8: invalid operation: op == opAAA (mismatched types rune and byte)
      
      Since in the following patches it will be handy for encoder to e.g. set default
      opcode first, and then change it under some conditions to another opcode,
      exactly the same compile error(s) will pop up.
      
      So let's fix all opcode constants to have their type as byte to avoid such
      unexpected errors.
      f6ba06e8
    • Kirill Smelkov's avatar
      decoder: Use .push instead of open-coded .stack append where appropriate · 0d8e88fa
      Kirill Smelkov authored
      For example
      
      	d.push(math.Float64frombits(u))
      
      looks more clear compared to
      
      	d.stack = append(d.stack, math.Float64frombits(u))
      
      and we already use push in many other places.
      0d8e88fa
    • Kirill Smelkov's avatar
      decoder/loadFloat: No need to explicitly cast v to interface{} · f04563c4
      Kirill Smelkov authored
      v is concrete type here (float64) and Go automatically converts it to
      interface{} on call to Decoder.push().
      f04563c4
  3. 17 Sep, 2018 1 commit
    • Kirill Smelkov's avatar
      decoder: Teach loadUnicode to return ErrUnexpectedEOF, not paper it over in tests (#44) · 49c79ab9
      Kirill Smelkov authored
      While doing a5094338 (encoder: More unexpected EOF handling) I was a bit
      disappointed that https://golang.org/cl/37052 was not going to be
      accepted and somehow added hack to tests that was doing
      
      	strconv.ErrSyntax -> io.ErrUnexpectedEOF
      
      error conversion if the test name contained "unicode".
      
      Today I was refactoring tests and noticed that it is better we teach loadUnicode
      to return io.ErrUnexpectedEOF in the first place, and the next easy way for this
      (after fixing strconv.UnquoteChar itself) is to append many "0" to string and
      see if strconv.UnquoteChar still returns ErrSyntax.
      
      So do it in our custom unquoteChar wrapper.
      49c79ab9
  4. 11 Sep, 2018 1 commit
    • Kirill Smelkov's avatar
      decoder: Cleanup decodeLong tests (#43) · 9eb8c4c6
      Kirill Smelkov authored
      Instead of copy-pasting full test driver for each value, make it to be
      only one test with many values in table each tested by common driver.
      
      TestZeroLengthData was checking output.BitLen() == 0 which is another
      way to test for equality with big.Int(0) according to
      
      	https://golang.org/pkg/math/big/#Int.BitLen
      
      So join it too to the rest of decodeLong tests.
      
      While at decodeLong topic, rename BenchmarkSpeed -> BenchmarkDecodeLong
      as this benchmark checks speed of decodeLong only, nothing else.
      9eb8c4c6
  5. 10 Sep, 2018 1 commit
  6. 05 Sep, 2018 1 commit
    • Kirill Smelkov's avatar
      decoder: Fix thinko in PROTO opcode handling (#41) · 5c420f85
      Kirill Smelkov authored
      Due to := in
      
      	v, err := d.r.ReadByte()
      
      newly declared err was shadowing outside err and thus even though the
      code intent was to treat all protocols != 2 as ErrInvalidPickleVersion,
      after leaving the switch err was always =nil and no error was reported.
      
      Fix it by explicitly declaring v and not using := not to shadow err.
      
      We also change the condition for supported protocol version to be in [0, 4] because:
      
      - we already have messages in our testsuite with PROTO opcode and
        version 1, and if we don't adjust the version condition the tests will fail.
      
      - the highest protocol version we support opcodes for is 4.
      - even though the documentation for PROTO opcode says version must be >= 2,
        in practice CPython decodes pickles with PROTO and lower version just ok:
      
      	In [18]: pickle.loads("\x80\x02I5\n.")
      	Out[18]: 5
      
      	In [19]: pickle.loads("\x80\x01I5\n.")
      	Out[19]: 5
      
      	In [20]: pickle.loads("\x80\x00I5\n.")
      	Out[20]: 5
      
        so we don't complain about those lower versions too.
      5c420f85
  7. 29 Aug, 2018 1 commit
  8. 17 Jul, 2018 2 commits
    • Kirill Smelkov's avatar
      Allow to hook-in application-level logic to handle persistent references (#38) · d51cc146
      Kirill Smelkov authored
      Continuing 84598fe1 (Add support for persistent references) theme let's
      develop support for persistent references more: the usual situation (at
      least in ZODB context, which is actually the original reason
      persistent references exist) is that decode converts a reference of
      (type, oid) tuple form into corresponding "ghost" object - an instance
      of type, but not yet loaded with database object's data. This way
      resulted object tree is created right away with types application
      expects (and ZODB/py further cares to automatically load object data
      when that ghost objects are accessed).
      
      To mimic this on Go side, with current state, after decoding, one would
      need to make a full reflect pass on the resulted decoded objects, find
      Refs, and convert them to instances of other types, also caring to
      patch pointers in other objects consistently that were pointing to such
      Refs. In other words it is some work and it coincides almost 100% with
      what Decoder already does by itself.
      
      Thus, not to duplicate that work and conducting parallel with Python
      world, let's add ability to configure Decoder and Encoder so that
      persistent references could be handled with user-specified application
      logic right in the process, and e.g. Decoder would return resulted
      object with references converted to corresponding Go types.
      
      To do so we introduce DecoderConfig and EncoderConfig - configurations
      to tune decode/encode process. To maintain backward compatibility
      NewDecoder and NewEncoder signatures are not adjusted and instead
      NewDecoderWithConfig and NewEncoderWithConfig are introduces. We should
      be sure we won't need e.g. NewDecoderWithConfigAndXXX in the future,
      since from now on we could be adding fields to configuration structs
      without breaking backward compatibility.
      
      For decoding there is DecoderConfig.PersistentLoad which mimics
      Unpickler.persistent_load in Python:
      
      https://docs.python.org/3/library/pickle.html#pickle.Unpickler.persistent_load
      
      For encoding there is EncoderConfig.PersistentRef which mimics
      Pickler.persistent_id in Python:
      
      https://docs.python.org/3/library/pickle.html#pickle.Pickler.persistent_id
      
      ( for Persistent{Load,Ref}, following suggestion from @kisielk, the
        choice was made to explicitly pass in/out Ref, instead of raw
        interface{} pid, because that makes the API cleaner. )
      
      Then both Decoder and Encoder are adjusted correspondingly with tests
      added.
      
      About tests: I was contemplating to patch-in support for decoder and
      encoder configurations, and handling errors, into our main test
      TestDecode, but for now decided not to go that way and to put the test
      as separate TestPersistentRefs.
      
      By the way, with having configurations in place, we could start to add
      other things there - e.g. the protocol version to use for Encoder.
      Currently we have several places where we either don't use opcodes from
      higher protocol versions (fearing to break compatibility), or instead
      always use higher-version opcodes without checking we should be able to
      do so by allowed protocol version.
      d51cc146
    • Kirill Smelkov's avatar
      .travis.yml += go1.9, go1.10 (#39) · 56e0ff77
      Kirill Smelkov authored
      "1.10" is put into quotes because otherwise Travis parses that as 1.1 -
      see e.g. https://github.com/travis-ci/travis-ci/issues/9247.
      56e0ff77
  9. 01 Jul, 2018 1 commit
    • Kirill Smelkov's avatar
      Add support for persistent references (#37) · 84598fe1
      Kirill Smelkov authored
      * Add docstrings to types exposed in public API
      
      I was refreshing my knowled of ogórek and added them along the way.
      
      * Add support for persistent references
      
      Python has the mechanism for objects in one pickle to reference objects
      in another pickle. This mechanism is primarily used in ZODB object
      database for one object to "persistently" reference another object(*).
      
      I needed a way to load and understand objects from a ZODB database in
      Go, and hence comes this patch for ogórek that teaches it to understand
      such persistent references.
      
      Like it is already with Call and Class we add a dedicated Ref type to
      represent persistent references in pickle stream.
      
      (*) in ZODB when one object is changed, there is no need to resave
      unchanged object that is referencing it because its reference stays
      valid and pointing to changed object: the "persistent ID" in ZODB is
      only object ID and which revision of the referenced object it points to
      is implicitly follows by current transaction number view of the database.
      84598fe1
  10. 25 Apr, 2017 1 commit
  11. 05 Apr, 2017 6 commits
    • Kamil Kisiel's avatar
      Fix typo in test variable name · ec792bc6
      Kamil Kisiel authored
      ec792bc6
    • Kamil Kisiel's avatar
      Merge pull request #35 from navytux/fix1 · d9aae4c0
      Kamil Kisiel authored
      decoder: Small speedups
      d9aae4c0
    • Kirill Smelkov's avatar
      decoder: Avoid allocation in readLine · 15d618fd
      Kirill Smelkov authored
      readLine is used in many text decoders, so it pays off if we care not to
      do allocation for result on every readLine call:
      
          name      old time/op    new time/op    delta
          Speed-4      378ns ± 0%     379ns ± 1%     ~     (p=0.397 n=5+5)
          Decode-4    70.1µs ± 0%    67.3µs ± 0%   -3.97%  (p=0.008 n=5+5)
          Encode-4    16.6µs ± 0%    16.6µs ± 0%     ~     (p=0.548 n=5+5)
      
          name      old alloc/op   new alloc/op   delta
          Speed-4       280B ± 0%      280B ± 0%     ~     (all equal)
          Decode-4    32.3kB ± 0%    31.0kB ± 0%   -3.87%  (p=0.008 n=5+5)
          Encode-4    6.54kB ± 0%    6.54kB ± 0%     ~     (all equal)
      
          name      old allocs/op  new allocs/op  delta
          Speed-4       8.00 ± 0%      8.00 ± 0%     ~     (all equal)
          Decode-4       428 ± 0%       326 ± 0%  -23.83%  (p=0.008 n=5+5)
          Encode-4       297 ± 0%       297 ± 0%     ~     (all equal)
      15d618fd
    • Kirill Smelkov's avatar
      decoder: Preallocate .buf capacity when we know (approximate) size of output to it · 14aaa14f
      Kirill Smelkov authored
      This reduces allocations a bit:
      
          name      old time/op    new time/op    delta
          Speed-4      383ns ± 4%     378ns ± 0%    ~     (p=0.302 n=5+5)
          Decode-4    72.1µs ± 1%    70.1µs ± 0%  -2.76%  (p=0.008 n=5+5)
          Encode-4    16.5µs ± 0%    16.6µs ± 0%    ~     (p=0.095 n=5+5)
      
          name      old alloc/op   new alloc/op   delta
          Speed-4       280B ± 0%      280B ± 0%    ~     (all equal)
          Decode-4    35.7kB ± 0%    32.3kB ± 0%  -9.72%  (p=0.008 n=5+5)
          Encode-4    6.54kB ± 0%    6.54kB ± 0%    ~     (all equal)
      
          name      old allocs/op  new allocs/op  delta
          Speed-4       8.00 ± 0%      8.00 ± 0%    ~     (all equal)
          Decode-4       429 ± 0%       428 ± 0%  -0.23%  (p=0.008 n=5+5)
          Encode-4       297 ± 0%       297 ± 0%    ~     (all equal)
      14aaa14f
    • Kirill Smelkov's avatar
      decode: Use Decoder by pointer always · cad218c8
      Kirill Smelkov authored
      Most of the functions in ogorek.go already use Decoder by pointer, but
      NewDecoder() and Decoder.Decode() exceptionally used it by value.
      
      This was probably an oversight because when Decoder struct is used by
      value it is copied on every Decode call and also it is not possible to
      change and retain state in between calls as all changed happen to copy
      and then forgotten.
      
      This also leads to many unnnecessary allocations, i.e. memory allocated
      for stack one Decode call is not reused on next Decode call etc.
      
      So use Decoder by pointer only. This leads to the following speedup:
      
          name      old time/op    new time/op    delta
          Speed-4      377ns ± 0%     383ns ± 4%     ~     (p=0.214 n=5+5)
          Decode-4    80.4µs ± 3%    72.1µs ± 1%  -10.38%  (p=0.008 n=5+5)
          Encode-4    16.5µs ± 1%    16.5µs ± 0%     ~     (p=0.690 n=5+5)
      
          name      old alloc/op   new alloc/op   delta
          Speed-4       280B ± 0%      280B ± 0%     ~     (all equal)
          Decode-4    44.0kB ± 0%    35.7kB ± 0%  -18.77%  (p=0.008 n=5+5)
          Encode-4    6.54kB ± 0%    6.54kB ± 0%     ~     (all equal)
      
          name      old allocs/op  new allocs/op  delta
          Speed-4       8.00 ± 0%      8.00 ± 0%     ~     (all equal)
          Decode-4       502 ± 0%       429 ± 0%  -14.54%  (p=0.008 n=5+5)
          Encode-4       297 ± 0%       297 ± 0%     ~     (all equal)
      cad218c8
    • Kirill Smelkov's avatar
      tests: Add benchmarks for Decode and Encode · 3979c069
      Kirill Smelkov authored
      - use existing tests vector as input data
      - for Decode we prepare one large pickle stream from testv and decode all pickle from there
      - for Encode we prepare one lirge slice with all objects from testv and encode it
      
      On an i7-6600U currently it looks like:
      
          BenchmarkSpeed-4         5000000               387 ns/op             280 B/op          8 allocs/op
          BenchmarkDecode-4          20000             81892 ns/op           43989 B/op        502 allocs/op
          BenchmarkEncode-4         100000             16711 ns/op            6536 B/op        297 allocs/op
      3979c069
  12. 09 Mar, 2017 2 commits
  13. 08 Mar, 2017 7 commits
    • Kamil Kisiel's avatar
      Merge pull request #33 from navytux/fix2 · 1cb847a8
      Kamil Kisiel authored
      Make decode(encode(v)) to be identity + tuple
      1cb847a8
    • Kirill Smelkov's avatar
      encoder/decoder: Teach ogórek about tuple · f98f54b1
      Kirill Smelkov authored
      It is true that tuples and lists are very similar, and when e.g. a
      python function accepts tuple as input, and then uses only indexes
      access, it can be substituted with list.
      
      However some functions do `isinstance(..., tuple)` etc, and e.g. for a
      go program which wants to produce pickle for such python programs there
      is currently no way to generate pickle with tuple opcodes.
      
      So to solve this teach ogórek about tuples:
      
      - introduce simple Tuple type which has []interface{} as underlying
      - when decoding tuple opcodes produce this Tuple instead of
        []interface{} used previously.
      - when encoding encode Tuple with tuple opcodes.
      
      Otherwise Tuple can be seen and behaves like a regular go []interface{}
      slice. In fact the only difference is that runtime type of Tuple is
      different from runtime type of []interface{} but otherwise both values
      are the same and can be casted to each other if/when needed freely.
      
      NOTE opReduce decoder is adjusted to always require tuple, not list,
      because with e.g. cPickle:
      
          "c__main__\nf\n]R." -> TypeError('argument list must be a tuple', ...)
          "c__main__\nf\n)R." -> ok
      
      ( the first one uses empty list - "]", and the second one empty tuple - ")" )
      
      So it is ok and compatible to always require args to be tuple for
      reduce.
      f98f54b1
    • Kirill Smelkov's avatar
      encoder: Adjust it so that decode(encode(v)) == v · 4fd6be93
      Kirill Smelkov authored
      That is so that decode·encode becomes identity.
      
      For the other way
      
      	encode(decode(pickle)) == pickle
      
      it is generally not possible to do, because pickle machine is almost
      general machine, and it is possible to e.g. have some prefix in program
      which is NOP or result of which is no longer used, use different opcodes
      to build list or tuple etc.
      
      Adjustments to encoder are:
      
      - teach it to encode big.Int back to opLong (not some struct)
      - teach it to encode Call back to opReduce (not some struct)
      
      Tests are adjusted to verify that `decode(encode(v)) == v` holds for
      all inputs in test vector.
      4fd6be93
    • Kirill Smelkov's avatar
      decoder: tests: Don't forget to print test input name on "no EOF" case · fd5b2c0c
      Kirill Smelkov authored
      Else it prints just "no EOF" and it is not clear for which input the
      problem is there.
      fd5b2c0c
    • Kirill Smelkov's avatar
      decoder: tests: When decode result is unexpected - report it with full details · 8b9c271e
      Kirill Smelkov authored
      that is with %#v instead of %q. Because e.g. %q for nil []byte will be
      printing still as "", not nil etc. When some decode test fails here we
      want to get full information, not some pretty one.
      8b9c271e
    • Kirill Smelkov's avatar
      decoder: Don't skip errors in ASCII long format · 6d5eaca2
      Kirill Smelkov authored
      1. the number has to be terminated with 'L' - check it
         (previously last byte was simply discarded)
      
      2. big.Int.SetString() ok code has to be checked too to catch invalid
         input.
      
      big.Int.SetString() usage in test adjusted to also not skip errors
      silently.
      6d5eaca2
    • Kirill Smelkov's avatar
      decoder: Fix crashes found by fuzzer (#32) · da5f0342
      Kirill Smelkov authored
      * decoder: Don't forget to check memo for key not there on read access
      
      If we do not do reading from memo will return memo's zero value (= nil),
      which is
      
      a) not correct - many memo keys could be read this way, and
      b) nil there on stack breaks invariants of stack containing only good
         values.
      
      Furthermore, it can even lead to crashes on e.g. calling
      reflect.TypeOf(stack.top()) in the next patch.
      
      Anyway getting something from memo must be checked for whether it there
      or not for correctness.
      
      Noticed while working on fix for #30.
      
      * decoder: Fix "panic: runtime error: hash of unhashable type ..."
      
      Go specification requires that only comparable types could be used as
      map keys:
      
          https://golang.org/ref/spec#Map_types
      
      For map[interface{}]... this is checked at runtime, and if concrete
      value used as a key is not comparable it results in runtime panic, e.g.:
      
          panic: runtime error: hash of unhashable type ogórek.Call
      
          goroutine 1 [running]:
          github.com/kisielk/og-rek.(*Decoder).loadDict(0xc420084360, 0x64, 0x0)
                  /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/ogorek.go:655 +0x18c
          github.com/kisielk/og-rek.Decoder.Decode(0xc42001c3c0, 0x5a9300, 0x0, 0x0, 0xc4200164b0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
                  /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/ogorek.go:187 +0x172b
          github.com/kisielk/og-rek.Fuzz(0x7ff901592000, 0xa, 0x200000, 0x3)
                  /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/fuzz.go:12 +0x108
          go-fuzz-dep.Main(0x50d830)
                  /tmp/go-fuzz-build561441550/goroot/src/go-fuzz-dep/main.go:49 +0xd9
          main.main()
                  /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/go.fuzz.main/main.go:10 +0x2d
          exit status 2
      
      so when decoding dict and friends - all places where maps are populated - we
      have to check whether an object on stack we are going to use as key is suitable.
      
      Issue #30 contains comparison of 2 ways to do such check - catch
      runtime panic in exception style manner or use
      reflect.TypeOf(v).Comparable(). Since reflect-way turns out to be faster
      
      https://github.com/kisielk/og-rek/issues/30#issuecomment-283609067
      
      and likely will become more faster in the future:
      
      https://github.com/golang/go/issues/19361
      
      it was decided to go the reflect way (which is also a canonical way in
      go land).
      
      So audit all places where map items are set and add appropriate checks
      before them.
      
      I've verified that if we remove any of the added checks, via so far found
      crash vectors, at least one crash case will reappear in tests. This
      means that all added checks are actually test covered.
      
      Updates: #30
      
      * decoder: Don't forget to check for odd #elements in loadDict & friends
      
      e.g. for Dict opcode the expected stack state is
      
      	MARK key1 obj1 key2 obj2 ... keyN objN DICT
      
      so if in between MARK and DICT there is odd number of elements this is
      an error in input data. We also used to crash on such cases, e.g.:
      
              "(\x88d"
      
              panic: runtime error: index out of range
      
              goroutine 1 [running]:
              github.com/kisielk/og-rek.(*Decoder).loadDict(0xc420082990, 0xc42000e464, 0x0)
                      /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/ogorek.go:652 +0x21d
              github.com/kisielk/og-rek.Decoder.Decode(0xc42001c7e0, 0x5a9320, 0x0, 0x0, 0xc4200167b0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
                      /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/ogorek.go:188 +0x172d
              github.com/kisielk/og-rek.Fuzz(0x7f6d1f310000, 0x3, 0x200000, 0x3)
                      /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/fuzz.go:12 +0x108
              go-fuzz-dep.Main(0x50d798)
                      /tmp/go-fuzz-build403415384/goroot/src/go-fuzz-dep/main.go:49 +0xd9
              main.main()
                      /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/go.fuzz.main/main.go:10 +0x2d
      
      I've audited whole decoder and regarding odd #(elements) there are only 2
      places to care: loadDict and loadSetItems and crashers for all of them were
      already found by fuzz testing.
      
      Fixes #30 (for all known cases so far).
      da5f0342
  14. 01 Mar, 2017 3 commits
  15. 28 Feb, 2017 4 commits
    • Kamil Kisiel's avatar
      Update README.md · e3ff867d
      Kamil Kisiel authored
      e3ff867d
    • Kamil Kisiel's avatar
      Expanded corpus after more fuzzing · 55591958
      Kamil Kisiel authored
      55591958
    • Kamil Kisiel's avatar
      Ignore fuzz build products. · ea193550
      Kamil Kisiel authored
      ea193550
    • Kamil Kisiel's avatar
      Fix Go stack overflows resulting from printing types generated by malformed data. · 572d0a6a
      Kamil Kisiel authored
      For example:
      
      runtime: goroutine stack exceeds 1000000000-byte limit
      fatal error: stack overflow
      
      runtime stack:
      runtime.throw(0x100a3d, 0xe)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/runtime/panic.go:566 +0x95
      runtime.newstack()
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/runtime/stack.go:1061 +0x416
      runtime.morestack()
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/runtime/asm_amd64.s:366 +0x7f
      
      goroutine 1 [running]:
      runtime.resolveNameOff(0xdeec0, 0x1916, 0x260e6ecb713)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/runtime/type.go:168 fp=0xc4400f02c0 sp=0xc4400f02b8
      reflect.resolveNameOff(0xdeec0, 0x1916, 0x0)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/runtime/runtime1.go:493 +0x33 fp=0xc4400f02e8 sp=0xc4400f02c0
      reflect.(*rtype).nameOff(0xdeec0, 0xc400001916, 0x0)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/reflect/type.go:663 +0x4b fp=0xc4400f0310 sp=0xc4400f02e8
      reflect.(*rtype).String(0xdeec0, 0xc42000e9e0, 0x98)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/reflect/type.go:733 +0x4e fp=0xc4400f0350 sp=0xc4400f0310
      fmt.(*pp).badVerb(0xc4200180c0, 0x74)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:324 +0x590 fp=0xc4400f03e8 sp=0xc4400f0350
      fmt.(*pp).fmtString(0xc4200180c0, 0x0, 0x0, 0x74)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:441 +0x119 fp=0xc4400f0420 sp=0xc4400f03e8
      fmt.(*pp).printValue(0xc4200180c0, 0xdeec0, 0xc42000e9e0, 0x98, 0x74, 0x108419)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:715 +0x294d fp=0xc4400f0610 sp=0xc4400f0420
      fmt.(*pp).printValue(0xc4200180c0, 0xea6e0, 0xc42000e9e0, 0x99, 0xc400000074, 0x108418)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:764 +0x2ec6 fp=0xc4400f0800 sp=0xc4400f0610
      fmt.(*pp).printValue(0xc4200180c0, 0xe5960, 0xc42000a5d0, 0x194, 0x74, 0x108417)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:777 +0x24a3 fp=0xc4400f09f0 sp=0xc4400f0800
      fmt.(*pp).printValue(0xc4200180c0, 0xdd140, 0xc42000ea00, 0x97, 0xc400000074, 0x108416)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:824 +0xf51 fp=0xc4400f0be0 sp=0xc4400f09f0
      fmt.(*pp).printValue(0xc4200180c0, 0xe5960, 0xc42000ea30, 0x194, 0x74, 0x108415)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:777 +0x24a3 fp=0xc4400f0dd0 sp=0xc4400f0be0
      fmt.(*pp).printValue(0xc4200180c0, 0xdd140, 0xc42000ea40, 0x97, 0xc400000074, 0x108414)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:824 +0xf51 fp=0xc4400f0fc0 sp=0xc4400f0dd0
      fmt.(*pp).printValue(0xc4200180c0, 0xe5960, 0xc4200105a0, 0x194, 0x74, 0x108413)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:777 +0x24a3 fp=0xc4400f11b0 sp=0xc4400f0fc0
      fmt.(*pp).printValue(0xc4200180c0, 0xdd140, 0xc42000ea80, 0x97, 0xc400000074, 0x108412)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:824 +0xf51 fp=0xc4400f13a0 sp=0xc4400f11b0
      fmt.(*pp).printValue(0xc4200180c0, 0xe5960, 0xc4200105b0, 0x194, 0x74, 0x108411)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:777 +0x24a3 fp=0xc4400f1590 sp=0xc4400f13a0
      fmt.(*pp).printValue(0xc4200180c0, 0xdd140, 0xc42000ea80, 0x97, 0xc400000074, 0x108410)
              /var/folders/dx/9jnx9d2j2kg8n35xgcsxjfyr0000gn/T/go-fuzz-build069870222/goroot/src/fmt/print.go:824 +0xf51 fp=0xc4400f1780 sp=0xc4400f1590
      
      ... etc
      572d0a6a