• Kirill Smelkov's avatar
    zodbdump: Fix pickle disassembly if state part of zpickle refers to class part · fbb2a3d9
    Kirill Smelkov authored
    I've tried to run `zodb dump --pretty=zpickledis` on wendelin.core test
    data in WCFS(*) and hit the following failure:
    
        (z-dev) kirr@deca:~/src/wendelin/wendelin.core/wcfs/internal/zdata/testdata$ zodb dump --pretty=zpickledis zblk.fs
        ...
        obj 0000000000000005 685 sha1:865171b709f575b355afd2cc9e1f32b9781c6510
        Traceback (most recent call last):
          File "/home/kirr/src/wendelin/venv/z-dev/bin/zodb", line 11, in <module>
            load_entry_point('zodbtools', 'console_scripts', 'zodb')()
          File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main
            return command_module.main(argv)
          File "<decorator-gen-3>", line 2, in main
          File "/home/kirr/src/wendelin/venv/z-dev/lib/python2.7/site-packages/golang/__init__.py", line 103, in _
            return f(*argv, **kw)
          File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbdump.py", line 341, in main
            zodbdump(stor, tidmin, tidmax, hashonly, pretty)
          File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbdump.py", line 167, in zodbdump
            pickletools.dis(dataf, disf) # state
          File "/usr/lib/python2.7/pickletools.py", line 2005, in dis
            raise ValueError(errormsg)
        ValueError: memo key 1 has never been stored into
    
    The problem turned out to be due to that state part of zpickle is
    referring to another object with the same class as already saved
    in class part of zpickle, so that class was being referred to via GET
    matching corresponding PUT done in the class part, but our zpickledis
    handler did not shared the memo in between those two parts and so the
    GET became unmatched.
    
    In more details the problem is illustrated by the following zpickle that
    corresponds to Object.value referring to the same Object. The first part
    of zpickle contains class part and refers to __main__.Object global
    with putting it into memo[1]. The second part of zpickle contains state
    part and refers to that object by `(Object, 7) PERSID` where Object is
    retrieved via memo[1] GET:
    
        obj 0000000000000007 41 sha1:7108c96ccb9cbeaab1164d533174c300e51309f9
              0: \x80 PROTO      2
              2: c    GLOBAL     '__main__ Object'
             19: q    BINPUT     1                   <-- NOTE
             21: .    STOP
          highest protocol among opcodes = 2
             22: \x80 PROTO      2
             24: U    SHORT_BINSTRING '\x00\x00\x00\x00\x00\x00\x00\x07'
             34: q    BINPUT     2
             36: h    BINGET     1                   <-- NOTE
             38: \x86 TUPLE2
             39: Q    BINPERSID
             40: .    STOP
          highest protocol among opcodes = 2
    
    To handle such zpickles well we need to share the memo when dumping
    class and state disassemblies similarly to how ZODB does in its
    ObjectWriter._dump:
    
    https://github.com/zopefoundation/ZODB/blob/5.8.1-0-g72cebe6bc/src/ZODB/serialize.py#L436-L443
    
    Pickletools.dis has explicit support for using shared memo - originally
    added in https://github.com/python/cpython/commit/62235e701e37 and
    likely motivated by ZODB use-case.
    
    (*) https://lab.nexedi.com/nexedi/wendelin.core/-/blob/07087ec8/wcfs/internal/zdata/testdata/zblk.fs
        generated by nexedi/wendelin.core@2c152d41
    
    /reviewed-by @jerome
    /reviewed-on nexedi/zodbtools!28
    fbb2a3d9
zodbdump.py 20.6 KB