• Kirill Smelkov's avatar
    bigfile/py: Teach loadblk() to automatically break reference cycles to pybuf · 9aa6a5d7
    Kirill Smelkov authored
    Because otherwise we bug on pybuf->ob_refcnt != 1.
    
    Such cycles might happen if inside loadblk implementation an exception
    is internally raised and then caught even in deeply internal function
    which does not receive pybuf as argument or by some other way:
    
    After
    
    	_, _, exc_traceback = sys.exc_info()
    
    there is a reference loop created:
    
    	exc_traceback
    	  |        ^
    	  |        |
    	  v     .f_localsplus
    	 frame
    
    and since exc_traceback object holds reference to deepest frame, which via f_back
    will be holding reference to frames up to frame with pybuf argument, it
    will result in additional reference to pybuf being held until the above
    cycle is garbage collected.
    
    So to solve the problem while leaving loadblk, if pybuf->ob_refcnt !=
    let's first do garbage-collection, and only then recheck left
    references. After GC reference-loops created by exceptions should go
    away.
    
    NOTE PyGC_Collect() (C way to call gc.collect()) always performs
        GC - it is not affected by gc.disable() which disables only
        _automatic_ garbage collection.
    
    NOTE it turned out out storeblk logic to unpin pybuf (see
        6da5172e "bigfile/py: Teach storeblk() how to correctly propagate
        traceback on error") is flawed, because when e.g. creating memoryview
        from pybuf internal pointer is copied and then clearing original buf
        does not result in clearing the copy.
    
    NOTE it is ok to do gc.collect() from under sighandler - at least we are
        already doing it for a long time via running non-trivial python code
        which for sure triggers automatic GC from time to time (see also
        786d418d "bigfile: Simple test that we can handle GC from-under
        sighandler" for the reference)
    
    Fixes: #7
    9aa6a5d7
_bigfile.c 27.4 KB