• Kirill Smelkov's avatar
    bigarray: ArrayRef support for BigArray · 450ad804
    Kirill Smelkov authored
    Rationale
    ---------
    
    Array reference could be useful in situations where one needs to pass arrays
    between processes and instead of copying array data, leverage the fact that
    top-level array, for example ZBigArray, is already persisted separately, and
    only send small amount of information referencing data in question.
    
    Implementation
    --------------
    
    BigArray is not regular NumPy array and so needs explicit support in
    ArrayRef code to find root object and indices. This patch adds such
    support via the following way:
    
    - when BigArray.__getitem__ creates VMA, it remembers in the VMA
      the top-level BigArray object under which this VMA was created.
    
    - when ArrayRef is finding root, it can detect such VMAs, because it will
      be pointed to by the most top regular ndarray's .base, and in turn gets
      top-level BigArray object from the VMA.
    
    - further all indices computations are performed, similarly to complete regular
      ndarrays case, on ndarrays root and a. But in the end .lo and .hi are
      adjusted for the corresponding offset of where root is inside whole
      BigArray.
    
    - there is no need to adjust .deref() at all.
    
    For remembering information into a VMA and also to be able to get
    (readonly) its mapping addresses _bigfile.c extension has to be extended
    a bit. Since we are now storing arbitrary python object attached to
    PyVMA - it can create cycles - and so PyVMA accordingly adjusted to
    support cyclic garbage collector.
    
    Please see the patch itself for more details and comments.
    450ad804
test_basic.py 24.4 KB