1. 01 Mar, 2023 38 commits
  2. 28 Feb, 2023 2 commits
    • Roque's avatar
      58070276
    • Levin Zimmermann's avatar
      pandas: Fix unpickle np arrays with py2+pd>0.19.x · d223aede
      Levin Zimmermann authored
      Pandas 0.20.0 introduced a bug fix [1] which changed the behaviour of
      'DataFrame.to_records()', so that the resulting Record objects dtype names are
      unicodes if the data frames column names were unicode. Before this bug fix
      the dtype names were str, no matter whether the column names were str or unicode.
      
      Unfortunately np unpickle breaks if dtype names are unicode [2]. Since
      many of our data frame columns are unicode, loading arrays often
      fails. In python3 this isn't a problem anymore, so until then we fix
      this by introducing a simple monkey patch to pandas, which basically
      reverts the mentioned bug fix.
      
      [1] https://github.com/pandas-dev/pandas/issues/11879
      [2] Small example to reproduce this error:
      
      ''
      import os
      
      import numpy as np
      import pandas as pd
      
      r = pd.DataFrame({u'A':[1,2,3]}).to_records()
      a = np.ndarray(shape=r.shape, dtype=r.dtype.fields)
      p = "t"
      
      try:
        os.remove(p)
      except:
        pass
      
      with open(p, 'wb') as f:
        np.save(f, a)
      with open(p, 'rb') as f:
        np.load(f)
      ''
      
      /reviewed-on !1738
      /reviewed-by @jerome @klaus
      d223aede