• Kirill Smelkov's avatar
    X golang_str: Fix iter(bstr) to yield byte instead of unicode character · cb0e6055
    Kirill Smelkov authored
    Things were initially implemented to follow Go semantic exactly with
    bytestring iteration yielding unicode characters as explained in
    https://blog.golang.org/strings. However this makes bstr not a 100%
    drop-in compatible replacement for std str under py2, and even though my
    initial testing was saying this change does not affect programs in
    practice it turned out to be not the case.
    
    For example with bstr.__iter__ yielding unicode characters running
    gpython on py2 will break sometimes when importing uuid:
    
    There uuid reads 16 bytes from /dev/random and then wants to iterate
    those 16 bytes as single bytes and then expects that the length
    of the resulting sequence is exactly 16:
    
         int = long(('%02x'*16) % tuple(map(ord, bytes)), 16)
    
         ( https://github.com/python/cpython/blob/2.7-0-g8d21aa21f2c/Lib/uuid.py#L147 )
    
    which breaks if some of the read bytes are higher than 0x7f.
    
    Even though this particular problem could be worked-around with
    patching uuid, there is no evidence that there will be no similar
    problems later, which could be many.
    
    -> So adjust bstr semantic instead to follow semantic of str under py2
       and introduce uiter() primitive to still be able to iterate
       bytestrings as unicode characters.
    
    This makes bstr, hopefully, to be fully compatible with str on py2 while
    still providing reasonably good approach for strings processing the
    Go-way when needed.
    
    Add biter as well for symmetry.
    cb0e6055
README.rst 21.3 KB