encode.go · 2fe0e8769202f80bbbf10cd0a944ab67df2092be · Kirill Smelkov / og-rek

Kirill Smelkov authored Sep 27, 2018
In Python bytes is immutable and read-only array of bytes. It is
also hashable and so is different from go []byte in that it can be
used as a dict key. Thus the closes approximation for Python bytes in Go
is some type derived from Go's string - it will be different from string
and at the same time will inherit from string it immutability property
and being able to be used as map key. So

- add ogórek.Bytes type to represent Python bytes
- add support to decode BINBYTES* pickle opcodes (these are protocol 3 opcodes)
- add support to encode ogórek.Bytes via those BINBYTES* opcodes
- for protocols <= 2, where there is no opcodes to directly represent
  bytes, adopt the same approach as Python - by pickling bytes as

	_codecs.encode(byt.decode('latin1'), 'latin1')

  this way unpickling it on Python3 will give bytes, while unpickling it
  on Python2 will give str:

	In [1]: sys.version
	Out[1]: '3.6.6 (default, Jun 27 2018, 14:44:17) \n[GCC 8.1.0]'

	In [2]: byt = b'\x01\x02\x03'

	In [3]: _codecs.encode(byt.decode('latin1'), 'latin1')
	Out[3]: b'\x01\x02\x03'

  ---

	In [1]: sys.version
	Out[1]: '2.7.15+ (default, Aug 31 2018, 11:56:52) \n[GCC 8.2.0]'

	In [2]: byt = b'\x01\x02\x03'

	In [3]: _codecs.encode(byt.decode('latin1'), 'latin1')
	Out[3]: '\x01\x02\x03'

- correspondingly teach decoder to recognize particular calls to
  _codecs.encode as being representation for bytes and decode it
  appropriately.

- since we now have to emit byt.decode('latin1') as UNICODE - add, so
  far internal, `type unicode(string)` that instructs ogórek encoder to
  always emit the string with UNICODE opcodes (regular string is encoded
  to unicode pickle object only for protocol >= 3).

- For []byte encoding preserve the current status - even though
  dispatching in Encoder.encode changes, the end result is the same -
  []byte was and stays currently encoded as just regular string.

  This was added in 555efd8f "first draft of dumb pickle encoder", and
  even though that might be not a good choice, changing it is a topic for
  another patch.
2fe0e876
encode.go 12.6 KB
Replace encode.go