ogorek_test.go · b429839d98e309ef2e9190eff0986688bedf354a · Kirill Smelkov / og-rek

tests: Show pickles in a way that can be copy-pasted into Python · b429839d

Kirill Smelkov authored Sep 21, 2018

When encoding tests fails, the "want" and "have" pickles are printed. It
is handy to copy-paste those pickles into Python console and check them
further there.

Pickle printing currently uses %q. However in Go fmt's %q can use \u and
\U if byte sequence form a valid UTF-8 character. That poses a problem:
in Python str (py2) or bytes (py3) literal \uXXXX are not processed as
unicode-escapes and enter the string as is. This result in different
pickle data pasted into Python and further confusion.

Entering data into Python as unicode literals (where \u works) and then
adding .encode('utf-8') also does not generally work - as pickle data is
generally arbitrary it can be a not valid UTF-8, for example:

	"\x80\u043c\u0438\u0440"	(= "\x80мир"   = "\x80\xd0\xbc\xd0\xb8\xd1\x80")

end unicode-encoding them in python also gives different data:

	In [1]: u"\x80\u043c\u0438\u0440".encode('utf-8')
	Out[1]: '\xc2\x80\xd0\xbc\xd0\xb8\xd1\x80'

(note leading extra \xc2)

For this reason let's implement quoting - that Python can understand -
ourselves. This dumping functionality was very handy during recent
encoder fixes debugging.

b429839d

ogorek_test.go 31.9 KB

Replace ogorek_test.go