1. 28 Feb, 2020 3 commits
    • Kirill Smelkov's avatar
      strconv: Fix b & friends on macos/windows · 0561926a
      Kirill Smelkov authored
      On macos and windows, Python2 is built with --enable-unicode=ucs2, which
      makes it to use UTF-16 encoding for unicode characters, and so for
      characters higher than U+10000 it uses surrogate encoding with _2_
      unicode points, for example:
      
              >>> import sys
              >>> sys.maxunicode
              65535                       <-- NOTE indicates UCS2 build
              >>> s = u'\U00012345'
              >>> s
              u'\U00012345'
              >>> s.encode('utf-8')
              '\xf0\x92\x8d\x85'
              >>> len(s)
              2                           <-- NOTE _not_ 1
              >>> s[0]
              u'\ud808'
              >>> s[1]
              u'\udf45'
      
      This leads to e.g. b tests failing for
      
          # tbytes                        tunicode
          (b"\xf0\x90\x8c\xbc",           u'\U0001033c'),     # Valid 4 Octet Sequence '𐌼'
      
          >           assert b(tunicode) == tbytes
          E           AssertionError: assert '\xed\xa0\x80\xed\xbc\xbc' == '\xf0\x90\x8c\xbc'
          E             - \xed\xa0\x80\xed\xbc\xbc
          E             + \xf0\x90\x8c\xbc
      
      because on UCS2 python build u'\U0001033c' is represented as 2 unicode
      points:
      
          >>> s = u'\U0001033c'
          >>> len(s)
          2
          >>> s[0]
          u'\ud800'
          >>> s[1]
          u'\udf3c'
          >>> s[0].encode('utf-8')
          '\xed\xa0\x80'
          >>> s[1].encode('utf-8')
          '\xed\xbc\xbc'
      
      -> Fix it by detecting UCS2 build and working around by manually
      combining such surrogate unicode pairs appropriately.
      
      A reference on the subject:
      
      https://matthew-brett.github.io/pydagogue/python_unicode.html#utf-16-ucs2-builds-of-python-and-32-bit-unicode-code-points
      0561926a
    • Kirill Smelkov's avatar
      strconv: Switch _utf8_decode_rune to return rune ordinal instead of unicode character · 5cc679ac
      Kirill Smelkov authored
      This is a preparatory step for the next patch where we'll be fixing
      strconv for Python2 builds with --enable-unicode=ucs2, where a unicode
      character can be taking _2_ unicode points.
      
      In that general case relying on unicode objects to represent runes is
      not good, because many things generally do not work for U+10000 and
      above, e.g. ord breaks:
      
          >>> import sys
          >>> sys.maxunicode
          65535                       <-- NOTE indicates UCS2 build
          >>> s = u'\U00012345'
          >>> s
          u'\U00012345'
          >>> s.encode('utf-8')
          '\xf0\x92\x8d\x85'
          >>> len(s)
          2                           <-- NOTE _not_ 1
          >>> ord(s)
          Traceback (most recent call last):
            File "<stdin>", line 1, in <module>
          TypeError: ord() expected a character, but string of length 2 found
      
      so we switch to represent runes as integer, similarly to what Go does.
      5cc679ac
    • Kirill Smelkov's avatar
      pyx.build: Fix hang under gpython · cd67996e
      Kirill Smelkov authored
      Commit 8af78fc5 (pyx.build: v↑ setuptools_dso (1.2 -> 1.4)) upgraded
      setuptools_dso to 1.4, but since from
      
      https://github.com/mdavidsaver/setuptools_dso/commit/3f3ff746
      
      setuptools_dso started to use multiprocessing, pyx.build, when running
      under gpython, started to hang, which is a known gevent problem - see
      e.g. here: https://github.com/gevent/gevent/issues/993. The problem was
      manifesting itself as pyx.build unit test hanging under Python3.
      
      Fix it by installing gevent multiprocessing plugin which is
      automatically used/activated by gevent.monkey.patch_all().
      
      geventmp says it is pre-alpha, but by using it we can unhang pyx.build
      tests, which is better state than before. The other future possibility
      would be to use https://github.com/jgehrcke/gipc wrapped into
      multiprocessing compatible API.
      cd67996e
  2. 27 Feb, 2020 3 commits
  3. 20 Feb, 2020 1 commit
  4. 17 Feb, 2020 1 commit
    • Kirill Smelkov's avatar
      sync += RWMutex · 1ad3c2d5
      Kirill Smelkov authored
      Provide sync.RWMutex that can be useful for cases when there are
      multiple simultaneous readers and more seldom writer(s).
      
      This implements readers-writer mutex with preference for writers
      similarly to Go version.
      1ad3c2d5
  5. 12 Feb, 2020 1 commit
  6. 11 Feb, 2020 3 commits
    • Kirill Smelkov's avatar
      errors: Take .__cause__ into account · 03f88c0b
      Kirill Smelkov authored
      A Python error can have links to other errors by means of both .Unwrap()
      and .__cause__ . These ways are both explicit and so should be treated
      by e.g. errors.Is as present in error's error chain.
      
      It is a bit unclear, at least initially, how to linearise and order
      error chain traversal in divergence points - for exception objects where
      both .Unwrap() and .__cause__ are !None. However more closer look
      suggests linearisation rule to traverse into .__cause__ after going
      through .Unwrap() part - please see details in documentation added into
      _error.pyx
      
      -> Teach errors.Is to do this traversal, and this way now e.g. exception
      raised as
      
      	raise X from Y
      
      will be treated by errors.Is as being both X and Y, even if any of X or Y
      also has its own error chain via .Unwrap().
      
      Top-level documentation is TODO.
      03f88c0b
    • Kirill Smelkov's avatar
      golang, errors, fmt: Error chaining (Python) · 337de0d7
      Kirill Smelkov authored
      Following errors model in Go and fd95c88a (golang, errors, fmt: Error
      chaining (C++/Pyx)) let's add support at Python-level for errors to wrap
      each other and to be inspected/unwrapped:
      
      - an error can additionally provide way to unwrap itself, if it
        provides .Unwrap() method. .__cause__ is not taken into account yet,
        but will be in a follow-up patch;
      - errors.Is(err) tests whether an item in error's chain matches target;
      - `fmt.Errorf("... : %w", ... err)` is similar to `"... : %s" % (..., err)`
        but resulting error, when unwrapped, will return err.
      - errors.Unwrap is not exposed as chaining through both .Unwrap() and
        .__cause__ will need more than just "current element" as unwrapping
        state (i.e. errors.Unwrap API is insufficient - see next patch), and
        in practice users of errors.Unwrap() are very seldom.
      
      Support for error chaining through .__cause__ will follow in the next
      patch.
      
      Top-level documentation is TODO.
      
      See https://blog.golang.org/go1.13-errors for error chaining overview.
      337de0d7
    • Kirill Smelkov's avatar
      golang: Teach pyerror to be a base class · 78d0c76f
      Kirill Smelkov authored
      It is surprising to have an exception class that cannot be derived from.
      
      Besides, in the future we'll use subclassing from golang.error as an
      indicator that an error is a "well-defined" (in simple words - does not
      need traceback to be interpreted).
      78d0c76f
  7. 10 Feb, 2020 1 commit
    • Kirill Smelkov's avatar
      golang: Expose error at Py level · 17798442
      Kirill Smelkov authored
      The first step to expose errors and error chaining to Python:
      
      - Add pyerror that wraps a pyx/nogil C-level error and is exposed as golang.error at py level.
      - py errors must be compared by ==, not by "is"
      - Add (py) errors.New to create a new error from text.
      - a C-level error that has .Unwrap, is exposed with .Unwrap at py level,
        but full py-level chaining will be implemented in a follow-up patch.
      - py error does not support inheritance yet.
      
      Top-level documentation is TODO.
      17798442
  8. 06 Feb, 2020 1 commit
    • Kirill Smelkov's avatar
      golang, errors, fmt: Error chaining (C++/Pyx) · fd95c88a
      Kirill Smelkov authored
      Following errors model in Go, let's add support for errors to wrap other
      errors and to be inspected/unwrapped:
      
      - an error can additionally provide way to unwrap itself, if it
        implements errorWrapper interface;
      - errors.Unwrap(err) tries to extract wrapped error;
      - errors.Is(err) tests whether an item in error's chain matches target;
      - `fmt.errorf("... : %w", ... err)` is similar to `fmt.errorf("... : %s", ... err.c_str())`
        but resulting error, when unwrapped, will return err.
      
      Add C++ implementation for the above + tests.
      Python analogs will follow in the next patches.
      
      Top-level documentation is TODO.
      
      See https://blog.golang.org/go1.13-errors for error chaining overview.
      fd95c88a
  9. 04 Feb, 2020 16 commits
    • Kirill Smelkov's avatar
      cxx: Correct dict interface · 58fcdd87
      Kirill Smelkov authored
      Package cxx was added in 9785f2d3 (cxx: New package), but the interface
      that cxx:dict provided turned out to be not optimal:
      
          dict.get  was returning (v, ok), and
          dict.pop  ----//---
      
      Correct dict.get and dict.pop to return just value, and, similarly to
      channels API, provide additional dict.get_ and dict.pop_ - extended
      versions that also return ok:
      
          dict.get(k)  -> v
          dict.pop(k)  -> v
          dict.get_(k) -> (v, ok)
          dict.pop_(k) -> (v, ok)
      
      This time add tests.
      58fcdd87
    • Kirill Smelkov's avatar
      golang: fmt.pxd -> _fmt.pxd + fmt.pxd import redirector · dbd051f1
      Kirill Smelkov authored
      Follow the scheme established and used for all other packages, because
      we will soon have fmt pyx part which, if named as fmt.pyx, will
      intersect and conflict with fmt.py .
      dbd051f1
    • Kirill Smelkov's avatar
      errors: Test for New (C++) · 288e16a7
      Kirill Smelkov authored
      errors.New was added in a245ab56 (errors: New package) without test.
      288e16a7
    • Kirill Smelkov's avatar
    • Kirill Smelkov's avatar
    • Kirill Smelkov's avatar
    • Kirill Smelkov's avatar
      golang: testing: Provide file and line for a failing ASSERT_EQ · ff2ed5fe
      Kirill Smelkov authored
      Makes understanding which test is it and where when one fails.
      ff2ed5fe
    • Kirill Smelkov's avatar
      libgolang: tests: Factor out common testing functionality into shared place · 46a6f424
      Kirill Smelkov authored
      Currently libgolang_test.cpp contains tests for code in libgolang.cpp and
      for code that lives in other libgolang packages - sync, fmt, etc. It is
      becoming tight and we are going to split libgolang_test.cpp and move
      package tests to their corresponing files - e.g. to sync_test.cpp and
      the like.
      
      Move common assertion utilities into shared header before that as a
      preparatory step.
      46a6f424
    • Kirill Smelkov's avatar
      golang: qq: Don't depend on six · e028cf28
      Kirill Smelkov authored
      Just use builtins and cimported things that we have at pyx level.
      e028cf28
    • Kirill Smelkov's avatar
      golang: qq: Use u for UTF-8 decoding · 3073ac98
      Kirill Smelkov authored
      U is preffered way to make sure an object is unicode string.
      3073ac98
    • Kirill Smelkov's avatar
      gcompat: Move qq into golang · 8c459a99
      Kirill Smelkov authored
      This will allow to integrate qq with u in the next patch.
      
      Moving to compiled code for string processing functions is also
      generally better for performance.
      8c459a99
    • Kirill Smelkov's avatar
      golang: Provide b, u for strings · bcb95cd5
      Kirill Smelkov authored
      With Python3 I've got tired to constantly use .encode() and .decode();
      getting exception if original argument was unicode on e.g. b.decode();
      getting exception on raw bytes that are invalid UTF-8, not being able to
      use bytes literal with non-ASCII characters, etc.
      
      So instead of this pain provide two functions that make sure an object
      is either bytes or unicode:
      
      - b converts str/unicode/bytes s to UTF-8 encoded bytestring.
      
      	Bytes input is preserved as-is:
      
      	   b(bytes_input) == bytes_input
      
      	Unicode input is UTF-8 encoded. The encoding always succeeds.
      	b is reverse operation to u - the following invariant is always true:
      
      	   b(u(bytes_input)) == bytes_input
      
      - u converts str/unicode/bytes s to unicode string.
      
      	Unicode input is preserved as-is:
      
      	   u(unicode_input) == unicode_input
      
      	Bytes input is UTF-8 decoded. The decoding always succeeds and input
      	information is not lost: non-valid UTF-8 bytes are decoded into
      	surrogate codes ranging from U+DC80 to U+DCFF.
      	u is reverse operation to b - the following invariant is always true:
      
      	   u(b(unicode_input)) == unicode_input
      
      NOTE: encoding _and_ decoding *never* fail nor loose information. This
      is achieved by using 'surrogateescape' error handler on Python3, and
      providing manual fallback that behaves the same way on Python2.
      
      The naming is chosen with the idea so that b(something) resembles
      b"something", and u(something) resembles u"something".
      
      This, even being only a part of strings solution discussed in [1],
      should help handle byte- and unicode- strings in more robust and
      distraction free way.
      
      Top-level documentation is TODO.
      
      [1] nexedi/zodbtools!13
      bcb95cd5
    • Kirill Smelkov's avatar
      libgolang: Provide Nil as alias for std::nullptr_t · 230c81c4
      Kirill Smelkov authored
      This continues 60f6db6f (libgolang: Provide nil as alias for nullptr and
      NULL): I've tried to compile pygolang with Clang on my Debian 10
      workstation and got:
      
          $ CC=clang CXX=clang++ python setup.py build_dso -i
      
          In file included from ./golang/fmt.h:32:
          ./golang/libgolang.h:381:11: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t'?
          constexpr nullptr_t nil = nullptr;
                    ^~~~~~~~~
                    std::nullptr_t
          /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8/bits/c++config.h:242:29: note: 'std::nullptr_t' declared here
            typedef decltype(nullptr)     nullptr_t;
                                          ^
          :
          In file included from ./golang/context.h
          In file included from golang/runtime/libgolang.cpp:30:
          ./golang/libgolang.h:381:11: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t'?
          constexpr nullptr_t nil = nullptr;
                    ^~~~~~~~~
                    std::nullptr_t
          /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8/bits/c++config.h:242:29: note: 'std::nullptr_t' declared here
            typedef decltype(nullptr)     nullptr_t;
                                          ^
          :39:
          ./golang/libgolang.h:381:11: error: unknown type In file included from golang/fmt.cpp:25:
          In file included from ./golang/fmt.h:32:
          ./golang/libgolang.h:421:17: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t'?
              inline chan(nullptr_t) { _ch = nil; }
                          ^~~~~~~~~
                          std::nullptr_t
      
          ...
      
      It seems with GCC and Clang under macOS nullptr_t is automatically provided in
      builtin namespace, while with older Clang on Linux (clang version 7.0.1-8) only
      in std:: namespace - rightfully as nullptr_t is described to be present there:
      
      https://en.cppreference.com/w/cpp/types/nullptr_t
      
      This way we either have to correct all occurrences of nullptr_t to
      std::nullptr_t, or do something similar with providing nil under golang:: .
      
      To reduce noise I prefer the later and let it be named as Nil.
      230c81c4
    • Kirill Smelkov's avatar
      golang: tests: pypanicWhenBlocked: Fix thinko in __exit__ · b5a2f9dc
      Kirill Smelkov authored
      The code was assigning nil to local, _not_ global _tblockforever. As a
      result _tblockforever was left set with a test hook even after leaving
      test context. Fix it.
      
      The bug was there starting from 3b241983 (Port/move channels to
      C/C++/Pyx).
      
      Had to change `= nil` to `= NULL` because with nil Cython complains as
      
              def __exit__(pypanicWhenBlocked t, typ, val, tb):
                  global _tblockforever
                  _tblockforever = nil
                                  ^
          ------------------------------------------------------------
      
          golang/_golang_test.pyx:86:25: Cannot assign type 'nullptr_t' to 'void (*)(void) nogil'
      
      This is https://github.com/cython/cython/issues/3314.
      b5a2f9dc
    • Kirill Smelkov's avatar
      time: Kill unused nullptr_t cimport · 9b63ec01
      Kirill Smelkov authored
      It's a leftover originating from b073f6df (time: Move/Port timers to
      C++/Pyx nogil).
      9b63ec01
    • Kirill Smelkov's avatar
      context: Fix thinko/typo in PyContext.__dealloc__ · 371d50b5
      Kirill Smelkov authored
      Instead of `pyctx.ctx = nil` it was just `ctx = nil` - i.e. assign nil
      to local variable instead of changing pyctx instance data. We were not
      observing this bug because Cython, for C++ fields of cdef classes,
      automatically emits in-place destructor calls in generated __dealloc__
      
          https://github.com/cython/cython/blob/0.29.14-11-g8c620c388/Cython/Compiler/ModuleNode.py#L1477-L1478
      
      and so this way there was no leak. However we want to be explicit and the
      code was not correct. Fix it.
      
      The bug was there from 2a359791 (context: Move/Port context package to
      C++/Pyx nogil).
      371d50b5
  10. 17 Jan, 2020 3 commits
    • Kirill Smelkov's avatar
      *: NULL -> nil (Pyx and only where we can) · 01ade7ac
      Kirill Smelkov authored
      Convert Pyx part of the project to use nil instead of NULL.
      
      Not every usage of NULL was converted and some places were left to use
      NULL where changing it to nil currently hits Cython compilation error:
      
      https://github.com/cython/cython/issues/3314
      01ade7ac
    • Kirill Smelkov's avatar
      *: NULL/nullptr -> nil (C++ only) · fc1c3e24
      Kirill Smelkov authored
      Convert C++ part of the project to use nil instead of NULL/nullptr.
      
      We do not convert pyx part yet, because Cython currently does not
      understand that nullptr_t has properties of NULL and with e.g. the
      following change
      
          --- a/golang/_context.pyx
          +++ b/golang/_context.pyx
          @@ -116,7 +116,7 @@ cdef cppclass _PyValue (_interface, gobject) nogil:
               __dealloc__():
                   with gil:
                       obj = <object>this.pyobj
          -            this.pyobj = NULL
          +            this.pyobj = nil
                       Py_DECREF(obj)
      
      errors as
      
          Error compiling Cython file:
          ------------------------------------------------------------
          ...
                  if __decref():
                      del self
              __dealloc__():
                  with gil:
                      obj = <object>this.pyobj
                      this.pyobj = nil
                                  ^
          ------------------------------------------------------------
      
          golang/_context.pyx:119:25: Cannot assign type 'nullptr_t' to 'PyObject *'
      
      https://github.com/cython/cython/issues/3314
      fc1c3e24
    • Kirill Smelkov's avatar
      libgolang: Provide nil as alias for nullptr and NULL · 60f6db6f
      Kirill Smelkov authored
      Nil is more native to Go.
      60f6db6f
  11. 13 Jan, 2020 1 commit
  12. 08 Jan, 2020 1 commit
  13. 06 Dec, 2019 1 commit
  14. 27 Nov, 2019 4 commits
    • Kirill Smelkov's avatar
      pygolang v0.0.5 · c5c3071b
      Kirill Smelkov authored
      This release is driven by wendelin.core v2 needs with one of the changes
      being that now most of the library was moved into nogil code and can be
      used fully from inside nogil world(*). Python modules are now just wrappers
      of their nogil counterparts. The way for Python and nogil worlds to
      communicate is also provided.
      
      The move to nogil required many other enhancements along the way. Please
      see CHANGELOG for overview.
      
      The move to nogil brought some speedup automatically.
      Below are benchmark results of this release compared to pygolang v0.0.4
      (1573d101) for python-level benchmarks (we have only those at present):
      
      	(on i7@2.6GHz)
      
      thread runtime:
      
          name             old time/op  new time/op  delta
          go               18.3µs ± 0%  18.3µs ± 1%     ~     (p=1.000 n=10+10)
          chan             2.91µs ± 3%  2.99µs ± 5%   +2.73%  (p=0.022 n=10+10)
          select           3.57µs ± 3%  3.57µs ± 4%     ~     (p=0.720 n=9+10)
          def              55.0ns ± 0%  54.0ns ± 0%   -1.82%  (p=0.002 n=8+10)
          func_def         43.8µs ± 2%  44.1µs ± 1%   +0.64%  (p=0.035 n=10+9)
          call             64.0ns ± 0%  66.3ns ± 1%   +3.59%  (p=0.000 n=10+10)
          func_call        1.05µs ± 1%  1.24µs ± 0%  +17.80%  (p=0.000 n=10+7)
          try_finally       138ns ± 0%   137ns ± 1%   -0.51%  (p=0.003 n=10+10)
          defer            2.32µs ± 1%  2.63µs ± 1%  +13.52%  (p=0.000 n=10+10)
          workgroup_empty  38.0µs ± 1%  24.1µs ± 1%  -36.43%  (p=0.000 n=10+10)
          workgroup_raise  47.7µs ± 1%  28.2µs ± 0%  -40.76%  (p=0.000 n=10+10)
      
      gevent runtime:
      
          name             old time/op  new time/op  delta
          go               16.9µs ± 1%  17.2µs ± 2%   +1.94%  (p=0.000 n=10+10)
          chan             7.43µs ± 0%  7.82µs ± 0%   +5.34%  (p=0.000 n=10+7)
          select           10.5µs ± 0%  11.2µs ± 0%   +6.74%  (p=0.000 n=10+10)
          def              63.0ns ± 0%  57.6ns ± 1%   -8.57%  (p=0.000 n=9+10)
          func_def         44.0µs ± 1%  44.2µs ± 1%     ~     (p=0.063 n=10+10)
          call             67.0ns ± 0%  64.0ns ± 0%   -4.48%  (p=0.002 n=8+10)
          func_call        1.06µs ± 1%  1.23µs ± 1%  +16.50%  (p=0.000 n=10+10)
          try_finally       144ns ± 0%   136ns ± 0%   -5.90%  (p=0.000 n=10+10)
          defer            2.37µs ± 1%  2.61µs ± 1%  +10.07%  (p=0.000 n=10+10)
          workgroup_empty  57.0µs ± 0%  55.0µs ± 2%   -3.53%  (p=0.000 n=10+9)
          workgroup_raise  72.4µs ± 0%  69.6µs ± 6%   -3.95%  (p=0.035 n=9+10)
      
      workgroup_* changes for thread runtime is the speedup I am talking about.
      defer/func_call slowdown is due to added exception chaining. We did not
      optimize Python-level defer yet, and if/when that would be needed, it should
      be possible to optimize by moving pydefer implementation into Cython.
      
      (*) go and channels were moved into nogil world in Pygolang v0.0.3 +
      v0.0.4 . Now it is the rest of the library that was moved with packages
      like context, time, sync etc.
      
      wendelin.core v2 needs nogil to run pinner thread on client side to
      support isolation property in cooperation with wcfs: since there is a
      `client -> wcfs -> pinner` loop:
      
            - - - - - -
           |           |
              pinner <------.
           |           |   wcfs
              client -------^
           |           |
            - - - - - -
           client process
      
      the pinner thread would deadlock if it tries to take the GIL because
      client thread can be holding GIL already while accessing wcfs-mmaped
      memory (think doing e.g. `x = A[i]` in Python).
      c5c3071b
    • Kirill Smelkov's avatar
      readme: Turn package references into links · 37ae291f
      Kirill Smelkov authored
      For base functionality we have overview in the readme itself, but for
      packages we have only their listing with brief overview and no
      documentation for in-package functionality.
      
      Let's have at least links to .h/.pxd/.py where package functionality is
      documented.
      37ae291f
    • Kirill Smelkov's avatar
      libgolang: Provide top-level overview for interfaces · 45c8cddd
      Kirill Smelkov authored
      See commit 5a99b769 (libgolang: Start providing interfaces) for context.
      45c8cddd
    • Kirill Smelkov's avatar
      libgolang: Provide top-level overview for automatic memory management · 7f0672aa
      Kirill Smelkov authored
      Provide top-level documentation for memory management facilities that
      was marked as TODO in refptr & co matches. See e.g. the following
      commits for context:
      
      - e82b4fab (libgolang: Objects refcounting (initial draft))
      - b2253abf (libgolang: Rename refobj -> object)
      - fd2a6fab (libgolang: Fix globals atexit race condition of ~refptr vs
        access from another thread)
      7f0672aa