1. 20 Jul, 2015 5 commits
  2. 19 Jul, 2015 7 commits
  3. 18 Jul, 2015 4 commits
    • Kevin Modzelewski's avatar
      Change attributes from strings to BoxedStrings · c5b1d41e
      Kevin Modzelewski authored
      The real benefit is that we intern the strings that
      end up getting used as attribute names, so we can compare
      them using pointer comparisons.  It should also reduce
      the size overhead of hidden classes, since we no longer
      have to copy the string data into the hidden class.
      c5b1d41e
    • Kevin Modzelewski's avatar
      Add internStringImmortal helper function · db4bc778
      Kevin Modzelewski authored
      And internStringMortal, which for now just resolves to
      internStringImmortal, but lets us mark strings that
      could eventually be collected.  (We could use the same
      approach that CPython uses and have a string destructor
      that removes mortal strings from the intern table.)
      db4bc778
    • Kevin Modzelewski's avatar
      Our list format is the same as CPython's · f9326e16
      Kevin Modzelewski authored
      well, except that two fields were swapped, and there is an
      extra struct wrapper in there.  But with some small changes
      we can now let capi code use the list macros for faster list
      manipulation.
      f9326e16
    • Kevin Modzelewski's avatar
      Remove list locks · 81b321dc
      Kevin Modzelewski authored
      They're not used any more, and even though they are
      empty NopLocks, they change the struct structure.
      81b321dc
  4. 17 Jul, 2015 21 commits
    • Kevin Modzelewski's avatar
      Experimental: speed up calling of capi code · caa5000a
      Kevin Modzelewski authored
      The main capi calling convention is to box all the positional
      arguments into a tuple, and then pass the tuple to PyArg_ParseTuple
      along with a format string that describes how to parse out the
      arguments.
      
      This ends up being pretty wasteful and misses all of the fast
      argument-rearrangement that we are able to JIT out.  These unicode
      functions are particularly egregious, since they use a helper
      function that ends up having to dynamically generate the format
      string to include the function name.
      
      This commit is a very simple change gets some of the common cases:
      in addition to the existing METH_O calling convention ('self' plus
      one positional arg), add the METH_O2 and METH_O3 calling
      conventions.  Plus add METH_D1/D2/D3 as additional flags that can
      be or'd into the calling convention flags, which specify that there
      should some number of default arguments.
      
      This is pretty limited:
      - only handles up to 3 arguments / defaults
      - only handles "O" type specifiers (ie no unboxing of ints)
      - only allows NULL as the default value
      - doesn't give as much diagnostic info on error
      
      The first two could be handled by passing the format string as part
      of the function metadata instead of using it in the function body,
      though this would mean having to add the ability to understand the
      format strings.
      
      The last two issues are tricky from an API perspective since they
      would require a larger change to pass through variable-length data
      structures.
      
      So anyway, punt on those issues for now, and just use the simple
      flag approach.  This cuts the function call overhead by about 4x
      for the functions that it's applied to, which are some common ones:
      string.count, unicode.count, unicode.startswith.
      (endswith, [r]find, and [r]index should all get updated as well)
      caa5000a
    • Kevin Modzelewski's avatar
      stat unicode allocations · d2ffecbe
      Kevin Modzelewski authored
      d2ffecbe
    • Kevin Modzelewski's avatar
      Add a django_parsing microbenchmark · 79b09dc0
      Kevin Modzelewski authored
      ie django_template minus the lexing.  We are faster now on the lexing,
      but the parsing is where most of the time gets spent.
      
      Also, change this benchmark and django_lexing to have a unicode template.
      Usually django does that conversion automatically, but the templates bypass
      where that happens, and we end up doing a lot of extra unicode decoding.
      79b09dc0
    • Chris Toshok's avatar
      Merge pull request #666 from rudi-c/gcfinalizers3 · e648044c
      Chris Toshok authored
      Preparation for finalizer code, refactoring some simple_destructor logic.
      e648044c
    • Rudi Chen's avatar
      Use weakref's original tp_dealloc. · fb2515af
      Rudi Chen authored
      fb2515af
    • Kevin Modzelewski's avatar
      Merge pull request #714 from kmod/perf4 · 65712e0e
      Kevin Modzelewski authored
      Convert "a in (b, c)" to "a == b or a == c"
      65712e0e
    • Kevin Modzelewski's avatar
      Merge pull request #713 from kmod/perf3 · ef2d7ba1
      Kevin Modzelewski authored
      optimize regex handling
      ef2d7ba1
    • Kevin Modzelewski's avatar
      Merge pull request #712 from kmod/perf2 · baad8901
      Kevin Modzelewski authored
      some fixes and cleanups
      baad8901
    • Kevin Modzelewski's avatar
      Merge pull request #711 from kmod/fix_707 · 6e2c06a8
      Kevin Modzelewski authored
      Another 2.7.9 compatibility fix
      6e2c06a8
    • Kevin Modzelewski's avatar
      f57db823
    • Kevin Modzelewski's avatar
      Convert "a in (b, c)" to "a == b or a == c" · f3e03b35
      Kevin Modzelewski authored
      Do this by adding "contains" to our codegen type system, and
      implement a special contains on the unboxedtuple type.
      
      This makes this operation quite a lot faster, but it looks like
      largely because we don't implement a couple optimizations that
      we should:
      - we create a new tuple object every time we hit that line
      - our generic contains code goes through compare(), which returns
        a box (since "<" and friends can return non-bools), but contains
        will always return a bool, so we have a bunch of extra boxing/unboxing
      
      We probably should separate out the contains logic from the rest of the
      comparisons, since it works quite differently and doesn't
      gain anything by being there.
      f3e03b35
    • Kevin Modzelewski's avatar
      cmake dependency tracking workaround · 5385cf7b
      Kevin Modzelewski authored
      5385cf7b
    • Kevin Modzelewski's avatar
      Optimize PySequence_GetSlice · 5bd967f2
      Kevin Modzelewski authored
      - copy CPython's implementation (that uses C slots)
      - implement the C slots for str and list
      - avoid doing a division for non-step slices
      5bd967f2
    • Kevin Modzelewski's avatar
      Remove ObjLookupCache.objptr · 36581431
      Kevin Modzelewski authored
      It was unused
      36581431
    • Kevin Modzelewski's avatar
      Reduce unnecessary string memsets · ba389a2e
      Kevin Modzelewski authored
      Particularly for string slicing, where we would
      always memset the string data to zero, and then
      immediately memcpy it.
      ba389a2e
    • Kevin Modzelewski's avatar
      Make listAppendInternal inlineable · b44f8a5b
      Kevin Modzelewski authored
      - put it into a header file (and start including it)
      - move the grow-the-array part into a separate function
        to encourage the fast-path to get inlined.
      b44f8a5b
    • Kevin Modzelewski's avatar
      Optimization to cpythons regex library · d95b70fc
      Kevin Modzelewski authored
      This division is expensive; the divisor is always sizeof(char) or sizeof(Py_UNICODE),
      and it seems to be faster to do a branch and then possibly a shift.
      d95b70fc
    • Kevin Modzelewski's avatar
      fix: need to check tp_getattro as well · 7c4c9095
      Kevin Modzelewski authored
      7c4c9095
    • Kevin Modzelewski's avatar
      int() continues to be tricky · 1505cc69
      Kevin Modzelewski authored
      int(str) and int(float) don't always return ints (cant return longs, doh).
      If we call int() on a subclass of int, we should call its __int__ method in
      case the subclass overrode it.
      1505cc69
    • Kevin Modzelewski's avatar
      add cpython regex tests · 069a7014
      Kevin Modzelewski authored
      069a7014
    • Kevin Modzelewski's avatar
      Makefile: make it easier to run custom cpythons · 7ad32c22
      Kevin Modzelewski authored
      And some other small cleanups
      7ad32c22
  5. 16 Jul, 2015 3 commits
    • Kevin Modzelewski's avatar
      Add a simplified version of django_template · 0130b77e
      Kevin Modzelewski authored
      This only does the lexing portion of the process.
      
      Further cut that down into a re.split ubench
      0130b77e
    • Rudi Chen's avatar
      Store simple_destructor in tp_dealloc. · 22b388e9
      Rudi Chen authored
      Replace the function pointer to the simple_destructor with a boolean
      indicating that the tp_dealloc function is safe to call whenever the
      simple_destructor used to be instead.
      
      A few additional classes are also specified to have a safe_tp_dealloc.
      
      For exceptions, use a hack where we look for the creation of exception
      classes and store them in a list so we can set their destructor at the
      same time as other classes.
      22b388e9
    • Chris Toshok's avatar
      Merge pull request #662 from rudi-c/gcfinalizers2 · d9840980
      Chris Toshok authored
      Some refactors in GC code + class-freed-before-instance bug fix.
      d9840980