An error occurred fetching the project authors.
  1. 12 Jul, 2015 2 commits
    • Kevin Modzelewski's avatar
      Get rid of "interpreted CompiledFunctions" · 41c0273e
      Kevin Modzelewski authored
      This was vistigial from the old llvm interpreter, where we it made a bit more
      sense.  Now it's just weird, and caused the tiering logic to be spread into
      weird places.  It was also the source of some bugs, since when we would deopt
      there's not really any relevant CompiledFunction that represents the deopt
      frame (this is why type speculation had to be temporarily disabled).
      
      So instead, make it so that the interpreter (and by extension, the baseline
      jit) work on the CLFunction directly, since the CompiledFunction didn't hold
      any relevant information anyway.  This require a whole bunch of refactoring and
      API changes.
      
      This commit doesn't actually change any of the tiering decisions (I think), but
      should just make them clearer; the actual behavioral changes will be done in
      upcoming commits.
      41c0273e
    • Kevin Modzelewski's avatar
      Get rid of void-returning functions · fc238ed8
      Kevin Modzelewski authored
      Module-level code used to return void, but this causes some
      special-casing, so switch to having them return None.
      Also, CPython has their module code objects return None, so
      it's a small compatibility gain as well.
      fc238ed8
  2. 02 Jul, 2015 1 commit
    • Marius Wachtler's avatar
      Add new JIT tier between the interpreter and the LLVM JIT tiers. · fc366564
      Marius Wachtler authored
      This JIT is tightly coupled to the ASTInterpreter, at every CFGBlock* entry/exit
      on can switch between interpreting and directly executing the generated code without having
      to do any translations.
      Generating the code is pretty fast compared to the LLVM tier but the generated code is not as fast
      as code generated by the higher LLVM tiers.
      But because the JITed can use runtime ICs, avoids a lot of interpretation overhead and
      stores CFGBlock locals sysbols inside register/stack slots, it's much faster than the interpreter.
      fc366564
  3. 10 Jun, 2015 1 commit
    • Kevin Modzelewski's avatar
      Change the way we store and pass string data · 95ad7ffc
      Kevin Modzelewski authored
      Convert to BoxedString much sooner, and have any functions that
      might need to box a string take a BoxedString.  This means that
      on some paths, we will need to box when previously we didn't, but
      for callsites that we control we can just intern the string and
      not have to box again.  The much more common case is that we
      passed in unboxed string data, but then ran into a branch
      that required boxing.
      
      BoxedString shouldn't be that much more costly than std::string,
      and this should cut down on string allocations.
      
      For django-template.py, the number of strings allocated drops from
      800k to 525k; for virtualenv_test.py, it goes from 1.25M to 1.0M
      
      A couple things made this not 100% mechanical:
      - taking advantage of places that we could eliminate unbox/rebox pairs
      - different null-termination assumptions between StringRef and the c api.
      95ad7ffc
  4. 29 May, 2015 1 commit
    • Kevin Modzelewski's avatar
      Python-level sampling profiler · 7596e61a
      Kevin Modzelewski authored
      Uses setitimer() to set a recurring signal, and prints
      a Python stacktrace at the next safepoint aka allowGLReadPreemption.
      This is not great since allowGLReadPreemption can happen a decent
      amount later than the signal.  (I'll play around with trying to get
      the signal to be acted on sooner, but it might be better to wait
      for full signal-handling support.)
      
      Still, it seems to provide some decent high-level info.  For example,
      half of the startup time of the django-template benchmark seems to be
      due to regular expressions.
      7596e61a
  5. 05 May, 2015 1 commit
    • Kevin Modzelewski's avatar
      Don't cache analysis results · b5823ff6
      Kevin Modzelewski authored
      On the simple test of `pyston -c "import pip"`, it reduces memory
      usage from 194MB to 153MB (20% of the memory previously was cached
      analysis data).  It also increases our benchmark geomean by ~1%,
      which isn't great, but I think we can get that back eventually and
      I don't think it's worth blocking this memory improvement for that.
      b5823ff6
  6. 23 Apr, 2015 1 commit
    • Kevin Modzelewski's avatar
      Move fn from module to code · 32a0dbff
      Kevin Modzelewski authored
      Modules have a __file__ attribute but that's only used for
      the module repr.  The filename that's used in tracebacks
      is stored in the code object.
      32a0dbff
  7. 22 Apr, 2015 1 commit
    • Kevin Modzelewski's avatar
      Fix analysis issue that virtualenv was running into · 81f00afe
      Kevin Modzelewski authored
      The issue was that the types analysis was osr-aware, where
      we would only type-analyze the sections of the function
      accessible from the osr entry point.  The phi and definedness
      analyses were not osr-aware, so they would think that phis
      in certain places where the type analysis knew that they
      were undefined.
      
      So this change makes definedness and phi analysis osr-aware,
      where they only analyze the appropriate section of the function.
      I think this means that we will do these analyses more, since we
      have to rerun them for each entry point, so hopefully analysis time
      doesn't increase too much.
      81f00afe
  8. 20 Apr, 2015 2 commits
  9. 19 Apr, 2015 1 commit
  10. 07 Apr, 2015 1 commit
    • Kevin Modzelewski's avatar
      Greatly reduce verbosity in -v mode · 467c4fd5
      Kevin Modzelewski authored
      All the old prints still exist, but at -vv or even -vvv mode.
      
      The basic system is:
      -v mode gives information about the overall execution: what functions we
          run into, what things take longer than we expected, etc.
      -vv mode gives information about each function: the cfg, llvm ir, etc.
      -vvv gives information about each BB.
      467c4fd5
  11. 24 Mar, 2015 1 commit
  12. 23 Mar, 2015 1 commit
  13. 17 Mar, 2015 1 commit
  14. 24 Feb, 2015 1 commit
    • Marius Wachtler's avatar
      Rebase llvm to r230300 · ef1acbdc
      Marius Wachtler authored
      Had to change the comdat handling otherwise the executable would crash on startup.
      Cause is that publicize is renaming symbols but keeps the old comdat around,
      the linker would then replace the implementation of the symbol with a call to 0.
      
      One may need to run 'make llvm_up' if you are using the cmake build and see a error,
      caused by debuginfo beeing renamed to debuginfodwarf.
      ef1acbdc
  15. 17 Feb, 2015 1 commit
  16. 14 Feb, 2015 5 commits
    • Kevin Modzelewski's avatar
      Reenable tier 2 for now · a3a12bb6
      Kevin Modzelewski authored
      We should do a more comprehensive investigation.  Removing t2 caused
      regressions on a number of benchmarks since we lost chances to do
      speculations, but making t3 easier to get to caused regressions
      due to the cost of our LLVM optimization set (which is pretty hefty
      since it's supposed to be hard to activate).
      a3a12bb6
    • Kevin Modzelewski's avatar
      Further distinguish OSR and non-osr compiles · 1bfb56e8
      Kevin Modzelewski authored
      A "FunctionSpecialization" object really makes no sense in the context of
      an OSR compile, since the FunctionSpecialization talks about the types
      of the input arguments, which no longer matter for OSR compiles.
      Now, their type information comes (almost) entirely from the OSREntryDescriptor,
      so in most places assert that we get exactly one or the other.
      1bfb56e8
    • Kevin Modzelewski's avatar
      Can kill all notion of partial-block-compilation · 0e60f0d3
      Kevin Modzelewski authored
      We only needed that for supporting the old deopt system
      0e60f0d3
    • Kevin Modzelewski's avatar
      Nuke the old "block guards" and the rest of the old deopt system · 8feae20e
      Kevin Modzelewski authored
      Long live new-deopt!
      8feae20e
    • Kevin Modzelewski's avatar
      For OSRs, do type analysis starting from OSR edge · ea673dfd
      Kevin Modzelewski authored
      Before we would do type analysis starting from the function entry
      (using the specialization of the previous function).  This makes things
      pretty complicated because we can infer different types than we are OSRing
      with!  Ex if the type analysis determines that we should speculate in an
      earlier BB, the types we have now might not reflect that speculation.
      
      So instead, start the type analysis starting from the BB that the OSR starts at.
      Should also have the side-benefit of requiring less type analysis work.
      
      But this should let us get rid of the OSR-entry guarding, and the rest of
      the old deopt system!
      ea673dfd
  17. 13 Feb, 2015 2 commits
  18. 11 Feb, 2015 2 commits
  19. 07 Feb, 2015 1 commit
  20. 04 Feb, 2015 1 commit
    • Kevin Modzelewski's avatar
      Intern most codegen strings · 325dbfeb
      Kevin Modzelewski authored
      Most importantly, intern all the strings we put into the AST* nodes.
      (the AST_Module* owns them)
      
      This should save us some memory, but it also improves performance pretty
      substantially since now we can do string comparisons very cheaply.  Performance
      of the interpreter tier is up by something like 30%, and JIT-compilation times
      are down as well (though not by as much as I was hoping).
      
      The overall effect on perf is more muted since we tier out of the interpreter
      pretty quickly; to see more benefit, we'll have to retune the OSR/reopt thresholds.
      
      For better or worse (mostly better IMO), the interned-ness is encoded in the type
      system, and things will not automatically convert between an InternedString and
      a std::string.  It means that this diff is quite large, but it also makes it a lot
      more clear where we are making our string copies or have other room for optimization.
      325dbfeb
  21. 22 Jan, 2015 1 commit
  22. 10 Jan, 2015 1 commit
  23. 07 Jan, 2015 2 commits
    • Kevin Modzelewski's avatar
      Rebase to LLVM r225000 · 83fb05b6
      Kevin Modzelewski authored
      Changes here due to the Metadata/Value split.
      
      Had some issues with intermediate commits due to leak
      detecting; not seeing them on this later commit.  Hopefully
      everything is ok...
      83fb05b6
    • Kevin Modzelewski's avatar
      Rebase to LLVM r223801 · 382d5095
      Kevin Modzelewski authored
      Primary challenge is rebasing past the JITEventListener changes.
      
      Some API changes, but also some considerable functionality changes,
      since we no longer get the loaded version of the object file.
      
      They added a feature to get the load address of a section, but not of
      a symbol; I think that makes sense so I'll submit a patch for that.
      382d5095
  24. 05 Jan, 2015 1 commit
  25. 04 Jan, 2015 1 commit
  26. 19 Dec, 2014 1 commit
    • Kevin Modzelewski's avatar
      Switch clang-format to C++11 mode · b0ead8ab
      Kevin Modzelewski authored
      Looks like for now this only changes double-closing-angle-brackets
      (now formats "> >" as ">>" in templates), but apparently it has
      effects on raw string literals as well.
      b0ead8ab
  27. 10 Dec, 2014 1 commit
    • Kevin Modzelewski's avatar
      Make 'is_generator' be a property of SourceInfo, not ScopeInfo · faa290b8
      Kevin Modzelewski authored
      ScopeInfo involves checking the whole function subtree to resolve
      scoping references.  SourceInfo has information about the specific
      function.  is_generator used to be in ScopeInfo, but accessing it
      would require doing the full subtree analysis, which can be
      unnecessary.
      
      This lets us avoid analyzing function subtrees that are never entered.
      faa290b8
  28. 07 Dec, 2014 1 commit
    • Marius Wachtler's avatar
      add two passes which remove unnecessary boxing · 15921897
      Marius Wachtler authored
      * the first one removes boxInt, boxFloat and boxBool calls where the argument is coming from a corresponding unbox call
      * the second pass removes duplicate boxing calls inside the same BB
      
      together they improve the performance by about 10%
      15921897
  29. 06 Dec, 2014 2 commits
  30. 22 Nov, 2014 1 commit