Commits · caa84b81aca2c56f21418ef2be63512e340dcc99 · Boxiang Sun / Pyston

An error occurred fetching the project authors.

12 Jul, 2015 2 commits

Get rid of "interpreted CompiledFunctions" · 41c0273e

This was vistigial from the old llvm interpreter, where we it made a bit more
sense. Now it's just weird, and caused the tiering logic to be spread into
weird places. It was also the source of some bugs, since when we would deopt
there's not really any relevant CompiledFunction that represents the deopt
frame (this is why type speculation had to be temporarily disabled).

So instead, make it so that the interpreter (and by extension, the baseline
jit) work on the CLFunction directly, since the CompiledFunction didn't hold
any relevant information anyway. This require a whole bunch of refactoring and
API changes.

This commit doesn't actually change any of the tiering decisions (I think), but
should just make them clearer; the actual behavioral changes will be done in
upcoming commits.

41c0273e

Get rid of void-returning functions · fc238ed8

Kevin Modzelewski authored 9 years ago

Module-level code used to return void, but this causes some
special-casing, so switch to having them return None.
Also, CPython has their module code objects return None, so
it's a small compatibility gain as well.

fc238ed8

02 Jul, 2015 1 commit

Add new JIT tier between the interpreter and the LLVM JIT tiers. · fc366564

Marius Wachtler authored 9 years ago

This JIT is tightly coupled to the ASTInterpreter, at every CFGBlock* entry/exit
on can switch between interpreting and directly executing the generated code without having
to do any translations.
Generating the code is pretty fast compared to the LLVM tier but the generated code is not as fast
as code generated by the higher LLVM tiers.
But because the JITed can use runtime ICs, avoids a lot of interpretation overhead and
stores CFGBlock locals sysbols inside register/stack slots, it's much faster than the interpreter.

fc366564

10 Jun, 2015 1 commit

Change the way we store and pass string data · 95ad7ffc

Kevin Modzelewski authored 9 years ago

Convert to BoxedString much sooner, and have any functions that
might need to box a string take a BoxedString.  This means that
on some paths, we will need to box when previously we didn't, but
for callsites that we control we can just intern the string and
not have to box again.  The much more common case is that we
passed in unboxed string data, but then ran into a branch
that required boxing.

BoxedString shouldn't be that much more costly than std::string,
and this should cut down on string allocations.

For django-template.py, the number of strings allocated drops from
800k to 525k; for virtualenv_test.py, it goes from 1.25M to 1.0M

A couple things made this not 100% mechanical:
- taking advantage of places that we could eliminate unbox/rebox pairs
- different null-termination assumptions between StringRef and the c api.

95ad7ffc

29 May, 2015 1 commit

Python-level sampling profiler · 7596e61a

Kevin Modzelewski authored 9 years ago

Uses setitimer() to set a recurring signal, and prints
a Python stacktrace at the next safepoint aka allowGLReadPreemption.
This is not great since allowGLReadPreemption can happen a decent
amount later than the signal.  (I'll play around with trying to get
the signal to be acted on sooner, but it might be better to wait
for full signal-handling support.)

Still, it seems to provide some decent high-level info.  For example,
half of the startup time of the django-template benchmark seems to be
due to regular expressions.

7596e61a

05 May, 2015 1 commit

Don't cache analysis results · b5823ff6

Kevin Modzelewski authored 9 years ago

On the simple test of `pyston -c "import pip"`, it reduces memory
usage from 194MB to 153MB (20% of the memory previously was cached
analysis data). It also increases our benchmark geomean by ~1%,
which isn't great, but I think we can get that back eventually and
I don't think it's worth blocking this memory improvement for that.

b5823ff6

23 Apr, 2015 1 commit

Move fn from module to code · 32a0dbff

Kevin Modzelewski authored 9 years ago

Modules have a __file__ attribute but that's only used for
the module repr.  The filename that's used in tracebacks
is stored in the code object.

32a0dbff

22 Apr, 2015 1 commit

Fix analysis issue that virtualenv was running into · 81f00afe

Kevin Modzelewski authored 9 years ago

The issue was that the types analysis was osr-aware, where
we would only type-analyze the sections of the function
accessible from the osr entry point.  The phi and definedness
analyses were not osr-aware, so they would think that phis
in certain places where the type analysis knew that they
were undefined.

So this change makes definedness and phi analysis osr-aware,
where they only analyze the appropriate section of the function.
I think this means that we will do these analyses more, since we
have to rerun them for each entry point, so hopefully analysis time
doesn't increase too much.

81f00afe

20 Apr, 2015 2 commits
- Emit PHI nodes in deterministic order to improve the effectiveness of the jit object cache · b47e2421
  Marius Wachtler authored 9 years ago
  
  b47e2421
- Add a cache for JITed object code. · ca7765ac
  Marius Wachtler authored 9 years ago
```
We still need to generate the IR but if we can find,
a cache file created for the exact same IR we will load it and
skip instruction selection etc...
```
  ca7765ac
19 Apr, 2015 1 commit
- Turn down verbosity · 5b3cc779
  Kevin Modzelewski authored 9 years ago
  
  5b3cc779
07 Apr, 2015 1 commit

Greatly reduce verbosity in -v mode · 467c4fd5

Kevin Modzelewski authored 9 years ago

All the old prints still exist, but at -vv or even -vvv mode.

The basic system is:
-v mode gives information about the overall execution: what functions we
    run into, what things take longer than we expected, etc.
-vv mode gives information about each function: the cfg, llvm ir, etc.
-vvv gives information about each BB.

467c4fd5

24 Mar, 2015 1 commit
- with statements now handle exceptions properly · 7fb7aaa5
  Michael Arntzenius authored 9 years ago
  
  7fb7aaa5
23 Mar, 2015 1 commit
- exec · 2bfcd7d1
  Travis Hance authored 9 years ago
  
  2bfcd7d1
17 Mar, 2015 1 commit
- add missing case for CLOSURE · b2256538
  Chris Toshok authored 9 years ago
  
  b2256538
24 Feb, 2015 1 commit

Rebase llvm to r230300 · ef1acbdc

Marius Wachtler authored 9 years ago

Had to change the comdat handling otherwise the executable would crash on startup.
Cause is that publicize is renaming symbols but keeps the old comdat around,
the linker would then replace the implementation of the symbol with a call to 0.

One may need to run 'make llvm_up' if you are using the cmake build and see a error,
caused by debuginfo beeing renamed to debuginfodwarf.

ef1acbdc

17 Feb, 2015 1 commit
- Support passing generator objects through the args array in OSR · bff16616
  Kevin Modzelewski authored 10 years ago
```
Only gets hit when there are >=3 !is_defined names also set (other
fake names might also count towards this).
```
  bff16616
14 Feb, 2015 5 commits

Reenable tier 2 for now · a3a12bb6

Kevin Modzelewski authored 10 years ago

We should do a more comprehensive investigation. Removing t2 caused
regressions on a number of benchmarks since we lost chances to do
speculations, but making t3 easier to get to caused regressions
due to the cost of our LLVM optimization set (which is pretty hefty
since it's supposed to be hard to activate).

a3a12bb6

Further distinguish OSR and non-osr compiles · 1bfb56e8

Kevin Modzelewski authored 10 years ago

A "FunctionSpecialization" object really makes no sense in the context of
an OSR compile, since the FunctionSpecialization talks about the types
of the input arguments, which no longer matter for OSR compiles.
Now, their type information comes (almost) entirely from the OSREntryDescriptor,
so in most places assert that we get exactly one or the other.

1bfb56e8

Can kill all notion of partial-block-compilation · 0e60f0d3
Kevin Modzelewski authored 10 years ago
```
We only needed that for supporting the old deopt system
```
0e60f0d3
Nuke the old "block guards" and the rest of the old deopt system · 8feae20e
Kevin Modzelewski authored 10 years ago
```
Long live new-deopt!
```
8feae20e

For OSRs, do type analysis starting from OSR edge · ea673dfd

Kevin Modzelewski authored 10 years ago

Before we would do type analysis starting from the function entry
(using the specialization of the previous function).  This makes things
pretty complicated because we can infer different types than we are OSRing
with!  Ex if the type analysis determines that we should speculate in an
earlier BB, the types we have now might not reflect that speculation.

So instead, start the type analysis starting from the BB that the OSR starts at.
Should also have the side-benefit of requiring less type analysis work.

But this should let us get rid of the OSR-entry guarding, and the rest of
the old deopt system!

ea673dfd

13 Feb, 2015 2 commits

Cleanup: delete vestiges of old expr guarding system · 222f5a70
Kevin Modzelewski authored 10 years ago

222f5a70

Cleanup: add settable OSR/reopt thresholds, and get rid of tier 2 · 7cf92757

Kevin Modzelewski authored 10 years ago

Previously it was:
tier 0: ast interpreter
tier 1: llvm, no speculations, no llvm opts
tier 2: llvm, w/ speculations, no llvm opts
tier 3: llvm, w/ speculations, w/ llvm opts

tier 2 seemed pretty useless, and very little would stay in it.  Also,
OSR would always skip from tier 1 to tier 3.

Separately, add configurable OSR/reopt thresholds.  This is mostly for the
sake of tests, where we can set lower limits and force OSR/reopts to happen.

7cf92757

11 Feb, 2015 2 commits
- Fix issues found by coverity · 0ba1cac6
  Kevin Modzelewski authored 10 years ago
```
Thanks to @denji for the report!
```
  0ba1cac6
- Fix irgen bug · e041df0a
  Kevin Modzelewski authored 10 years ago
```
Need to unbox-rebox bools like we do for ints and floats.
```
  e041df0a
07 Feb, 2015 1 commit
- give all CLFunctions a ParamNames (renamed from ArgNames) including builtin ones · 5de99b04
  Travis Hance authored 10 years ago
  
  5de99b04
04 Feb, 2015 1 commit

Intern most codegen strings · 325dbfeb

Kevin Modzelewski authored 10 years ago

Most importantly, intern all the strings we put into the AST* nodes.
(the AST_Module* owns them)

This should save us some memory, but it also improves performance pretty
substantially since now we can do string comparisons very cheaply.  Performance
of the interpreter tier is up by something like 30%, and JIT-compilation times
are down as well (though not by as much as I was hoping).

The overall effect on perf is more muted since we tier out of the interpreter
pretty quickly; to see more benefit, we'll have to retune the OSR/reopt thresholds.

For better or worse (mostly better IMO), the interned-ness is encoded in the type
system, and things will not automatically convert between an InternedString and
a std::string.  It means that this diff is quite large, but it also makes it a lot
more clear where we are making our string copies or have other room for optimization.

325dbfeb

22 Jan, 2015 1 commit
- Fixes · d0672808
  Kevin Modzelewski authored 10 years ago
  
  d0672808
10 Jan, 2015 1 commit
- Add NONZERO support to irgen, and fix some bugs · 89c5f52b
  Kevin Modzelewski authored 10 years ago
  
  89c5f52b
07 Jan, 2015 2 commits

Rebase to LLVM r225000 · 83fb05b6

Kevin Modzelewski authored 10 years ago

Changes here due to the Metadata/Value split.

Had some issues with intermediate commits due to leak
detecting; not seeing them on this later commit.  Hopefully
everything is ok...

83fb05b6

Rebase to LLVM r223801 · 382d5095

Kevin Modzelewski authored 10 years ago

Primary challenge is rebasing past the JITEventListener changes.

Some API changes, but also some considerable functionality changes,
since we no longer get the loaded version of the object file.

They added a feature to get the load address of a section, but not of
a symbol; I think that makes sense so I'll submit a patch for that.

382d5095

05 Jan, 2015 1 commit
- Happy new year! · d50ce553
  Kevin Modzelewski authored 10 years ago
```
(update copyright notices)
```
  d50ce553
04 Jan, 2015 1 commit
- Add sys.exit() · 189b1150
  Kevin Modzelewski authored 10 years ago
  
  189b1150
19 Dec, 2014 1 commit

Switch clang-format to C++11 mode · b0ead8ab

Kevin Modzelewski authored 10 years ago

Looks like for now this only changes double-closing-angle-brackets
(now formats "> >" as ">>" in templates), but apparently it has
effects on raw string literals as well.

b0ead8ab

10 Dec, 2014 1 commit

Make 'is_generator' be a property of SourceInfo, not ScopeInfo · faa290b8

Kevin Modzelewski authored 10 years ago

ScopeInfo involves checking the whole function subtree to resolve
scoping references.  SourceInfo has information about the specific
function.  is_generator used to be in ScopeInfo, but accessing it
would require doing the full subtree analysis, which can be
unnecessary.

This lets us avoid analyzing function subtrees that are never entered.

faa290b8

07 Dec, 2014 1 commit

add two passes which remove unnecessary boxing · 15921897

Marius Wachtler authored 10 years ago

* the first one removes boxInt, boxFloat and boxBool calls where the argument is coming from a corresponding unbox call
* the second pass removes duplicate boxing calls inside the same BB

together they improve the performance by about 10%

15921897

06 Dec, 2014 2 commits

Fix a bug in OSR-entry guard checking · 860be3a2

Kevin Modzelewski authored 10 years ago

If we had to guard on the type of an object but the variable
ended up being undefined, we treated that as a guard failure
(not sure why).

Fixing that exposed another issue: if we guard that an object
is an int, we will try to unbox it if necessary.  If the variable
wasn't defined, then we will try to unbox some garbage memory.
Use the new handlePotentiallyUndefined to deal with that.

860be3a2

Refactor some common control-flow-creation code (NFC) · 7fe28ae0
Kevin Modzelewski authored 10 years ago

7fe28ae0

22 Nov, 2014 1 commit

Add the ability to use the LLVM "basic" register allocator · b917c7a6

Kevin Modzelewski authored 10 years ago

Controlled by the '-b' command-line flag.  That used to
be for "benchmark" mode but it looks like benchmark mode
has been broken for a while, so remove that.

b917c7a6