- 20 Jun, 2015 13 commits
-
-
Kevin Modzelewski authored
Involves a couple changes: - have the rewriter treat certain callsites as non-mutations - add special cases for wrapperdescr objects
-
Kevin Modzelewski authored
I renamed all the "about to enter jitted code" and "about to enter the interpreter" stats to "in_jitted_code" and "in_interpreter", respectively; I don't think the exact entry point ends up mattering that much. A lot of stuff is showing up as "in_jitted_code"; I tried to find some of it using the new itimer helper, and put some separate timers on those.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Seeing some corrupted-pyc-file issues on travis-ci, that I can't reproduce locally. Add some more debugging output for when it happens again.
-
Kevin Modzelewski authored
If you define INVESTIGATE_STAT_TIMER to the name of the timer you want to investigate, we will set an itimer that raises a SIGTRAP if you are in that particular timer, but ignores the signal otherwise. There's no tooling on top of it, but just running that inside gdb is already helpful.
-
Kevin Modzelewski authored
Conflicts: src/codegen/unwinding.cpp
-
Kevin Modzelewski authored
make a couple more places successfully rewrite
-
Kevin Modzelewski authored
- if we try guarding after a mutation - if we use all of our scratch space Now, just set a "failed" flag internally and which prevents committing. The motivation for the first part is trying to get rewrite calls to tp_getattro; if the rewrite is from getattr then it will succeed, but if it comes from callattr then we will want to do some more guards after the tp_getattro. We could try to pass that state around, but for now just use the 'failed' approach.
-
Kevin Modzelewski authored
We don't usually call callattr with null_on_nonexistent, but we do for __hasattr__ checking. We can rewrite those to just do the guards and then return NULL.
-
Kevin Modzelewski authored
kind of hacky but I think it's ok for now.
-
Kevin Modzelewski authored
We could also add more general rewriting, but - these new special cases catch something like 95% of the cases that we weren't rewriting - these special cases are faster than doing the generic nonzerno mechanism (looking up the attribute, etc) It'd be nice if we could get to the point that the generic rewrites we'd create would be as good as the hand-crafted ones, but that would require knowing that we don't need to guard on constant classes, and then inlining within rewrites.
-
Kevin Modzelewski authored
Some more perf hunting
-
Kevin Modzelewski authored
ie one of the common entrypoints to capi code.
-
- 19 Jun, 2015 12 commits
-
-
Kevin Modzelewski authored
ie when it's on a builtin method. We assumed at some point that we wouldn't need to look at the function object, but now that we can rewrite method_cls calls, that's not true.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Playing with stattimers
-
Chris Toshok authored
-
Chris Toshok authored
-
Kevin Modzelewski authored
ie roll up all the time into the most "avoidable" reason that we were doing it. For example, if we are doing something like calling slot_tp_getttro on a builtin type (very avoidable), roll up all the subsequent time (runtimeCall, etc) into the slot_tp_getattro timer. But if we call runtimeCall where we couldn't avoid it (ex from the interpreter), log that separately. Not sure how helpful it will be but for this specific investigation it seems to somewhat work. The idea of the "avoidability" is definitely pretty specific to the type of work that you are thinking of doing; the numbers I put in are for investigating slowpaths. Also, remove all the timers that we have on specific runtime functions (ex: listMul). I think we'll need another strategy for those.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Switch super to tp_getattro
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Some small micro-optimizations
-
Kevin Modzelewski authored
Mostly related to making sure that some hot paths get inlined better, particularly for allocation/sweeping. These changes make a simple for-loop benchmark (which is just dependent on allocation speed) about 25% faster.
-
Kevin Modzelewski authored
Incremental traceback
-
- 18 Jun, 2015 15 commits
-
-
Kevin Modzelewski authored
constant loading optimizations in rewriter
-
Kevin Modzelewski authored
cmake testing
-
Travis Hance authored
-
Travis Hance authored
-
Travis Hance authored
-
Travis Hance authored
-
Travis Hance authored
-
Marius Wachtler authored
- Reuse a register if it already contains the specified value - Generate LEA when beneficial --> Generated code is smaller and has same or better performance
-
Chris Toshok authored
make BoxedTraceback hold a single line, instead of a vector. rename BoxedTraceback::Here to BoxedTraceback::here
-
Chris Toshok authored
move the exception stat logging to PythonUnwindSession::logException, which is called from cxa_throw. remove raiseRaw and make raiseExc use throw
-
Chris Toshok authored
-
Chris Toshok authored
-
Chris Toshok authored
-
Chris Toshok authored
-
Chris Toshok authored
-