- 08 Jul, 2015 5 commits
-
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Put the common-case (where we don't do any work) in an inline-able function, and keep the slow stuff hidden.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
At this point, this is only used for some rarely-used targets like "disassemble this bitcode".
-
Kevin Modzelewski authored
Plus a benchmark of the getattr() function. I have some ideas on how to make this faster, but I don't think it's that important at the moment.
-
- 07 Jul, 2015 6 commits
-
-
Kevin Modzelewski authored
Be more consistent about "ip" while unwinding
-
Kevin Modzelewski authored
rewriter optimization; switch things away from compareInternal
-
Travis Hance authored
rewriter: use lea to allocate stack space
-
Kevin Modzelewski authored
compareInternal should probably just call PyObject_RichCompareBool eventually, but the switch also has the benefit of avoiding an extra boxing step.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Unicode-creation optimizations
-
- 06 Jul, 2015 12 commits
-
-
Kevin Modzelewski authored
I think the ip we receive is the return address to that stack frame (at least in the non-signal case). This makes things a bit weird since the "is this ip in this range" tests have a different half-openness than they normally do. At some point an "ip = ip - 1" snuck in, which I think was to address this issue, but I think it's better to not change the ip -- ie the resulting address is not independently useful (it's in the middle of an instruction); we could pass it around as "ip_minus_one", but instead just try converting the tests to be better.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Since we don't memset the unicode allocations any more, and we now use a precise gc handler, we have to be careful that unicode objects are valid enough whenever a collection could happen. The issue was that "unicode->str = gc_alloc(...)" could cause a collection, in which case the collector would go and see a bogus value in the ->str field. Now, do the str allocation first (since it is UNTRACKED) and then do the precise allocation second.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Inlining the allocation + object initialization saves a decent amount of overhead, since most of the properties will be fixed. For example, the size of the main allocation is fixed, so we can directly allocate it from the correct SmallArena bucket. We can also skip all the indirect function calls.
-
Kevin Modzelewski authored
add babel to our extra tests
-
Kevin Modzelewski authored
baseline jit: patch block transitions to a direct jump.
-
Kevin Modzelewski authored
Add a new JIT tier which is tightly coupled to the AST Interpreter
-
Marius Wachtler authored
-
Marius Wachtler authored
+ add small cleanup: add helper for mode calculation
-
Kevin Modzelewski authored
For issue #254, add lt, le, gt, ge to old-style class
-
- 05 Jul, 2015 1 commit
-
-
Kevin Modzelewski authored
assembler: Use 32bit moves if the immediate fits inside 32bit
-
- 04 Jul, 2015 1 commit
-
-
Marius Wachtler authored
Before we emitted a runtime check to check if the block has been JITed or if we have to fallback to the interpreter. Now we always generate a exit to the interpreter if the block is not yet JITed and patch the exit to a direct jump later when we have successfully generated code for the new block. This also removes the epilog and replaces it with a direct 'leave ret' combo which saves space and an additional jump.
-
- 02 Jul, 2015 7 commits
-
-
Marius Wachtler authored
-
Marius Wachtler authored
makes the encoding 4-5byte shorter and works because the upper 32bit will get auto cleared
-
Marius Wachtler authored
This JIT is tightly coupled to the ASTInterpreter, at every CFGBlock* entry/exit on can switch between interpreting and directly executing the generated code without having to do any translations. Generating the code is pretty fast compared to the LLVM tier but the generated code is not as fast as code generated by the higher LLVM tiers. But because the JITed can use runtime ICs, avoids a lot of interpretation overhead and stores CFGBlock locals sysbols inside register/stack slots, it's much faster than the interpreter.
-
Kevin Modzelewski authored
Use less large constants
-
Boxiang Sun authored
-
Boxiang Sun authored
-
Boxiang Sun authored
-
- 01 Jul, 2015 8 commits
-
-
Kevin Modzelewski authored
Preperations for the new JIT tier
-
Kevin Modzelewski authored
Conflicts: src/core/types.h
-
Marius Wachtler authored
IC sizes are guessed...
-
Marius Wachtler authored
It deleted the passed ICSlotRewrite* and there was no way for a caller to know this without looking at the source. Make the ownership explicit by using a std::unique_ptr
-
Marius Wachtler authored
and make code ready for the new JIT tier.
-
Marius Wachtler authored
-
Kevin Modzelewski authored
Conflicts: src/runtime/iterobject.cpp
-
Kevin Modzelewski authored
add -a flag which outputs assembly of ICs
-