<kenaan>
mattip py3.6 777ad5f524c1 /pypy/: merge default into py3.6
<kenaan>
mattip py3.6 60d99c821291 /pypy/module/_multiprocessing/test/test_semaphore.py: fix test for python3
ronan has quit [Ping timeout: 258 seconds]
ronan has joined #pypy
jacob22 has joined #pypy
<lritter>
i'm looking to augment the LLVM backend for my partially JITed language because compile perf is *awful*, and it occurred to me that folks here might have some experience with different backends.
<lritter>
the alternatives i'm seriously considering atm are libjit, libgccjit or libtcc. what i don't know is which one has the best bang for buck i.e. outperforms LLVM in compile speed, while still providing a reasonable amount of execution performance
<lritter>
libtcc currently looks the most enticing... it's not fairly fast, but iirc it compiles quickly, but also quite importantly, in a pinch it could replace the clang C bridge that we use
<simpson>
lritter: It seems like you're looking at a very flexible range of choices; have you seen QBE, by chance? It's another LLVM-but-smaller library: https://c9x.me/compile/
<lritter>
libgccjit could perhaps even act as a permanent replacement, but gcc itself is still GPL and that's going to be a problem w. shipping
<lritter>
simpson, yes i did. tbh the API just made me go ???
<lritter>
i also looked at libfirm, same.
<lritter>
but QBE is the most liberally licensed option which definitely is a plus.
<simpson>
BTW, there's a thesis recently going around that only a small handful of transformations are necessary to get an 80/20 tradeoff in optimizers. It could be that you only need those optimizations during JIT; certainly ISTM that this list is precisely the list that RPython uses.
marky1991 has joined #pypy
<lritter>
the jit is mostly needed to produce AST and syntax transform functions, actual program code is supposed to be compiled with big boy pants ;)
<lritter>
we currently have 1.2s of startup time which is spent 90% in llvm... a lot of it is code that is executed once then thrown away
<lritter>
if i can't solve this with middleware i won't.
<simpson>
Have you checked which LLVM passes are being used? ISTR that by default they select passes that are heavy, aimed at C++.
<lritter>
simpson, it's definitely -O0. i've done there everything i could.
<lritter>
1.2s without any optimizations, 1.7s with.
<lritter>
i think what it comes down to is deciding in which order i'm going to write & test backends :|
<lritter>
starting with libtcc would have the most benefit as having a C backend is never bad
<lritter>
plus if it's already good enough i might even use it for the C bridge.
<lritter>
what also matters is the kind of output you get when you break in GDB. LLVM is pretty good in that regard.
<lritter>
but if that fails, what to try next? QBE doesn't really look like it does online compilation well.
<lritter>
libjit is smaller than libgccjit and has the better license so i'd try that i guess
<simpson>
I bet that QBE's author would support adding part of the online API, but they might ask you to provide the JIT bits and write some C.
<simpson>
They're targeting BSDs IIUC, where JIT memory needs brush up against the system security policy.
<lritter>
ok, so tcc's GDB support at runtime is non-existent. it does work offline though.
marky1991 has quit [Ping timeout: 246 seconds]
mattip has joined #pypy
Garen_ has joined #pypy
Garen has quit [Ping timeout: 255 seconds]
<arigato>
mattip: re issue #3011 (cpyext slow when calling Py_BuildValue)
<mattip>
?
<arigato>
I think that we could build and fill tuples entirely from C, the same way as we do that with PyIntObjects
<arigato>
unless I'm mistaken, C code calling PyInt_FromLong() gets a PyIntObject made from C which doesn't have any corresponding pypy side at first (or possibly ever)
<arigato>
doing that with tuples seems possible now that PyTupleObjects maintain their own list of "PyObject *" pointers too
<arigato>
ah, but note that PyTuple_SET_ITEM is already just a macro in C
<arigato>
I guess PyTuple_New() still needs to be turned into a C-allocation-only, and then care taken in tupleobject.py
<mattip>
right. On py3.6 we still need to improve PyLong_FromLong like PyInt_FromLong, that might e the difference between the benchmark there
<mattip>
nice
<mattip>
so one by one allow C obj creation from C without a w_obj until needed
oberstet has quit [Remote host closed the connection]
<antocuni>
note that we still need to solve the general case of creating objects in C and returning them to Python (e.g. ndarray.getitem returning a scalar)
<mattip>
antocuni: you mean the getitem benchmark in bench_np.py (0.06 in cpython2.7, 0.41 in pypy2.7)?