cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end ) | use cffi for calling C | if a pep adds a mere 25-30 [C-API] functions or so, it's a drop in the ocean (cough) - Armin
CrazyPython has quit [Read error: Connection reset by peer]
CrazyPython has joined #pypy
CrazyPython has quit [Ping timeout: 240 seconds]
Ai9zO5AP has quit [Quit: WeeChat 2.5]
jvesely has joined #pypy
xcm has quit [Killed (livingstone.freenode.net (Nickname regained by services))]
xcm has joined #pypy
bje_ has joined #pypy
jvesely has quit [Quit: jvesely]
<bje_> Hi. I've just tried running my package under PyPy and was stunned that it takes 3 times longer to run than under CPython. Is there a way I can dig into this?
<simpson> bje_: The first thing to consider is the high-level structure of your program. What does it spend the bulk of its time doing? Is it repetitive? Are there hot code paths, or is most code only run a few times?
<bje_> It's a time step simulation with a big triply nested loop.
<bje_> lots of repetition
<energizer> how long does it take in total?
<energizer> is it pure python?
<bje_> Pure Python (but uses some Numpy functions)
<bje_> It takes about 3.7 seconds in CPython and about three times that under Pypy.
<bje_> The main loop is in:
<bje_> function _sim
<simpson> It might sound counterintuitive, but my first attempt might be to use array.array instead of Numpy. (Maybe I'm stuck in an outdated mode of thinking though.)
<bje_> OK, will check that out, thanks
<energizer> there is also the pure-python tinynumpy
<simpson> The overall runtime's a little short, though, and the code's not too bad-looking, so you'd probably have to get down into the disassembly with vmprof: https://vmprof.readthedocs.io/en/latest/
<energizer> i wonder if this is the sort of issue that would've been solved by numpypy
<bje_> this is all a bit counterintuitive. I would have expected numpy to be the least of my worries.
<energizer> numpy, being written in C and Fortran, isn't pypy's strong suit
<bje_> I was just expecting you would jump into and out of the numpoy C code ..
<energizer> also pypy can take a little while to warm up so on a short program it might not help
<bje_> I don't consider 3 seconds short. Is it?
<energizer> compared to something like a web server, sure
<bje_> ok
<bje_> AttributeError: module 'tinynumpy' has no attribute 'zeros'
<bje_> Wha? zeros is in the docs ..
<energizer> that is annoying
<bje_> tinynumpy 1.2.1
<bje_> >>>> import tinynumpy as np
<bje_> >>>> x = np.zeros(10)
<bje_> File "<stdin>", line 1, in <module>
<bje_> Traceback (most recent call last):
<bje_> AttributeError: module 'tinynumpy' has no attribute 'zeros'
<bje_> Does this work for you?
<energizer> >>> import tinynumpy.tinynumpy as np
<energizer> >>> np.zeros
<energizer> <function zeros at 0x7f6e679d06a8>
<bje_> Oh, tinynumpy.tinynumpy ...
<bje_> Well, that was unexpected :-)
<energizer> bje_: btw i'm mostly guessing here based on some but not a ton of experience trying to do that sort of thing - the experts will know better if you hang around a while
<bje_> OK, shall do :)
<bje_> Thanks
<bje_> I tried using Tinynumpy, but it's too incomplete
dddddd has quit [Remote host closed the connection]
bje_ has quit [Quit: Leaving]
ronan has quit [Remote host closed the connection]
ronan has joined #pypy
ronan has quit [Remote host closed the connection]
ronan has joined #pypy
wleslie has joined #pypy
<mattip> for the logs: tinynumpy is based on ctypes; perhaps a cffi based one would be faster
<mattip> and numpy/pandas is going to be slow on PyPy since it calls out to c via the C-API
<energizer> how far along did numpypy get?
ronan has quit [Remote host closed the connection]
ronan has joined #pypy
ronan has quit [Ping timeout: 246 seconds]
tsaka__ has quit [Ping timeout: 260 seconds]
bje_ has joined #pypy
<LarstiQ> bje_: > I was just expecting you would jump into and out of the numpoy C code ..
<LarstiQ> bje_: if that happens once it's fine, if it happens a lot the cost of traversing that boundary being more expensive for pypy than cpython starts to show
<bje_> LarstiQ, Ah, OK, thanks.
<LarstiQ> bje_: the reason for that is that the CPython C api exposes a lot of implementation details that pypy has to emulate. See https://morepypy.blogspot.com/2019/12/hpy-kick-off-sprint-report.html for an approach I hope will get rid of that
<bje_> LarstiQ, I am running under pypy3 -m cPython to try and get timings. Is this a useful starting point?
<LarstiQ> bje_: cProfile? Iirc that distorts the timing, I'd consider vmprof instead
<bje_> OK
<bje_> I've just noticed that my asserts, previously nulled out using PYTHONOTPIMIZE, appear to be being run. I might just have to comment those out for now
<LarstiQ> ah, make sure cpython and pypy are doing the same amount of work yes :)
<bje_> :-)
<bje_> LarstiQ, I posted a link to my source file above. Are there any glaring problems there?
<bje_> wrt PyPy, not in general :-)
<LarstiQ> bje_: not something that stood out at a glance to me. I agree with what simpson said
<LarstiQ> hour_demand is a pandas Series or numpy array?
<LarstiQ> ah, demand_ndarray suggest np I suppose :)
<bje_> it's an ndarray.
<bje_> I deliberately avoid Pandas in this file for performance reasons
ilbelkyr has left #pypy [#pypy]
<bje_> LarstiQ, I've got a lot of loops. I am surprised the performance gap is so wide.
ronan has joined #pypy
ronan has quit [Ping timeout: 246 seconds]
<bje_> LarstiQ, should I be trying to eliminate Numpy?
jvesely has joined #pypy
bje_ has quit [Quit: Leaving]
<LarstiQ> bje_: if you don't need it much, pypy is pretty good at lists and array.array, so that's an option. Another thing you could try is running it longer and see if it's a matter of the jit still warming up
bje_ has joined #pypy
bje_ has quit [Quit: Leaving]
wleslie has quit [Quit: ~~~ Crash in JIT!]
ronan has joined #pypy
tsaka__ has joined #pypy
ekaologik has joined #pypy
ronan has quit [Ping timeout: 245 seconds]
tsaka__ has quit [Ping timeout: 240 seconds]
dddddd has joined #pypy
tsaka__ has joined #pypy
bje_ has joined #pypy
bje_ has quit [Quit: Leaving]
ekaologik has quit [Quit: https://quassel-irc.org - Komfortabler Chat. Überall.]
tsaka__ has quit [Ping timeout: 260 seconds]
<mattip> tinynumpy is based on cytpes. A CFFI backend would be faster, and could also support structured dtypes
CrazyPython has joined #pypy
Taggnostr2 has joined #pypy
rubdos has joined #pypy
Taggnostr has quit [Ping timeout: 248 seconds]
CrazyPython has quit [Remote host closed the connection]
CrazyPython has joined #pypy
CrazyPython has quit [Read error: Connection reset by peer]
tsaka__ has joined #pypy
CrazyPython has joined #pypy
CrazyPython has quit [Read error: Connection reset by peer]
_whitelogger has joined #pypy
<michelp> hi @arigato I just wanted to thank you again for your help, I now have a fully working example of User Defined Types with GraphBLAS: https://github.com/michelp/pygraphblas/blob/master/pygraphblas/demo/User_Defined_Types.ipynb
<michelp> happy new year!
marvin has quit [Remote host closed the connection]
marvin_ has joined #pypy