cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end ) | use cffi for calling C | if a pep adds a mere 25-30 [C-API] functions or so, it's a drop in the ocean (cough) - Armin
jcea has quit [Remote host closed the connection]
jcea has joined #pypy
xcm has quit [Read error: Connection reset by peer]
xcm has joined #pypy
jcea has quit [Remote host closed the connection]
jcea has joined #pypy
BPL has quit [Quit: Leaving]
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
jcea has quit [Quit: jcea]
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
Ai9zO5AP has quit [Ping timeout: 276 seconds]
lritter has quit [Ping timeout: 268 seconds]
lritter has joined #pypy
dddddd has quit [Remote host closed the connection]
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
jvesely has quit [Quit: jvesely]
oberstet has quit [Ping timeout: 240 seconds]
oberstet has joined #pypy
<mattip> anyone around to rubber-duck the no_collect issue in _siphash24?
<mattip> it seems any function that returns more than one result cannot be inside a @rgc.no_collect
oberstet_ has joined #pypy
oberstet has quit [Read error: Connection reset by peer]
kingsley has quit [Remote host closed the connection]
<fijal> mattip: returning 2 results creates a tuple
<fijal> so indeed, it can't be
<mattip> couldn't the inlined function push/pop the results rather than make a tuple?
<mattip> the function is marked @always_inline
<cfbolz> mattip: after inlining there is an optimization that is supposed to remove the malloc
jcea has joined #pypy
<mattip> maybe the analysis is done before malloc removal?
<fijal> mattip: it could if we had an assembler backend
<fijal> otherwise it's a bit hard
<mattip> ok
<fijal> but I *think* it's a good idea to rely on analysis before malloc removal, just in case
<mattip> in this case I think it is causing aarch64 translation of py3.6 to fail, so something needs to change
<mattip> what would be the cost of removing the @rgc.no_collect from _rsiphash24 ?
<cfbolz> mattip: I think removing the no collect will fail differently
<cfbolz> In any case, I can try to look in detail why the malloc is not removed, but probably only tonight
<mattip> cool, thanks
xorAxAx has quit [Ping timeout: 240 seconds]
<cfbolz> mattip: if something is wrong with malloc removal on aarch64, that would be an important problem to find anyway
<cfbolz> fijal: feel like taking a look at my blog post?
<arigato> re cant_collect: it's "easy" to write a test that fails, but that will only be because by default in tests there is no inlining and/or malloc removal
<mattip> ahh. bummer
<arigato> I suspect for some obscure reason the inlining decides not to inline that function on aarch64
<arigato> even though it says "always_inline"
<fijal> cfbolz: sure
<arigato> mattip: thanks a lot for caring about version tags for pypi
<mattip> :)
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
<arigato> cfbolz: fwiw, I get for this url: "403 Invalid security token"
<mattip> what about
<arigato> works, thanks
<Dejan> awesome
antocuni has joined #pypy
<Dejan> I am sure RapidJSON has some tradeoffs
BPL has joined #pypy
BPL has quit [Remote host closed the connection]
Ai9zO5AP has joined #pypy
<arigato> cfbolz: in case you didn't know, there is a missing picture (New York Times dataset)
<arigato> "Right now I only consider ASCII strings for caching that do not contain any escapes": would it be simple to cache in a dict {"non-parsed repr": "parsed string"} so that you cache all strings and also avoid re-parsing them?
<cfbolz> arigato: oh thanks
<cfbolz> arigato: yes that would be possible, but it seems there really aren't very many strings with escapes
dddddd has joined #pypy
<arigato> OK
<cfbolz> arigato: I see Single digit percentages
<arigato> I'm more afraid about non-english language, but I guess that typically uses UTF-8 anyway and not an escape
<cfbolz> arigato: right, but eg the wikidata dataset has tons of languages
<cfbolz> But yes, it would definitely be a possible extension
<cfbolz> arigato: maybe I should write 'so far' :-)
<cfbolz> Ah, 'right now'
<arigato> I'm just saying, strings using random utf-8 characters are not considered as containing escapes, right?
<cfbolz> arigato: I don't remember, honestly
<cfbolz> Will check when I'm home
<arigato> :-)
<cfbolz> arigato: yes, it works exactly like you said
<cfbolz> arigato: did you like the draft?
BPL has joined #pypy
jcea has quit [Remote host closed the connection]
jcea has joined #pypy
BPL has quit [Quit: Leaving]
<arigato> cfbolz: yes
<cfbolz> arigato: cool :-)
<arigato> if nothing else it's also a good summary of the current situation in pypy
froztbyte has quit [Quit: Lost terminal]
antocuni has quit [Ping timeout: 240 seconds]
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
<mattip> marmoute: truth be told, I don't really use the website for day-to-day work.
<mattip> Perhaps if we start to use the other pieces: issues, merge requests, we will discover problems
<tumbleweed> seen that aarch64 memory corruption bug on pypy3.6 too, and on pypy2.7 translated with pypy2.7 7.1.1 (no JIT). So it only translates correctly on 7.2.0
oberstet_ has quit [Ping timeout: 240 seconds]
oberstet has joined #pypy
<cfbolz> I posted the blog post
<cfbolz> however, the blog seems broken for me, can't click any links
<mattip> yes, something seems wrong
<cfbolz> grumble
<mattip> I published it again and now the links worked
<mattip> it seems suspicious that the preview link you posted did not work
<cfbolz> hrm, ok
<cfbolz> mattip: thanks for fixing it :)
<mattip> uhh, ok?
<cfbolz> blogger is really old and creaky
<mattip> yup
<wleslie> formatting code in it is challenging, I'm impressed you've done it so many times
<cfbolz> wleslie: I use https://highlightjs.org/
<wleslie> neat
<wleslie> I left blogspot behind for nikola which lets me write in reST, but I'll keep that in mind
<cfbolz> wleslie: we should have used our own domain from the start, the cost of migrating now is pretty high
<wleslie> I don't think it was an unreasonable choice at the time; we'd have no way to know that it'd be kind of abandonware. google hadn't abandoned anything at that point. now they have an entire cemetary of the stuff.
<Dejan> I can barely see the class name on that example on highlightjs.org
<wleslie> Oh right; and I have javascript disabled on this page.
<wleslie> at least it fails sensibly.
<wleslie> after enabling it, I see some prolog.
jvesely has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
<mattip> it turns out we can back up the entire blogger content as an 7.8MB XML file. Now all we need is an alternate site and an formatter to import it there
<fijal> cool!
<fijal> mattip: if you find a solution we can host that you like, I can host it
<mattip> hugo is all the rage now for static sites, I don't know how it does comments
<fijal> I'm happy not to do comments
<fijal> as in, "send us an email if you want to comment"
<wleslie> if it's like nikola you can use disqus or similar
<wleslie> unmodified example: http://william-ml-leslie.id.au/
<mattip> we should make it part of the new website deployment
<mattip> ahh, I see, you connect discus to it https://gohugo.io/content-management/comments/
<mattip> another thing to add to the TODO list
* mattip off
mattip has left #pypy ["Leaving"]
* Dejan has Carbon X1 too
<Dejan> Awesome laptop, but too powerful for me (LOL), I will soon get Pinebook Pro (aarch64) for myself and give my Carbon X1 to my wife
antocuni has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
BPL has joined #pypy
<antocuni> cfbolz: I think there is a mistake in the blog post: the table "number of keys/unique keys" is repeated twice
<antocuni> instead of putting the table containing the number of maps vs hashtables
<antocuni> (nice blog post, btw)
<antocuni> also, [in CPython it depends](https://...) looks like a markdown link which was not handled correctly
<cfbolz> antocuni: thanks!
<cfbolz> Will fix when I'm home
<antocuni> ok
<_aegis_> is there any technical reason pypy can't have a special cased allocator for stuff like parsers that is opt in? like marking a block "I know when I return from this function, 99% of the allocations will be freed"
<fijal> _aegis_: why would you want that?
<_aegis_> kinda like turning on nogc, but also switching to a temp gc just for new allocations
<_aegis_> because if you combine all of your allocations you can isolate them from the main gc and dealloc them all at once
<Alex_Gaynor> It sounds like you're asking for explicit arena management.
<simpson> fijal: There's a recent video in the C++ world which advocates for arenas or regions that can be repeatedly bump-allocated and then discarded entirely at the end of the computation.
<_aegis_> I've seen this in rust too
<_aegis_> you basically allocate a big special vector for your data then drop it all in one chunk and it is very fast
<fijal> but they don't have a GC
<fijal> _aegis_: so this is how our GC works
<fijal> it does bump pointer allocation and on minor collection freeing is free
<fijal> (for stuff that does not survive)
<_aegis_> I'm partly responding to the json parser article, which complains that the GC isn't handling the mass allocation very well
<fijal> but it's the mass allocation that survives
<fijal> not mass allocation that dies immediately
<Alex_Gaynor> It's not even clear to me how explicit epoch management would work without either a) being a full tracing GC, b) allowing UAF, c) RAII + lifetime management
<_aegis_> oh, right! so can you flip it and just not use the nursery?
<fijal> you could
<_aegis_> I should've read more closely
<fijal> but there are problems there too
<fijal> the general issue is that you don't know upfront how big the json is
<fijal> if you knew, there are ways to improve it I would think?
<_aegis_> count commas, colons, non WS chars?
<_aegis_> doesn't work streaming though
<_aegis_> just for loads
<fijal> seems pretty specialized for special GC support to me?
<_aegis_> parsers are more common than json parsers
<fijal> one way would be to keep the OG string and put pointers there
<_aegis_> that's cute but again streaming?
<fijal> yeah that would not work with streaming
<fijal> note that this is a passion project, we could not get anyone to pay for fast json parser
<cfbolz> fijal: it also doesn't work for escapes
<fijal> because you move to something else before needing fast json parser, a lot of the time
<fijal> cfbolz: yep!
<_aegis_> if you add useful tools for json they can be applied to other parsers and special gc cases too
<fijal> cfbolz: but the general "special GC support for json" smells bad to me
<fijal> _aegis_: I'm not sure
<fijal> I mean, maybe, in principle, but *specific* general tools?
<_aegis_> well, optimizing long lived allocations in a parser situation is useful for most parsers no?
<fijal> I'm not sure
<fijal> "this object will live longer" is maybe a useful hint - but it's also one that it's easy to mismanage
<fijal> and it's not entirely clear to me if you can always predict nicely
<_aegis_> well what's the downside of mispredicting that?
<_aegis_> if it just ends up slower that's on the person who tried to do it?
<fijal> adding hints that make stuff "sometimes faster sometimes slower" seems like a bad idea to me
<_aegis_> I was thinking more broad, like "allocations in this thread will tend to behave like so" (in this case live longer)
<fijal> I doubt you can say so
<fijal> which allocations?
<_aegis_> similar to the nogc() block
<fijal> I think this is very much against the pypy spirit - have manual hints that are hard to predict and can be off
<_aegis_> is the pypy spirit to automatically detect that this block generates mostly long lived allocations and should skip the nursery?
<fijal> definitely more so - but so far allocating straight out of the nursery didn't yield benefits like expected
<fijal> it's not very easy to predict upfront what will live longer what will die shortly
<fijal> cfbolz: did you try using SSE/AVX for that btw?
<fijal> or is the bottleneck in the GC copying?
<_aegis_> is there a point where you can say "everything in the nursery is still alive, let's just memcpy it somewhere"?
<fijal> no
<fijal> because you need to update the pointers and you need to do a lot of other bookkeeping
<fijal> but there would be no benefit to that either
<fijal> _aegis_: I strongly suggest understanding how our GC works and playing with it a bit before trying to have ideas
<cfbolz> fijal: that's a bit too harsh ;-)
<cfbolz> Yes, something along these lines should help, somehow
<cfbolz> But I never managed to find the right approach that helped in practice
Ai9zO5AP has quit [Ping timeout: 240 seconds]
<_aegis_> ok, if this is more annoying than interesting I'll be quiet
<_aegis_> you're doing a _lot_ of minor collections during parsing right?
<simpson> It depends on the parsing technique, but in general, I imagine there's a lot of temporaries that the JIT can see are temporary. The JIT might not be able to avoid every temporary allocation.
<cfbolz> simpson: the jit is not involved
<simpson> cfbolz: Oh, okay.
<fijal> _aegis_: sorry that's not what I meant. I think reading about the GC would help. There are also parameters to play with and playing with them is interesting, we can help with that
<fijal> to recap: allocating in old generation has been tried a few times and didn't work. If you want to try to do it again, we can try to help. If you want to convince us to try, you need to come up with a more detailed proposal that requires understanding how the GC works.
Ai9zO5AP has joined #pypy
<fijal> we were thinking for example to enlarge the nursery for JIT tracing temporarily
<fijal> so you end up with a setup where you know when to stop and when to collect
<fijal> but it's a sizable research project I think?
<fijal> _aegis_: apologies, those are generally interesting questions
<_aegis_> ok, thank you, no worries
<_aegis_> my first thought on reading the gc doc was maybe the nursery size matters a lot for parsing a huge json file?
<_aegis_> and you could even dynamically resize it based on frequency of minor collections
<arigato> one other point is that I bet that most of the time, "many young objects survive" means something like "50% instead of only 10% of the objects survive"
<arigato> allocating everything immovable in the first place is still more costly than using the nursery and copying the 50% away
<_aegis_> yeah that's what I was wondering about re keeping the nursery
<cfbolz> It still feels to me like something ought to be possible
<cfbolz> But nothing I tried worked
<fijal> same
<arigato> same :-)
<cfbolz> Damn
<_aegis_> the other way to keep a nursery would be to allocate a new one, which wouldn't require updating pointers?
<_aegis_> like, promote the whole thing in place
<_aegis_> and use a new memory area for the next nursery allocation
<arigato> ah, interesting idea... you'd still need to update some flags in all objects
<_aegis_> which you could decide during minor collection
<_aegis_> based on how many objects liced
<_aegis_> lived
<arigato> but how do you know how many objects are alive? you'd need to walk the nursery another time first
<_aegis_> sure, you're just saving the object moving step
<arigato> maybe it's a good idea
<arigato> you'd walk the nursery, write a list of objects that are alive, and either the list overflows some fixed bound or not
<_aegis_> can't you know how many are alive by how many you either freed or would've promoted?
<arigato> no, we walk the alive objects (only)
<_aegis_> sure but you can keep track of how much of the nursery that is somehow
<arigato> yes
<_aegis_> if you either know how many objects / bytes are alive; or how many are dead; you can compute the other
<arigato> I fear that the slow-down for most "regular" minor collections won't justify the occasional speed-up, though
<_aegis_> can't this be faster than actually doing the minor collection?
<cfbolz> arigato: in unrelated GC news, I really liked this talk and think you might enjoy it: https://youtu.be/c1UBJbfR-H0
<cfbolz> arigato: it's about compacting C heaps without moving objects, by doing fun tricks with remapping pages
<arigato> you can't know how many objects or which total size is alive or dead, without walking all alive objects, and that's a sizeable cost
<arigato> cfbolz: thanks for reminding me
<cfbolz> arigato: oh, did I already link it?
<_aegis_> ok, but you at least walk once and keep a list then use that list for either the rest of the minor collection or the nursery promotion
<arigato> _aegis_: maybe we can activate the slower "do extra pass first to compute the total size" only if the previous minor collection turned out to move out most of objects?
<_aegis_> so the overhead is the list, and you don't need full pointer sizes for that, just nursery indices
<_aegis_> and yeah this can all be based on other heuristics
<cfbolz> arigato: I am wondering whether you can do a bit better than the talk if you control the allocator, and whether CPython could adopt something like it
<_aegis_> you can do both nursery resizing and nursery promotion based on tracking minor allocation success
<_aegis_> / frequency
<arigato> _aegis_: yes, it's probably worth playing around with. note that I fear that even a compact list is bad because it trashes at least the L1 cache
<arigato> meaning we'll load the L1 cache twice per surviving object, instead of once
<_aegis_> per object?
<arigato> per surviving object
<_aegis_> surely you'd get at least a few iterations of the list from a cache line
<arigato> no I mean, the same actual object will need to be loaded into the L1 cache twice, instead of just once
<arigato> (ignoring the cost of the list itself)
<_aegis_> I was also thinking maybe there's a way to do some of the work and just stop / roll it back cheaply midway
<arigato> (for reference, the sizes are on the order of 1MB for the nursery, of which 10%-20% typically survives)
<_aegis_> guarding this with a heuristic could work out well though, you don't even bookkeep unless the nursery would've been ok a certain % of times
<arigato> yes
<arigato> or just "the previous minor collection saw more than 50% surviving objects", which we can know very cheaply
<arigato> and which can be nice because it is relatively rare but can occur for a wide range of reasons
<arigato> (json, JIT, or really the user program allocating like crazy)
<_aegis_> and when you're coming from the jit instead of rpython you can remember which code areas do this and hit the first block too
<_aegis_> the corollary to all of this is "what do we do when the user starts freeing objects in the nursery"
<the_rat> cfbolz: thanks for the great blog post :) I have a msgpack parser in Python that could benefit from some of the techniques you describe. Are some of these tools exposed in a reusable way?
<_aegis_> the json parser is rpython? so jitted python can have different sorts of overhead too
<_aegis_> I have a really tiny kinda optimized python bson parser as well, which is part of my interest here
<_aegis_> in pypy3 it beats out the c ext for dump but not load
<_aegis_> (for a small object though)
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
antocuni has quit [Ping timeout: 240 seconds]
<cfbolz> the_rat: ah, interesting
<cfbolz> the_rat: not yet, but I'll think about it
<the_rat> Cool :)
<cfbolz> the_rat: are the keys repeated in the msgpack format? (I don't know how it works, sorry)
<the_rat> Yes, it's a similar structure to json, but packed in a binary format
<cfbolz> the_rat: OK, so it has the same problems
<the_rat> My use case is more about decoding many similar documents than a big file, though... gotta think how generalizable that is
<cfbolz> the_rat: yes, many similar documents is helped less well than the big file
<cfbolz> But it still helps
<cfbolz> I did measure it, but wanted to ship the post and so I didn't write it up
<cfbolz> the_rat: are the strings prefixed with their lengths? Ie they don't need escaping?
antocuni has joined #pypy
mwhudson has quit [Ping timeout: 250 seconds]
<_aegis_> it seems like cpython does far more aggressive string interning than pypy
xcm has quit [Read error: Connection reset by peer]
xcm has joined #pypy
<Alex_Gaynor> cfbolz: yes, msgpack strings are length prefixed
<cfbolz> _aegis_: in general? Yes
<cfbolz> Alex_Gaynor: ok
alexge50 has quit [Changing host]
alexge50 has joined #pypy
mwhudson has joined #pypy
mwhudson has quit [Changing host]
mwhudson has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
jcea has quit [Remote host closed the connection]
jcea has joined #pypy
Ai9zO5AP has quit [Ping timeout: 265 seconds]
Ai9zO5AP has joined #pypy
CrazyPython has joined #pypy
CrazyPython has quit [Remote host closed the connection]
inhahe has quit []
CrazyPython has joined #pypy
inhahe has joined #pypy
jcea has quit [Remote host closed the connection]
CrazyPython has quit [Remote host closed the connection]
jcea has joined #pypy
lritter has quit [Ping timeout: 268 seconds]
jvesely has quit [Quit: jvesely]
lritter has joined #pypy
jcea has quit [Remote host closed the connection]
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
speeder39_ has joined #pypy
antocuni has quit [Ping timeout: 240 seconds]