#jruby on 2021-03-26 — irc logs at freenode.irclog.whitequark.org

2021-03-11 21:42 ChanServ changed the topic of #jruby to: Get 9.2.16.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

00:07 <headius[m]> works on a regular JDK?

00:07 <chrisseaton[m]> It's a Ruby app

00:08 <chrisseaton[m]> Oh sorry I see what you mean - ah yes it takes Graal dumps

00:08 <headius[m]> GraalVM is much slower on everything I test but it would be good to figure out why

00:08 <chrisseaton[m]> I don't know if C2 can still generate the same BGV files or if it's historical?

00:08 <headius[m]> BGV?

00:09 <chrisseaton[m]> That's the file format IGV accepts

00:09 <headius[m]> ah, well I believe it still can

00:10 <chrisseaton[m]> I don't see any code for it in their repo, apart from the vendored Graal sources

00:11 <headius[m]> it would be strange if they removed it... I have hooked IGV up to Hotspot many times

00:14 <headius[m]> looks like it is in debug builds

00:22 ur5us_ has joined #jruby

00:25 <headius[m]> would be good to try on graal anyway and figure out why everything seems slower

00:34 <headius[m]> ok now that is more like it

00:41 <headius[m]> so with no arity-splitting of dig...

00:41 <headius[m]> normal:

00:41 * headius[m] < https://matrix.org/_matrix/media/r0/download/matrix.org/QLSmSLNlrVroTKGuLiVqvDWy/message.txt >

00:42 <headius[m]> indy

00:42 * headius[m] < https://matrix.org/_matrix/media/r0/download/matrix.org/DDCFsbKqqjHWwUvNkNlxBQjs/message.txt >

00:42 <headius[m]> I have posted to mlvm-dev... responding to my email there that brought up this problem five years ago

00:42 <headius[m]> chrisseaton: FWIW I modified your bench to the "manual loop" form so the eval'ed loop harness is not skewing results so badly

00:44 <chrisseaton[m]> What does 'no arity-splitting' mean?

00:44 <headius[m]> manually providing unboxed digoverloads

00:44 <headius[m]> dig overloads

00:44 <headius[m]> so this is still using arg[] path but much much faster

00:45 <headius[m]> there was no perf issue in dig, really

00:45 <headius[m]> just bad varargs logic in method handles

00:48 <headius[m]> untold perf lost due to this but should be greatly improved across the board now

00:48 <chrisseaton[m]> Is this the thing we reported a year or so ago?

00:49 <chrisseaton[m]> Ah I see your email

00:49 <headius[m]> no that has been fixed in JDK and there was a modest workaround (make sure all types in target method signature get resolved by method's classloader first)

00:50 <headius[m]> wish we had some resources to set up perf regression testing

00:50 <chrisseaton[m]> Is this an identity thing - does the array have to be copied because it has identity so must exist?

00:50 <headius[m]> I don't see any good reason for it

00:50 <headius[m]> other than maybe they wanted to reuse logic that is untyped (Object, Object[]) for populating the array

00:51 <headius[m]> it all inlines but does not eliminate the middle man

00:51 <chrisseaton[m]> Ha has Duncan in the email chain!

00:52 <headius[m]> yeah and he was using Object[] and didn't see it

00:52 <headius[m]> I never dug back into it after that

00:52 <headius[m]> if people would just trust me when I say stuff is broken we'd save a lot of time

00:53 <chrisseaton[m]> Do you use BIPS with the string version of `report`?

00:54 <headius[m]> no, while loop

00:54 <headius[m]> or do you mean in general?

00:54 <chrisseaton[m]> In general, if you're just whipping up a benchmark?

00:55 <headius[m]> depends on what I am measuring

00:55 <headius[m]> I don't use BIPS for really small stuff

00:55 <headius[m]> for something like this I will use the while loop form

00:55 <headius[m]> https://gist.github.com/headius/28343b8c393e76c717314af57089848d

00:55 <headius[m]> if it is big I just use it like you did

00:59 <headius[m]> this BIPS issue is real though... with TR optimizing the loop and the benchmark together you are getting a very different view of perf

00:59 <chrisseaton[m]> Yes they should optimise to literally the same machine code

01:01 <headius[m]> all that tells you is that doing the exact same thing N times optimizes to less work than N * something

01:01 <headius[m]> it doesn't give you a very good picture of the cost of something in isolation

01:02 <chrisseaton[m]> Yeah I don't think there's a great solution to that - we've got some blackholes now actually - I think there's even a node for it... but they're also artificial

01:02 <chrisseaton[m]> Generally I try to do real IO for input and real IO for output

01:02 <headius[m]> there needs to be some way to insert a boundary between the measurement and the work

01:02 <chrisseaton[m]> That does things that a blackhole doesn't even do - like flatten strings

01:03 <headius[m]> yeah that helps but it still can see the loop working against the same input unless you get new input per loop

01:03 <headius[m]> this is one reason I was opposed to the BIPS change

01:03 <headius[m]> not only does it make it useless for comparing an impl like TR with other impls that can't eliminate the loop, it also gives you a bad view of perf

01:03 <headius[m]> the version that did not inline was better

01:05 <headius[m]> and for small things like this we we won't even compile the loop, you have the added cost of a slowly interpreted loop surrounding the bench body

01:06 <chrisseaton[m]> Doesn't bips call the whole method every second?

01:06 <chrisseaton[m]> So it doesn't just enter the method and sit there - so you don't need OSR.

01:06 <headius[m]> no, the "call_times" method is generated per benchmark and contains the outer loop

01:07 <headius[m]> so it is evaluated code called once for every bench body (maybe twice for warmup + bench, but that is still below our JIT threshold)

01:07 <headius[m]> https://github.com/evanphx/benchmark-ips/blob/master/lib/benchmark/ips/job/entry.rb#L50-L91

01:08 <chrisseaton[m]> Yes it calls `call_times` with `times` set so it lasts about a second, and then calls it `n` times for your timing value, so `3` or whatever

01:08 <chrisseaton[m]> The loop inside `call_times` gets hot, and triggers compilation of `call_times`, which even without OSR, you then re-enter again

01:13 <headius[m]> so for five seconds you get five calls... but our threshold is 50

01:13 <headius[m]> granted we have meant to make that threshold be based more on the work done but that is how it is now

01:13 <chrisseaton[m]> I thought the loop back-jumps would add to the threshold but maybe not

01:14 <headius[m]> enebo has played with that but not landed anything

01:14 <chrisseaton[m]> Maybe BIPS should start with `times = 1` and increase more slowly to a higher number

01:14 <headius[m]> well, perhaps warmup should include some heavy calls with low iters to ensure it actually warms up the harness

01:15 <chrisseaton[m]> Yes I mean during warmup, start with `times = 1` for the first second

01:15 <headius[m]> which may be the same thing you ... yeah

01:15 <headius[m]> but still... I don't want the loop to optimize with the bench block

01:16 <headius[m]> I don't think any of us really want that because it taints the measurement of the block

01:16 <headius[m]> so it looks like this does jit call_times but in the middle of the bench run

01:17 <headius[m]> I see the first dozen or so dots (printed when call_times is called) tick up slowly, and then they speed way up

01:17 <chrisseaton[m]> I have an approach I called continuous-adpative-benchmarking... can't find it now - it's sort of designed to be unstable so it doesn't rely on becoming stable

01:18 * headius[m] < https://matrix.org/_matrix/media/r0/download/matrix.org/vVXlcPmiRbCWDJHBBtBlYtRi/message.txt >

01:18 <headius[m]> that is the main bench run with logging niblets added

01:19 <chrisseaton[m]> https://github.com/oracle/truffleruby/blob/master/bench/benchmark-interface/lib/benchmark-interface/backends/simple.rb#L40-L80

01:19 <chrisseaton[m]> This never picks some `times` and doesn't have separate warmup and measurement - it just runs

01:19 <chrisseaton[m]> `times` always varys so it's not a constant

01:20 <headius[m]> yeah I think we need to do something about the BIPS numbers because there are a lot of benchmarks out there using it now

01:20 <headius[m]> some for really tiny units of work

01:21 <headius[m]> and it should probably try to subtract the cost of benchmarking an empty block to normalize across impls that can or cannot inline that

01:22 <chrisseaton[m]> Or some sort of weird debug yield operation that pastes in the code statically

01:22 <headius[m]> yeah

01:22 <chrisseaton[m]> Tricky not to become something unrepresentative the other way though

01:23 <headius[m]> it's also going through proc.call which has more overhead than either yield or a simple method call on JRuby and MRI

01:24 <headius[m]> been meaning to get that inlined through but yeah

01:32 <headius[m]> well good news is that the version of MH varargs logic that takes a collector function also fixes the problem, so I don't have to roll my own

01:39 justinmcp has quit [Quit: No Ping reply in 180 seconds.]

01:40 justinmcp has joined #jruby

01:41 <headius[m]> chrisseaton: thanks for bringing this to our attention... I had either forgotten this was a problem or assumed they had fixed it

01:42 <headius[m]> should speed up tons of method calls in JRuby

02:33 <headius[m]> chrisseaton: https://github.com/jruby/jruby/pull/6630

02:34 <headius[m]> there will be a separate PR to improve dig itself, but this addresses the issue you found

04:38 ur5us_ has quit [Ping timeout: 258 seconds]

06:49 <headius[m]> well I think 10x improvement is good enough for today, but I sure wish those boxed call paths optimized better on Graal JIT

06:49 <headius[m]> need to double check they are inlining, but with all three PRs in place they should be

07:14 <headius[m]> enebo: kares https://github.com/jruby/jruby/discussions/6634

07:35 <headius[m]> chrisseaton: additional improvements from dig overloading and another indy call tweak, I don't know how much if any of this will land in 9.2.17.0 but it is in the pipeline

08:47 valphilnagel has joined #jruby

08:53 drbobbeaty has quit [Ping timeout: 250 seconds]

08:54 drbobbeaty has joined #jruby

09:17 valphilnagel has quit [Quit: Leaving]

11:00 snickers has joined #jruby

14:13 <chrisseaton[m]> All looks good! I understand what you mean by arity splitting now.

14:25 <enebo[m]> chrisseaton: if you look at some generated Scala stuff they will overload some methods out to 30 params to avoid the arg boxing in a primitive array

14:25 <enebo[m]> Or they did anyways. I don't look at Scala very much :)

16:30 <headius[m]> enebo: some edge cases I have to look into on those PRs but it would be nice to have at least the collect fix in .17

16:44 <enebo[m]> collect fix and you fixed that getAndCache already right?

16:45 <headius[m]> that is in a PR still as well

16:45 <headius[m]> there are four of them😀

16:45 <enebo[m]> ok that one seems like a no-brainer because it is obviously wrong

16:45 <headius[m]> that is an easy one yeah

16:45 <enebo[m]> dig split I don't personally care for .17 but it looks safe so I am indifferent

16:46 <enebo[m]> collectAs looks pretty substantial but as you said there are corners to work otu still but if you can be satisified it seems like a pretty small change for a nice win

16:48 <headius[m]> yeah the whole set combined helps a lot of different configurations

16:48 <headius[m]> I will work on making sure everything is super green today and we can decide

16:48 <enebo[m]> ok

17:35 <headius[m]> enebo: found my issue on the collect PR... I was not adding 1 for a method_missing binding

17:35 <headius[m]> added tests and added JI specs with indy because that would have caught it too

17:35 <headius[m]> so that is back to 100% again 😀

17:36 <headius[m]> there is something in 9.2.16.0 that prevents -X+C -Xjit.threshold=0 from running optcarrot btw

17:36 <headius[m]> at least that release, maybe earlier

17:36 <headius[m]> visibility checking bug it seems

17:50 <enebo[m]> but only visibility on something bytecode compiled

17:51 <headius[m]> yeah

17:51 <headius[m]> so the compiled call is not properly checking visiblity (comes up as private but should be callable)

17:55 <enebo[m]> I have been impressed how simple it has been to not have these mismatches since moving to IR

17:55 <enebo[m]> Or it feels like we have not had very many

18:00 <headius[m]> oh yeah it has been much cleaner that way

19:42 <headius[m]> enebo: the hits keep coming

19:42 <headius[m]> https://github.com/jruby/jruby/pull/6635

19:43 <headius[m]> just ran into this by accident while working on these optimizations

19:43 <enebo[m]> hah

19:43 <enebo[m]> I suppose we do not need to worry about a java package becoming another object

19:43 <headius[m]> interestingly the subpackages gottten from package modules seem fine, as fast as this after

19:43 <headius[m]> only affected these five that are special

19:44 <enebo[m]> makes sense we probably do not use constants maybe to access

19:44 <headius[m]> I am looking to see if the other ones can be cached more but the cost of accessing them in that loop is almost zero already

19:46 <headius[m]> ahh right, the package modules add a method on first access so it is just a get then

19:46 <headius[m]> so that is already good

19:48 <headius[m]> enebo: all those PRs are ready to go

19:49 <headius[m]> so just a matter of deciding

20:03 subbu is now known as subbu|away

20:28 <enebo[m]> ok

20:29 <enebo[m]> I just merge getandcache fix we already decided on that

20:34 <enebo[m]> dig looks ok but I don't see the benefit of the names dig1 and dig2. Feels like the params point that out

20:35 <headius[m]> I had to use a different name because the arity splits have the same signature as dig

20:35 <headius[m]> I mean as dig1/dig2

20:35 <enebo[m]> oh on RubyObject since things are also RubyObjects

20:35 <enebo[m]> ah I see

20:35 <headius[m]> yeah

20:35 <headius[m]> they are static but signature was same so it rejects it

20:36 <headius[m]> could make them package private

20:36 <headius[m]> they are just unrolled versions of dig

20:37 <headius[m]> well sort of unrolled... they recurse but only once or twice

20:37 <enebo[m]> So I don't really see much risk on this one so I guess I don't see any harm here. I have been hoping we can reduce the changeset on 9.2 overall

20:37 <headius[m]> heh yeah

20:37 <headius[m]> it has very little impact on perf compared to the collect PR

20:38 <headius[m]> it could be punted to .18

20:38 <enebo[m]> Are these static on RubyObject due to history or visibility?

20:38 <headius[m]> history... the other dig was

20:38 <enebo[m]> Or even 9.3

20:38 <headius[m]> CRuby basically just has the one function so we put ours on object as static

20:38 <enebo[m]> public elsewhere would remove the naming artifact

20:39 <headius[m]> true

20:39 <enebo[m]> So I am ok if we land this but it is not part of the brief on 9.2.

20:40 <enebo[m]> since I saw chrisseaton was recently over on this tab...do you know of a particular use case where dig is used?

20:40 <headius[m]> it looked from your bench like you did some auditing

20:40 <chrisseaton[m]> Does the benchmark have the comments on how deep we found it being used? Can't remember.

20:41 <enebo[m]> Faster is great but it is more motivating if someone says "oh rails does it all the time"

20:41 <headius[m]> at 3 you had a comment "easy to find" and at 7 "possible to find"

20:41 <headius[m]> does not say where you found these levels though

20:41 <chrisseaton[m]> There wasn't some major obvious problem we were fixing. There are tons of them in Shopify's code-base but I don't think it's some inner-loop operation in Rails.

20:42 <chrisseaton[m]> It was also partly based on a conversation about pattern matching I think.

20:42 <enebo[m]> ah yeah if TR impl pattern matching in Ruby then dig is probably a reasonable way to search

20:43 <enebo[m]> although even for us we could use dig...we tend not to use our public exposed methods as much for that stuff because it usually involves argument parsing and stuff like that.

20:44 <enebo[m]> This PR does provide essentially the raw helpers though for us

20:44 <chrisseaton[m]> Not about directly using dig necessarily, but about recursion and JIT performance cliffs.

20:44 <chrisseaton[m]> There's no massive thing behind this - someone found it was slow, we fixed it, that's about it.,

20:44 <enebo[m]> ah ok. yeah makes sense to fix stuff when you see it.

20:45 <chrisseaton[m]> I Tweeted it to amplify a beginner's contribution.

20:45 <enebo[m]> headius: so I guess I am neutral to ambivalent :)

20:45 <enebo[m]> chrisseaton: yeah

20:45 <headius[m]> on the dig PR?

20:45 <enebo[m]> yeah

20:46 <enebo[m]> it is not really worth this much so I am fine merging

20:46 <enebo[m]> I did not think this had any risk

20:46 <headius[m]> ok

20:46 <enebo[m]> It really would only have been risky if we both didn't realize we dropped the boxed path

20:46 <enebo[m]> or something like that

20:52 <headius[m]> ok, so how about collect PR

20:52 <enebo[m]> heh

20:53 <enebo[m]> I have been looking at both

20:53 <headius[m]> I know we wanted a minimal changeset for this one

20:53 <headius[m]> but these are nice

20:53 <enebo[m]> I am more nervous the more I read but I am also not sure how much more we can do to validate this

20:53 <enebo[m]> For something like RG upgrade I was hoping someone would notice an environmental sort of problem

20:54 <enebo[m]> It is complicated enough from an env perspective where it made me want some more time in the hope someones env happened to notice it

20:54 <enebo[m]> for indy though? Or changing invokers

20:54 <headius[m]> kalenp: I don't know if y'all can test out a prerelease easily yet but it would be nice for .17

20:55 <enebo[m]> I mean if our test suite does not catch this I feel it is less likely someone will just happen to hit it

20:55 <enebo[m]> especially if compile.invokedynamic is enabled

20:55 <headius[m]> yeah and workaround is disable it again

20:55 <enebo[m]> a large rails app is where I would expect something weird to fall out and I just don't think people roll with that flag in this case

20:56 <enebo[m]> like hitting bugs in either of these would be an esoteric enough use of Ruby that nothing we test would hit it

20:56 <enebo[m]> Seems like massive codebases are our only chance at that

20:57 <enebo[m]> (which is painting stuff in a not completely logical way but I think it is generally true from a will someone catch this problem on a dev branch in a week or two)

20:58 <enebo[m]> and also consider I doubt anyone runs any test suites with indy enabled

20:58 <enebo[m]> unless it is for perf regressions and they roll with it on in production perhaps?

20:58 <headius[m]> I did add spec:ji to indy for one of these PRs because it would have caught a regression

20:58 <headius[m]> we have most suites running with and without now

20:59 <enebo[m]> yeah I saw that. So that in iteself may end up saving us in the future (as well as what you fixed in the PR)

20:59 <enebo[m]> So I am somewhat leery that we will add some risk but I don't know how to feel better about the risk

20:59 <enebo[m]> Waiting 2 months on jruby-9.2 won't do that either

21:00 <enebo[m]> So it is never or now in my mind

21:00 <enebo[m]> now == 9.3 no problems :)

21:00 <headius[m]> yeah that is true too

21:00 <enebo[m]> carrot and stick applies here maybe

21:00 <enebo[m]> If 9.3 has it then it is more motivation to move to 9.3

21:01 <enebo[m]> then the never would be we want to do whatever we can to harden 9.2 so we decided to put that risk into 9.3

21:01 <headius[m]> heh well that is another angle

21:02 <headius[m]> we just won't be able to post about it until 9.3 then 😀

21:02 <enebo[m]> It is not some artificial carrot in my mind either since I do have some concerns something is broken and we will have no idea

21:02 <enebo[m]> It is good marketing whenever it is posted

21:02 <enebo[m]> and maybe better for 9.3 than the end of 9.2

21:03 <enebo[m]> I am willing to introduce the risk knowing it is adding it but I also think if we wait we will put it into something that people expect some risk with

21:03 <headius[m]> I guess I don't care where they land

21:03 <headius[m]> if collect doesn't land the rest are not really meaningful

21:04 <headius[m]> maybe the block thing

21:04 <enebo[m]> ok let's just put them on 9.3 for now

21:05 <enebo[m]> I would like 9.2 to be done and only address reported issues

21:06 <headius[m]> I will rebase and move them to 9.3

21:06 <headius[m]> the java package one is pretty simple but not a reported issue

21:06 <enebo[m]> I merged it

21:07 <enebo[m]> hahah I guess I didn't

21:08 <enebo[m]> The only risk I could possibly see here is if we have something else which is constantly putting new shit onto the package instance and by wiping it out we are wallpapering over some leak

21:08 <enebo[m]> but that leak only happens for the top-level method names

21:08 <headius[m]> it only does it for these four packages

21:08 <enebo[m]> seems extremely unlikely to me

21:08 <headius[m]> five

21:09 <headius[m]> the others are cached as methods by the package module logic

21:09 <enebo[m]> as you pointed out every other package is basically getting saved

21:09 <enebo[m]> So the other issue is someone is adding to package instance for whatever reason expecting it to go away

21:10 <enebo[m]> In that case I feel we are just fixing a bug rather than breaking a feature

21:10 <headius[m]> yeah

21:10 <enebo[m]> no one reported this but I don't see much risk and you opened it :)

21:10 <headius[m]> these might be cached somewhere else but it goes through a complicated process to get it

21:10 <headius[m]> so this skips the process

21:11 <enebo[m]> I did wonder

21:11 <enebo[m]> I mean I guess it is weird that the others work

21:11 <enebo[m]> but they are all rooted off of either Java or an existing package

21:11 <enebo[m]> so I assume that magic is in those two places

21:12 <enebo[m]> and these methods probably just do it a bit different and are lacking going through some code path

21:13 <headius[m]> yeah I did not investigate that but started to refactor it

21:13 <headius[m]> it could be cleaned up a lot on 9.3

21:14 <enebo[m]> I think it is at the end it does bindJavaPackageOrClassMethod

21:14 <headius[m]> yeah

21:14 <enebo[m]> singleton.addMethod(name.intern(), new JavaAccessor(singleton, packageOrClass, parentPackage, name));

21:15 <enebo[m]> almost feels like if we want to unconditionally bind Ruby init could just bind to kernel doing this and then we wouldn't need that code at all

21:15 <headius[m]> ahh I think these are like this so if you do not use java we do not initialize any of these packages at boot

21:15 <headius[m]> otherwise they could just get the package and bind it like that right away

21:15 <enebo[m]> yeah

21:15 <enebo[m]> I figured this is an attempt and not loading JI stuff but let's face it we do anyways

21:16 <headius[m]> well let's merge that one since there's a lot of accesses of those top level packages

21:16 <enebo[m]> yeah that one will be huge

21:17 <enebo[m]> for JI users it is sooo common

21:18 <enebo[m]> As for caching I cannot think of anything other than the above and neither of those are things we should worry about

21:18 <enebo[m]> My only comment is we maybe can share some different code later and not manually cache there

21:18 <headius[m]> yeah that is what I started but it was too big for 9.2

21:19 <enebo[m]> ah ok

21:19 <headius[m]> this is a simple fix for now

21:19 <headius[m]> merging!

21:19 <enebo[m]> cool. so we should be good for next week

21:19 <headius[m]> yeah nothing else new reported

21:19 <enebo[m]> I wish I had a cold beer waiting

21:20 <headius[m]> hah

21:20 <headius[m]> I do but they are all crowlers or imperial stouts

21:20 <enebo[m]> TGIF

21:20 <headius[m]> YOLO

21:20 <enebo[m]> TGIIS

21:20 <enebo[m]> Its Imperial Stouts is the end of that one

21:20 <headius[m]> I was wondering

21:21 <headius[m]> I should have chilled that Darkness 2014 I apparently have never checked in

21:21 <headius[m]> the crowlers are pilseners and a sour

21:21 <enebo[m]> 50 minutes in the freezer...good to go

21:24 <enebo[m]> I have a 2014. We can do a syrup chug

21:31 <headius[m]> don't forget to... nevermind

21:32 <enebo[m]> lol

21:33 <headius[m]> I was wrong, 2015

21:33 <headius[m]> the Bat

21:33 <enebo[m]> oh well I have that too

21:34 <enebo[m]> Wait I should check but I think I might have two of those

21:34 <headius[m]> I do which is why I'm cool just hitting it

21:34 <headius[m]> ok rebasing is going fine, and then I will just merge to master

21:34 <enebo[m]> barrel or no?

21:35 <headius[m]> regular

21:35 <headius[m]> I have barrel of a few other years but not that one

21:36 <enebo[m]> yeah I guess I don't have any of that year

21:36 <headius[m]> come on over, you can wear a mask and drink it through a straw

21:42 <headius[m]> collector is rebased and green, merging

21:43 <headius[m]> I will merge varargs tweak when green and then 9.2 and everything will be all set

22:10 drbobbeaty has quit [Read error: Connection reset by peer]

22:15 drbobbeaty has joined #jruby

22:45 <headius[m]> enebo: everything is merged everywhere

22:45 <headius[m]> time to watch Majin Buu wake up

22:45 drbobbeaty has quit [Read error: No route to host]

22:51 drbobbeaty has joined #jruby