<headius[m]> pretty good description
<headius[m]> Ruby flair
<headius[m]> kalenp: oh wow that's awesome, is there something public we could tweet from @jruby?
<headius[m]> I'd encourage you to go all the way to 9.2.11(.1) if possible, because every minor release improves performance (and .10 and .11 have some great stuff)
<headius[m]> kalenp: I have to run but if you tweet something at @jruby (or @headius) I'll see it and retweet
_whitelogger has joined #jruby
ludolf[m] has joined #jruby
postmodern has quit [Read error: Connection reset by peer]
postmodern has joined #jruby
postmodern has quit [Quit: Leaving]
ur5us has joined #jruby
ur5us has quit [Quit: Leaving]
<chrisseaton[m]1> Hi is headius around?
<havenwood> headius[m]: ping
<headius[m]> Hey what's up
<chrisseaton[m]1> headius: remember this commit? https://github.com/jruby/jruby/commit/a7484ba4398365673d988a85157f4ce791e3248c. I see you using method handles to read for example `nil` out of the runtime, but how does it turn into a constant?
<headius[m]> An unchanged indy call site becomes a constant guarded by a safepoint
<chrisseaton[m]1> This call site reads `public final IRubyObject nil;` from `ThreadContext`, but the context is an argument, not a constant. Where do you add the logic to test anything?
<headius[m]> Test what?
<headius[m]> Once populated it should never change
<chrisseaton[m]1> If you get a different thread context, it's a different `nil` object isn't it?
<chrisseaton[m]1> Do you test that the context is what you expect it to be? How do you find that you've been called with a different context and so you need to re-read the nil object?
walter[m] has joined #jruby
<headius[m]> That can't happen
<headius[m]> Well, it can but it doesn't matter
<headius[m]> It's the same nil
<headius[m]> Context is just used to get the nil instance, which can't be passed to a bootstrap method to be a "real" constant
<headius[m]> All contexts in a give runtime point to the same nil and the bytecode is not shared across instances
<chrisseaton[m]1> So the method handle expression is constant unless you told it not to be?
<chrisseaton[m]1> The method handles are to read a field from an object - does Java assume that's constant unless you tell it otherwise?
<chrisseaton[m]1> I see the `SwitchPoint` for normal calls
<headius[m]> That's correct
<headius[m]> SwitchPoint is just an explicit safepoint I control
<headius[m]> The handles don't read the field, I read it and set the call site to a constant value. If it never changes it propagates as a constant
<chrisseaton[m]1> So after compilation the `ALOAD 0` to get `ThreadContext` is redundant?
<headius[m]> Yes
<chrisseaton[m]1> And what you're using indy here for is really a kind of `load_runtime_constant` instruction
<headius[m]> It is dropped and does not appear in the native code
<headius[m]> Yes
<headius[m]> This is a pretty important characteristic of indy call sites, and we use it heavily
<headius[m]> In theory we could treat larger expressions as guarded contestants too, but I haven't taken it that far. I have done prototypes of a compiler that emits method handles, which should specialize very well
<headius[m]> Constants
<headius[m]> (on mobile)
<chrisseaton[m]1> Yeah like template compilation - each instruction is an indy call site for the logic.
<headius[m]> Right, that's basically how we are using it in these cases. We could be using it a lot more, but more specialization penalizes startup and memory footprint so we have to balance it
<headius[m]> For literals, the cost of bootstrapping is worth it to be able to fold them as constants. For other things, it's a mixed bag
<chrisseaton[m]1> Yeah I can see constants literally become one instruction
<headius[m]> The downside is that this ties a piece of bytecode to a single jruby run time, which we mitigate by deferring bytecode compilation as long as we can
<headius[m]> The multiple runtime case is probably better served by not using invoke dynamic, if memory profile and warm up are more of a concern. In that case we can share one copy of the bytecode across runtimes
<headius[m]> Despite some early troubles, indy works really well for us now
<chrisseaton[m]1> I still see both control paths for fixnum and bignum, where I'd expect bignum to be left as a deoptimisation - for example a simple fib still has `subtractAsBignum` in the machine code.
<chrisseaton[m]1> Is indy not helping with that?
<chrisseaton[m]1> Or maybe C2 doesn't intrinsify `ExactMath`
<headius[m]> It's because fixnums are not the value objects
<chrisseaton[m]1> Ah ok
<headius[m]> Every math operation creates a new object which could potentially not be a fixnum
<headius[m]> If you run this on Graal jit, you should see it reduced to long math
<chrisseaton[m]1> Yes the issue with unboxing
<headius[m]> And I believe floating-point math should optimize bit better on C2
<headius[m]> Remember that jruby existed for 5 years before I got involved and it was another 4h5 years before indy was viable, so there's a lot of decisions that were out of my hands (like nil having state)
<headius[m]> 4 to 5 years
<chrisseaton[m]1> Yeah I'm doing some history spelunking of my own here
<chrisseaton[m]1> Thinking about the idea of a Truffle interpreter for IR...
<chrisseaton[m]1> Seems like it'd be easier now we know a lot more about writing bytecode intepreters in Truffler due to Espresso and things
<headius[m]> It would be interesting to try
<headius[m]> The down side of IR is that shoving everything into a not-quite-SSA register machine loses context you had in the AST. It works great for going to bytecode but I can't be very smart in my jit
<chrisseaton[m]1> You have these complex 'operands' which are like little trees off the instructions, which suit Truffle.
<chrisseaton[m]1> Ideally Truffle wants structured control flow though.
<headius[m]> Yeah and over time there more of those as we see what we can do with indy
<headius[m]> String interpolation, for example, has gone from being several separate IR instructions to a single large instruction so I can propagate that locality into invoke dynamic expressions
<headius[m]> Like Java string formatting and concatenation is doing now
<chrisseaton[m]1> Yeah do you break things up so optimisation passes can work on them, or keep them high-level so the backend can be intelligent.
<chrisseaton[m]1> Is Subbu still active?
<headius[m]> That's the trade-off yeah
<headius[m]> Subbu is not active for many years now
<headius[m]> Ultimately we don't have the bandwidth to essentially duplicate what the jvm should be doing for us, so more and more instructions are getting condensed
<headius[m]> Indy allows us a middle ground that bypasses bytecode rather neatly
<chrisseaton[m]1> headius: have you ever worried about `moduleGeneration` overflowing in a long-running application that uses metaprogramming?
<headius[m]> yes, and because of that I have used other mechanisms in places like a new Object() instance and identity comparison
<headius[m]> practically though you'd have to generate a lot of modules/classes to overflow it
<headius[m]> indy doesn't use generation in any case
<headius[m]> it wouldn't be a large change to make it a long either
<chrisseaton[m]1> Seeeeems like we should just be able to switch it to `Object`... the identity comparison should be the same machine code.
<headius[m]> yeah, it has remained an int to avoid chewing through objects, but that's probably overoptimization
<headius[m]> define a million methods and create a million objects... are the objects more of a problem than the million methods?
<chrisseaton[m]1> Yeah I see indy already does exactly this
<headius[m]> and every module has a SwitchPoint-based invalidator, which ignores generation (though they invalidate on the same boundaries)
<headius[m]> I have meant to rework invalidation, perhaps more like you guys do it, since what we have now pays a very large cost when invalidating root elements of the class heirarchy
<headius[m]> it's on a long todo list
<headius[m]> I have had to work around the cost of SwitchPoint invalidation as well, since at one point it always forced a safepoint even if never bound to anything ☹️
<headius[m]> that may not be the case anymore, but we still have places where the switchpoint creation itself is lazy
<headius[m]> and so on
<chrisseaton[m]1> I don't think you ever get 'extra safepoints' - the compiler inserts them always in the same place, regardless of what SwitchPoint or Assumption checks you have. The reason for this is that the invalidation can wait until you reach one, so they can be as late as you want and no point trying to respond to them more quickly.
<headius[m]> probably
<headius[m]> the only concerns are irreversible operations (memory writes, IO) which are always guarded
<headius[m]> safepoint is just an abstraction around being able to manually flip that bit
<headius[m]> I mean switchpoint
<headius[m]> I suppose that's why they didn't just call it safepoint... too much implication that you're creating a new one
<headius[m]> I could not be on the JSR-292 committee because I was working for Sun at the time, but we had Ola Bini represent JRuby
<headius[m]> yeah basically the same
<headius[m]> there are other reasons the generation has remained, like for embedding a serial number into generated code... but I think most of those experiments are irrelevant now