ur5us has quit [Ping timeout: 264 seconds]
ur5us has joined #jruby
ur5us has quit [Ping timeout: 260 seconds]
ur5us has joined #jruby
ur5us has quit [Ping timeout: 264 seconds]
JulesIvanicGitte has quit [*.net *.split]
boc_tothefuture[ has quit [*.net *.split]
RomainManni-Buca has quit [*.net *.split]
ChrisSeatonGitte has quit [*.net *.split]
RomainManni-Buca has joined #jruby
JulesIvanicGitte has joined #jruby
boc_tothefuture[ has joined #jruby
ChrisSeatonGitte has joined #jruby
GGibson[m] has quit [Ping timeout: 240 seconds]
kai[m]1 has quit [Ping timeout: 240 seconds]
JulesIvanicGitte has quit [Ping timeout: 242 seconds]
boc_tothefuture[ has quit [Ping timeout: 242 seconds]
RomainManni-Buca has quit [Ping timeout: 242 seconds]
ChrisSeatonGitte has quit [Ping timeout: 242 seconds]
XavierNoriaGitte has quit [Ping timeout: 268 seconds]
KarolBucekGitter has quit [Ping timeout: 268 seconds]
JesseChavezGitte has quit [Ping timeout: 246 seconds]
MarcinMielyskiGi has quit [Ping timeout: 246 seconds]
ahorek[m] has quit [Ping timeout: 246 seconds]
MattPattersonGit has quit [Ping timeout: 246 seconds]
lopex[m] has quit [Ping timeout: 246 seconds]
chrisseaton[m] has quit [Ping timeout: 240 seconds]
nhh[m] has quit [Ping timeout: 240 seconds]
OlleJonssonGitte has quit [Ping timeout: 240 seconds]
TimGitter[m] has quit [Ping timeout: 240 seconds]
FlorianDoubletGi has quit [Ping timeout: 240 seconds]
rdubya[m] has quit [Ping timeout: 240 seconds]
liamwhiteGitter[ has quit [Ping timeout: 240 seconds]
UweKuboschGitter has quit [Ping timeout: 240 seconds]
kares[m] has quit [Ping timeout: 240 seconds]
CharlesOliverNut has quit [Ping timeout: 240 seconds]
ravicious[m] has quit [Ping timeout: 240 seconds]
daveg_lookout[m] has quit [Ping timeout: 246 seconds]
BlaneDabneyGitte has quit [Ping timeout: 246 seconds]
byteit101[m] has quit [Ping timeout: 260 seconds]
TimGitter[m]1 has quit [Ping timeout: 260 seconds]
enebo[m] has quit [Ping timeout: 260 seconds]
headius[m] has quit [Ping timeout: 260 seconds]
kroth_lookout[m] has quit [Ping timeout: 244 seconds]
hopewise[m] has quit [Ping timeout: 244 seconds]
slonopotamus[m] has quit [Ping timeout: 240 seconds]
dentarg[m] has quit [Ping timeout: 268 seconds]
ruurd has joined #jruby
ruurd has quit [Quit: bye folks]
kares[m] has joined #jruby
lopex[m] has joined #jruby
enebo[m] has joined #jruby
kai[m]1 has joined #jruby
MattPattersonGit has joined #jruby
UweKuboschGitter has joined #jruby
slonopotamus[m] has joined #jruby
OlleJonssonGitte has joined #jruby
ahorek[m] has joined #jruby
ChrisSeatonGitte has joined #jruby
JesseChavezGitte has joined #jruby
ravicious[m] has joined #jruby
hopewise[m] has joined #jruby
rdubya[m] has joined #jruby
TimGitter[m] has joined #jruby
BlaneDabneyGitte has joined #jruby
KarolBucekGitter has joined #jruby
GGibson[m] has joined #jruby
daveg_lookout[m] has joined #jruby
byteit101[m] has joined #jruby
headius[m] has joined #jruby
JulesIvanicGitte has joined #jruby
CharlesOliverNut has joined #jruby
FlorianDoubletGi has joined #jruby
boc_tothefuture[ has joined #jruby
dentarg[m] has joined #jruby
nhh[m] has joined #jruby
kroth_lookout[m] has joined #jruby
MarcinMielyskiGi has joined #jruby
liamwhiteGitter[ has joined #jruby
RomainManni-Buca has joined #jruby
chrisseaton[m] has joined #jruby
TimGitter[m]1 has joined #jruby
XavierNoriaGitte has joined #jruby
ruurd has joined #jruby
<daveg_lookout[m]> New Ubuntu 18 Kernel hasn't helped. I'll get a new set of stack dumps and gdb traces with it if that would be helpful.
<headius[m]> I am pretty sure this reproduction is the same issue
<headius[m]> once it gets into this state the child and parent just start tossing exceptions back and forth forever
<headius[m]> I am trying to sort out the proper handling along with some test cases
<daveg_lookout[m]> awesome. It's occurring frequently, so if fix is monkey-patchable we can get it tested out pretty quickly. Still not in staging -- I'm still working on expanding our load testing to apply enough different input types to reproduce this, since it won't happen with any single stimulus type, no matter the volume
<headius[m]> monkey-patchable probably not, this is deep plumbing within the fiber logic... but once we have a patch you could do a custom build of 9.2.14
<headius[m]> or run off a 9.2.15 snapshot if that isn't too much risk
<headius[m]> yeah anything you can do to reliably reproduce it will help, since I am still not sure how it gets into this state in your app
<headius[m]> but it seems like the key is trying to interrupt a thread doing heavy fiber stuff
<headius[m]> I think the interrupt0 and kernel stuff was a red herring and we just saw that in the traces because it is a heavy cross-thread operation
<daveg_lookout[m]> Are there instructions anywhere for using 9.2.15 snapshots? I see the jars on sonatype, but don't see jruby-jars gem with snapshots anywhere. Just planning ahead. Thanks
<headius[m]> jruby-jars snapshots may not be getting pushed anywhere... they don't typically get deployed to maven and that is how we are generating snapshots right now
<daveg_lookout[m]> ok, we can build our own jruby-jars gem and host it locally with special version number. thx
<headius[m]> enebo: you around?
<enebo[m]> yeah
<headius[m]> trying to puzzle out how to handle these scenarios
<headius[m]> basically the issue is having a thread waiting on a fiber and then someone interrupts the thread with a raise
<enebo[m]> and by thread you mean java thread
<headius[m]> since it is waiting on the fiber the logic to me is that the raise gets forwarded to the fiber and then we wait for it to do something about it
<headius[m]> well by thread I mean some consumer of a fiber
<headius[m]> which might be another fiber
<headius[m]> basically someone on the fiber.resume side gets interrupted waiting for the fiber to come back with something
<enebo[m]> So something concurrently raising on a thread waiting for a fiber
<headius[m]> right
<headius[m]> so, forward to the fiber and it can handle it or re-raise it... so caller uses thread interrupt to forward it and then loops back to wait again
<headius[m]> the problem seems to be that on the fiber side it might get this exception while trying to respond, in which case it looks like it was sent to the fiber and it forwards back to the parent
<headius[m]> because we use the same mechanism to signal fibers as we do to signal threads, it can't see that this was already forwarded once
<enebo[m]> this exception == the exception the thread uses to notify or the exception of the raise?
<enebo[m]> or are those the same
<headius[m]> same
<headius[m]> so this is the case that will get stuck and run forever:
<headius[m]> t = Thread.new { f = Fiber.new { loop { Fiber.yield; p 1 } }; loop { f.resume; p 2 } }; sleep 1; t.raise('foo'); t.join
<headius[m]> it will do 1, 2, 1, 2, 1, 2, and then stop because it gets in a loop forwarding the exception back and forth
<enebo[m]> ok so C sends to A and A will then send to B but B could get the same thing and then decide it needs to send back to A
<headius[m]> and never exits
<headius[m]> well it is even more tricky than that... A sends to B, and B sees it and thinks it was interrupted so it sends it to A
<headius[m]> since they both loop while awaiting a response they just pingpong the same exception forever
<enebo[m]> ok so A sending it is unclear to B. It cannot differentiate between delegating and just really being sent it
<headius[m]> what I am trying to avoid is having a thread interrupted waiting on a fiber that is running, and then walking away from that fiber without stopping it
<headius[m]> yeah that seems to be the main issue
<headius[m]> maybe this is just the wrong mechanism to forward to the fiber
<enebo[m]> The really obvious answer is differentiating the two types of sends
<enebo[m]> Like wrapping the raise of the forwarding case or something like that
<headius[m]> so what you would want to happen is that the exception is picked up by the fiber, it propagates it, that thread exits and has an exception in hand that it tosses back to the parent, which propagates it
<headius[m]> simulating that it is the same call stack basically
<enebo[m]> I am guessing the propagation itself is all done with throw too?
<headius[m]> yeah
<headius[m]> that would just be normal stack unrolling
<enebo[m]> So another mechanism maybe is just adding some state to mark it is a forward vs an honest interrupt
<headius[m]> so this logic is basically trying to forward the interrupt into the fiber's stack since that is what is "really" running
<headius[m]> mmm yeah
<headius[m]> perhaps ForwardedException that it knows to unwrap and throw rather than forward again
<enebo[m]> subtyping I guess if you want to construct an object
<headius[m]> that would only be thrown at this boundary so it knows it is due to the other side getting interrupted
<enebo[m]> I suppose this can be multiple exceptions
<enebo[m]> If anyone wants to see the beginning of something neat: https://github.com/enebo/jruby-launcherr/actions/runs/502105404
<enebo[m]> I will add releases and I think I may be able to support adding more than the big three
<enebo[m]> I am hoping to finish up some "gem" logic here to make a sample replacement gem for testing the new launcher
<enebo[m]> windows has an issue of orphaning a ruby script which runs rails which exec rails and then receives a ^C
<enebo[m]> but seemingly works well other than that :P
<headius[m]> I will try doing the forward as a special exception
<enebo[m]> ok
<enebo[m]> headius: imagine how nice it will be to have a launcher which can compile on all supported "binary' platforms via CI
<headius[m]> so launcher functionality is basically done?
<enebo[m]> well "done"
<enebo[m]> I want to make DLL overlay for windows and I have to fix the windows orphaning
<enebo[m]> The unix/mac/etc... from a Rust perspective is probably done
<enebo[m]> I want to go through all reported and closed issues on jruby-launcher and make sure none of that regresses
<enebo[m]> but I figured if I can make a gem anyone interested can install it and try it up until our release
<enebo[m]> At this point it is the old launcher logic for the most part which I think is overly complicated but I am testing each scenario
<enebo[m]> As I continue to add tests perhaps it will get simpler
<headius[m]> so this is still based on exec'ing a command line on unix yeah?
<headius[m]> I think only Windows tried to load jvm.dll
<headius[m]> in the old one
<enebo[m]> yeah exec everywhere atm
<enebo[m]> but we released 32 bit launcher on windows in our releases so even in the old one no one had been using dll launcher
<enebo[m]> unless they built the 64 bit one themselves
<enebo[m]> I do plan on fixing that though since we realized supporting 32 bit exe is something we can drop from our releases
<enebo[m]> no windows release anymore will not run 64 bit unless it is ancient and we can exec to 32 bit jvm
<enebo[m]> or CreateProcessW on windows...although I am just using Rust buitin Process stuff in std on windows
<enebo[m]> Having written a bit more in Rust I cannot fathom any reason to use C++ at this point other than some ABI inconvience (which would need to be very inconvenient since you can C call convention from Rust) or have a legacy code base
<headius[m]> yeah 64 should be the default now and will work 99% of time
<headius[m]> and yeah agreed, every time I have to look into C++ I just see a dead end
<enebo[m]> That is merely my opninion but I just don't see any appeal to C++ any more. Especially if you are writing a multiple platform launcher :)
<headius[m]> if you can push a gem for this thing we could throw it into CI on a JRuby branch
<headius[m]> if it gets through that gauntlet it is pretty solid
<headius[m]> so I think I am doing an even simpler fix here to try it: only retry once
<enebo[m]> well I don't know if that would demonstrate very much
<headius[m]> so if caller is interrupted with a raise it forwards the raise and tries to wait once more... if it gets interrupted a second time either that is the fiber sending it back or someone is really trying to kill this
<enebo[m]> we always invoke from a single install location
<headius[m]> gonna need to get that loom fiber in place soon
<enebo[m]> I think it may help validate the module stuff but even then i think we always do the same thing
<headius[m]> this fixes it with a minimal change
<enebo[m]> reporter is on channel right? Hopefully they can give good confirmation as well
<enebo[m]> we can 9.2.15.0 once it is solid
<headius[m]> yeah daveg_lookout and kroth_lookout both
<headius[m]> I will push this to a branch and see how it looks
<daveg_lookout[m]> Yep, we're here.
<enebo[m]> cool
<enebo[m]> headius: so your description is using a wrapper to let it know what it is?
<enebo[m]> it is unclear. it almost sound like you are just looking to see if you receive two
<headius[m]> I backed off from the wrapper because I realized it could show up at other places in the target fiber, like if it was blocked on IO or something
<headius[m]> and it would bubble through exception handlers looking for RaiseException etc
<headius[m]> so now it will try to forward any interrupting exception once and then wait for a response again
<enebo[m]> yeah attribute would work better for that if you could assume only one exception type
<headius[m]> either the fiber will handle it and return a normal value, or it will get tossed back to the caller, which should then raise it normally
<enebo[m]> so sender will just block?
<headius[m]> so the fiber gets one opportunity to handle the exception that interrupted the caller
<enebo[m]> If they fiber dies for some other reason does that mean a dead blocked thread?
<headius[m]> rather than looping forever, which seems to be the cause of this issue
<headius[m]> waiting for a response is via a queue pop
<enebo[m]> or does the thread monitor the state of the fiber while it waits?
<headius[m]> it will pop to wait for a response initially... if that pop is interrupted by an exception it will send that to the fiber and try to pop again... whatever happens at that point it will either return a result from the fiber or propagate any further exception
<enebo[m]> but only if the fiber actually responds or another interrupt comes in from somewhere else
<headius[m]> right
<enebo[m]> So I guess so long as the fiber does something it is fine and perhaps that is fine. I am just talking through it to be helpful :)
<headius[m]> so there is a chance of orphaning a fiber if the calling thread gets interrupted twice while waiting, but that would be a pretty strong indication the thread needs to bug out
<enebo[m]> ok
<headius[m]> enebo: how big are the executables?
subbu is now known as subbu|lunch
<enebo[m]> headius: I got 220k last I checked with a strip but this is without using xargo LTO and other options.
<enebo[m]> Xargo can get the size down quite a bit more but it means switching to unstable and 5 platforms as binaries will still be 1/2 of our gem size today so it may not be worth it
<headius[m]> yeah 220 is already a lot better
<enebo[m]> unstable is rust is not really as scary as it sounds but at this point I am somewhat satisfied
<enebo[m]> if I end up doing as much in windows as the C++ launcher perhaps that will grow it but winapi or windows-rs appear to be very similar to most foreign bindings in that you ask for stuff to get pulled in and it generates the code for binding to those windows methods
<enebo[m]> So I don't expect very much growth in size there even if we do use 8-10 more windows methods
<headius[m]> ok cool
<headius[m]> I have pushed a PR with this fiber change (only try to forward interrupt exception once)
<headius[m]> it passed fiber specs locally so we shall see how it does in full ci
<headius[m]> daveg_lookout: you can make a build from that branch, or just apply my patch to 9.2.14 and do a build from there
<headius[m]> if this looks ok in CI I will merge later
<daveg_lookout[m]> sounds good, looking at the change now
subbu|lunch is now known as subbu
ur5us has joined #jruby
<kroth_lookout[m]> I had a look at that PR (and the rest of the conversation here) and seems worth trying to me
ur5us has quit [Quit: Leaving]
Caerus has quit [Ping timeout: 256 seconds]
Caerus has joined #jruby
Caerus has quit [Ping timeout: 265 seconds]
Caerus has joined #jruby