ur5us has quit [Ping timeout: 258 seconds]
ur5us has joined #jruby
ur5us has quit [Ping timeout: 258 seconds]
Antiarc has quit [Ping timeout: 240 seconds]
Antiarc has joined #jruby
ur5us has joined #jruby
ur5us has quit [Ping timeout: 258 seconds]
mistergibson has joined #jruby
<headius[m]> good morning
<headius[m]> 9.2 branch merged to master
<boc_tothefuture[> Afternoon all
<headius[m]> hi there!
<boc_tothefuture[> I am trying to understand a bit how java.math.BigDecimal is supposed to work within the ruby ecosystem.
<boc_tothefuture[> for example, i see it implements coerce but if I do a math operations with it, I get an error.
<boc_tothefuture[> Like if I do "8 + BigDecimalVariable" I get an exception.
<boc_tothefuture[> Is there a best practice here, a way to convert BigDecimal to the Ruby version and then go from there? Or really to convert back and forth reliably?
<headius[m]> you need to use Java's BigDecimal instead of Ruby's?
<boc_tothefuture[> well.. I am given Java BigDecimal
<boc_tothefuture[> From the framework
<boc_tothefuture[> I could convert it to Ruby's if that is the way to go..
<headius[m]> ah ok... and it isn't converting to Ruby automatically during the call
<boc_tothefuture[> no, it throws an error essentially saying it can't be casted to that.
<boc_tothefuture[> Java::JavaLang::ClassCastException (org.jruby.ext.bigdecimal.RubyBigDecimal cannot be cast to java.math.BigDecimal)
<boc_tothefuture[> but I didn't see a "to_ruby_big_decimal" type method.
<headius[m]> you can call to_d
<headius[m]> that is the Ruby coercion method for bigdecimals
<headius[m]> not common but it is there
<headius[m]> also need to have done require 'bigdecimal
travis-ci has joined #jruby
<travis-ci> jruby/jruby (master:7abc7bb by Charles Oliver Nutter): The build was broken. https://travis-ci.com/jruby/jruby/builds/207813672 [201 min 36 sec]
travis-ci has left #jruby [#jruby]
<boc_tothefuture[> that works. thanks!
<headius[m]> excellent!
<headius[m]> hmm failures on sequel head... I wonder if those are our issues
<boc_tothefuture[> question I didn't ask... is there a way to convert it back?
<headius[m]> generically, all Ruby objects will have a to_java method that takes an optional Java type
<headius[m]> to_java should do the right thing for you
<headius[m]> the Ruby BigDecimal just wraps a Java one so the conversion should be fairly lightweight
<boc_tothefuture[> yep.. awesome, it does! :-)
travis-ci has joined #jruby
<travis-ci> jruby/jruby (master:f49e970 by Charles Oliver Nutter): The build is still failing. https://travis-ci.com/jruby/jruby/builds/207821582 [200 min 12 sec]
travis-ci has left #jruby [#jruby]
subbu is now known as subbu|lunch
travis-ci has joined #jruby
<travis-ci> jruby/jruby (load_service_redux:cfe71a2 by Charles Oliver Nutter): The build failed. https://travis-ci.com/jruby/jruby/builds/207829506 [206 min 13 sec]
travis-ci has left #jruby [#jruby]
<headius[m]> hmm no jeremyevans in here today
enebo has joined #jruby
ChanServ changed the topic of #jruby to: Get 9.2.14.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com
<headius[m]> enebo: so the fzakaria eagain thing is triggered by these properties: -J-Djnr.ffi.asm.enabled=false -J-Djruby.compile.mode=OFF
<headius[m]> remove either one and it works
<headius[m]> compile.mode seems to be used by some Ruby FFI stuff but I do not see how it affects this waitpid call that happens via jnr-posix
<headius[m]> the ffi.asm thing obviously affects jnr-posix but that property alone is insufficient to trigger the bug... very strange
<headius[m]> perhaps something in jnr-ffi is also looking at the jruby.compile.mode flag? 🤔
<enebo[m]> hmm this is wsl and not windows proper?
subbu|lunch is now known as subbu
<enebo[m]> we don't detect that as windows do we?
<headius[m]> not windows, not wsl
<enebo[m]> on this is nix stuff
<headius[m]> Linux but with the Nix package system/userspace in place
<headius[m]> yeah
<enebo[m]> WOT
<headius[m]> but I realized we have seen intermittent EAGAIN on waitpid on travis
<enebo[m]> the ffi thing I can see an angle in my head but why compile=off?
<headius[m]> yeah weird isn't it
<headius[m]> Ruby FFI does use that property to decide if it should generate bytecode stubs for FFI functions
<headius[m]> but this is just doing system('date') which does a jnr-posix waitpid internally
<headius[m]> it should not touch Ruby FFI stuff at all
<headius[m]> (and Ruby FFI probably should use a different property anyway
<enebo[m]> could it possible use OFF in FFI?
<headius[m]> I think I need to see if jnr-ffi is looking at the jruby property or something
<headius[m]> how else could jnr-posix be affected
<enebo[m]> hmm
<enebo[m]> If something is examining off outside the interpreter or whether jit is enabled it feels wrong
<enebo[m]> but if that is not the case how could using an interp cause a behavioral difference
<headius[m]> yeah it is not meant for things outside the jit
<travis-ci> jruby/jruby (master:f52f741 by Charles Oliver Nutter): The build is still failing. https://travis-ci.com/jruby/jruby/builds/207829532 [200 min 48 sec]
travis-ci has joined #jruby
travis-ci has left #jruby [#jruby]
<enebo[m]> We have seen interp and JIT do things differently over the years but they tend to just be basic ruby semantics usually in weird corners and it is hard to see how that would happen to make something in ffi do something differently
<enebo[m]> barring "fell out of the interpreter" and not running half a source file
<headius[m]> yeah this is literally just -e "system('date')"
<headius[m]> and it gets to the internal waitpid call
<enebo[m]> but for that to be true it would be super weird since JIT tends to only kick in after n attempts
<headius[m]> also reproducible with Process.waitpid spawn 'date'
<enebo[m]> even the asm part feels weird to me
<enebo[m]> I guess at some level posix will call down to jnr-ffi which will make it to jffi
<headius[m]> I could see asm having an effect but why only with compile.mode=OFF? If it is broken it should remain broken
<headius[m]> bleh
<enebo[m]> what prints out with native.enabled.verbose?
<headius[m]> native.verbose causes it to print out successfully loaded native POSIX impl
<enebo[m]> so we are getting LinuxPOSIX
<headius[m]> hmmm
<headius[m]> yeah that seems ok
<headius[m]> I wonder... could it be getting interrupted
<headius[m]> the logic for Process.waitpid sets up an interrupter using pthread_kill
<enebo[m]> comment it out
<headius[m]> it should not be firing because we don't interrupt the thread but if it did, that could cause this
<headius[m]> I'll do one better...I added an option to disable that feature
<headius[m]> native.pthread_kill=false
<enebo[m]> also you could put a a print on wakeup
<enebo[m]> oh but you see it as InterrruptedException in pthreadKillable?
<headius[m]> blast it doesn't help
<headius[m]> that was a good theory
<enebo[m]> I am just curious if you see this thing looping or what happens
<enebo[m]> but what I find weird is this two problems or just one
<enebo[m]> the fact that two settings causes it may not mean it is the same issue
<headius[m]> hmm I just notice it may not be using the pthread_kill logic
<enebo[m]> so I guess errno is 0 without those two set
<enebo[m]> err without either set
<headius[m]> ah yeah nevermind it finishes the pthreadKillable call and then sees nonzero errno
<enebo[m]> yeah
<headius[m]> raiseErrnoIfSet should be inside the closure
<headius[m]> so it is immediately after the waitpid
<enebo[m]> pthreadKillable probably is not creating errno != 0 though
<enebo[m]> I guess unless as you say signalHandlewr interrupts
<headius[m]> date finishes executing too, it prints out in this output
<headius[m]> maybe errno is just not being cleared and due to the combination of flags it is seeing an EAGAIN from something else?
<headius[m]> oh no
<headius[m]> what if this is normal behavior because the subprocess runs so fast we don't have time to waitpid
<headius[m]> and we just slow down due to compile=off and asm=off
<enebo[m]> hmm
<enebo[m]> I assume compile=off will always be present at first
<headius[m]> well just sleeping between the spawn and waitpid does not fail
<headius[m]> so perhaps not
<enebo[m]> so the combo could maybe slow it down enough?
<headius[m]> does not fail without the properties set and explicit delay
<enebo[m]> but this is only on Nix too right?
<headius[m]> well this is the only place I have been able to repro, using fzakaria docker image
<enebo[m]> remove pthreadKillable from this and just call waitpid
<headius[m]> the option to turn off pthread_kill should be doing that
<headius[m]> does not appear to help
<headius[m]> I don't have this set up to rebuild in the container right now
<enebo[m]> ok. I guess if you are certain of that then that is not part of it
<enebo[m]> ah
sagax has joined #jruby
<enebo[m]> but I can confirm it should not happen from reading the code
<enebo[m]> unless applyAsInt is super weird :)
<headius[m]> right, should just go straightaway to the waitpid closure if that property is off
<headius[m]> heh yeah yay for erased generics
<enebo[m]> so both
<enebo[m]> I have to say I find the weirdest aspect of this is compile=off
<headius[m]> definitely
<enebo[m]> really only two theories have merit: 1) slower execution 2) a bug in interp
<headius[m]> there's the full output and command line
<enebo[m]> but what Ruby executes in that call?
<enebo[m]> something ruby starting up which toggles something
<headius[m]> oh no
<enebo[m]> ok so one thing to note here
<headius[m]> -Xdebug.parser fixes it
<enebo[m]> yeah I was going to suggest that :)
<headius[m]> I was getting there from your train of thought
<enebo[m]> there is no Ruby loaded in that script so for OFF to be effective it would have to do something to earlier Ruby code loaded
<enebo[m]> I will also say the other thing though...as a test case -e is normally force to compile as the main script
<enebo[m]> so if you -r something_with_that system it probably would break without compile=off
<enebo[m]> if it had something to do with those lines
<headius[m]> yeah this is an interesting wrinkle
<enebo[m]> but this makes much more sense to me
<enebo[m]> so something in Ruby loading is working differently with compile=off
<enebo[m]> That in itself is remarkable
<enebo[m]> since nothing will compile normally past the main script which has not executed multiple times already
<enebo[m]> hmm
<enebo[m]> with default settings do we compile more than the default file/-e?
<headius[m]> not unless jit fires
<enebo[m]> ok so let's think through this
<headius[m]> nothing in prelude should be jitting in this short example
<enebo[m]> could we errantly execute something 20 times and not notice it is not working during bootstrapping ruby but it always ends up ok in the end because it JITs?
<headius[m]> there's a little bit of FFI use here but only on Windows and Solaris
<enebo[m]> and that is the other part of this it takes both to fail
<headius[m]> I don't see how we wouldn't notice it failing
<enebo[m]> well we do not see it anywhere but Nix so far
<headius[m]> what else does debug.parser turn off?
<enebo[m]> literally everything
<enebo[m]> all we do is init in Ruby
<enebo[m]> err initCore I think but we do not load gems or anything in kernel
<enebo[m]> as you know it exists to debug the parser/lexer so we will never execute any ruby
<headius[m]> right
<enebo[m]> how that does it I don't recall
<enebo[m]> It is super useful as it turns out
<headius[m]> aha
<headius[m]> --disable-gems also works
<enebo[m]> hmm
<enebo[m]> ok well that removed a thousand lines :)
<headius[m]> yeah but doesn't help narrow down much 😀
<enebo[m]> so something in gems or dependency of gems is doing something with ffi and in interp mode it fails?
<enebo[m]> but that comes back to what would not interp in the first place
<enebo[m]> and if it failed it would need to keep getting called so a JIT could fix it
<enebo[m]> headius: can you nuke all the gems?
<headius[m]> I probably can
<enebo[m]> one thing gems does is load a lot of crap in a loop
<headius[m]> I can confirm just disabling did_you_mean does not fix it
<enebo[m]> and maybe there is something really strange in there that the interp does not do well
<enebo[m]> but in full it JITs and continues enough where we do not notice not everything is loaded
<headius[m]> ok weird
<headius[m]> --disable-gems -rrubygems
<headius[m]> also is ok
<enebo[m]> HAHA
<headius[m]> that should only prevent gem related stuff at boot from loading
<enebo[m]> so it is not loading any gems but loading rubygems
<headius[m]> yeah
<headius[m]> and that is workoi
<headius[m]> working ok
<enebo[m]> yeah I wonder if there is a problem loading some gems in that image and we normally JIT something and that "fixes" the bootstrap enough
<headius[m]> ack nevermind it is intermittent
<headius[m]> I may be wrong about all this now
<enebo[m]> with OFF?
<headius[m]> ok so requiring rubygems does fail
<headius[m]> just didn't at first
<enebo[m]> ok
<headius[m]> I have not gotten --disable-gems alone to fail
<headius[m]> so there does seem to be a race when it fails
<enebo[m]> my pet theory has an interesting problem with it. What gem not loaded properly with OFF would then cause something later to stop working because something else finally loaded
<headius[m]> yeah something that touches FFI and leaves a bad errno somewhere somehow?
<enebo[m]> hmm we do reset errno at times
<headius[m]> I bet errno is nonzero before this waitpid call and it isn't cleared
<headius[m]> I think I can check that
<enebo[m]> yeah that was what jumped out when you said that
<headius[m]> could be a libc behavior different on Nix?
<enebo[m]> but if that is true you should see this behavior potentially in calling many things?
<headius[m]> not clearing errno in the same places
<headius[m]> and some gem does an ffi call that leaves an errno set
<headius[m]> I don't know how to tie this together with compile=off
<enebo[m]> so it is too bad you cannot build easily since you could remove that errno check
<headius[m]> oh but compile=off does change how FFI works
<enebo[m]> I mean printing it before would also be a good check
<headius[m]> as mentioned earlier
<headius[m]> so if it started to cause some ffi call booted by rubygems to fail, leave an errno present, and then we don't clear it before this
<headius[m]> and weird libc just for extra spice
<enebo[m]> can you repeat how off with ffi is different?
<headius[m]> errno is 2 before the waitpid call
<headius[m]> enoent
<enebo[m]> how does FFI change with compile=off?
<headius[m]> it uses a generic invoker instead of a bytecode-generated invoker
<headius[m]> as with the asm property there may be bugs in the generic invoker code never seen because we typically don't run this way
<enebo[m]> ok that seems likely now
<headius[m]> I had to do some work on jffi/jnr-ffi to get the non-asm logic passing tests
<enebo[m]> So it seems very likely you can finish this up by resetting errno before the call but it begs a couple of questions
<headius[m]> I believe it falls through that code if compile=off
<headius[m]> yeah this could be a glitch in how jnr-ffi or jnr-posix or ffi handles errno
<headius[m]> like the generic invoker is supposed to clear errno but does not
<enebo[m]> 1. is there a bug here in jffi/jnr-ffi that is not working and the errno is just a sad side-effect?
<enebo[m]> 2. Should we be more defensive before calls to posix and reset errno?
<headius[m]> I had thought that in normal C we can rely on errno to be reset to zero on a successful call
<headius[m]> not having to clear it before that call
<enebo[m]> heh for all we know waitpid is not being invoked at all here
<headius[m]> but jnr-ffi or jnr-posix includes logic to cache errno so it doesn't get corrupted by an intervening call
<enebo[m]> we still don't know if resetting it would even make this work
<headius[m]> if that broke or was not being done properly in the generic invoker we could end up with errno remaining set across a successful call
<headius[m]> yeah well I can force it before waitpid
<headius[m]> trying now
<enebo[m]> yeah if it failed to invoke before it might just always be broken
<headius[m]> clearing b
<headius[m]> before the call seems to work
<headius[m]> so it seems like this is a rogue errno value leftover in jnr-posix or something
<headius[m]> ugh sorry intermittent again
<headius[m]> and --disable-gems may be another red herring
<headius[m]> it is a timing issue of some kind
<headius[m]> ok disable-gems does still seem to be green, whew
<headius[m]> and clearing does not help
<enebo[m]> so perhaps it is just the non-generated invoker is broken
<enebo[m]> headius: so compile=OFF but asm=true will go back to generated invokers?
<headius[m]> well there are two levels of generation
<headius[m]> asm=true allows jnr-ffi to generate ASM stubs for the native side of a call
<headius[m]> compile=off is being used by our Ruby FFI impl to decide whether to generate a Java stub for each FFI function
<headius[m]> so asm=true turns on the native stub again but the java stub would still be off
<headius[m]> I will try to really confirm that asm property actually is affecting this
<headius[m]> half dozen runs with asm=true all ok
<headius[m]> back to false, fails immediately
<headius[m]> seems the same with compile.mode... works ok six out of six runs with compile=default and fails again immediately when compile=off
ur5us has joined #jruby
<headius[m]> I can confirm the errno does clear to 0 before waitpid
ur5us has quit [Remote host closed the connection]
<headius[m]> hmm
<headius[m]> bypassing Process.waitpid and going straight to jnr-posix seems to pass ok
<enebo[m]> ship it
<enebo[m]> funny though it seems like lots of stuff should be broken in this env
<headius[m]> oh well of course, there's no raise
<headius[m]> but errno does appear to be zero after the waitpid call in this configuration
ur5us has joined #jruby
<headius[m]> this nix setup appears to be using glibc btw
mistergibson has quit [Quit: Leaving]
ur5us has quit [Ping timeout: 260 seconds]