ur5us has quit [Ping timeout: 258 seconds]
nirvdrum has quit [Ping timeout: 240 seconds]
ur5us has joined #jruby
ur5us has quit [Ping timeout: 250 seconds]
ur5us has joined #jruby
ur5us has quit [Ping timeout: 276 seconds]
nirvdrum has joined #jruby
<chrisseaton[m]> I've been working with someone who's encountering the 2 GB string limit in JRuby in practice (not judging - TruffleRuby has the same limitation!) Do you have any tips or tricks to manage that limitation?
nirvdrum has quit [Remote host closed the connection]
<chrisseaton[m]> I'm recommending FFI to them.
<enebo[m]> chrisseaton: ffi is probably the simplest workaround
<chrisseaton[m]> Got an overflow bug we found coming as a PR as well.
<enebo[m]> chrisseaton: cool. I recently saw we have some tagged specs on large index values in ruby/spec. In that case I think we are supposed to realize it and raise
<chrisseaton[m]> Need to be more rigorous about using addExact etc.
<enebo[m]> overflow itself is different but possibly somewhat related
<enebo[m]> oh heh ok then different
<chrisseaton[m]> newLength = oldLength * 1.5, ends up being negative
<enebo[m]> hmm if we do that for String (ByteList?) we probably also do it for Array
<chrisseaton[m]> Anyway PR will make it clear - a comment was also misleading
<enebo[m]> ok
<travis-ci> jruby/jruby (jruby-9.2:e67c508 by Thomas E. Enebo): The build is still failing. https://travis-ci.com/jruby/jruby/builds/225762153 [173 min 58 sec]
travis-ci has left #jruby [#jruby]
travis-ci has joined #jruby
<headius[m]> gday
<headius[m]> enebo: the overflow patch seems fine but I was unable to allocate a string larger than MAX_VALUE - 2
<headius[m]> I'm not sure why MAX_VALUE and MAX_VALUE - 1 raise errors
<enebo[m]> heh interesting...that is unexpected
<headius[m]> see my comment there... if I use "*" * MAX_VALUE - 2 it allocates and then fails on the first append, but anything larger than that can't complete the *
<enebo[m]> I do not know how this all opts out since most of the time this will never fire but I wondered if a negative value check is ok since we know how much we grow
<enebo[m]> hmm although I am wrong
<enebo[m]> if it just happens to hit max_int than 2*max_int would be what?
<headius[m]> yeah we have used manual checks like that elsewhere but we have moved almost all those to addExact by now
<headius[m]> it is a Hotspot intrinsic so it should be pretty efficient when not raising
<enebo[m]> ok I admit it felt hacky to even consider it
<headius[m]> using jump on overflow in asm
<enebo[m]> so one extra jump and this will not generate a trace I am guessing :)
<enebo[m]> I don't know enough on x86/64 assembly to know but there is probably some support for some IEEEish overflow in the instruction set
<headius[m]> That raises in next step
<headius[m]> this is not following the ensure size path either so we need to audit other calls to these size-based ByteList constructors
<headius[m]> and probably set the fallback size to MAX_VALUE - 2, but I am trying to confirm this in JVM spec
<headius[m]> chrisseaton: FWIW the only solution I have considered would be using a long[] instead of byte[] so the effective size would be 8 * MAX_VALUE but clearly that takes a lot of code changes and can't interact with byte[] APIs like IO
<headius[m]> from what I have heard from others this is about the best you can do to get around it without chaining together multiple arrays
<enebo[m]> I feel like we considered linked list (segments) idea but it would end up changing tons of code
<enebo[m]> The fact RubyString is backed by ByteList does give us some freedom to change how bytes are backed but ByteList is so unencapsulated I feel this would mean rewriting everything :)
<enebo[m]> HAHA we will just stop using JVM for byte[] and use malloc
<enebo[m]> If all strings had an explicit destructor and not relied on finalization it would almost work
subbu is now known as subbu|lunch
<headius[m]> yeah I was thinking about that too
<headius[m]> the trend on JVM has been toward making finalization less reliable and now it is actually deprecated
<headius[m]> so we would need to set up a reaper thread to do this practically going forwarded
<headius[m]> I believe *Reference logic is still blessed so we would basically have weak references that refer to the "NativeByteList" and also have a reference to the memory pointer to clean, and just scrub it when we detect the weak reference has been evacuated
<headius[m]> but still needs a reaper
<enebo[m]> The other thing I dislike though is the notion that standard Java tools will not see that memory
<headius[m]> enebo: so about fixing this and merging PR... 9.3 or 9.2?
<headius[m]> merges might get messy fixing it in 9.2 but it is no less an issue there
<enebo[m]> headius: yeah I am ok with this PR on 9.2. Most people will never hit it and the case you do hit it then you will probably still break but you might not
<headius[m]> do you mean 9.3?
<enebo[m]> for the bytelist change?
<headius[m]> yeah the PR is against master right now
<headius[m]> you said you are ok with it for 9.2 but then also said that most people will never hit it so I am confused
<enebo[m]> 9.2 also does not use bytelist artifact and self bundles right?
<enebo[m]> oh ok I mean I think it is not risky for 9.2 because most people will not see it
<headius[m]> ahh ok
<enebo[m]> The only possible problem would be some unexpected perf regression but that seems unlikely
<headius[m]> yeah we can retarget it to 9.2
<enebo[m]> you could just mege and cp since it will apply so cleanly
<headius[m]> yeah and then fix additional cases on 9.2
<enebo[m]> right I am guessing Array has same issue and probably hash
<headius[m]> and other paths to allocation in ByteList
<enebo[m]> oh right
<headius[m]> this only helps the case of growing existing
<enebo[m]> well an audit definitely seems like a good idea
<headius[m]> hey I am still having issues with the pom.xml schema URLs too
<enebo[m]> I fixed an issue on 9.2 today: https://github.com/jruby/jruby/issues/6668
<headius[m]> nice
<enebo[m]> Seems safe enough and even Ruby 3 still does this
<headius[m]> only seems to be happening on master but regenerated pom.xml are differing by just https
<headius[m]> show me your maven version again
<enebo[m]> I am usually pretty meh about subclasses working on core types but we already were doing this in at least one other place
<enebo[m]> 3.8.1!!!!!
<enebo[m]> I am showing 3.6.3...this angers me :)
<headius[m]> yeah so I don't get it
<enebo[m]> no you are using a newer one so I am betting we are swapping possibly
<enebo[m]> or kares and one of us
<enebo[m]> but yours wants https instead of http right?
<enebo[m]> My anger comes from me literally installing latest maven (or so I thought) like 2 months ago
<enebo[m]> I clearly didn't
<headius[m]> checking diff
<headius[m]> mine is adding https
<headius[m]> for the xsd at least
<enebo[m]> yep...so after 3.6.x I am guessing they update to https
<enebo[m]> I will get 3.8.1 so we are at least in sync
<headius[m]> I have not been committing this
<headius[m]> but I can if you can confirm 3.8.1 does it for you too
<enebo[m]> you know what? I think I have not did it either :)
<enebo[m]> let me swap to master and see if I see it on a full rebuild
<headius[m]> that line in pom.xml has not updated since 2014
<enebo[m]> we both are good at working around stuff :)
<enebo[m]> we have been seeing this for a few months I think
<headius[m]> yeah I thought I was behind on maven but I guess I am ahead
<headius[m]> overflow PR applied and cherry-picked
<enebo[m]> and I thought I was ahead
<enebo[m]> Errno::ENOENT: No such file or directory - /home/enebo/work/jruby/lib/ruby/gems/shared/bin/rake
<enebo[m]> haha
<enebo[m]> some bootstrap problem switching branches
<enebo[m]> I think I will go to full repos per branch now
<headius[m]> ah yeah maybe does not have that last round of fixes for default gem installs
<headius[m]> yeah it is easier
<headius[m]> also intellij will not fall apart when you switch branches
<enebo[m]> intellij gets a bit confused in debugger until you refresh maven
<headius[m]> ugh 75 calls to ByteList(int)
<enebo[m]> confused == compile error
<headius[m]> yeah
<headius[m]> ok most of these are fixed sizes, whew
<enebo[m]> So I guess commit those poms and we will both no longer see this
<headius[m]> ok
subbu|lunch is now known as subbu
travis-ci has joined #jruby
<travis-ci> jruby/jruby (master:460bb69 by Charles Oliver Nutter): The build was broken. https://travis-ci.com/jruby/jruby/builds/225776824 [204 min 59 sec]
travis-ci has left #jruby [#jruby]
<travis-ci> jruby/jruby (jruby-9.2:69525f5 by Alex Pilon): The build is still failing. https://travis-ci.com/jruby/jruby/builds/225777009 [173 min 20 sec]
travis-ci has left #jruby [#jruby]
travis-ci has joined #jruby
<headius[m]> hmmm that NPE is new
<headius[m]> intermittent, I filed a bug the other day
<chrisseaton[m]> TruffleRuby uses ropes, but they're designed to be able to collapse to byte[] for simplicity, so doesn't actually workaround the length issue for us
<headius[m]> yeah at some point you still have to work with byte[] APIs
<chrisseaton[m]> At least if you centralise it you can change it when someone complains!
<headius[m]> I was wondering about that
<headius[m]> I assumed it was due to header size
<headius[m]> I wonder if there is a way to query this without reflection
<headius[m]> when searching for this I did not see a single post or answer that used anything except MAX_VALUE
<chrisseaton[m]> Should have been encoded in the spec, really. Not really tractable to play whack-a-mole like this.
<headius[m]> for sure
<headius[m]> -8 might be safest since ArrayList uses that but this should be queryable
<headius[m]> I suppose if there were a way to query, ArrayList would be using it
<headius[m]> so yeah most of these alloc paths do actually compare with MAX_VALUE but clearly that is not the right value to use
<chrisseaton[m]> It's only an issue when we've decided to allocate more than the user actually asked for. If the user asked for that much it can be a natural allocation fail.
ur5us has joined #jruby
<travis-ci> jruby/jruby (jruby-9.2:035c42c by Charles Oliver Nutter): The build is still failing. https://travis-ci.com/jruby/jruby/builds/225794046 [163 min 26 sec]
travis-ci has left #jruby [#jruby]
travis-ci has joined #jruby
ur5us has quit [Ping timeout: 250 seconds]
ur5us has joined #jruby