<JulesIvanicGitte>
(ok its a bit unfait because, it’s mostly the warmup. We removed it from prod because we found a bug in our app, not because of JRuby)
<headius[m]>
Yeah, after warm up it doesn't look too far off
<headius[m]>
Definitely not where I'd like to see it
<headius[m]>
I'm going to be experimenting the next couple days with some jvm flags to shorten that warm-up curve, but the jvm gets in our way a little bit there
<headius[m]>
If you've got time, I still think the first thing we should do is focus on single-threaded throughput and try to get back to a comfortable level
<JulesIvanicGitte>
Do you think that tuning the JRuby JIT can improve the first requests response time ? Something like that: `-Xjit.threshold=0`
<enebo[m]>
Jules Ivanic (Gitter): That will force every method in the system to JIT. Even though JIT compiles happen off thread once finished all that new code will need to compile and warmup in the JVM. That should dramatically slow down warmup time
<JulesIvanicGitte>
can something like this `BUNDLE_DISABLE_EXEC_LOAD=true` affect JRuby performances ?
shellac has quit [Quit: Computer has gone to sleep.]
shellac has joined #jruby
<CharlesOliverNut>
hey I'm back now
<CharlesOliverNut>
I'm thinking low JRuby JIT threshold partially, but also turning off some startup-time features of JVM and reducing compile thresholds
<CharlesOliverNut>
-XX:-TieredCompilation -XX:Tier4CompileThreshold=15000 (that's default, lower might kick it off sooner)
<CharlesOliverNut>
unsure if this will help or not...may impact peak perf, definitely will impact startup time, but may warm up more quickly
<CharlesOliverNut>
where does timers.after come from?
<CharlesOliverNut>
wow, timers has no locks at all
<CharlesOliverNut>
oops
<CharlesOliverNut>
wrong channel
claudiuinberlin has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<JulesIvanicGitte>
> -XX:-TieredCompilation -XX:Tier4CompileThreshold=15000 (that's default, lower might kick it off sooner)
<JulesIvanicGitte>
Does it work with G1
<JulesIvanicGitte>
(edited) ... G1 => ... G1 ??
<JulesIvanicGitte>
> where does timers.after come from?
<JulesIvanicGitte>
What are you talking about ? I don’ understand 🤔
<headius[m]>
G1 shouldn't have any effect on tiered compilation
<headius[m]>
timers lines were wrong channel
xardion has quit [Remote host closed the connection]
xardion has joined #jruby
shellac has quit [Quit: Computer has gone to sleep.]
<lopex>
headius[m]: wrt that encoding length, I'm astonished how small is the surface area for that issue (given mri prevalidates strings in it's own semantics)
<lopex>
it's like 3 issues for us and not having that approximate length
<lopex>
something stinks in mri then
<headius[m]>
that ArrayIndexOOB thing?
<lopex>
and infinite loops
<lopex>
basically in MRI it all goes like this
<lopex>
io/string literals/regexp literals/ are all validated
<lopex>
so there's little chance broken strings get into the guts
<lopex>
onigenc_mbclen_approximate and that return 1
<headius[m]>
So where is the logic in MRI that deals with this
<lopex>
no, that's a place where onigmo uses it
<lopex>
and yet, in 99.9999...% cases we dont need that, since MRI prevalidates the input that will be fed to onigmo
<lopex>
so, basically we have two choices
<headius[m]>
ok
<headius[m]>
I'm with you so far
<headius[m]>
I guess I'm not clear why this doesn't get kicked out when it has a bad leading byte
<lopex>
mirror mri, and waste perf in that additional length logic that will almost never fail (even then that function is used mostly for parsing afaik)
<headius[m]>
for example
<headius[m]>
ah maybe that answers my question
<lopex>
oh I forgot
<lopex>
for [0xA4].pack("C")
<lopex>
the cr is unknown I suppose
<lopex>
and then String#grapheme_clusters traverses that
<lopex>
for me it's an edge case where nonvalidated broken input is fed right into joni
<lopex>
and, as a second option, we could pass have safe encoding version
<lopex>
er s/have/here/
<headius[m]>
ahh
<headius[m]>
that does extra validation as it walks characters
<lopex>
and returns 1 for that byte
<headius[m]>
we could pass both and redo the match with safe encoding if it blows up
<headius[m]>
providing a proper error then
<headius[m]>
hamfisted approach maybe
<lopex>
it's so centralized, so I guess we could hardcore enc.getSaveVersion() etc
<headius[m]>
how much overhead are we talking?
<headius[m]>
ah sure, that would be cleaner
<lopex>
that onigenc_mbclen_approximate in the wiki
<headius[m]>
oh ok
<lopex>
I would add a length version to Encoding interface though
<lopex>
so we have out old length and preciseLength
<lopex>
but encoding instance would determine if it's that approx version
<lopex>
headius[m]: I included descriptions for those usages in mri in wiki
<headius[m]>
yeah I'm parsing it
<headius[m]>
so the -1 case from precise_mbc_enc_len gets converted to 1 by onigenc_mbclen_approximate
<headius[m]>
where we just use the -1 and then blow up
<lopex>
basically, for this case
<lopex>
but it;s all mess
<enebo[m]>
lopex: are you saying above that there are a limited set of known paths where a string is not guaranteed valid and those must potentially use approximate length?
<lopex>
enebo[m]: yeah
<lopex>
good way to put it
<enebo[m]>
lopex: how about the parser?
<lopex>
enebo[m]: actually 2 not
<lopex>
*now
<headius[m]>
so if it can't figure out mbc length it needs to just assume length 1 to advance safely
<lopex>
enebo[m]: hah, which parser :P
<headius[m]>
and this is also how we end up with infinite loops because we use the -1 blindly to increment index and then walk back into the character again
<lopex>
enebo[m]: mri uses parser_mbclen
<enebo[m]>
lopex: ah sorry I mean ruby parser but I suppose regexp are mostly validated through ruby parser as well
havenwood has quit [Remote host closed the connection]
<lopex>
enebo[m]: it used to use preciseLength and have it's own guards
<lopex>
in callers
<headius[m]>
well it would be worth seeing if the extra checks add enough overhead to worry
<lopex>
enebo[m]: it's hard to explain since there's so many versions used inconsistently
<lopex>
yeah, we can always change Encoding.length
<headius[m]>
so I'm getting that it's the difference between get length, advance character vs get length, validate length, advance char
havenwood has joined #jruby
havenwood has joined #jruby
havenwood has quit [Changing host]
<lopex>
but you need to be careful, since it can be by site basis when there's a guard agains <0
<headius[m]>
well my gut says we should do what MRI does and we'll measure it
<lopex>
I also wanted to get rid of those intermmediate char length tables for utf-8
<enebo[m]>
lopex: just one more question: precise return -1 so is guard to see that and then do approximate length so it continues?
<lopex>
enebo[m]: in the callers ?
<enebo[m]>
wherever we run into the endless loops I guess
<lopex>
enebo[m]: I mean the callers might have the guards
<enebo[m]>
callers is a bit vague to me but I only partially read the conversation
<enebo[m]>
or I read it all but pretty quickly
<enebo[m]>
I should have not asked any questions but I wanted to understand your original statement (which you answered already)
<headius[m]>
so I think you are saying not all places should normalize bad char length to approx
<headius[m]>
which is why we need the separate path
<headius[m]>
and that's why you were suggesting we pass safe encoding into those paths that should approx
<lopex>
yeah, additionally for unsave paths we could use very simplified length version for validated strings
<lopex>
much more efficient than we have now
<lopex>
so there's that
<headius[m]>
when we know that we're handling it appropriately from those callers
<headius[m]>
ok I think we're on the same page
<lopex>
we will need to change some call sites anyways though
<headius[m]>
sweet my travis changes worked
<headius[m]>
rubyspec is hanging in a concurrent autoload spec
<headius[m]>
so that needs to be tagged and fixed
<headius[m]>
the other hang was clearly in the case folding stuff, so I'm stumped for the moment
<headius[m]>
I'll put the queue tests back though
<lopex>
enebo[m]: I forgot, but precise gives also missing as (-n -1) right ?
<headius[m]>
enebo: I switched all our rake-based targets to run directly rather than via that mvn PHASE stuff
<headius[m]>
so they aren't triple buffering output or whatever