<jswenson[m]> Looks like most of the nashorn stuff has a more complex class name (`jdk.nashorn.internal.scripts.Script$181$\^eval\_`) while the jruby stuff is typically more simple.
<jswenson[m]> Top few without stripping the end off the class names
<headius[m]> yeah they are probably encoding it more
<jswenson[m]> for reference the OQL for the second one was : `SELECT s.member.clazz.getName() FROM java.lang.invoke.DirectMethodHandle s`
<jswenson[m]> I'm using eclipse mat, not sure if that matters for these.
<headius[m]> it does, the visualvm oql sucks
<headius[m]> I am realizing that now
<jswenson[m]> had to use bash to group and sort
<headius[m]> that query is syntax error in visualvm, lame
<headius[m]> ok so basically we could filter by member.clazz.getName =~ /jruby/
<headius[m]> however that is done in MAT OQL
<headius[m]> there does seem to be a lot of duplication of DMH just in your results though
<jswenson[m]> can do this: `SELECT s.member.clazz.getName() FROM java.lang.invoke.DirectMethodHandle s WHERE s.member.clazz.getName().startsWith("org.jruby")` which yields 39k results (of the 66k total)
<headius[m]> ok that's pretty good
<headius[m]> 39k direct handles to JRuby classes seems like a lot to start
<jswenson[m]> that's the className and memberName
<jswenson[m]> `SELECT s.member.clazz.getName(), s.member.name.toString() FROM java.lang.invoke.DirectMethodHandle s WHERE s.member.clazz.getName().startsWith("org.jruby")`
<headius[m]> these are the kinds of sites I would expect to see even with indy off because these are literals and constants
<jswenson[m]> Yeah I think we must not have any indy on
<jswenson[m]> as we're not explicitly adding it anywhere and only disabling indy.yield
<headius[m]> some of these are being created by JVM and would be reduced in newer versions
<headius[m]> like those references to fstring in Bootstrap should be almost entirely in static data for jitted Ruby code and could use the same entry, but probably doesn't in Java 8
<headius[m]> ok so this is a good report on direct handles
<headius[m]> it doesn't look like a symptom of a problem at the moment
<headius[m]> maybe we can look at LF trees that are rooted by nashorn somehow
<jswenson[m]> I can query the 113k lambda forms, but I'm not sure where I can go from there.
<headius[m]> jswenson: well we are getting closer
<headius[m]> basically we need a report of where all the lambda forms are and what is holding onto them to see which ones are JRuby and could be reduced
<jswenson[m]> let me see if I can wrangle mat / OQL into doing what I want
<jswenson[m]> not sure where on a lambda form (or around a lambda form) it is hiding the information about why it was created / what created it
<headius[m]> yeah I am trying to figure that out... seems like it is not in the object but would have to be based on who is referencing it
<headius[m]> that is exactly what we want though, who created it and why
<jswenson[m]> Darn.
<headius[m]> I may be able to pull in a method handle expert
<headius[m]> I pinged Vladimir Ivanov from Oracle, who has done a large part of the work to reduce and reuse lambda forms
<headius[m]> he will be able to give us some pointers on investigating and perhaps tuning this stuff
<headius[m]> I have to drop off for the night soon so I can't help continue but you can keep poking around with MAT and we will see what Vladimir comes back with
<jswenson[m]> :thu
<jswenson[m]> * ๐Ÿ‘๏ธ
<jswenson[m]> thanks for the help!
<headius[m]> I am looking to see if it is possible to turn more indy off, but that might only be on master
<headius[m]> on master we have worked toward a completely indy-free mode to use for native compiles, but that is not available in the 9.2 line
<headius[m]> from what I see, though, those are not unexpected DMH to see for normal mode, and those numbers don't seem to indicate a problem
<headius[m]> jswenson: yeah thanks for your patience... I want to help you but also figure out if we can do more to reduce our metaspace load
travis-ci has joined #jruby
<travis-ci> jruby/jruby (master:5ccdc58 by Charles Oliver Nutter): The build is still failing. https://travis-ci.com/jruby/jruby/builds/216882491 [165 min 3 sec]
travis-ci has left #jruby [#jruby]
<headius[m]> bout ready to give up on Travis
<headius[m]> nonsensical maven failures... restarted
travis-ci has joined #jruby
<travis-ci> jruby/jruby (master:5ccdc58 by Charles Oliver Nutter): The build is still failing. https://travis-ci.com/jruby/jruby/builds/216882491 [162 min 16 sec]
travis-ci has left #jruby [#jruby]
<headius[m]> seems like something is still corrupting the shared maven cache on travis, but removing it would add a lot of time and noise to the builds
ur5us_ has joined #jruby
ur5us_ has quit [Remote host closed the connection]
ur5us_ has joined #jruby
<travis-ci> jruby/jruby (master:5ccdc58 by Charles Oliver Nutter): The build is still failing. https://travis-ci.com/jruby/jruby/builds/216882491 [159 min 9 sec]
travis-ci has left #jruby [#jruby]
travis-ci has joined #jruby
<headius[m]> hosed
ur5us_ has quit [Ping timeout: 264 seconds]
<headius[m]> clearing caches and restarting builds is working, but I guess it is just time to leave
ur5us_ has joined #jruby
<travis-ci> jruby/jruby (master:5ccdc58 by Charles Oliver Nutter): The build was fixed. https://travis-ci.com/jruby/jruby/builds/216882491 [225 min 22 sec]
travis-ci has joined #jruby
travis-ci has left #jruby [#jruby]
ur5us_ has quit [Ping timeout: 264 seconds]
travis-ci has joined #jruby
travis-ci has left #jruby [#jruby]
<travis-ci> jruby/jruby (jruby-9.2:2c3c16f by Charles Oliver Nutter): The build was fixed. https://travis-ci.com/jruby/jruby/builds/216887019 [177 min 59 sec]
<headius[m]> double green, good time to call it a night
<headius[m]> kalenp: copy_stream fix is in
rdubya[m] has quit [Quit: Idle for 30+ days]
travis-ci has joined #jruby
<travis-ci> jruby/jruby (kares-patch-joda+asm:f71bfe2 by Karol Bucek): The build was fixed. https://travis-ci.com/jruby/jruby/builds/216899731 [201 min 55 sec]
travis-ci has left #jruby [#jruby]
rdubya[m] has joined #jruby
<Freaky> headius[m]: got my json reproduction down to two files
<headius[m]> oh rad
<Freaky> ish
<Freaky> hmmm
<Freaky> this file loads a large JSON payload
<headius[m]> you want to try to mess with the JVM JIT and see if it goes away, try -XX:TieredStopAtLevel=1
<headius[m]> could still be jit but it would implicate C2
<headius[m]> flip side would be -XX:-TieredCompilation which will use only C2
<Freaky> is JIT partially time based?
<headius[m]> invocation count triggers it but it happens asynchronously once triggered
<headius[m]> normal tiered execution will compile a simpler form with profiling and then later recompile an optimized form
<Freaky> right, keeps running the interpreted/lower tiers until the new code is ready
<headius[m]> that is tier 3 (C1 client compiler with profiling) and tier 4 (C2 server compiler)
<headius[m]> but we can force it to use only one and maybe skip to the conclusion of this exciting story
<headius[m]> TBH I doubt it is JVM JIT just because it is pretty rare to find a bug
<Freaky> right
<Freaky> GIT2SVN = JSON.parse(IO.read('db/git2svn.json')).freeze
<Freaky> if I comment that line out, it goes away
<headius[m]> hmm
<Freaky> but it isn't enough on its own..
<Freaky> crash damn you
<Freaky> it's certainly harder to reproduce with this cut down environment
<Freaky> it's like it's sensitive to just the amount of code that happens to be floating about
<Freaky> meh
<Freaky> let's try jdb again, now I can reproduce it fairly reliably
<headius[m]> yeah need to look at the incoming string at the point it raises the error, checking whether it has a weird encoding or bad offsets or something
<headius[m]> first place to look anywat
<Freaky> hmm
<Freaky> โ”‚Breakpoint hit: "thread=main", json.ext.Parser$ParserSession.parseImplemetation(), line=2,331 bci=528
<Freaky> nearly ;)
<Freaky> off by 2
<Freaky> observation appears to make it go away
<headius[m]> of course it does
<Freaky> ran it a dozen times with the debugger, nothing
<Freaky> ran it once without, boom
<Freaky> I don't suppose there's some way I can dump the object from JRuby after the fact?
<Freaky> both JIT options make it go away too
<headius[m]> how certain are you of that?
<headius[m]> I think the next option might be to build a version of the json ext that has some extra logging
<Freaky> yeah, was thinking that
<headius[m]> but that might fix it too ๐Ÿคทโ€โ™‚๏ธ
<Freaky> TieredStopAtLevel=1 on iteration 20 and still nothing, -TieredCompilation on 10 and nothing
<Freaky> baseline managed 2 iterations
<headius[m]> wouldn't be the first time we found a bug in tiered
<headius[m]> you could try level=3 and see if it comes back
<headius[m]> I have some folks that could look into a jit issue if we have a repro
<headius[m]> I can even open a bug on OpenJDK once we are sure ๐Ÿ˜€
<Freaky> as I recall from the compilation log just before it happened was a deopt from level 4 to 3, iirc?
<Freaky> 26634 14095 4 json.ext.Parser$ParserSession::parseObject (962 bytes) made not entrant
<headius[m]> yes we did see that
<Freaky> nothing with StopAtLevel=3
<headius[m]> ok
<Freaky> right
<Freaky> aside from the giant JSON file I have it pretty minimal now
<headius[m]> that is good
<headius[m]> once you get to a point you can share it just attach something to the issue or link a repo
<Freaky> yep
<kalenp[m]> exciting to see progress on this. hope it's relevant to our own json issues ๐Ÿคž
<Freaky> eh
<headius[m]> seems unlikely they wouldn't be related ๐Ÿ˜€
<Freaky> yep
<Freaky> big JSON file, that seems to be all it takes
<headius[m]> sweeet
<headius[m]> bbl
<jswenson[m]> @kalen
<jswenson[m]> * @kalenp I wonder if we're doing any massive json parsing here.
<kalenp[m]> Some of our known instances are actually on fairly small json objects (2 keys and short values). but perhaps the large json just exercises it intensely enough to show up in a reasonable amount of time
<jswenson[m]> From the looks of the repro, it looks like simply loading the large JSON will make it so a smaller load will fail later.
<jswenson[m]> Loads the big bork.json (around 2.5MB) then loads a simple json payload in a loop.
<jswenson[m]> Thus far I cannot repro, but I haven't tried on jdk15 yet
<kalenp[m]> well, we certainly do serialize large json objects. will have to think about de-serializing large json
<kalenp[m]> interesting, I'm able to get the repro on java 16 (no 15 available in my current environment), but on java 11, it's running just fine for several minutes. so, either we have two separate issues, or the particular details of how to trigger the bug have changed
<headius[m]> could be different heuristics for when and how it jits that aren't being hit with this example on 11
<Freaky> yeah, I've not seen it <15
<kalenp[m]> on further reflection, we definitely do parse some large json objects
<Freaky> the smaller the initial JSON the less reliably I can reproduce it
<Freaky> maybe it has different sensitivities on different versions
<headius[m]> I'm wrapping up another investigation and then I will try to repro locally too
<Freaky> kalenp[m]: any JVM tuning?
<headius[m]> ah that could make a difference
<Freaky> JRUBY_OPTS='-J-XX:+UnlockExperimentalVMOptions -J-XX:+UseShenandoahGC' seems to stop it as well
<Freaky> also SerialGC and ParallelGC from the look of it
<headius[m]> grr G1
<headius[m]> I would not be surprised if this ends up being a G1 bug
<jswenson[m]> We're definitely using G1
<headius[m]> worth throwing another GC at it if you can afford to try it
<headius[m]> jswenson: you probably didn't see this on Twitter but I got some assistance from Nashorn folks and there's a way to turn off its aggressive optimizations
<headius[m]> I'll add this to the issue