lopex has quit [Quit: Connection closed for inactivity]
_whitelogger has joined #jruby
_whitelogger has joined #jruby
neck has joined #jruby
throsturt has joined #jruby
<throsturt> I'm running marid integration server for opsgenie which supports ruby scripting via jruby, but I'm having this problem that the /tmp directory keeps getting filled up with jruby*.jar files, many of which are identical, and fuser shows that 1622 of 1904 are still being held. Is this a known bug with jruby?
drbobbeaty has joined #jruby
drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
neck has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]
<headius[m]> Hello there
<headius[m]> throsturt: What's in those files?
<headius[m]> I know we unpack at least one small binary for calling native code but I don't recall any jruby*.jar files for which we do that
<throsturt> headius[m]: a bunch of classes, what specifically should I be looking for?
<headius[m]> throsturt: It sounds like it's the JRuby jar itself...perhaps this marid/opsgenie thing is doing it? I don't believe this would be anything we're doing
<headius[m]> gist a listing of files for me to confirm
<headius[m]> if it's the JRuby jar contents then it's definitely not something we're doing
<throsturt> there's thousands of jars in here, any particular one you want to see the contents of?
<throsturt> well not thousands anymore since we restarted but here's an ls: https://gist.github.com/ThrosturX/63299d46e926ef20474936d4c2f4e1ae
<headius[m]> any one
<headius[m]> hmmm
<headius[m]> ok this might be something else then
<headius[m]> do you know what version of JRuby it's running?
<throsturt> 2.5.3 IIRC
<headius[m]> this looks like how we deal with nested jars; they get unpacked to a temp location so that we can add them to the JVM's classloaders correctly
<throsturt> but the file handles are never released
<headius[m]> that's the compatible Ruby version, but at least that tells me it's 9.2.x
<throsturt> leads to an out of memory error eventually
<headius[m]> yeah I can imagine
<throsturt> how do I check the jruby version?
<headius[m]> JRUBY_VERSION inside a Ruby script should show it
<headius[m]> I'm not familiar with this thing
<throsturt> 9.2.8.0
<throsturt> I ran the jar with -v
<throsturt> jruby 9.2.8.0 (2.5.3) 2019-08-12 a1ac7ff OpenJDK 64-Bit Server VM 11.0.4+11-LTS on 11.0.4+11-LTS +jit [linux-x86_64]
throsturt is now known as throstur
<headius[m]> in order to add the jar to our classloader it needs to have an accessible filesystem URL, so we unpack it to temp and then add that location
<headius[m]> But we do clean up those URLs when the JRuby instance gets shut down...IF it gets shut down
<headius[m]> the close() method later in that file triggers a cleanup of all those URLs and delete of the files
<throstur> ah... IF it gets shut down
<headius[m]> if that's not firing then the JRuby instances aren't being shut down after use, so you accumulate more and more
<throstur> but we have a long-lived process that doesn't seem to kill jruby
<throstur> so do I report that as a bug to OpsGenie then/
<throstur> s/\//?/
<headius[m]> if it keeps making new JRuby instances and never calling org.jruby.Ruby.tearDown or ScriptingContainer.tearDown, this would happen
<headius[m]> yes I think so
<headius[m]> a bit later in that file we also set up a finalize method that will clean up non-singleton (non-global, basically) instances when they GC
<headius[m]> so that's another possible way to explain the bug...they are creating singleton JRuby instances and walking away from them without terminate
<headius[m]> Hopefully this is enough for them to fix the issue...feel free to tag @headius on your issue
<throstur> thanks
<throstur> I actually already created the issue assuming this was the problem but I'll go ahead and forward these details to them
<headius[m]> I can't think of an easy workaround for you now...you could try to shut this down from within Ruby code but that would be unpredictable since obviously you're still running code
<headius[m]> good luck!
<throstur> thanks headius! you've been incredibly helpful
<headius[m]> I hopw they are able to tidy this up quickly! Unfortunately this is the only way to ship jars-in-jars and be able to access them
<throstur> I'm sure they'll do their best, I created the ticket at level 1 priority so I'm sure they'll get on it soon, obviously this is something that shouldn't fly in an enterprise solution
<throstur> as I understand it they are selling their product for copious amounts
<headius[m]> well I think we can expect some help then 🙂
<headius[m]> if they need any assistance they can stop by here or file an issue on jruby/jruby
lopex has joined #jruby
drbobbeaty has joined #jruby
sagax has quit [Quit: Konversation terminated!]
jrafanie has joined #jruby
sagax has joined #jruby
<headius[m]> Yeah I think I'm just going to change the way the JIT lazily initializes these cached values to always use indy
<headius[m]> the worst outcome will be the overhead of bootstrap+linking of those sites but every acces of symbols, bytelists, fixnums, block bodies, and a dozen other things will optimize like constants then
<headius[m]> we're paying untold costs in unelidable static load + null check on these cache fields
<headius[m]> old JIT paid a different cost, initializing them all up front, but still static load on every access and likely other barriers to optimizing them as constants
<headius[m]> Not to mention every one of these caches emit a whole method body + caching bytecode
<headius[m]> re: meatspace
<headius[m]> method entries are definitely clogging up meatspace
<enebo[m]> can you measure the change in memory?
<headius[m]> I haven't gotten to that point yet for anything but call sites
<headius[m]> not that we have a great way to measure meatspace anyway
<enebo[m]> I am working on this from a different angle but in all cases we need something which we can measure from startup/warmup/eventual/memory perspective
<enebo[m]> And I am thinking more and more it is like n things
<enebo[m]> gem list, rails app, tighter loop bench
<enebo[m]> for metaspace I think the rg.org running 100ks of requests maybe will show something
<enebo[m]> but I am not in love with that idea
<enebo[m]> and my main issue with the rg.org bench from this spring was that I think it is too simple (we talked about this yesterday)
<enebo[m]> it it much better than a single controller app though
<headius[m]> well the other problem with rg.org is setting it up
<headius[m]> I mean once I have it working here it's working, I guess, but it's a lot of crap
<enebo[m]> oh yeah that is definitely true
jrafanie has quit [Quit: Textual IRC Client: www.textualapp.com]
<headius[m]> you know maybe we should look at redmine again
<headius[m]> because I know that worked well at some point
<headius[m]> and it's fairly self-contained
<enebo[m]> I have some notes and plan on setting it up for the nuking branch to see if memory radically decreases but it is a pain
<enebo[m]> yeah redmine would be a good one
<headius[m]> it may be very close to working right now
<headius[m]> it just wasn't maintained for JRuby
<headius[m]> I knew a guy selling packaged Redmine + JRuby + OpenSolaris servers for a while
jrafanie has joined #jruby
<enebo[m]> so I always wonder if people ever make their own bench suites for their own apps
<enebo[m]> like I am sure basecamp does from tweets but I don't know how common that is
<headius[m]> I would if I did any real work
<headius[m]> but that probably falls somewhere below testing if you're paid to build an app
<enebo[m]> if redmine had their own bench it would be perfect since it would not be made by us so propriety would be better and of course we could compare and not have to make it
<enebo[m]> OSS apps maybe have less resources who care to do it
<headius[m]> so most of these things I want to cache already have indy paths that are pretty tight...like symbol just throws an invokedynamic + string into bytecode and from then on you get the symbol as a constant
<headius[m]> I can just remove the non-indy logic and they're done and won't emit all this extra bytecode
<headius[m]> oh and the way I'm caching these is that each literal is a small method
<headius[m]> so...how many literals in a typical method
<headius[m]> N * method * bytecode size, plus it's slow and non-constant
<enebo[m]> literals? you mean symbols or identifiers
<headius[m]> symbols, fixnums, bytelist for strings, all that
<headius[m]> frozen strings
<headius[m]> block bodies
<enebo[m]> ah in operand sense
<headius[m]> yeah operand
<enebo[m]> well it seems quite a few then
<enebo[m]> not most of a method but not uncommon
<headius[m]> so they'd all boil away to indy + call site + constant method handle and fold like any constant
<headius[m]> emitted native code should end up just being a MOV
<enebo[m]> yeah my only concern then would be warmup but my above sentences points out a pretty big need for this anyways
<enebo[m]> I want to prune how many things JIT and that requires knowing warmup and eventual speed
<headius[m]> I think it will be mostly a wash
<headius[m]> these lazy init via get field + null check + init + set right now anyway
<enebo[m]> but I definitely want something larger to use for that since the people who have traditionally complained about warmup has been larger apps
<headius[m]> and it's not like call sites which generate adapted handle chain...it's just a single method handle for the value
<headius[m]> I can POC making all literals indy and we can try something
<headius[m]> it's mostly just deleting the non-indy logic
<enebo[m]> well I am not trying to be discouraging on this either as much as I want to stop reasoning about it and try and measure it a bit more
<headius[m]> sure
<enebo[m]> and for my stuff it is impossible to reason I think
<enebo[m]> I do know much of our JITting is not needed to be fast although being two stacked runtimes complicates proving how true that is
<headius[m]> heh
<enebo[m]> but I can see things like pruning infrequent calls can slow down a runtime but I am "reasoning" that back edges of loops will get most or all of that back
<headius[m]> ok this is trivial
<enebo[m]> with that said I still have no reasonable benchmarks to know how good or bad the situation is
<headius[m]> they just restored a hard dep on rmagick and redcarpet
<headius[m]> so it may be like 95% working
<enebo[m]> so some platform: stuff
<enebo[m]> wow rmagick is still a thing isn't it
<headius[m]> yeah ffs
<enebo[m]> I wonder how well rmagick4j works now. It should be fine unless they added new features
<headius[m]> yeah it may just need some cleanup
<enebo[m]> Feels like that library is so old it must not change massively
<enebo[m]> if that
<enebo[m]> It was not 100% of rmagick either since it is huge
<headius[m]> also a question of what redmine actually needs from it
<headius[m]> like what, thumbnails?
<headius[m]> it's not instagram
<enebo[m]> I don't recall how much was done by the end of that gsoc but it was enough for most common things
<enebo[m]> yeah good question...image_voodoo + image_science for MRI side may be a nice contribution if it is just thumbnails
<headius[m]> it would make a way better app for profiling
<enebo[m]> and I can adopt image_voodoo to do anything but I don't want to implement a graphics library in it so it has to mostly overlap already
<headius[m]> sure
<enebo[m]> rmagick is pretty crazy as a dep
<headius[m]> yeah I am always amazed at the amount of crap people have to install to support all these exts
<enebo[m]> it generates some big string like you would generate for a command-line and then on back side gets an image back
<enebo[m]> Nov 10, 2017 is last time my redmine dir was mucked with
<enebo[m]> + when /jdbcmysql/
<enebo[m]> + gem "activerecord-jdbcmysql-adapter", "51.0"
<enebo[m]> ok so we have been here before :)
<enebo[m]> headius: we looking at 3.4.3 of redmine or are you looking back to see when we last worked
<enebo[m]> ./lib/redmine/thumbnail.rb:require 'mimemagic'
<enebo[m]> looks like only thumbnail but I see there is a font path
<enebo[m]> so image_science/voodoo do not add fonts
<enebo[m]> ok font is for some gantt chart
jrafanie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<headius[m]> ugh
<headius[m]> I forgot about dregexp /o
<headius[m]> I use an atomic reference that's lazily initialized under class synchronization
<headius[m]> talk about gross
<headius[m]> this should clearly just use indy because that's disgusting
<headius[m]> the whole /o atomicity thing is a real pain
<enebo[m]> I doubt we even have more than a dozen dregexps in a typical app
<headius[m]> yeah that's true I guess
<headius[m]> it's just a lot of code to emit for what should basically just be a constant after init
<headius[m]> ick
<headius[m]> booleans always go through context.runtime.getTrue etc
<headius[m]> I never updated this to use context.tru and fls
<headius[m]> fals
<headius[m]> so that's two field loads plus a virtual dispatch
<enebo[m]> my statement is not arguing for one impl or another. I think indy is totally valid in something not used much since it is not going to add much to warmup
<headius[m]> I guess this is a good exercise in light of our meatspace problems
<enebo[m]> oh and I am confused about redmine
<enebo[m]> I must have a non-normative redmine
<enebo[m]> 3.0.4 is latest on a pull but 4.0.4 is out
<headius[m]> hmm did they move repo?
<headius[m]> to gitlab or something
<enebo[m]> "You will need to download a copy of the current development-code. The official code repository is located in Subversion and can be downloaded by following the Download instructions.
<enebo[m]> "
<enebo[m]> I think there mirroring is busted maybe
<enebo[m]> I am cloning from github
<headius[m]> subversion ffs
<headius[m]> what decade is this
<enebo[m]> Author: Go MAEDA <maeda@farend.jp>
<enebo[m]> Date: Thu Sep 12 12:51:23 2019 +0000
<enebo[m]> wtf... maybe they changed tagging style
<headius[m]> ok weird
<enebo[m]> yeah 3.4.3 is last tag I see in my repo
<enebo[m]> I really don't want to bag on people's choices but SVN...It does not really do anything better than git. I feel people who prefer it just really love revision #s
<enebo[m]> I have never been a fan of gits command-line but once you make it past tolerating it then it pretty much beats svn in all ways other than the logically simplicity of revision #
<enebo[m]> I have seen arguments about how it is more recoverable but I have never had an uncoverable git repo ever
<headius[m]> hah I found a bug
<headius[m]> JIT for a symbol proc is still using java.lang.String
<headius[m]> nobody must be using symbol procs with weird encodings
<enebo[m]> anyways to each their own...I guess we should find 4.0.4 tag
<enebo[m]> well that is probably almost entirely true
<headius[m]> yeah I have been seeing a lot of openjdk folks moaning about git lately
<enebo[m]> very very few method names end up as non 7bit in my experience
<headius[m]> mostly because it does a lot of things just different enough from hg to be confusing
<enebo[m]> yeah I have mixed feelings on hg. I used to prefer it but once you get used to git it changes how you think about repos
<headius[m]> but then they talk about having to fiddle with refs directly and I'm wondering what the hell they're doing
<enebo[m]> but in my mind it is a big step up from svn and mostly just different than git
<enebo[m]> they decided to have some epic subtree system
<headius[m]> I got a bunch of replies to one thread showing how hg and git use different words for similar concepts
<headius[m]> that was mostly my problem trying to use both
<headius[m]> apparently hg does have branches very similar to git but they're not called branches
<headius[m]> so I never found that
<enebo[m]> they = openjdk
<headius[m]> yeah mistakes were made
<headius[m]> hg forests for one
<headius[m]> hg patch queues for another
<headius[m]> both deprecated features now
<enebo[m]> My main reservation of hg which may have changed is that it seemed a lot slower than git
<enebo[m]> yeah
<enebo[m]> My question was wondering how important it was to break up the source in the first place
<enebo[m]> At the time they did it bandwidth was more painful
<headius[m]> I think that has improved but it's still largely in python where git is mostly C
<headius[m]> so they're fighting an uphill battle there
<enebo[m]> but most people were building the whole thing anyways
<headius[m]> I guess the perf issues with openjdk repos were mostly fixable too
<headius[m]> like they weren't caching or compressing or something
<headius[m]> but I am glad they're moving to git
<enebo[m]> yeah and having the mirror already makes it much easier
<enebo[m]> ok 4.0-stable is a branch which has the tag I think
<enebo[m]> err no but this is latest code
<enebo[m]> 4.0-stable should be pretty reasonable even from HEAD assuming they only merge over stable changes
<headius[m]> ok cool
<headius[m]> so they are doing things the right way and having release work on branches
<headius[m]> "right way"
<enebo[m]> It looks like they branch on release
<headius[m]> I guess my problem with that is why have master at all then
<headius[m]> nobody pushes to master in the "right way" so what is master
<enebo[m]> ""right way"" I confirm this bug
<headius[m]> hah
<enebo[m]> master is 5 maybe
<enebo[m]> 5.0 ends up being 5.0-stable would be my guess
<enebo[m]> I do not feel there is a truly great way of doing this stuff
<enebo[m]> diverging code and focus is tough if you immediately branch
<enebo[m]> if you keep it too long then you are probably messing with progress at the cost of speeding up stability
<enebo[m]> I think we mostly have this right. If we take too long we make next version branch until we stable branch and then converty master back to next release
<enebo[m]> but it is a "feeling" thing which I am sure bugs many people who follow how we work
<headius[m]> if we had like five active branches it would make sense to do it that other way I guess
<headius[m]> but we only have two
<headius[m]> and usually one
<enebo[m]> yeah by design largely
<enebo[m]> if we had a team of 10 full time people I think we would do this differently
<headius[m]> enebo: so do you have something you want to try to run against this indy_literal branch?
<headius[m]> I mean to put your mind at ease about startup and maybe warmup
<enebo[m]> well no I don't
<enebo[m]> This is why redmine came up
<headius[m]> right ok
<enebo[m]> I think our main problem is the stuff we run to see if indy is running well tends to be quicker and smaller
<headius[m]> we can test this with the usual round of gem and rails commands for the moment
<headius[m]> it's a weird case because this only affects non-dev mode
<enebo[m]> The people who complain about warmup (which is not only on indy) ends up being much larger applications
<headius[m]> which is what we still recommend for startup time
<enebo[m]> I mean if we land something now and realize it hurts warmup when we get this running then we have options too
<enebo[m]> I just cannot reasonably land my experiment without something more substantial because it is less clear on possible impacts
<headius[m]> this is going to make bytecode output way smaller
<enebo[m]> but complaints on warmup are exclusively non-dev too so we just need some guantlet of use cases to see how changes are affected
<headius[m]> it could be better than half
<enebo[m]> have you tried parser gem yet?
<headius[m]> every literal, every call site getting a method with a dozen bytecodes before down to load context + invokedynamic
<enebo[m]> Will something fit now? That would spice things up :)
<headius[m]> parser gem?
<headius[m]> definitely
<enebo[m]> whitequark
<headius[m]> the current jit max size is based on total .class size
<enebo[m]> but to play devils advocate why not replace the entire method with indy
<headius[m]> and this logic will be no bigger within the actual jitted body but not have these limbs dangling all over
<enebo[m]> backwards branching perhaps excluded
<headius[m]> that's what this does
<headius[m]> the methods themselves are gone
<enebo[m]> sorry I think I missed something...you are still emiting some other java bytecode
<headius[m]> before, every literal value emitted a separate method to lazily init and cache it in a static field
<headius[m]> that method continued to be called every time, loading field and checking == null
<headius[m]> now, every literal value just emits indy
<headius[m]> and folds
<headius[m]> no extra field or method needed
<enebo[m]> oh sorry you just meant for making literals
<headius[m]> yeah
<headius[m]> I will also do call sites but those still go through a separate synthetic method to wrap all the arg wrangling
<headius[m]> here look at this
<headius[m]> there's only two literals there, 1 and 2, but you can see they emit a whole extra method each
<headius[m]> oh and the two fields fixnum0 fixnum1
<headius[m]> we should be able to see a measurable reduction in total emitted JIT size
<enebo[m]> An interesting outcome of this is handle can be shared
<headius[m]> yeah some of these handles are cached in o.j.Ruby already
<headius[m]> like all nil/false/true will use the same constant handle
<headius[m]> I think fixnums in cache range will too
<headius[m]> symbols could
<enebo[m]> so that is a second level of reduction which can be huge for stuff like that
<headius[m]> for sure
<enebo[m]> remember too I changed operands to make Fixnums in operand...not sure if this can be applied to use Operand for MH
<enebo[m]> I mean I can imagine it could
<enebo[m]> If you are going to make MH for all fixnums compiled then making it on the Fixnum operand will end up allowing more savings than Ruby fixnum cache range
<headius[m]> I can't pass through most objects into jit
<headius[m]> they would have to be able to find it on the "other side"
<headius[m]> I can only pass things that can go in constant pool
<headius[m]> numbers, string version of bytelist, etc
<enebo[m]> ah you would need the operand reference which is possible to get but it would be a journey
<headius[m]> right
<enebo[m]> (sicne we do have reference to IRScope in methods)
<headius[m]> but we can improve this anyway
<headius[m]> like have Symbol aggregate a MethodHandle field
<headius[m]> then all caches of that symbol will use same handle
<headius[m]> could actually all use same call site probably
<enebo[m]> MH[] handles
<enebo[m]> int or long generic handles JIT would emit long/int
<headius[m]> we're going to find a happy medium here
<headius[m]> may make it harder to do AOT
<enebo[m]> yeah I was just thinking about that
<headius[m]> SVM has only limited support for indy
<enebo[m]> AOT would need an initializer but then that is not even enough since you could not guarantee slot would be open
<headius[m]> but there's much bigger things that have to change for us to AOT ruby code anyway
<enebo[m]> UUID!
<headius[m]> SVM does support some limited indy for lambdas...there may be a path
<headius[m]> hah yeah that won't impact startup
<enebo[m]> anyways. I am curious about memory improvements on this
<enebo[m]> warmup is tied for second
<headius[m]> I'm going to also make this change for caching a CallSite object and then push a branch
<enebo[m]> combining this with my heustics if they work will stack too
<headius[m]> calls will still go through a synthetic method but use indy to initialize a CachingCallSite rather than the static field dance
<enebo[m]> I would love to see us halve memory from metaspace not being so epicly large
<headius[m]> so bytecode + field reduction
<enebo[m]> I still feel that is more from one class per space though
<headius[m]> yeah hopefully we can measure that
<headius[m]> so that's 9.2.8.0 versus this for my 13 gems
<headius[m]> my gem home got corrupted in some weird way
<headius[m]> I had to wipe it out
<headius[m]> but you can test this with your 10k gems
<headius[m]> this is a better view of actual init costs though
<headius[m]> basically noise
<headius[m]> oh I need /shared for actual gem location
<headius[m]> retrying this
<headius[m]> ok I still have lots of gems actually
<headius[m]> 412
<headius[m]> ish
<headius[m]> that's reversed, branch vs 9.2.8.0
<headius[m]> so again probably noise
<enebo[m]> yeah appears to just be noise for this
<headius[m]> I should really be comparing with master
<headius[m]> rdubya: would you be able to try startup on your app with this branch?
<rdubya[m]> yeah, I should be able to do that, at this point it will probably be this afternoon until I can get to it though
<headius[m]> I added rails -h times for master versus branch: https://gist.github.com/headius/6b46e4deddc3a9c558af3d4bc29c5c1c
<headius[m]> it's super noisy but seems pretty close 🤷‍♂️
<headius[m]> I'll look away and then run it again and it will vary by as much as 1s on either side
<headius[m]> something that takes 30s to start up would be a better test, and then obviously this is not telling us much about warmup
<headius[m]> this is encouraging enough to proceed though, especially with the JIT size reduction
xardion has quit [Remote host closed the connection]
rusk has quit [Remote host closed the connection]
<headius[m]> I'll have to optimize refined calls some day
<headius[m]> some day
xardion has joined #jruby
<headius[m]> ok I have call sites using indy to cache too
<headius[m]> I'll push this
<headius[m]> comparison of the synthetic method emitted for non-indy calls
<headius[m]> about half as much bytecode?
<headius[m]> enebo: rdubya https://github.com/jruby/jruby/pull/5874
throstur is now known as throstur_
<headius[m]> "Indy Logic by Headius" sounds like a fragrance
<rdubya[m]> lol
<rdubya[m]> cool, i'll take a look at it when i get a chance
<headius[m]> 👍️
<headius[m]> enebo: if you want to throw anything at this go for it...I'm going to change gears for a bit and follow up any recent issues
<enebo[m]> headius: try running something which traditionally does not fit bytecode wise
<headius[m]> oh that's what you meant
<enebo[m]> yeah
<headius[m]> I'm not sure what to run...parser gem methods are so gigantic they won't even be JIT candidates due to IR size
<headius[m]> actually I forgot we're not limiting JIT by bytecode size, we're limiting it by IR size
<headius[m]> so this won't change candidates but it will make better use of the IR size we do allow
<headius[m]> we don't have a good metric to translate IR size to bytecode size right now
<headius[m]> this is what kares was talking about when he said we might want to reinstate bytecode limit instead of IR limit
<headius[m]> parse gem's biggest methods are also gigantic case/when which maybe needs better IR to begin with
<headius[m]> bloody shame that Ruby doesn't have a constant-time switch
<headius[m]> enebo: your giant metaspace was from RG?
<enebo[m]> headius: but we can adjust the limits of IR to be larger if we know we are generating less actual bytecode
<enebo[m]> headius: yeah from running rg.org profile benchmark for about 2 hours
<enebo[m]> it topped out at 1.5G with a pretty tiny heap
<enebo[m]> I believe our slides had the heap size on it somewhere
<enebo[m]> We may actually run in a much smaller heap now too due to 9.2.8 changes
<enebo[m]> hello world rails app dropped by nearly 45M or so
<enebo[m]> 9.2.8 probably ends up as 1.4G
<headius[m]> oh yeah true
<headius[m]> we did a ton of memory reduction
<enebo[m]> our heap situation is really pretty good
<kares[m]> yep heap is good
<kares[m]> havent put snapshot out yet
<kares[m]> too much other stuff to handle + want to put in enebo's branch work and more experiments of mine - want to try sharing CL based on .rb file
<kares[m]> to reduce meta fragmentation
jrafanie has joined #jruby
lucasb has joined #jruby
<headius[m]> that's a pretty good idea
<headius[m]> too many projects
<headius[m]> branch is green in any case
<headius[m]> kares: dunno if you've been following but moving all literal values to indy seems promising
<headius[m]> if you run with indy this will be no gain really
<headius[m]> but running without indy this should be much less bytecode + class/method structure to stick in meatspace
<headius[m]> it will be important for us moving forward with a tiered indy approach too...always do literals with indy, and do calls with simple indy that promotes to more specialized when hot
<headius[m]> hard to infer anything from travis build times but branch does not appear to be any slower, and some jobs faster
<headius[m]> compiler specs are notably slower...but again could just be different VM
<headius[m]> total time 3h12 vs 3h51 seems pretty good if that smooths out VM differences
<headius[m]> woah why's all this stuff in allow failures now
<headius[m]> not sure why those got moved to allow but I moved them back
<headius[m]> for some reason that complete jar extended test started failing but we didn't see it because of this
<headius[m]> also weird that mri:stdlib did not hang on my branch 🤯
<kares[m]> need to read back all the way
<headius[m]> you can get the gist of it from https://github.com/jruby/jruby/pull/5874
<kares[m]> but there's definitely alot of lambda forms already in the class list ...
<kares[m]> pbly too tired - just getting to destinatiin after a whole day ...
<headius[m]> yeah that will be the question
<headius[m]> these are super trivial indy sites
<headius[m]> call site + constant method handle + cached value
<headius[m]> like enebo mentioned in many cases we might even be able to reuse the same MH and possibly even the call site for the same value, like having all Symbol objects aggregate a constant MH or ConstantCallSite
<headius[m]> no rush on this
<headius[m]> we just want some bigger apps to throw at it
jrafanie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
throstur_ has quit [Ping timeout: 265 seconds]
<rdubya[m]> the indy stuff made the startup up to 10 seconds faster
<rdubya[m]> hopefully I did the comparison right and there isn't something else at play there
<rdubya[m]> seems to hold true when running `rails s` as well, worse case they do about the same, best case indy wins by a couple seconds
<rdubya[m]> and considering it typically takes 80-90 seconds for our app to start, trimming off 2-3 seconds is a couple of percent performance boost 🙂
<headius[m]> rdubya: wow that's unexpected
<headius[m]> we could be seeing bytecode verification costs here
<headius[m]> I mean if it's 50% more bytecode for that extra stuff that's 50% more verification time on boot
throstur_ has joined #jruby
<headius[m]> and we know verification cost is a huge part of startup normally
<enebo[m]> rdubya: that is pretty interesting...
<enebo[m]> rdubya: how long is startup normally if it saves 10s?
<enebo[m]> oh 80-90
<enebo[m]> err I am confused by the 10s faster
<rdubya[m]> that was when i ran `rails -h`
<rdubya[m]> my gist above has the numbers
<enebo[m]> it looked 1s faster
<rdubya[m]> the fastest time 9.2.8 ran -h in was about 37 seconds of user time
<rdubya[m]> indy's fastest got down to 29
<enebo[m]> oh user cpu
<enebo[m]> I thought you were looking at wall clock time
<enebo[m]> that is a big reduction is CPU though. Maybe it is verification
<rdubya[m]> real and sys stay pretty much the same
<enebo[m]> I guess it is building a smaller class too
<enebo[m]> real drops noticeably doesn't it? 1-1.5s
<rdubya[m]> that's true it does
<enebo[m]> I like that user went down but this is one stat we have never exactly tried to improve
<rdubya[m]> i hadn't noticed that
<enebo[m]> so less bytecode generation and less verification plus how literals get loaded happens differently obviously
subbu is now known as subbu|away
<rdubya[m]> yeah, reran my other tests too, rails s looks to be a second or so faster real time and user time drops by a few seconds
<rdubya[m]> that's just to get the server started and to the point of listening to requests, on my machine that takes ~25 seconds
<rdubya[m]> in our containers it can take over a minute
<enebo[m]> if you want to cp my commit from nuke_counter onto it and use JRUBY_OPTS=-Xjit.time.delta=800000
<enebo[m]> I am curious if you see time go up or down
<enebo[m]> I actually expect a lot less compilation so user and memory should drop
<rdubya[m]> i'll give it a shot
<enebo[m]> locally rails -h with a single controller drops 8-9s of user but also is almost 1s faster to start
<enebo[m]> but with that said I think I am eliminating some methods which should get compiled (work in progress)
<rdubya[m]> I just merged in that branch
<rdubya[m]> and now the real user time is around 7.4 seconds and user is around 25 seconds
<rdubya[m]> whether I include the delta time or not
<rdubya[m]> 9.2.8 has real -> 10s and user -> 38 s
<rdubya[m]> so it looks like the two changes together trim about 2 and a half seconds off of real time, whether I specify a delta or not
<enebo[m]> default delta is 1/8 of what I asked
<enebo[m]> I have not seen compiling less affecting startup very much but it will effect peak
<enebo[m]> if you enable JIT logging you will see much much less compilation going on so memory and metaspace should drop
<enebo[m]> I think one missing thing is adding loop detection on back edges into the counts so bigger methods with loops called less frequently but still hot will compile
<enebo[m]> something with a loop probably will call enough crap that it won't be seen as happening as frequently
<enebo[m]> anyways I need to make some better test cases to figure this out
<enebo[m]> 2-2.5s off on rails -h is pretty impressive though
<rdubya[m]> if i drop the delta to 1000 it has worse performance, but still better than existing
<rdubya[m]> just realized you had 800_000 up there not 80_000
<rdubya[m]> no wonder there isn't much difference lol
<enebo[m]> I 100000 is default
<enebo[m]> yeah much higher (it is ns)
<rdubya[m]> 100k is slightly better than 80k
<rdubya[m]> 800k is worse than 100k
<rdubya[m]> and 1k is worse than 100k
<enebo[m]> Having a compiler on a compiler and having the lwoer compiler be able to do extra magic if enough native stuff is smooshed next to each other is our confounding factor here
<enebo[m]> yeah when you increase it enough everything compiles and much of that is a hindrance to startup
<enebo[m]> I think much of it is just not helpful for peak either but it is tough to logically justify what is being pruned
<enebo[m]> hot spot will look down the call stack when deciding to compile and they may compile up several levels right away
<enebo[m]> That is another consideration I have been thinking about (not in the same way but just the notion that n call levels all native end up being something C2 will be able to do opts with)
throstur_ has quit [Ping timeout: 246 seconds]
drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
throstur_ has joined #jruby
subbu|away is now known as subbu
<headius[m]> I want to see metaspace with all our changes plus some classloader batching
<headius[m]> I think we are on to some good stuff
<headius[m]> Reducing JIT to hottest stuff is going to out less load on both code cache and CPU cache too I suspect
<headius[m]> Put less
<headius[m]> Oh we need to test this stuff in Java 11 too...there's many incremental improvements to Indy and method handles that could sweeten the deal
lucasb has quit [Quit: Connection closed for inactivity]
<lopex> wow lots of indy stuff talk
<lopex> so 8 and 11 changes are big deal ?
throstur_ has quit [Ping timeout: 276 seconds]
drbobbeaty has joined #jruby
<headius[m]> May be but more interesting is how we are trying to refine jit
<headius[m]> My branch uses Indy for more stuff by default and that seems to be a win over generating bytecode
<headius[m]> enebo has a branch that changes jit threshold to be based on time, like have to hit threshold within some time range to jit
<headius[m]> kares is playing with better use of classloader to reduce load on metaspace
throstur_ has joined #jruby
<lopex> cool
<lopex> but different thresholds might be affected diffently by vendors even
<lopex> thumbs up btw
<headius[m]> yes certainly, and enebo and I discussed ergonomics based on the host system too since slower systems might not trigger jit in the same time range as faster ones