#jruby on 2019-09-12 — irc logs at freenode.irclog.whitequark.org

2019-08-12 18:53 ChanServ changed the topic of #jruby to: Get 9.2.8.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

02:22 lopex has quit [Quit: Connection closed for inactivity]

04:28 _whitelogger has joined #jruby

04:46 _whitelogger has joined #jruby

07:50 neck has joined #jruby

08:28 throsturt has joined #jruby

08:29 <throsturt> I'm running marid integration server for opsgenie which supports ruby scripting via jruby, but I'm having this problem that the /tmp directory keeps getting filled up with jruby*.jar files, many of which are identical, and fuser shows that 1622 of 1904 are still being held. Is this a known bug with jruby?

08:55 drbobbeaty has joined #jruby

09:09 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

09:21 neck has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]

09:46 <headius[m]> Hello there

09:47 <headius[m]> throsturt: What's in those files?

09:47 <headius[m]> I know we unpack at least one small binary for calling native code but I don't recall any jruby*.jar files for which we do that

10:20 <throsturt> headius[m]: a bunch of classes, what specifically should I be looking for?

10:21 <headius[m]> throsturt: It sounds like it's the JRuby jar itself...perhaps this marid/opsgenie thing is doing it? I don't believe this would be anything we're doing

10:21 <headius[m]> gist a listing of files for me to confirm

10:22 <headius[m]> if it's the JRuby jar contents then it's definitely not something we're doing

10:22 <throsturt> there's thousands of jars in here, any particular one you want to see the contents of?

10:24 <throsturt> well not thousands anymore since we restarted but here's an ls: https://gist.github.com/ThrosturX/63299d46e926ef20474936d4c2f4e1ae

10:33 <headius[m]> any one

10:34 <headius[m]> hmmm

10:34 <headius[m]> ok this might be something else then

10:34 <headius[m]> do you know what version of JRuby it's running?

10:35 <throsturt> 2.5.3 IIRC

10:35 <headius[m]> this looks like how we deal with nested jars; they get unpacked to a temp location so that we can add them to the JVM's classloaders correctly

10:35 <throsturt> but the file handles are never released

10:35 <headius[m]> that's the compatible Ruby version, but at least that tells me it's 9.2.x

10:35 <throsturt> leads to an out of memory error eventually

10:35 <headius[m]> yeah I can imagine

10:35 <throsturt> how do I check the jruby version?

10:35 <headius[m]> JRUBY_VERSION inside a Ruby script should show it

10:36 <headius[m]> I'm not familiar with this thing

10:36 <throsturt> 9.2.8.0

10:36 <throsturt> I ran the jar with -v

10:36 <throsturt> jruby 9.2.8.0 (2.5.3) 2019-08-12 a1ac7ff OpenJDK 64-Bit Server VM 11.0.4+11-LTS on 11.0.4+11-LTS +jit [linux-x86_64]

10:36 throsturt is now known as throstur

10:37 <headius[m]> So the relevant logic lives here: https://github.com/jruby/jruby/blob/807900a0478fd4d9a5509eb4ad1172a1a89a2c1e/core/src/main/java/org/jruby/util/JRubyClassLoader.java#L76

10:37 <headius[m]> in order to add the jar to our classloader it needs to have an accessible filesystem URL, so we unpack it to temp and then add that location

10:38 <headius[m]> But we do clean up those URLs when the JRuby instance gets shut down...IF it gets shut down

10:38 <headius[m]> the close() method later in that file triggers a cleanup of all those URLs and delete of the files

10:38 <throstur> ah... IF it gets shut down

10:38 <headius[m]> if that's not firing then the JRuby instances aren't being shut down after use, so you accumulate more and more

10:38 <throstur> but we have a long-lived process that doesn't seem to kill jruby

10:39 <throstur> so do I report that as a bug to OpsGenie then/

10:39 <throstur> s/\//?/

10:39 <headius[m]> if it keeps making new JRuby instances and never calling org.jruby.Ruby.tearDown or ScriptingContainer.tearDown, this would happen

10:39 <headius[m]> yes I think so

10:41 <headius[m]> Assuming they're using ScriptingContainer, here's the method: https://github.com/jruby/jruby/blob/807900a0478fd4d9a5509eb4ad1172a1a89a2c1e/core/src/main/java/org/jruby/embed/ScriptingContainer.java#L1842

10:41 <headius[m]> a bit later in that file we also set up a finalize method that will clean up non-singleton (non-global, basically) instances when they GC

10:41 <headius[m]> so that's another possible way to explain the bug...they are creating singleton JRuby instances and walking away from them without terminate

10:42 <headius[m]> Hopefully this is enough for them to fix the issue...feel free to tag @headius on your issue

10:42 <throstur> thanks

10:43 <throstur> I actually already created the issue assuming this was the problem but I'll go ahead and forward these details to them

10:43 <headius[m]> I can't think of an easy workaround for you now...you could try to shut this down from within Ruby code but that would be unpredictable since obviously you're still running code

10:44 <headius[m]> good luck!

10:45 <throstur> thanks headius! you've been incredibly helpful

10:45 <headius[m]> I hopw they are able to tidy this up quickly! Unfortunately this is the only way to ship jars-in-jars and be able to access them

10:46 <throstur> I'm sure they'll do their best, I created the ticket at level 1 priority so I'm sure they'll get on it soon, obviously this is something that shouldn't fly in an enterprise solution

10:47 <throstur> as I understand it they are selling their product for copious amounts

10:47 <headius[m]> well I think we can expect some help then 🙂

10:48 <headius[m]> if they need any assistance they can stop by here or file an issue on jruby/jruby

10:52 lopex has joined #jruby

10:52 drbobbeaty has joined #jruby

11:07 sagax has quit [Quit: Konversation terminated!]

13:01 jrafanie has joined #jruby

13:06 sagax has joined #jruby

13:45 <headius[m]> Yeah I think I'm just going to change the way the JIT lazily initializes these cached values to always use indy

13:45 <headius[m]> the worst outcome will be the overhead of bootstrap+linking of those sites but every acces of symbols, bytelists, fixnums, block bodies, and a dozen other things will optimize like constants then

13:46 <headius[m]> we're paying untold costs in unelidable static load + null check on these cache fields

13:47 <headius[m]> old JIT paid a different cost, initializing them all up front, but still static load on every access and likely other barriers to optimizing them as constants

13:51 <headius[m]> Not to mention every one of these caches emit a whole method body + caching bytecode

13:51 <headius[m]> re: meatspace

13:52 <headius[m]> method entries are definitely clogging up meatspace

13:58 <enebo[m]> can you measure the change in memory?

13:58 <headius[m]> I haven't gotten to that point yet for anything but call sites

13:58 <headius[m]> not that we have a great way to measure meatspace anyway

13:59 <enebo[m]> I am working on this from a different angle but in all cases we need something which we can measure from startup/warmup/eventual/memory perspective

13:59 <enebo[m]> And I am thinking more and more it is like n things

13:59 <enebo[m]> gem list, rails app, tighter loop bench

14:00 <enebo[m]> for metaspace I think the rg.org running 100ks of requests maybe will show something

14:00 <enebo[m]> but I am not in love with that idea

14:01 <enebo[m]> and my main issue with the rg.org bench from this spring was that I think it is too simple (we talked about this yesterday)

14:01 <enebo[m]> it it much better than a single controller app though

14:01 <headius[m]> well the other problem with rg.org is setting it up

14:01 <headius[m]> I mean once I have it working here it's working, I guess, but it's a lot of crap

14:01 <enebo[m]> oh yeah that is definitely true

14:01 jrafanie has quit [Quit: Textual IRC Client: www.textualapp.com]

14:02 <headius[m]> you know maybe we should look at redmine again

14:02 <headius[m]> because I know that worked well at some point

14:02 <headius[m]> and it's fairly self-contained

14:02 <enebo[m]> I have some notes and plan on setting it up for the nuking branch to see if memory radically decreases but it is a pain

14:02 <enebo[m]> yeah redmine would be a good one

14:02 <headius[m]> it may be very close to working right now

14:02 <headius[m]> it just wasn't maintained for JRuby

14:03 <headius[m]> I knew a guy selling packaged Redmine + JRuby + OpenSolaris servers for a while

14:03 jrafanie has joined #jruby

14:03 <enebo[m]> so I always wonder if people ever make their own bench suites for their own apps

14:04 <enebo[m]> like I am sure basecamp does from tweets but I don't know how common that is

14:04 <headius[m]> I would if I did any real work

14:04 <headius[m]> but that probably falls somewhere below testing if you're paid to build an app

14:04 <enebo[m]> if redmine had their own bench it would be perfect since it would not be made by us so propriety would be better and of course we could compare and not have to make it

14:05 <enebo[m]> OSS apps maybe have less resources who care to do it

14:05 <headius[m]> so most of these things I want to cache already have indy paths that are pretty tight...like symbol just throws an invokedynamic + string into bytecode and from then on you get the symbol as a constant

14:05 <headius[m]> I can just remove the non-indy logic and they're done and won't emit all this extra bytecode

14:05 <headius[m]> oh and the way I'm caching these is that each literal is a small method

14:06 <headius[m]> so...how many literals in a typical method

14:06 <headius[m]> N * method * bytecode size, plus it's slow and non-constant

14:06 <enebo[m]> literals? you mean symbols or identifiers

14:06 <headius[m]> symbols, fixnums, bytelist for strings, all that

14:06 <headius[m]> frozen strings

14:06 <headius[m]> block bodies

14:06 <enebo[m]> ah in operand sense

14:06 <headius[m]> yeah operand

14:07 <enebo[m]> well it seems quite a few then

14:07 <enebo[m]> not most of a method but not uncommon

14:07 <headius[m]> so they'd all boil away to indy + call site + constant method handle and fold like any constant

14:08 <headius[m]> emitted native code should end up just being a MOV

14:08 <enebo[m]> yeah my only concern then would be warmup but my above sentences points out a pretty big need for this anyways

14:08 <enebo[m]> I want to prune how many things JIT and that requires knowing warmup and eventual speed

14:09 <headius[m]> I think it will be mostly a wash

14:09 <headius[m]> these lazy init via get field + null check + init + set right now anyway

14:09 <enebo[m]> but I definitely want something larger to use for that since the people who have traditionally complained about warmup has been larger apps

14:09 <headius[m]> and it's not like call sites which generate adapted handle chain...it's just a single method handle for the value

14:10 <headius[m]> I can POC making all literals indy and we can try something

14:10 <headius[m]> it's mostly just deleting the non-indy logic

14:11 <enebo[m]> well I am not trying to be discouraging on this either as much as I want to stop reasoning about it and try and measure it a bit more

14:11 <headius[m]> sure

14:11 <enebo[m]> and for my stuff it is impossible to reason I think

14:11 <enebo[m]> I do know much of our JITting is not needed to be fast although being two stacked runtimes complicates proving how true that is

14:12 <headius[m]> https://www.redmine.org/issues/29441

14:12 <headius[m]> heh

14:12 <enebo[m]> but I can see things like pruning infrequent calls can slow down a runtime but I am "reasoning" that back edges of loops will get most or all of that back

14:12 <headius[m]> ok this is trivial

14:13 <enebo[m]> with that said I still have no reasonable benchmarks to know how good or bad the situation is

14:13 <headius[m]> they just restored a hard dep on rmagick and redcarpet

14:13 <headius[m]> so it may be like 95% working

14:13 <enebo[m]> so some platform: stuff

14:13 <enebo[m]> wow rmagick is still a thing isn't it

14:13 <headius[m]> yeah ffs

14:14 <enebo[m]> I wonder how well rmagick4j works now. It should be fine unless they added new features

14:14 <headius[m]> yeah it may just need some cleanup

14:14 <enebo[m]> Feels like that library is so old it must not change massively

14:14 <enebo[m]> if that

14:15 <enebo[m]> It was not 100% of rmagick either since it is huge

14:15 <headius[m]> also a question of what redmine actually needs from it

14:15 <headius[m]> like what, thumbnails?

14:15 <headius[m]> it's not instagram

14:15 <enebo[m]> I don't recall how much was done by the end of that gsoc but it was enough for most common things

14:15 <enebo[m]> yeah good question...image_voodoo + image_science for MRI side may be a nice contribution if it is just thumbnails

14:16 <headius[m]> it would make a way better app for profiling

14:16 <enebo[m]> and I can adopt image_voodoo to do anything but I don't want to implement a graphics library in it so it has to mostly overlap already

14:16 <headius[m]> sure

14:16 <enebo[m]> rmagick is pretty crazy as a dep

14:17 <headius[m]> yeah I am always amazed at the amount of crap people have to install to support all these exts

14:17 <enebo[m]> it generates some big string like you would generate for a command-line and then on back side gets an image back

14:23 <enebo[m]> Nov 10, 2017 is last time my redmine dir was mucked with

14:23 <enebo[m]> + when /jdbcmysql/

14:23 <enebo[m]> + gem "activerecord-jdbcmysql-adapter", "51.0"

14:23 <enebo[m]> ok so we have been here before :)

14:24 <enebo[m]> headius: we looking at 3.4.3 of redmine or are you looking back to see when we last worked

14:25 <enebo[m]> ./lib/redmine/thumbnail.rb:require 'mimemagic'

14:26 <enebo[m]> looks like only thumbnail but I see there is a font path

14:26 <enebo[m]> so image_science/voodoo do not add fonts

14:27 <enebo[m]> ok font is for some gantt chart

14:29 jrafanie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

14:33 <headius[m]> ugh

14:34 <headius[m]> I forgot about dregexp /o

14:34 <headius[m]> I use an atomic reference that's lazily initialized under class synchronization

14:34 <headius[m]> talk about gross

14:34 <headius[m]> this should clearly just use indy because that's disgusting

14:34 <headius[m]> the whole /o atomicity thing is a real pain

14:35 <enebo[m]> I doubt we even have more than a dozen dregexps in a typical app

14:35 <headius[m]> yeah that's true I guess

14:35 <headius[m]> it's just a lot of code to emit for what should basically just be a constant after init

14:37 <headius[m]> ick

14:37 <headius[m]> booleans always go through context.runtime.getTrue etc

14:37 <headius[m]> I never updated this to use context.tru and fls

14:37 <headius[m]> fals

14:37 <headius[m]> so that's two field loads plus a virtual dispatch

14:37 <enebo[m]> my statement is not arguing for one impl or another. I think indy is totally valid in something not used much since it is not going to add much to warmup

14:37 <headius[m]> I guess this is a good exercise in light of our meatspace problems

14:38 <enebo[m]> oh and I am confused about redmine

14:38 <enebo[m]> I must have a non-normative redmine

14:39 <enebo[m]> 3.0.4 is latest on a pull but 4.0.4 is out

14:39 <headius[m]> hmm did they move repo?

14:39 <headius[m]> to gitlab or something

14:41 <enebo[m]> "You will need to download a copy of the current development-code. The official code repository is located in Subversion and can be downloaded by following the Download instructions.

14:41 <enebo[m]> "

14:41 <enebo[m]> I think there mirroring is busted maybe

14:41 <enebo[m]> I am cloning from github

14:41 <enebo[m]> url = https://github.com/redmine/redmine.git

14:41 <headius[m]> subversion ffs

14:41 <headius[m]> what decade is this

14:41 <enebo[m]> Author: Go MAEDA <maeda@farend.jp>

14:41 <enebo[m]> Date: Thu Sep 12 12:51:23 2019 +0000

14:42 <enebo[m]> wtf... maybe they changed tagging style

14:42 <headius[m]> ok weird

14:42 <enebo[m]> yeah 3.4.3 is last tag I see in my repo

14:44 <enebo[m]> I really don't want to bag on people's choices but SVN...It does not really do anything better than git. I feel people who prefer it just really love revision #s

14:45 <enebo[m]> I have never been a fan of gits command-line but once you make it past tolerating it then it pretty much beats svn in all ways other than the logically simplicity of revision #

14:45 <enebo[m]> I have seen arguments about how it is more recoverable but I have never had an uncoverable git repo ever

14:45 <headius[m]> hah I found a bug

14:46 <headius[m]> JIT for a symbol proc is still using java.lang.String

14:46 <headius[m]> nobody must be using symbol procs with weird encodings

14:46 <enebo[m]> anyways to each their own...I guess we should find 4.0.4 tag

14:46 <enebo[m]> well that is probably almost entirely true

14:47 <headius[m]> yeah I have been seeing a lot of openjdk folks moaning about git lately

14:47 <enebo[m]> very very few method names end up as non 7bit in my experience

14:47 <headius[m]> mostly because it does a lot of things just different enough from hg to be confusing

14:47 <enebo[m]> yeah I have mixed feelings on hg. I used to prefer it but once you get used to git it changes how you think about repos

14:48 <headius[m]> but then they talk about having to fiddle with refs directly and I'm wondering what the hell they're doing

14:48 <enebo[m]> but in my mind it is a big step up from svn and mostly just different than git

14:48 <enebo[m]> they decided to have some epic subtree system

14:48 <headius[m]> I got a bunch of replies to one thread showing how hg and git use different words for similar concepts

14:48 <headius[m]> that was mostly my problem trying to use both

14:49 <headius[m]> apparently hg does have branches very similar to git but they're not called branches

14:49 <headius[m]> so I never found that

14:49 <enebo[m]> they = openjdk

14:49 <headius[m]> yeah mistakes were made

14:49 <headius[m]> hg forests for one

14:49 <headius[m]> hg patch queues for another

14:49 <headius[m]> both deprecated features now

14:50 <enebo[m]> My main reservation of hg which may have changed is that it seemed a lot slower than git

14:50 <enebo[m]> yeah

14:50 <enebo[m]> My question was wondering how important it was to break up the source in the first place

14:50 <enebo[m]> At the time they did it bandwidth was more painful

14:50 <headius[m]> I think that has improved but it's still largely in python where git is mostly C

14:51 <headius[m]> so they're fighting an uphill battle there

14:51 <enebo[m]> but most people were building the whole thing anyways

14:51 <headius[m]> I guess the perf issues with openjdk repos were mostly fixable too

14:51 <headius[m]> like they weren't caching or compressing or something

14:51 <headius[m]> but I am glad they're moving to git

14:52 <enebo[m]> yeah and having the mirror already makes it much easier

14:54 <enebo[m]> ok 4.0-stable is a branch which has the tag I think

14:54 <enebo[m]> err no but this is latest code

14:57 <enebo[m]> 4.0-stable should be pretty reasonable even from HEAD assuming they only merge over stable changes

14:58 <headius[m]> ok cool

14:58 <headius[m]> so they are doing things the right way and having release work on branches

14:58 <headius[m]> "right way"

14:59 <enebo[m]> It looks like they branch on release

14:59 <headius[m]> I guess my problem with that is why have master at all then

14:59 <headius[m]> nobody pushes to master in the "right way" so what is master

14:59 <enebo[m]> ""right way"" I confirm this bug

14:59 <headius[m]> hah

14:59 <enebo[m]> master is 5 maybe

14:59 <enebo[m]> 5.0 ends up being 5.0-stable would be my guess

15:00 <enebo[m]> I do not feel there is a truly great way of doing this stuff

15:00 <enebo[m]> diverging code and focus is tough if you immediately branch

15:01 <enebo[m]> if you keep it too long then you are probably messing with progress at the cost of speeding up stability

15:01 <enebo[m]> I think we mostly have this right. If we take too long we make next version branch until we stable branch and then converty master back to next release

15:01 <enebo[m]> but it is a "feeling" thing which I am sure bugs many people who follow how we work

15:02 <headius[m]> if we had like five active branches it would make sense to do it that other way I guess

15:02 <headius[m]> but we only have two

15:02 <headius[m]> and usually one

15:02 <enebo[m]> yeah by design largely

15:02 <enebo[m]> if we had a team of 10 full time people I think we would do this differently

15:10 <headius[m]> enebo: so do you have something you want to try to run against this indy_literal branch?

15:10 <headius[m]> I mean to put your mind at ease about startup and maybe warmup

15:11 <enebo[m]> well no I don't

15:11 <enebo[m]> This is why redmine came up

15:11 <headius[m]> right ok

15:12 <enebo[m]> I think our main problem is the stuff we run to see if indy is running well tends to be quicker and smaller

15:12 <headius[m]> we can test this with the usual round of gem and rails commands for the moment

15:12 <headius[m]> it's a weird case because this only affects non-dev mode

15:12 <enebo[m]> The people who complain about warmup (which is not only on indy) ends up being much larger applications

15:12 <headius[m]> which is what we still recommend for startup time

15:12 <enebo[m]> I mean if we land something now and realize it hurts warmup when we get this running then we have options too

15:13 <enebo[m]> I just cannot reasonably land my experiment without something more substantial because it is less clear on possible impacts

15:13 <headius[m]> this is going to make bytecode output way smaller

15:14 <enebo[m]> but complaints on warmup are exclusively non-dev too so we just need some guantlet of use cases to see how changes are affected

15:14 <headius[m]> it could be better than half

15:14 <enebo[m]> have you tried parser gem yet?

15:14 <headius[m]> every literal, every call site getting a method with a dozen bytecodes before down to load context + invokedynamic

15:14 <enebo[m]> Will something fit now? That would spice things up :)

15:14 <headius[m]> parser gem?

15:14 <headius[m]> definitely

15:14 <enebo[m]> whitequark

15:15 <headius[m]> the current jit max size is based on total .class size

15:15 <enebo[m]> but to play devils advocate why not replace the entire method with indy

15:15 <headius[m]> and this logic will be no bigger within the actual jitted body but not have these limbs dangling all over

15:15 <enebo[m]> backwards branching perhaps excluded

15:15 <headius[m]> that's what this does

15:15 <headius[m]> the methods themselves are gone

15:16 <enebo[m]> sorry I think I missed something...you are still emiting some other java bytecode

15:18 <headius[m]> before, every literal value emitted a separate method to lazily init and cache it in a static field

15:18 <headius[m]> that method continued to be called every time, loading field and checking == null

15:18 <headius[m]> now, every literal value just emits indy

15:18 <headius[m]> and folds

15:18 <headius[m]> no extra field or method needed

15:19 <enebo[m]> oh sorry you just meant for making literals

15:19 <headius[m]> yeah

15:19 <headius[m]> I will also do call sites but those still go through a separate synthetic method to wrap all the arg wrangling

15:19 <headius[m]> here look at this

15:19 <headius[m]> https://gist.github.com/headius/718ff8e6bece2c4033168fa680f9eeef

15:19 <headius[m]> there's only two literals there, 1 and 2, but you can see they emit a whole extra method each

15:20 <headius[m]> oh and the two fields fixnum0 fixnum1

15:21 <headius[m]> we should be able to see a measurable reduction in total emitted JIT size

15:21 <enebo[m]> An interesting outcome of this is handle can be shared

15:21 <headius[m]> yeah some of these handles are cached in o.j.Ruby already

15:21 <headius[m]> like all nil/false/true will use the same constant handle

15:21 <headius[m]> I think fixnums in cache range will too

15:21 <headius[m]> symbols could

15:21 <enebo[m]> so that is a second level of reduction which can be huge for stuff like that

15:22 <headius[m]> for sure

15:22 <enebo[m]> remember too I changed operands to make Fixnums in operand...not sure if this can be applied to use Operand for MH

15:22 <enebo[m]> I mean I can imagine it could

15:23 <enebo[m]> If you are going to make MH for all fixnums compiled then making it on the Fixnum operand will end up allowing more savings than Ruby fixnum cache range

15:23 <headius[m]> I can't pass through most objects into jit

15:23 <headius[m]> they would have to be able to find it on the "other side"

15:23 <headius[m]> I can only pass things that can go in constant pool

15:24 <headius[m]> numbers, string version of bytelist, etc

15:24 <enebo[m]> ah you would need the operand reference which is possible to get but it would be a journey

15:24 <headius[m]> right

15:24 <enebo[m]> (sicne we do have reference to IRScope in methods)

15:24 <headius[m]> but we can improve this anyway

15:24 <headius[m]> like have Symbol aggregate a MethodHandle field

15:25 <headius[m]> then all caches of that symbol will use same handle

15:25 <headius[m]> could actually all use same call site probably

15:26 <enebo[m]> MH[] handles

15:27 <enebo[m]> int or long generic handles JIT would emit long/int

15:27 <headius[m]> we're going to find a happy medium here

15:27 <headius[m]> may make it harder to do AOT

15:27 <enebo[m]> yeah I was just thinking about that

15:27 <headius[m]> SVM has only limited support for indy

15:28 <enebo[m]> AOT would need an initializer but then that is not even enough since you could not guarantee slot would be open

15:28 <headius[m]> but there's much bigger things that have to change for us to AOT ruby code anyway

15:28 <enebo[m]> UUID!

15:28 <headius[m]> SVM does support some limited indy for lambdas...there may be a path

15:29 <headius[m]> hah yeah that won't impact startup

15:29 <enebo[m]> anyways. I am curious about memory improvements on this

15:29 <enebo[m]> warmup is tied for second

15:29 <headius[m]> I'm going to also make this change for caching a CallSite object and then push a branch

15:30 <enebo[m]> combining this with my heustics if they work will stack too

15:30 <headius[m]> calls will still go through a synthetic method but use indy to initialize a CachingCallSite rather than the static field dance

15:30 <enebo[m]> I would love to see us halve memory from metaspace not being so epicly large

15:30 <headius[m]> so bytecode + field reduction

15:31 <enebo[m]> I still feel that is more from one class per space though

15:31 <headius[m]> yeah hopefully we can measure that

15:32 * headius[m] sent a long message: < https://matrix.org/_matrix/media/v1/download/matrix.org/UImiQyvAPVtCVUVnnjSWkhYn >

15:32 <headius[m]> so that's 9.2.8.0 versus this for my 13 gems

15:33 <headius[m]> my gem home got corrupted in some weird way

15:33 <headius[m]> I had to wipe it out

15:33 <headius[m]> but you can test this with your 10k gems

15:33 <headius[m]> this is a better view of actual init costs though

15:33 <headius[m]> basically noise

15:36 <headius[m]> oh I need /shared for actual gem location

15:36 <headius[m]> retrying this

15:36 <headius[m]> ok I still have lots of gems actually

15:36 <headius[m]> 412

15:36 <headius[m]> ish

15:38 <headius[m]> https://gist.github.com/headius/6b46e4deddc3a9c558af3d4bc29c5c1c

15:38 <headius[m]> that's reversed, branch vs 9.2.8.0

15:38 <headius[m]> so again probably noise

15:40 <enebo[m]> yeah appears to just be noise for this

15:43 <headius[m]> I should really be comparing with master

15:48 <headius[m]> rdubya: would you be able to try startup on your app with this branch?

15:49 <rdubya[m]> yeah, I should be able to do that, at this point it will probably be this afternoon until I can get to it though

15:51 <headius[m]> I added rails -h times for master versus branch: https://gist.github.com/headius/6b46e4deddc3a9c558af3d4bc29c5c1c

15:52 <headius[m]> it's super noisy but seems pretty close 🤷‍♂️

15:52 <headius[m]> I'll look away and then run it again and it will vary by as much as 1s on either side

15:53 <headius[m]> something that takes 30s to start up would be a better test, and then obviously this is not telling us much about warmup

15:53 <headius[m]> this is encouraging enough to proceed though, especially with the JIT size reduction

16:03 xardion has quit [Remote host closed the connection]

16:04 rusk has quit [Remote host closed the connection]

16:05 <headius[m]> I'll have to optimize refined calls some day

16:06 <headius[m]> some day

16:08 xardion has joined #jruby

16:12 <headius[m]> ok I have call sites using indy to cache too

16:12 <headius[m]> I'll push this

16:13 <headius[m]> https://gist.github.com/headius/eaf13609fae927364aea05cd03502e27

16:13 <headius[m]> comparison of the synthetic method emitted for non-indy calls

16:13 <headius[m]> about half as much bytecode?

16:19 <headius[m]> enebo: rdubya https://github.com/jruby/jruby/pull/5874

16:19 throstur is now known as throstur_

16:20 <headius[m]> "Indy Logic by Headius" sounds like a fragrance

16:20 <rdubya[m]> lol

16:21 <rdubya[m]> cool, i'll take a look at it when i get a chance

16:21 <headius[m]> 👍️

16:21 <headius[m]> enebo: if you want to throw anything at this go for it...I'm going to change gears for a bit and follow up any recent issues

16:21 <enebo[m]> headius: try running something which traditionally does not fit bytecode wise

16:22 <headius[m]> oh that's what you meant

16:22 <enebo[m]> yeah

16:22 <headius[m]> I'm not sure what to run...parser gem methods are so gigantic they won't even be JIT candidates due to IR size

16:22 <headius[m]> actually I forgot we're not limiting JIT by bytecode size, we're limiting it by IR size

16:23 <headius[m]> so this won't change candidates but it will make better use of the IR size we do allow

16:23 <headius[m]> we don't have a good metric to translate IR size to bytecode size right now

16:23 <headius[m]> this is what kares was talking about when he said we might want to reinstate bytecode limit instead of IR limit

16:24 <headius[m]> parse gem's biggest methods are also gigantic case/when which maybe needs better IR to begin with

16:24 <headius[m]> bloody shame that Ruby doesn't have a constant-time switch

16:26 <headius[m]> enebo: your giant metaspace was from RG?

16:26 <enebo[m]> headius: but we can adjust the limits of IR to be larger if we know we are generating less actual bytecode

16:26 <enebo[m]> headius: yeah from running rg.org profile benchmark for about 2 hours

16:26 <enebo[m]> it topped out at 1.5G with a pretty tiny heap

16:27 <enebo[m]> I believe our slides had the heap size on it somewhere

16:29 <enebo[m]> We may actually run in a much smaller heap now too due to 9.2.8 changes

16:29 <enebo[m]> hello world rails app dropped by nearly 45M or so

16:30 <enebo[m]> 9.2.8 probably ends up as 1.4G

16:30 <headius[m]> oh yeah true

16:30 <headius[m]> we did a ton of memory reduction

16:32 <enebo[m]> our heap situation is really pretty good

16:39 <kares[m]> yep heap is good

16:39 <kares[m]> havent put snapshot out yet

16:40 <kares[m]> too much other stuff to handle + want to put in enebo's branch work and more experiments of mine - want to try sharing CL based on .rb file

16:40 <kares[m]> to reduce meta fragmentation

16:43 jrafanie has joined #jruby

16:44 lucasb has joined #jruby

16:44 <headius[m]> that's a pretty good idea

16:45 <headius[m]> too many projects

16:46 <headius[m]> branch is green in any case

16:46 <headius[m]> kares: dunno if you've been following but moving all literal values to indy seems promising

16:46 <headius[m]> if you run with indy this will be no gain really

16:46 <headius[m]> but running without indy this should be much less bytecode + class/method structure to stick in meatspace

16:48 <headius[m]> it will be important for us moving forward with a tiered indy approach too...always do literals with indy, and do calls with simple indy that promotes to more specialized when hot

16:51 <headius[m]> hard to infer anything from travis build times but branch does not appear to be any slower, and some jobs faster

16:51 <headius[m]> compiler specs are notably slower...but again could just be different VM

16:52 <headius[m]> total time 3h12 vs 3h51 seems pretty good if that smooths out VM differences

16:55 <headius[m]> woah why's all this stuff in allow failures now

16:59 <headius[m]> https://travis-ci.org/jruby/jruby/builds/584237441

16:59 <headius[m]> not sure why those got moved to allow but I moved them back

17:00 <headius[m]> for some reason that complete jar extended test started failing but we didn't see it because of this

17:00 <headius[m]> also weird that mri:stdlib did not hang on my branch 🤯

17:05 <kares[m]> need to read back all the way

17:06 <headius[m]> you can get the gist of it from https://github.com/jruby/jruby/pull/5874

17:06 <kares[m]> but there's definitely alot of lambda forms already in the class list ...

17:06 <kares[m]> pbly too tired - just getting to destinatiin after a whole day ...

17:07 <headius[m]> yeah that will be the question

17:07 <headius[m]> these are super trivial indy sites

17:07 <headius[m]> call site + constant method handle + cached value

17:07 <headius[m]> like enebo mentioned in many cases we might even be able to reuse the same MH and possibly even the call site for the same value, like having all Symbol objects aggregate a constant MH or ConstantCallSite

17:08 <headius[m]> no rush on this

17:08 <headius[m]> we just want some bigger apps to throw at it

18:16 jrafanie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

18:32 <rdubya[m]> headius: enebo https://gist.github.com/rdubya/e77105f1dba6b736004eeecd9aeb4b0e

18:32 throstur_ has quit [Ping timeout: 265 seconds]

18:32 <rdubya[m]> the indy stuff made the startup up to 10 seconds faster

18:32 <rdubya[m]> hopefully I did the comparison right and there isn't something else at play there

18:41 <rdubya[m]> seems to hold true when running `rails s` as well, worse case they do about the same, best case indy wins by a couple seconds

18:41 <rdubya[m]> and considering it typically takes 80-90 seconds for our app to start, trimming off 2-3 seconds is a couple of percent performance boost 🙂

18:41 <headius[m]> rdubya: wow that's unexpected

18:42 <headius[m]> we could be seeing bytecode verification costs here

18:42 <headius[m]> I mean if it's 50% more bytecode for that extra stuff that's 50% more verification time on boot

18:42 throstur_ has joined #jruby

18:43 <headius[m]> and we know verification cost is a huge part of startup normally

18:48 <enebo[m]> rdubya: that is pretty interesting...

18:49 <enebo[m]> rdubya: how long is startup normally if it saves 10s?

18:49 <enebo[m]> oh 80-90

18:49 <enebo[m]> err I am confused by the 10s faster

18:50 <rdubya[m]> that was when i ran `rails -h`

18:50 <rdubya[m]> my gist above has the numbers

18:50 <enebo[m]> it looked 1s faster

18:51 <rdubya[m]> the fastest time 9.2.8 ran -h in was about 37 seconds of user time

18:52 <rdubya[m]> indy's fastest got down to 29

18:52 <enebo[m]> oh user cpu

18:52 <enebo[m]> I thought you were looking at wall clock time

18:52 <enebo[m]> that is a big reduction is CPU though. Maybe it is verification

18:53 <rdubya[m]> real and sys stay pretty much the same

18:53 <enebo[m]> I guess it is building a smaller class too

18:53 <enebo[m]> real drops noticeably doesn't it? 1-1.5s

18:53 <rdubya[m]> that's true it does

18:54 <enebo[m]> I like that user went down but this is one stat we have never exactly tried to improve

18:54 <rdubya[m]> i hadn't noticed that

18:54 <enebo[m]> so less bytecode generation and less verification plus how literals get loaded happens differently obviously

18:59 subbu is now known as subbu|away

18:59 <rdubya[m]> yeah, reran my other tests too, rails s looks to be a second or so faster real time and user time drops by a few seconds

19:00 <rdubya[m]> that's just to get the server started and to the point of listening to requests, on my machine that takes ~25 seconds

19:00 <rdubya[m]> in our containers it can take over a minute

19:04 <enebo[m]> if you want to cp my commit from nuke_counter onto it and use JRUBY_OPTS=-Xjit.time.delta=800000

19:04 <enebo[m]> I am curious if you see time go up or down

19:05 <enebo[m]> https://github.com/jruby/jruby/commit/24632b32477feddab323934ca3efebb09c2b3edb

19:07 <enebo[m]> I actually expect a lot less compilation so user and memory should drop

19:09 <rdubya[m]> i'll give it a shot

19:11 <enebo[m]> locally rails -h with a single controller drops 8-9s of user but also is almost 1s faster to start

19:11 <enebo[m]> but with that said I think I am eliminating some methods which should get compiled (work in progress)

19:27 <rdubya[m]> I just merged in that branch

19:27 <rdubya[m]> and now the real user time is around 7.4 seconds and user is around 25 seconds

19:28 <rdubya[m]> whether I include the delta time or not

19:29 <rdubya[m]> 9.2.8 has real -> 10s and user -> 38 s

19:29 <rdubya[m]> so it looks like the two changes together trim about 2 and a half seconds off of real time, whether I specify a delta or not

19:30 <enebo[m]> default delta is 1/8 of what I asked

19:30 <enebo[m]> I have not seen compiling less affecting startup very much but it will effect peak

19:31 <enebo[m]> if you enable JIT logging you will see much much less compilation going on so memory and metaspace should drop

19:31 <enebo[m]> I think one missing thing is adding loop detection on back edges into the counts so bigger methods with loops called less frequently but still hot will compile

19:32 <enebo[m]> something with a loop probably will call enough crap that it won't be seen as happening as frequently

19:32 <enebo[m]> anyways I need to make some better test cases to figure this out

19:32 <enebo[m]> 2-2.5s off on rails -h is pretty impressive though

19:32 <rdubya[m]> if i drop the delta to 1000 it has worse performance, but still better than existing

19:33 <rdubya[m]> just realized you had 800_000 up there not 80_000

19:33 <rdubya[m]> no wonder there isn't much difference lol

19:33 <enebo[m]> I 100000 is default

19:33 <enebo[m]> yeah much higher (it is ns)

19:34 <rdubya[m]> 100k is slightly better than 80k

19:34 <rdubya[m]> 800k is worse than 100k

19:34 <rdubya[m]> and 1k is worse than 100k

19:35 <enebo[m]> Having a compiler on a compiler and having the lwoer compiler be able to do extra magic if enough native stuff is smooshed next to each other is our confounding factor here

19:35 <enebo[m]> yeah when you increase it enough everything compiles and much of that is a hindrance to startup

19:36 <enebo[m]> I think much of it is just not helpful for peak either but it is tough to logically justify what is being pruned

19:36 <enebo[m]> hot spot will look down the call stack when deciding to compile and they may compile up several levels right away

19:37 <enebo[m]> That is another consideration I have been thinking about (not in the same way but just the notion that n call levels all native end up being something C2 will be able to do opts with)

20:48 throstur_ has quit [Ping timeout: 246 seconds]

20:53 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

20:53 throstur_ has joined #jruby

20:53 subbu|away is now known as subbu

22:16 <headius[m]> I want to see metaspace with all our changes plus some classloader batching

22:16 <headius[m]> I think we are on to some good stuff

22:16 <headius[m]> Reducing JIT to hottest stuff is going to out less load on both code cache and CPU cache too I suspect

22:17 <headius[m]> Put less

22:20 <headius[m]> Oh we need to test this stuff in Java 11 too...there's many incremental improvements to Indy and method handles that could sweeten the deal

22:24 lucasb has quit [Quit: Connection closed for inactivity]

22:48 <lopex> wow lots of indy stuff talk

22:48 <lopex> so 8 and 11 changes are big deal ?

23:01 throstur_ has quit [Ping timeout: 276 seconds]

23:08 drbobbeaty has joined #jruby

23:16 <headius[m]> May be but more interesting is how we are trying to refine jit

23:17 <headius[m]> My branch uses Indy for more stuff by default and that seems to be a win over generating bytecode

23:17 <headius[m]> enebo has a branch that changes jit threshold to be based on time, like have to hit threshold within some time range to jit

23:17 <headius[m]> kares is playing with better use of classloader to reduce load on metaspace

23:25 throstur_ has joined #jruby

23:54 <lopex> cool

23:55 <lopex> but different thresholds might be affected diffently by vendors even

23:56 <lopex> thumbs up btw

23:57 <headius[m]> yes certainly, and enebo and I discussed ergonomics based on the host system too since slower systems might not trigger jit in the same time range as faster ones