#jruby on 2019-07-16 — irc logs at freenode.irclog.whitequark.org

2019-04-09 18:34 ChanServ changed the topic of #jruby to: Get 9.2.7.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

02:46 _whitelogger has joined #jruby

07:06 rusk has joined #jruby

08:46 drbobbeaty has joined #jruby

09:07 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

09:49 shellac has joined #jruby

10:58 drbobbeaty has joined #jruby

12:51 shellac has quit [Ping timeout: 250 seconds]

14:33 lucasb has joined #jruby

15:26 oblutak18 has joined #jruby

16:03 xardion has quit [Remote host closed the connection]

16:08 xardion has joined #jruby

16:47 rusk has quit [Remote host closed the connection]

17:37 subbu is now known as subbu|lunch

17:57 oblutak18 has quit [Remote host closed the connection]

18:14 subbu|lunch is now known as subbu

20:16 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

20:33 lucasb has quit [Quit: Connection closed for inactivity]

20:33 <lopex> headius[m]: so even slightest load on G1 side causes the issue ? according to Aleksey'a answer ?

20:47 <headius[m]> I think he just means that even though the write barriers are cheap, if you're doing billions of assignments you will see an impact

20:47 <headius[m]> A microsecond here and a microsecond there

20:50 oblutak18 has joined #jruby

20:52 <lopex> hm

20:53 <lopex> so what if one could do only newgen ?

20:53 <lopex> since there's no leaks anyways ?

21:08 <lopex> I guess, I'm missing some intuition here

21:11 <headius[m]> Well I'm curious about this passive mode for Shenandoah

21:12 <headius[m]> I assume the trade-off is GC pauses get longer

21:17 <lopex> tradeoffs

21:18 <lopex> icms is still a thing ?

21:18 <lopex> but intuition would say, change the parameters to put most burden on young parallel gc

21:23 <headius[m]> With a bit of allocation profiling, we could probably improve performance a lot

21:28 <lopex> wrt joni ext thing, I guess I'll ask the reportee to check search against match

21:28 <lopex> since we could indeed waste a lot of time in that skip routines

21:29 <lopex> those methods are big and might not be compiled

21:29 <lopex> *reporter, lol

21:29 <lopex> there have been some improvements in onigmo this year too

21:30 <lopex> expecially fixes for sunday search

21:32 <headius[m]> Yeah I have not had a chance to try running what he provided

21:33 <lopex> but he does use search, and we would know more if only match was used

21:34 <lopex> but given the timings I guess it;s not too shabby having range checks everywhere

21:34 <lopex> we could even ship the ext for some platforms

21:35 <lopex> anyways, I'll hapilly assist profile analysis

21:36 <lopex> also, the stack is more fragmented

21:36 <lopex> headius[m]: originally joni used single int[] for stack, and a bunch of offsets for a frame

21:38 whitingjr has quit [Ping timeout: 272 seconds]

21:39 <enebo[m]> Does anyone know is bouncy castle loads every possible fucking thing in the library or is ruby openssl library that eager?

21:40 <enebo[m]> like 60% of a rails apps String data appears to just be BC strings

21:40 <lopex> lol

21:40 <lopex> or maybe ruby code triggers that ?

21:40 <enebo[m]> invokedynamic really likes making new var1-n strings too :P

21:40 <lopex> like initializing constsants

21:41 <enebo[m]> lopex: yeah could be

21:41 <lopex> enebo[m]: strings on heap ?

21:41 <lopex> via specific query ?

21:41 <enebo[m]> yeah

21:42 <lopex> or just but clicking ?

21:42 <enebo[m]> inspecting heap dump with visualvm

21:42 <enebo[m]> String + char[] is like 10M of 120M rails process

21:43 <enebo[m]> ByteList + byte[] is also ~10M

21:43 <lopex> actually I couldnt say if it's hight or low

21:43 <enebo[m]> although bytelists are much more complicated to look at since I see a low of COW action

21:43 <lopex> yeah

21:43 <lopex> most are not probably

21:48 <lopex> cow is so tempting and so cursed

21:50 <enebo[m]> lopex: yeah I more abd more feel it is cursed

21:51 <lopex> I think it;s cursed too

21:51 <lopex> well, easy to remove though

21:51 <enebo[m]> select count(filter(heap.objects('org.jruby.util.ByteList'), 'it.bytes.length == 15'))

21:51 <enebo[m]> 19464.0

21:52 <lopex> what does it tell though ?

21:52 <enebo[m]> select count(filter(heap.objects('org.jruby.util.ByteList'), 'it.bytes.length == 15 && it.realSize != 15'))

21:52 <enebo[m]> 16491.0

21:52 <lopex> can you compare by ref ?

21:52 <lopex> er

21:52 <lopex> groupby by ref

21:53 <lopex> and them size

21:53 <enebo[m]> I dno't recall how 15 as starting length in createByteList in lexer came about

21:53 <lopex> yeah, but it might be many small strings too

21:53 <enebo[m]> I am really just wondering how much we overcommit here and it looks like 80% of the time

21:54 <enebo[m]> but bytelist has a cost if you start to small and the string is larger

21:55 <enebo[m]> lopex: I increased this from 6 to 15 at the same time I stopped intern some ident strings

21:56 <enebo[m]> I feel this would only reduce number of grows by 1 in strings > 15

21:57 <enebo[m]> which is 20% of the strings

21:59 <enebo[m]> going to use science! well be a little hacky and consider strings created during a rails app start

22:00 <enebo[m]> OMGZ I forgot bytelist is does not use a scaling factor

22:00 <lopex> enebo[m]: change it to some larger arbitrary value

22:01 <lopex> so the chance of misinterpreting gets lower

22:01 <enebo[m]> buffer.append(c); // O_o

22:02 <enebo[m]> this might be significant

22:03 <enebo[m]> StringTerm and no doubt HeredocTerm calls append(byte) n times for a n length() string which will call grow(1) n times

22:03 <enebo[m]> so my default of 15 hid the cost by making 80% of all things fit by default

22:04 <enebo[m]> OTOH this never shows up in profiling I guess we will see

22:05 <enebo[m]> This also could explain how poor oj dump speed is using ByteList as the backing store

22:05 <enebo[m]> I thought it was just because of constant bounds checking but if it has no scaling factor I am dump a 15k json file with 15k System.arraycopys

22:06 <lopex> I'm only getting more stupid as I age wrt those things

22:08 <enebo[m]> I dumped all str lengths made and it is fascinating how small most strings are

22:08 <enebo[m]> I guess Ruby encourages interpolation enough where each fragment is generally small

22:10 <lopex> otoh, did we see any perf bump once java moved to compact strings ?

22:10 <enebo[m]> which java added them?

22:11 <lopex> 9

22:24 <enebo[m]> https://gist.github.com/enebo/c489d67ce3e05e4db9adcff5abbd4ad6

22:27 <headius[m]> Is that a histogram?

22:30 <enebo[m]> yeah

22:30 oblutak18 has left #jruby [#jruby]

22:30 <enebo[m]> gem list is unaffected by just making the array a lot larger but I don't like this and rails will definitely make a lot more strings

22:30 <enebo[m]> anyways I will play with this tomorrow

22:30 <enebo[m]> add in a scaling factor and maybe also do inlined CR calc instead of walking the string a second time

22:31 <headius[m]> Is that a heap histogram or an allocation profile?

22:31 <enebo[m]> the latter I have thought about in the past but I will talk to Kevin before I attempt it

22:31 <enebo[m]> it is stringterm bytelist sizes

22:31 <enebo[m]> I made the histogram looking at only that

22:32 <headius[m]> But this is from a heap snapshot, yes? live objects?

22:32 <enebo[m]> only other thing which is doing this is heredoc itself which will typically be larger strings

22:32 <enebo[m]> this is from rails s and killing it

22:32 <enebo[m]> I made this from printlns

22:32 <headius[m]> Okay so it is all allocations

22:33 <enebo[m]> all allocations of normal strings in stringterm

22:33 <enebo[m]> so very specific thing

22:33 <headius[m]> I believe RubyString imposes a growth factor when growing the ByteList

22:33 <headius[m]> So that end of things may be better

22:33 <headius[m]> 1.5x or something

22:33 <enebo[m]> yeah and this has nothing to do with RubyString and happens before it ever is actually a string

22:34 <enebo[m]> it is the parser reading strings in the lexer and not even the only path just the most common one

22:34 <enebo[m]> dinner though...this is unneeded churn even if I cannot measure much

22:36 <headius[m]> Yeah, I was just wondering if we should be looking harder at ByteList in other allocation profiles too

22:36 <enebo[m]> possibly

22:36 <enebo[m]> I am definitely going to look around

22:36 <enebo[m]> anything which does an append directly is a suspect

22:37 <enebo[m]> what the hell

22:37 <enebo[m]> there is a growth factor in here

22:37 <lopex> I wonder, since ralloc in java is new and copy it sohuld be quite easily localized by tools by this pattern

22:37 <enebo[m]> I am seriously confused now

22:37 <enebo[m]> newSize >> 1

22:37 <lopex> this way we could learns about reallocs

22:38 <lopex> is it not ?

22:39 <enebo[m]> lopex: I was super confused I did not see the newSize + (newSize >> 1) so everything I said above is not an issue from a growth factor issue

22:39 jrafanie has joined #jruby

22:39 jrafanie has quit [Client Quit]

22:39 <enebo[m]> The choice of 15 may be a bad default though for StringTerm strings

22:39 <headius[m]> Yeah

22:39 <lopex> enebo[m]: where would you see that ?

22:39 <lopex> I'm confused

22:40 <enebo[m]> in grow()

22:40 <headius[m]> StringBuffer defaults to 16

22:40 <enebo[m]> it could be where the number came from

22:40 <headius[m]> But it's also only used if someone plans to mutate

22:40 <headius[m]> Could be

22:40 <enebo[m]> I bet I just did the histogram befoer but for gem list

22:41 <enebo[m]> 80% requires no grow() so that is pretty nice

22:41 <headius[m]> Right-sizing some of these BLs could give us free memory reduction

22:41 <enebo[m]> well that was why I was looking

22:41 <lopex> headius[m]: also aggressive cow could trigger more barriers

22:41 <enebo[m]> but I don't want to trafe-off any percieved startup for that

22:41 <lopex> since those like ampty strings for diffferent encodings might be accessed

22:41 <enebo[m]> especially since memory problems are not actually in the heap at all right now

22:42 <lopex> potentially changed

22:42 <enebo[m]> so far most memory improvementsd have no effect on startup so I just want to continue the trend

22:42 <enebo[m]> when I say most I express doubt in that measuring wall clock is a bit noisy

22:43 <lopex> and all that bit flipping in flags

22:43 <enebo[m]> but after all the changes if anything we may be a tiny bit faster

22:43 <lopex> it's a mess

22:43 <enebo[m]> anyways dinner for reals now

22:45 <lopex> headius[m]: and all those potentiall leaks for arrays we talked about years ago

22:45 <headius[m]> Yeah

22:46 <lopex> I think it's the first cow we shoudl get rid of

22:46 <lopex> it could help gc

22:46 <headius[m]> It wouldn't be too difficult to remove copy on write and try some things out

22:46 * rtyler waves

22:47 <lopex> what things ?

22:47 <headius[m]> Yeah possibly

22:47 <lopex> just remove the cow

22:47 <headius[m]> I'm sure we're screwing up some object age metrics by keeping these backing arrays around

22:47 <headius[m]> I mean things likely to hit arrays hard, like any typical Ruby application. Just see how bad the allocation curve looks, if it looks bad at all

22:48 <lopex> but it's hard to measure

22:48 <lopex> well, impossible

22:48 <lopex> like hmm

22:48 <lopex> imagine a pathological benchmark

22:49 <headius[m]> Why impossible? If the heap is significantly bigger, that tells us something. If applications run slower or faster, that tells us something too

22:49 <lopex> fill an array with some distinct objects

22:49 <lopex> make a slice

22:49 <lopex> operate on them

22:49 <headius[m]> Primary concern for me is always real applications versus synthetic benchmarks. Obviously we can show a performance hit for things like heavy array slicing

22:49 <lopex> yeah I know

22:50 <headius[m]> I guess the corollary to this is that I have no idea how common it is too heavily slice up an array

22:50 <headius[m]> That is always been the case we bring up when we discussed removing copy on write, but do we really know it's a problem?

22:51 <headius[m]> Only we and MRI do COW for Array

22:51 <lopex> no standard metrics

22:51 <lopex> er, no data I mean

22:51 <lopex> and mri does more now

22:51 <headius[m]> Yeah

22:51 <lopex> since it packs small strings in unions

22:52 <lopex> so it's like 4x improvements just on allocations

22:52 <lopex> that extra 1x was on java meta data :P

22:52 <lopex> but mri in single alloc can to whle string indeed

22:53 <lopex> *do

22:54 <lopex> I forgot at what state their gc is

22:54 <lopex> but hat surely helps

22:59 <lopex> headius[m]: btw I'm running a semimportant production jruby app on a docker now

23:00 <lopex> so I'd be interested on a state of those images

23:10 <headius[m]> We could pack very small strings into the header

23:10 <lopex> like longs ?

23:10 <lopex> or/and unsave ?

23:11 <lopex> *fe

23:11 <headius[m]> Yeah

23:11 <lopex> we have lots of int though without unsafe

23:11 <lopex> er

23:12 <lopex> hash only I guess

23:12 <lopex> and flags

23:14 <lopex> wuld we know how aligned RubyString fields are on comon platforms ?

23:15 <lopex> bleh, it;s too late now here

23:35 <headius[m]> I'm not sure

23:41 <lopex> headius[m]: btw have you seen http://localhost.run/ ?

23:42 <headius[m]> <lopex "headius: btw have you seen http:"> Isn't that sending all traffic through them?

23:42 <lopex> lol that tunnels ?

23:43 <headius[m]> Interesting but it's basically just ssh port forwarding eh?

23:43 <headius[m]> Yeah -R is for mapping remote port to local one

23:43 <lopex> and I'm affraid lots of ppl use https://anydesk.com/

23:43 <lopex> look at that

23:43 <lopex> yeah

23:43 <lopex> and you get a name with that forward

23:43 <lopex> just for testing

23:44 <lopex> but anydesk is overused

23:44 <lopex> a lot

23:44 <lopex> headius[m]: but exposing some testing env to the world no not necessarily dangerous

23:45 <lopex> well, you do it via third party yes

23:45 <lopex> headius[m]: that's why I'm keeping my rpi at home for serious stuff

23:45 <headius[m]> Yeah, I'd hope my web developer knows how to open a port on their own, or put the site somewhere I can reach it

23:45 <lopex> yeah, sure

23:46 <lopex> but sometimes it's troublesome

23:46 <lopex> headius[m]: same for teamviewer btw

23:47 <lopex> headius[m]: we have a lot of cases where client use some weird puls software toconnect

23:47 <lopex> and it;s almost unusable

23:47 <lopex> so sure, ssh is always the best

23:48 <lopex> and there's lots of inwdoes there so what can you do

23:48 <lopex> *windows

23:49 <lopex> wsl2 when it comes might be worthy

23:49 <lopex> since it's a true linux kernel under the hood

23:50 <lopex> headius[m]: we have like 2% windows serwers

23:50 <lopex> but that;s what bites you

23:51 <headius[m]> Yeah I am Keen to try it out

23:53 <lopex> wsl2 ?

23:53 <lopex> they promise a lot