nirvdrum has joined #jruby
drbobbeaty has joined #jruby
nirvdrum has quit [Ping timeout: 265 seconds]
nirvdrum has joined #jruby
nirvdrum has quit [Ping timeout: 255 seconds]
ur5us has quit [Ping timeout: 240 seconds]
nirvdrum has joined #jruby
nirvdrum has quit [Ping timeout: 265 seconds]
nirvdrum has joined #jruby
nirvdrum has quit [Ping timeout: 260 seconds]
nirvdrum has joined #jruby
nirvdrum has quit [Ping timeout: 258 seconds]
nirvdrum has joined #jruby
nirvdrum has quit [Ping timeout: 240 seconds]
rusk has joined #jruby
ksawery[m] has quit [Quit: Idle for 30+ days]
lopex[m] has left #jruby ["Kicked by @appservice-irc:matrix.org : Idle for 30+ days"]
pedran[m] has left #jruby ["Kicked by @appservice-irc:matrix.org : Idle for 30+ days"]
anubhav8421[m] has quit [Quit: Idle for 30+ days]
mischa[m] has quit [Quit: Idle for 30+ days]
louis[m] has quit [Quit: Idle for 30+ days]
nieve[m] has quit [Quit: Idle for 30+ days]
ilikeorangutans[ has quit [Quit: Idle for 30+ days]
amiracam[m] has quit [Quit: Idle for 30+ days]
ezzeddine[m] has quit [Quit: Idle for 30+ days]
harald[m]3 has quit [Quit: Idle for 30+ days]
aaronkelton[m] has quit [Quit: Idle for 30+ days]
lucasb has joined #jruby
nirvdrum has joined #jruby
shellac has joined #jruby
shellac has quit [Read error: Connection reset by peer]
nirvdrum has quit [Ping timeout: 252 seconds]
shellac has joined #jruby
nirvdrum has joined #jruby
rusk has quit [Remote host closed the connection]
<headius[m]> Good morning
<headius[m]> lopex: hey last chance to take over docker... we need to get that done
<headius[m]> klobuczek: any updates on your end?
<headius[m]> I just ran a quick benchmark measuring java_method and did not see a substantial slowdown yet
<headius[m]> oddly enough it's a bit faster
shellac has quit [Ping timeout: 258 seconds]
joast has quit [Quit: Leaving.]
<lopex> headius[m]: yep
<headius[m]> I have a patch that splits up java_method searching by name, so for non-overloaded cases it will just be the cost of a stream-based filter and for overloaded cases it will be that plus a linear search of all overloads
<headius[m]> remaining overhead at this point appears to be mostly the cost of creating the Ruby Method and the internal invoker for the method
fzakaria[m] has quit [Ping timeout: 240 seconds]
harpritsinghkohl has quit [Ping timeout: 240 seconds]
sandio[m] has quit [Ping timeout: 240 seconds]
claudiuinberlin[ has quit [Ping timeout: 240 seconds]
slackfan[m] has quit [Ping timeout: 240 seconds]
JasonRogers[m] has quit [Ping timeout: 240 seconds]
byteit101[m] has quit [Ping timeout: 240 seconds]
BlaneDabneyGitte has quit [Ping timeout: 256 seconds]
rebelwarrior[m] has quit [Ping timeout: 256 seconds]
FlorianDoubletGi has quit [Ping timeout: 256 seconds]
harlemsquirre4 has quit [Ping timeout: 240 seconds]
kares[m] has quit [Ping timeout: 240 seconds]
ChrisSeatonGitte has quit [Ping timeout: 240 seconds]
<headius[m]> enebo: you around today?
fzakaria[m] has joined #jruby
BlaneDabneyGitte has joined #jruby
FlorianDoubletGi has joined #jruby
rebelwarrior[m] has joined #jruby
harpritsinghkohl has joined #jruby
slackfan[m] has joined #jruby
byteit101[m] has joined #jruby
claudiuinberlin[ has joined #jruby
sandio[m] has joined #jruby
JasonRogers[m] has joined #jruby
kares[m] has joined #jruby
harlemsquirre4 has joined #jruby
ChrisSeatonGitte has joined #jruby
<headius[m]> klobuczek: so good news and bad news
<headius[m]> finally back at work and got around to benchmarking java_method
subbu is now known as subbu|lunch
<headius[m]> the good news is that it seems faster in 9.2.10... which is sort of bad news for us finding your perf regression
<headius[m]> I've made additional changes to make it even faster but this is not really helping our current conundrum
<headius[m]> I could go further by more aggressively caching the resulting method object, making subsequent lookups cost only as much as building a string key based on name + arg types
<headius[m]> but again that is just icing on the cake...java_method does not appear to be slower in 9.2.10
<klobuczek[m]> @hedius I didn't get a chance to successfully get through the git bisect
<headius[m]> what can I do to help you
joast has joined #jruby
<klobuczek[m]> what's the simplest thing can we do to narrow down the source of the problem?
<klobuczek[m]> in limited time
<headius[m]> well kares suggestion to get timings of your specs is a good one, because that would reduce how much code you have to run to reproduce it
<headius[m]> obviously it's not going to scale well trying to bisect with an 80-minute test suite
<klobuczek[m]> I'm doing that as we speak
<headius[m]> perfect I will stand by then
<klobuczek[m]> so I got down to 1 spec (albeit with a lot of setup) with 3.2 sec vs 5.88 sec
<klobuczek[m]> what do you suggest next?
<klobuczek[m]> I could retry the bisect
<headius[m]> well that sounds like a good candidate
<headius[m]> bisect is probably best since that's pretty short
<headius[m]> so that's the main source of slowdown? You mentioned some extreme increase in test time
<headius[m]> like 15 min to 80 min or something
<klobuczek[m]> yes for the entire suits of ~2K specs
<klobuczek[m]> 5.88 sec is the slowest one out of the 2K
<klobuczek[m]> and it exposes only factor ~2, but might be good enough
<headius[m]> kares: question for you
<headius[m]> I ran into this today while optimizing java_method: https://github.com/jruby/jruby/commit/250976c35aed2194d78f3b4e2505602ca9a5e65d
<headius[m]> ah wait nevermind, need to bisect further back
<headius[m]> that prettyParams has an unusual amount of overhead because it bottoms out in some native code to get the canonical name of the arg types
<headius[m]> I'm trying to track whether we added it as a nice to have or what
<headius[m]> wow
<headius[m]> ok really nevermind...this goes all the way back to when I added java_method
<headius[m]> in 2009 😳
<headius[m]> klobuczek: ok I think bisecting with that spec is a good path for now then
<klobuczek[m]> yes it's show the difference very clearly, 2 steps remaining
<headius[m]> fingers crossed
<headius[m]> hmmm
<headius[m]> that is surprising
<headius[m]> previous commit returns that spec to original speed?
<klobuczek[m]> yes
<headius[m]> this is only used when constructing backtraces for `caller`
<klobuczek[m]> what is the previous commit you want me to confirm?
<enebo[m]> my browser crashed and I lost tabs including this one
<klobuczek[m]> * which previous commit you want me to confirm?
<headius[m]> klobuczek: whatever commit is before that one you bisected to
<headius[m]> I am looking at whether `count` passed in there might be unreasonably large in some places
<enebo[m]> raw arraylist will set to 0 len list vs actually alloc'ing one
<enebo[m]> but we will never not populate 0 into caller
<enebo[m]> and this is to make a backtrace which is already expensive
<headius[m]> yeah
<headius[m]> so any overhead added here shold be dwarfed by the backtrace cost in the first place
<enebo[m]> I believe 15ef947ac423baf5c68c5b2a4f5a58e668eb7db7 is commit before it
<headius[m]> klobuczek is on Java 11 though so it will be using StackWalker, which is an interesting wrinkle
<klobuczek[m]> ok, I just wanted to ask, becuase it's not easy to find out from bisect what's after or before
<enebo[m]> I did 'git co ed72b695584909e3db384732c85643dcafbe337d^'
<enebo[m]> If I know my git ^ magic which it is possible I don't :)
<headius[m]> don't see why it would have a big impact unless someone's passing a huge length to `caller`
<klobuczek[m]> 15ef947ac423baf5c68c5b2a4f5a58e668eb7db7 is good
<headius[m]> and yeah `git log <hash>^` should show you parent
<enebo[m]> 15ef947ac423baf5c68c5b2a4f5a58e668eb7db7 seems like it is more interesting for a change...it passes true/false on backtrace and in paths which supposedly have one it will then call a helper to populate it vs potentially doing nothing if there is none
<headius[m]> klobuczek: wow
<enebo[m]> yeah I am surprised :)
<headius[m]> klobuczek: ok on either commit run that spec passing this to JRuby: `-Xlog.callers`
<headius[m]> and gist the output
<headius[m]> `JRUBY_OPTS=-Xlog.callers`
<headius[m]> I hope it's not slower because of --profile
<enebo[m]> will that spec run without --profile?
<headius[m]> I assume --profile is for timing specs
<headius[m]> that's not jruby --profile
<klobuczek[m]> yes, I think not much difference in time
<headius[m]> ok
<enebo[m]> klobuczek: if you are sure that is good to know
<klobuczek[m]> running without now
<headius[m]> I'm just wondering if --profile is making `caller` calls for anything... but if it's one spec I wouldn't expect to see much overhead from that
<klobuczek[m]> with the -Xlog I got now into the bad time range on the good one
<headius[m]> the log flag is just for our information, not timing
<headius[m]> I just want to see where those backtraces are coming from
<enebo[m]> I don't see a mistake in that commit but I really wonder if somehow this is constructing a MAX_VALUE sized backtrace
<enebo[m]> That would totally explain it
<enebo[m]> :)
<headius[m]> enebo: yeah it filters MAX_VALUE but does not filter unreasonably large sizes smaller than that
<enebo[m]> it is just it passwd it around
<headius[m]> of course if it were constructing a MAX_VALUE arrayit would consume something like 2B * 8 bytes
<headius[m]> seems unlikely it's that large
<enebo[m]> we wil laugh when we find that JVM 11 bug which improerly passes Integer.MAX_VALUE as a parm
<enebo[m]> return constructBacktrace(boundMethods, Integer.MAX_VALUE);
<enebo[m]> That line just looks scary
<headius[m]> not sure what you mean though... this won't construct a trace that large given this code
<headius[m]> that line was there before... it's just a way to say "all frames"
<enebo[m]> I did not see a trace so I don't know what code is being called
<headius[m]> -1 might be safer but it also was used as an upper bound before, not used for size calc
<enebo[m]> yeah I am just saynig if count == Integer.MAX_VALUE fails for some reason it would alloc MAX_VALUE
<enebo[m]> It shouldn't
<headius[m]> I don't think it would ever fail to detect if count is MAX_VALUE unless the JVM is broken
<headius[m]> but MAX_VALUE - 1 would get through
<enebo[m]> Just thinking about how changing alloc size on something making a backtrace could possibly overshadow even the backtrace much less anything else
<headius[m]> it would have to be pretty big to overshadow, yes
<enebo[m]> yeah which is how I got there
<headius[m]> klobuczek: other thing you could do would be to get a timing of the FULL suite with that previous commit
<headius[m]> if this is really the problem then you should see times go back to normal
<headius[m]> or git revert <that hash> on jruby-9.2 branch and run from that
<enebo[m]> while (stackIter.hasNext() && trace.size() < count) {
<enebo[m]> No problem there unless somehow stackIter.hasNext is not working as expected
* klobuczek[m] posted a file: 15ef947ac423baf5c68c5b2a4f5a58e668eb7db7.log (1087KB) < https://matrix.org/_matrix/media/r0/download/matrix.org/rgrsTDwcQaGfJgVXoDSdcJyf >
<headius[m]> interesting
<headius[m]> lots of active_support
* klobuczek[m] posted a file: ed72b695584909e3db384732c85643dcafbe337d.log (1087KB) < https://matrix.org/_matrix/media/r0/download/matrix.org/HAkYErgvXZtDSQSgoXJqmgeH >
<klobuczek[m]> I will be out for half an hour
<headius[m]> ok
lucasb has quit [Quit: Connection closed for inactivity]
<headius[m]> default ArrayList capacity is 10, allocated on first add
<headius[m]> I'm not sure this change really saves us much... the growth factor is 1.5 it looks like
<headius[m]> so default 10, 15, 22, 33, 50, 75, 112 or something
<headius[m]> I think a cap at something like Math.max(count, 100) would be a good enough initial size if a large size is requested
<headius[m]> I mean min
<enebo[m]> I just don't see how the 10 part could cause an entire suite to drop that much
<enebo[m]> I really would expect it to be a bit faster
<enebo[m]> unless memory is super super tight
<enebo[m]> even then this stuff is eden garbage
<headius[m]> yeah
<enebo[m]> or are we lazy and retaining this for a long time
subbu|lunch is now known as subbu
<headius[m]> I don't even see a caller trace in this fanout.rb file
<headius[m]> hmm
<headius[m]> neo4j is doing these calls
<enebo[m]> hmmm
<enebo[m]> maybe with lots of 0 element stacks?
<enebo[m]> heh
<enebo[m]> I mean I still don't quite get this...perhaps with huge ass count much larger than the actual trace
<enebo[m]> you could see if they were doing 20000000 or something how this opt would be bad
<enebo[m]> but why
<headius[m]> yeah it seems like it would have to be requesting a very large count from `caller` or `caller_locations`
<headius[m]> I can't find this querable.rb file in any neo4j
<headius[m]> I think this is the line in neo4j: source_line, line_number = Logging.first_external_path_and_line(caller_locations)
<headius[m]> there's an option to turn off this instrumentation
<headius[m]> klobuczek: another thing to try, which I think would get rid of all these caller calls, would be this item from neo4j-core readme:
<headius[m]> ```All adaptors will also take the skip_instrumentation option to skip logging of queries.```
<headius[m]> I can't match your call stack exactly but I think that's what's causing all those caller calls at runtime
<headius[m]> which would certainly affect the entire suite if something broke in there to make it more expensive
<enebo[m]> logging 'count' in that method would certainly be interesting to see
<headius[m]> enebo: if it's this line of code I found then count will come in as MAX_VALUE
<headius[m]> because it's just a bare `caller_locations` with no size
<headius[m]> klobuczek: I can't find active_graph/querable.rb in latest neo4j
<headius[m]> that shows in your log as being the source of almost all the runtime `caller` backtraces
<headius[m]> hmm
<enebo[m]> I don't really see the value in this commit but with that said I find it super weird this could possibly be the cause of a problem
<enebo[m]> Unless stackwalker ends up returning 0 elements a lot by neo4j it will always ensureCapacity to 10 right away on the first add
<enebo[m]> but even so...a 10 element array...how expensive could that be between quite a bit of ruby code
<headius[m]> oh no
<headius[m]> caller and caller_locations without size specified don't use MAX_VALUE, they use this
<headius[m]> well I guess we found the problem
<enebo[m]> yeah in my mind an excessively large arraylist was the only thing which would make sense
<enebo[m]> it is processing a backtrace after all...the thing we do the slowest in our impl
<headius[m]> yeah
<headius[m]> probably could have had klobuczek run an alloc profile
<headius[m]> rather than revert that commit I think we should choose something suitable as a max initial size
<headius[m]> that will avoid having many small arraycopy for an unbounded backtrace but not allocate something huge
<enebo[m]> This opt really should have almost no values based on how arraylist is impld
<enebo[m]> the 16 part of it at least
<enebo[m]> err 10
<headius[m]> maybe simply Math.min(count, 100) as the initial size
<enebo[m]> whatever
<headius[m]> 100 is negotiable
<enebo[m]> for a large pre-known it can be but you may not end up with that many so I guess this is the tradeoff
<headius[m]> right
<enebo[m]> a bad caller kills stuff dead
<headius[m]> 100 may be too much since this is Ruby calls
<headius[m]> I dunno maybe I'm coming back around to just reverting it
<enebo[m]> I wonder what the shortest stack even is
<enebo[m]> it may be >100 :)
<headius[m]> ruby calls
<headius[m]> so shortest stack will be 1
<enebo[m]> 2 for main perhaps yeah
<headius[m]> that's one reason this is a weird unbounded thing
<headius[m]> I don't know how many frames out of the Java trace will produce a Ruby frame
<headius[m]> so it just consumes Java frames until meeting the count of Ruby frames requested
<enebo[m]> I have no opinion on this fix
<headius[m]> 100 elements will mean unbounded calls always allocate 800 bytes for the array
<enebo[m]> Large stacks are improved by passing in a big value
<enebo[m]> At some smaller levels we probably cannot see any difference doubling our way to 10 ruby calls or what not
<headius[m]> 800ish, maybe 400 with compressed oops but I don't know how that works
<headius[m]> 800 bytes versus pretty much any full caller trace seems like less than noise
<enebo[m]> something big we can but I just don't know...
<enebo[m]> yeah this is heuristic land and I don't think we can just make a guess. I think we should revert for .11 and then maybe kares might want to re-heuristic this
<enebo[m]> Especially since this is time in generating a backtrace which is already pretty slow
<headius[m]> lol
<enebo[m]> HAHAHA
<enebo[m]> hey comment out back to new ArrayList() and time it
<headius[m]> I agree with revert and reexamine
<enebo[m]> I guess it will be same speed as there is not much trace there
<headius[m]> yeah should be 9.2.9 because that's what it did
<enebo[m]> not much == < 10
<headius[m]> same as 9.2.9
<headius[m]> this a worst case but clearly allocating 2^23 * 8 bytes for every caller trace is double plus ungood
<headius[m]> heh that's 64MB
<headius[m]> sweet
<headius[m]> I will revert on jruby-9.2
<enebo[m]> ok
<headius[m]> did klobuczek open an issue about this?
<headius[m]> I pushed the revert
<headius[m]> and merged to master
<headius[m]> ah I did not tag issue but will mention the commit in that bug and ping kares
<klobuczek[m]> so if we do `rvm install jruby-head` will we get master?
<enebo[m]> klobuczek: check on jruby-9.2 branch HEAD
<enebo[m]> master already has many other changes
<headius[m]> klobuczek: as I mention there, this instrumentation is still very expensive so hopefully it's not used in production
<headius[m]> I don't know if this is the answer to your spec suite regression but it seems likely
<headius[m]> so thank you for your extra footwork on this one
<klobuczek[m]> by instrumentation you mean the `caller `and `caller_location` calls?
<headius[m]> yes
<lopex> to it means it's there for real ?
<headius[m]> I was not quite sure what version of neo4j to look at, since I saw it has been renamed to active_graph, but the caller traces in your output show that `querable.rb` is initiating some ActiveSupport instrumentation that must eventually be using `caller_locations`
<headius[m]> that instrumentation is guarded by the `skip_instrumentation` option
<headius[m]> lopex: I tihnk it's still experimental in this release
<lopex> I gather it will be quite a hassle to implement efficiently
<klobuczek[m]> yes, the neo4j is parially renamed (gem name so far). I'll reconsile this soon.
<headius[m]> lopex: do you know how pattern matching is implemented in other runtimes? Is it just a cascade of if's?
<lopex> headius[m]: no, but I remember is was a big pita for scala
<headius[m]> I assumed that statically-typed languages were able to statically dispatch to the right case in O(1) for many cases
<headius[m]> at least for things like case classes
<lopex> no in scala it was still pita since it leads to exponential check pretty quickly
<headius[m]> I assume any langauges with arbitrary expressions for the cases would have to be O(n)
<lopex> headius[m]: like the distinction between values and types
<headius[m]> mm yeah
<lopex> like case (_:String) | (_:Integer)
<headius[m]> maybe a very limited set of forms could be O(1)
<headius[m]> in any case Ruby will be no better than case/when
<lopex> vs ADTs
<headius[m]> I think the only thing you get is more explicit destructuring
<lopex> but I dont know how deeply nested can it be wrt types/values
<lopex> like casing a hash, like, {foo: an integer, bar: a value}
<headius[m]> yeah I have not followed the spec
<lopex> the doc doesnt mention that combination though
<headius[m]> hah
<headius[m]> better than bleach I guess
<lopex> yeah I heard about it
<lopex> then got lost on wiki and found this https://en.wikipedia.org/wiki/Megavirus#Structure
<lopex> "Surprisingly, the Megavirus is larger than some bacteria, particularly the ones it preys on."
<klobuczek[m]> @hedius the instrumentation in Querable is simply for the purpose of logging queries and it is used in production. We inherited that code. Are you saying that using active_support/notifications is not appropriate for that purpose?
<headius[m]> I'm saying it's expensive
<headius[m]> a single caller backtrace can take a few milliseconds on JRuby, so if you have many instrumentation calls happening per request it will reduce throughput
<headius[m]> with that bad commit reverted you should be back to previous performance though
<headius[m]> it's just something to consider... it would be worth testing throughput with `skip_instrumentation` on
<klobuczek[m]> yes just confirmed 9.2.9.0 and jruby-9.2 identical times
<headius[m]> that's great news
<headius[m]> we can look at a 9.2.11 release early next week then
<klobuczek[m]> thank you for your support and patience
_whitelogger has joined #jruby
<headius[m]> klobuczek: can you add a comment saying that https://github.com/jruby/jruby/issues/6089 is fixed for you now?
<headius[m]> enebo: idea for reducing overhead of unbounded caller....
<headius[m]> Ruby should add a form that takes a block to process each element
<headius[m]> that will allow users to have unbounded caller but bail out early, which in JRuby would mean we can use StackWalker to reduce how many frames get generated
<headius[m]> so caller_locations with no size would just grab full trace right away... with size or with block would do it iteratively
<headius[m]> the basic problem with cases like this instrumentation is when only one specific frame is desired that should be close to the caller location
<headius[m]> users may not know how many levels to ask for but know they want to bail out early
<headius[m]> wat
<headius[m]> enebo: nothing else is open for 9.2.11
nirvdrum has quit [Ping timeout: 258 seconds]
<headius[m]> enebo: are you able to run tests for jruby-openssl on Linux?
<headius[m]> I have pushed this but it will need testing and release and Darwin still can't run the specs green for me locally: https://github.com/jruby/jruby-openssl/pull/194
<headius[m]> kares: you may be interested in that one too
drbobbeaty has quit [Quit: Textual IRC Client: www.textualapp.com]