#jruby on 2018-08-28 — irc logs at freenode.irclog.whitequark.org

2018-05-24 16:34 ChanServ changed the topic of #jruby to: Get 9.2.0.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

02:20 rdubya has quit [Ping timeout: 264 seconds]

03:15 rdubya has joined #jruby

03:19 rdubya has quit [Ping timeout: 252 seconds]

04:15 rdubya has joined #jruby

04:21 rdubya has quit [Ping timeout: 264 seconds]

04:44 Puffball_ has quit [Read error: Connection reset by peer]

07:52 KeyJoo has joined #jruby

07:54 drbobbeaty has quit [Ping timeout: 252 seconds]

07:56 drbobbeaty has joined #jruby

09:14 drbobbeaty has quit [Ping timeout: 252 seconds]

10:57 drbobbeaty has joined #jruby

11:00 rdubya has joined #jruby

11:05 rdubya has quit [Ping timeout: 252 seconds]

11:11 rdubya has joined #jruby

11:37 isavin has joined #jruby

11:39 <isavin> older downloads seems to be unavailable (e.g. https://s3.amazonaws.com/jruby.org/downloads/1.6.8/jruby-bin-1.6.8.tar.gz). can someone have a look? or does anyone know an alternate source for this official releases?

11:39 <isavin> *release

11:53 eregon has joined #jruby

12:17 isavin has quit [Remote host closed the connection]

14:02 Crocket has joined #jruby

14:10 <headius> isavin: this is unfortunately true...the S3 bucket we kept them in was wiped out mistakenly by the company hosting us and we have only restored builds going back a certain amount of time

14:10 <headius> do you need the full tarball or could you make do with just the "complete" jar file?

14:10 <headius> enebo: ^^

14:11 <headius> since JRuby 1.7.6 (I think) we started pushing the tarball to Maven Central so newer releases are all available that way

14:11 <headius> prior to that the tarballs only existed on our bucket, which was not properly set up and then got nuked

14:17 <headius> in other news...good morning all!

14:17 <enebo> I do have 1.6.8 on the nas

14:18 <enebo> so I will restore that one

14:21 <headius> 👍

14:21 <headius> don't want to just restore everything you have on the NAS?

14:21 <headius> at least we have a guarantee now they won't wipe it out

14:35 <enebo> headius: yeah probably should now...having it wiped out a second time made me not want to go through it all again

15:33 rtyler has quit [Remote host closed the connection]

16:03 xardion has quit [Remote host closed the connection]

16:08 xardion has joined #jruby

16:57 <headius> enebo: I don't blame ya

17:15 <headius> grrr

17:16 <headius> I can't get this interrupt test to fail locally and don't see why it would fail on travis

17:38 Puffball has joined #jruby

17:49 <subbu> 1.6.8 seems like ... what ... a decade old?

17:59 <headius> prehistoric

17:59 <headius> wow, 2012

17:59 <headius> it's over 6 years

18:07 subbu is now known as subbu|lunch

18:33 kares has quit [*.net *.split]

18:36 kares has joined #jruby

18:44 subbu|lunch is now known as subbu

18:45 Puffball_ has joined #jruby

18:49 Puffball has quit [Ping timeout: 272 seconds]

18:50 Puffball has joined #jruby

18:54 Puffball_ has quit [Ping timeout: 272 seconds]

19:00 <headius> ok, master is green again with some timeout tweaks

19:00 <headius> if anything goes red spuriously we'll deal with it immediately...these flaky results need to end

19:35 <xardion> Man, I feel your pain.

19:35 <xardion> I've been trying to run the same test quite for a couple hours now.

19:35 <xardion> suite*

19:36 <xardion> Our chef repo is set up to run manadatory Travis CI builds of the chef spec

19:36 <xardion> and one of the specs keeps failing because yum repository mirrors are being flaky today

19:37 <xardion> I got it to pass once, but somebody had merged while my specs were running so I had to re-run them *rage*

19:37 <xardion> To be fair, this repository is too goddamn big and needs to be broken up, but nobody wants to put in the time to deal with the cross-cookbook dependency headache that breaking up this repo would require.

19:50 <headius> xardion: yeah I just had one job completely fail to start on travis and I regularly see our maven builds stall for no obvious reason

19:51 <headius> I mean, you get what you pay for, but these failures are confusing and infuriating

19:51 <headius> I've been trying to break up our longer suites also

19:55 <ChrisBr> headius: ever thought switching to CircleCi?

19:56 <ChrisBr> we recently did and I don't look back ... altough everything needs to run in a container there!

20:22 <headius> is there a free option?

20:23 <headius> we are not exactly a rich project

20:29 <ChrisBr> yup

20:30 <headius> well that's definitely worth investigating then

20:30 <ChrisBr> there is a free option and if you're an open source project you get some more containers even

20:30 <headius> travis has served us well but these spurious unexplanable failures are really frustrating

20:30 <ChrisBr> yup

20:30 <ChrisBr> and Circle has live debugging

20:30 <ChrisBr> so you can login via SSH to the contaienr

20:31 <ChrisBr> I didn't use it very often but is a nice option if you dont cant reproduce it locally

20:31 <ChrisBr> and they have uploading assets included (e.g. logs, screenshots ...)

20:32 <ChrisBr> so you're running Travis on the free plan then?

20:32 <ChrisBr> btw: I think e.g. rails runs Travis & Circle in parallel

20:35 <headius> yeah we are just free plan

20:35 <headius> Travis guys might have bumped up our job count but I dunno

20:35 <headius> they did run a bunch of stuff on JRuby for a long time

20:37 <ChrisBr> headius: I would think so, IIRC with the basic plan you can only run two jobs in parallel

20:38 <headius> ah sure

20:38 <headius> I think we have 5

20:39 <ChrisBr> yeah, otherwise that would take forever ...

20:41 <ChrisBr> "We offer a total of four free linux containers ($2400 annual value) for open-source projects. Simply keeping your project public will enable this for you!" https://circleci.com/pricing/

20:41 <ChrisBr> but the pricing model is quite untransparent to be honest ...

20:44 <headius> lopex: we could have a different ByteCodeMachine for regionless matching that's a little smaller, right?

20:44 <headius> I'm not sure alloc is what's keeping us from matching MRI but making a smaller object would help if so

20:45 <headius> "This application will be able to read and write all public and private repository data." 😬

20:46 <headius> I mean I get that it needs access but I'd love to see this a bit finer-grained :-)

20:48 <lopex> headius: you mean the "match?"

20:48 <headius> yeah I was looking at it in passing

20:49 <lopex> headius: there's one more alloc for repeat stack

20:49 <headius> I suspect maybe this is back to big switches optimizing poorly

20:49 <headius> but if it's alloc we can make some stuff smaller

20:49 <lopex> headius: but then you'd have to account for (.)\1 and the like

20:50 <lopex> headius: whole "region" range update is tiny little thing there

20:50 <lopex> headius: do you profile it ?

20:50 <lopex> *did

20:51 <headius> I did an alloc profile

20:51 <headius> it's nothing we don't know...all ByteCodeMachine and int[]

20:51 <lopex> so the int come from that repeat stack alloc

20:51 <lopex> *comes

20:52 <headius> ah, I see

20:52 <headius> hmm

20:52 <headius> lazy?

20:52 <lopex> headius: https://github.com/jruby/joni/blob/master/src/org/joni/StackMachine.java#L56

20:52 <lopex> no

20:52 <headius> ah

20:53 <lopex> mri does that too

20:53 <headius> yeah but MRI probably alloca's it

20:54 <lopex> they also dont pool stacks

20:54 <headius> I suppose lazy wouldn't help here if it's usually going to be needed when it's present

20:54 <lopex> headius: and sampling profile tells anything ?

20:55 <headius> I'll do a run

20:55 <headius> I'm guessing it's not going to be much new info either

20:55 <lopex> headius: I know joni has some trouble wth chars class perf

20:56 <lopex> headius: whast about match? against a single char ?

20:56 <headius> well enebo said we suffer most for small strings

20:56 <headius> longer strings I think we make up the difference or beat MRI

20:56 <headius> that makes it seem like the alloc

20:57 <lopex> headius: so in this case the match method in matcher is kind of big

20:57 <headius> yes

20:57 <lopex> headius: forwardSearchRange

20:58 <headius> hmm yeah

20:58 <headius> I should check jit logs and see if some of these are getting kicked out

20:58 <lopex> er searchCommon

20:58 <headius> I don't see any significant number of ticks in interpreted code though

20:59 <headius> 66.3% 10507 + 1 org.joni.ByteCodeMachine.executeSb

20:59 <headius> 27.6% 4372 + 0 org.joni.Matcher.searchCommon

20:59 <headius> that's sampled

21:00 <lopex> is that the regexp from the issue ?

21:00 <headius> yeah

21:00 <headius> they both jit

21:01 <lopex> I'll show you the execution

21:01 <headius> ok

21:03 <lopex> headius: https://gist.github.com/lopex/32f898523bf3670488298d22cdb7d792

21:04 <headius> ok

21:05 <lopex> headius: I'd check how it compares against empty regexp and empty string

21:06 <headius> with both empty it's all executeSb

21:07 <lopex> it executes only end opcode

21:07 <lopex> and mri ?

21:07 <headius> perf? I'll check

21:10 <headius> 10M match? of empty/empty is 0.60s in JRuby and 0.78s in MRI

21:10 <headius> some of that is the loop itself

21:11 <headius> Graal JIT in 11 lowers JRuby to 0.48

21:11 <lopex> afaik we are slower for char class execution

21:11 <lopex> thats a lot of array hopping

21:12 <headius> maybe we should be using Unsafe to skip the bounds checks

21:12 <headius> "should"

21:12 <lopex> both the bitset and CodeRange

21:13 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

21:13 <lopex> er

21:14 <lopex> hmm I dont believe return ((code[ip + (c >>> BitSet.ROOM_SHIFT)] & (1 << c)) != 0); is the fault

21:19 <lopex> headius: I dunno, the switch has dense values, the order shouldnt matter if there's breaks in all cases right ?

21:20 <headius> oh god I figured out why this autoload test fails

21:20 <headius> it's a bad test

21:20 <headius> it spins up ten threads and uses a global to choose one to sleep and the others to execute

21:20 <headius> but we run it all at once and so more than one sleeps

21:21 <headius> 🙄

21:21 <headius> lopex: it shouldn't but C2 is notoriously bad at optimizing big switches

21:21 <headius> that's why the parser uses a command pattern

21:22 <lopex> yeah, I remember you mentioning that wrt joni

21:22 <headius> I think all those branches mucks up the profiling

21:23 <lopex> it shouldnt be hard to convert at first thought

21:23 <headius> it might be worth a try...it had a large impact on parser perf

21:26 <enebo> lopex: headius: https://gist.github.com/enebo/e328bcd1161dca0583aba7a3f9829d61

21:26 <enebo> My memory was bad we do not make up much with larger strings

21:26 <enebo> but with that said we are executing a single match in a block which did the results no favors

21:26 <headius> ok

21:26 <enebo> If this is in a method we no doubt look a bit better

21:27 <enebo> but I still expect/hope joni can beat oni even with the block overhead

21:27 <headius> this test is gross

21:28 <lopex> enebo: then EXACT_BM might be the fault

21:29 <lopex> enebo: headius: quite a bit of array hopping https://github.com/jruby/joni/blob/master/src/org/joni/SearchAlgorithm.java#L323

21:29 <lopex> and almost nothing in the interpreter for those longer cases

21:30 <enebo> lopex: what is regex.intMap

21:30 <lopex> enebo: skip map for map search

21:31 <lopex> enebo: https://github.com/jruby/joni/blob/master/src/org/joni/SearchAlgorithm.java#L357

21:31 <enebo> lopex: I am partially surprised if this a if/else with that check vs something calling two methods which don't have this check (e.g. make a smaller method)

21:31 <lopex> it forwards faster

21:31 <headius> I have no idea what this test is supposed to do

21:31 <headius> hmm

21:31 <lopex> what test ?

21:31 <lopex> ah

21:31 <headius> autoload thing, pay no attention

21:32 <headius> multitasking

21:32 <enebo> doh

21:32 <lopex> enebo: there's regexp.map and intMap

21:33 <enebo> lopex: map and intMap is the same type right?

21:33 <lopex> enebo: but those are tight inner loops

21:33 <lopex> both should be int[]

21:33 <enebo> this could be a single set of whiles if you save proper int[] to a local variable

21:33 <lopex> enebo: no, map is byte[]

21:34 <lopex> enebo: yeah this is very old code, I havent paid attention for a long time

21:34 <enebo> lopex: I don't know if size matters but I was thinking about notion of two search methods where you call one depending on which one you work with

21:35 <lopex> aaah

21:35 <enebo> I guess this method calls no methods and even with half the whiles it maybe would not inline

21:35 <lopex> enebo: oni has a new variation called sunday search

21:36 <headius> screw it, I'm going to quarantine this test and open an issue

21:36 <enebo> when we call match? which path are we going done?

21:36 <enebo> down

21:37 <lopex> depends on the regexp

21:37 <enebo> lopex: ok but what feature triggers intMap vs map

21:38 <enebo> this method is pretty damn irreducible other than the if/else making the method larger

21:38 <lopex> enebo: fixed parts in the regexp

21:38 <lopex> like /foo..bar/

21:39 <lopex> foo will trugger that since thet fast skip want for find the interesting part using that

21:39 <lopex> *trigger and *wants

21:39 <enebo> so foo uses which one?

21:39 <enebo> intMap or map

21:40 <lopex> enebo: https://github.com/jruby/joni/blob/master/src/org/joni/Regex.java#L315

21:40 <lopex> enebo: and https://github.com/jruby/joni/blob/master/src/org/joni/Regex.java#L288

21:40 <lopex> two answers

21:40 <enebo> oh I see

21:41 <lopex> so if >= 3 the ma[

21:41 <lopex> ma[

21:41 <lopex> er, map

21:41 <enebo> so map is used if it is small enough and intMap for codepoint matching

21:41 <lopex> er intMap

21:42 <lopex> if fixed part exceeds 256 then map

21:42 <headius> ok sorry, I'm back to joni stuff

21:42 <enebo> 256 distinct characters it can match?

21:43 <lopex> enebo: it's the length of exact part

21:43 <lopex> aka fixed

21:43 <enebo> oh so foo is 3

21:43 <enebo> which would use map

21:43 <lopex> yes

21:44 <lopex> so I'm almost sure we're loosing here

21:44 <lopex> since the execution of this is:

21:44 <enebo> lopex: so we could make BM_EXACT_SMALL and BM_EXACT_LARGE but that would not only eliminate a single if and reduce the algo size

21:44 <lopex> enebo: https://gist.github.com/lopex/99bd7403bede55a26a19761ba54b324f

21:45 <enebo> s/not//

21:45 <enebo> fwiw those benchmarks are a bit stupid and highly contrived

21:45 <lopex> enebo: we should port sunday search from them

21:46 <enebo> no doubt I just exposed BM_EXACT being slower and not overall why match? is slower

21:46 <enebo> lopex: sounds good to me

21:46 <enebo> lopex: how big is it?

21:46 <lopex> enebo: because almost nothing is done in ther interpreter in this case

21:46 <lopex> enebo: so that search seems to dominate

21:47 <enebo> did the 9:30 string match? we were looking at earlier also do exact_bm?

21:47 <lopex> enebo: but it still doesnt explain our perf for the example from thata issue

21:47 <enebo> seems like a lot of char ranges

21:48 <enebo> lopex: ok so my made up longer strings would get helped and would help regexp with static text in them converting to sunday search

21:48 <enebo> lopex: but the bench in the issue is unrelated

21:48 <headius> I have to leave cafe, bbiab

21:48 <lopex> enebo: agreed

21:49 <enebo> lopex: ok the issue reported one I think happens as part of common Rails activity

21:49 <enebo> I think kares came up with that regexp as being hit

21:49 <enebo> lopex: the ones I added was just to see how overhead of regexp changed as the match string got longer

21:50 <enebo> lopex: so not important although knowing a better algo exists is probably important

21:50 <enebo> lopex: since I think some of the date processing regexps do matches like (sun|mon|tue...)

21:50 <enebo> lopex: ^ is this alternation still a collection of exact_bm?

21:53 <lopex> enebo: almost, different skip map algo

21:53 <lopex> for sun|mon|tue

21:54 <lopex> enebo: it will have a skip map [m, s, t]

21:56 <lopex> so for /sun|mon|tue/ =~ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaasunbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbfoobar"

21:56 <lopex> enebo: https://gist.github.com/lopex/38b969c70ab900a69e52ced99fcd69ad

21:58 <lopex> enebo: I forgot how it worked though

21:58 <lopex> ah

21:58 <enebo> lopex: so it is like switch(first) case 'm': match_exact 'mon' ...

21:58 <lopex> yes

21:59 <enebo> lopex: but it starts on the 'o' in match_exact right?

21:59 <enebo> since it knows from first instr that m matched?

21:59 <lopex> what 'o' ?

22:00 <enebo> 'mon' 'm' was determined by the skip map

22:00 <enebo> so the second character of mon

22:00 <lopex> enebo: hmm, the order might not matter

22:00 <lopex> since it's and alteration

22:00 <lopex> *anm

22:01 <lopex> enebo: oh, it's sorted

22:01 <enebo> lopex: but regardless of order if it determines first character in skip map then second char in where it jumps next should be next thing looked at

22:02 <enebo> lopex: I guess I am saying after skip map determines it might be 'mon' because it saw 'm' does it actually look at 'm' again in exact?

22:02 <lopex> yes

22:02 <lopex> ah

22:02 <lopex> yeah

22:02 <lopex> enebo: it always reassure in the interpreter if that what you mean

22:03 <lopex> *reassures

22:03 <enebo> lopex: well I am just wondering if one less compare would make a difference

22:04 <enebo> seems like exact3:on would be better bytecode if you know m will match from skip map

22:04 <lopex> enebo: but it need to build a backtrack first

22:05 <lopex> cant have it both ways

22:05 <lopex> enebo: that's why you have push first

22:05 <enebo> lopex: well I clearly don't understand this

22:05 <enebo> lopex: and unfortunately I need to go now :P

22:06 <lopex> enebo: or we're missunderstanding each other

22:06 <enebo> lopex: but I should maybe hang with you and try and learn basics of oni/joni bytecode one of these days

22:06 <enebo> I think if I did I would not ask so many long questions

22:06 <enebo> lopex: but I will talk to you later buddy

22:07 <lopex> enebo: yeah, I'm off the steam now too

22:07 <lopex> and also forgot lots of that parts of code

22:08 rtyler has joined #jruby

22:27 drbobbeaty has joined #jruby

22:35 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]