<kares>
will get it green by reverting or trying newer jruby for Maven - its just that its really hard to get the maven jruby plugin right for the whole matrix ;(
<GitHub47>
[jruby-openssl] kares created ci-green (+1 new commit): https://git.io/fN4r0
<GitHub47>
jruby-openssl/ci-green 33475eb kares: [build] try using later 1.7 JRuby for maven plugin
sgeorge has quit [Remote host closed the connection]
sgeorge has joined #jruby
shellac has joined #jruby
jrafanie has joined #jruby
<enebo>
ChrisBr: yeah those test:mri ones look odd
<enebo>
ChrisBr: we do sometimes have timeouts but all three of those seem to be in the same spot
<enebo>
ChrisBr: can you run test:mri and see if you see that
<enebo>
I will try and eliminate some of this red this morning...looks like we added a param to CompiledIRMethod and missed it failing our spec:compiler run
<ChrisBr>
enebo: hm ok
<enebo>
ChrisBr: it is weird for all three to hang in the same place
<ChrisBr>
oh is it the same place?
<enebo>
we have spurious tests we should remove but I have never seen 3 runs of test:mri hang in same place in a single run
<enebo>
well I think so. The last text all seem to be around some cgi escaping tests
<ChrisBr>
locally I can not even start the test :/
<enebo>
oh yeah? what happens?
<ChrisBr>
nothing
<ChrisBr>
is stall after starting mri test suite
<ChrisBr>
anyway, already have sth suspicious, maybe some infinite loop in the base iterator
<enebo>
lopex: it dramatically sped up strftime since it was already compiled
<enebo>
lopex: yeah even without viewing this I believe I know what will be said
<enebo>
lopex: in Rails we things like date/times end up using the same regexp/date formats per application and not many of them
<lopex>
enebo: runtime cache for regexp ?
shellac has quit [Quit: Computer has gone to sleep.]
<enebo>
lopex: pre-compile regexps we see more than n times?
<lopex>
enebo: or did you reuse the existing cache ?
<enebo>
err cache
<enebo>
no
<lopex>
ok
<enebo>
lopex: we have regexp cache then?
<lopex>
yes
<enebo>
ok good
<lopex>
weakref one
<kares>
there's already a cache being reused
<lopex>
soft rather
<lopex>
getRgexpFromCache afaik
<enebo>
lopex: If I knew this I forgot about it
<kares>
except not for any regexps having modifiers
<enebo>
oh
<lopex>
enebo: there's three of them
<kares>
thus a bunch of in date parsing doesn't get reused
<lopex>
for quoted and processed regexps too
<kares>
all having //i
<enebo>
haha well that sucks for some of those date ones
<enebo>
i seems like a reasonable thing to cache though
<kares>
haven't checked joni if theres aditional checking - just to caching I have seen in RubyRegexp
<kares>
but for this we could pbly just write them in native and re-use
<enebo>
so iso8601 is not cached because /ix?
<enebo>
That is a big regexp
<enebo>
kares: perhaps caching that will get out perf better than MRI
<enebo>
kares: I also thought about native benefit of value object in Java vs Ruby Hash
<enebo>
for your recent change
<enebo>
m[:hour]
<enebo>
which made me wonder if we are destined short-term just to go full native on this
<lopex>
kares: it's options aware
<lopex>
and encoding
<enebo>
lopex: so //ix regexp caches?
<lopex>
I hope I dont lie to you
<enebo>
lopex: hey every day is a new day to me...I won't blame you if you are wrong
<lopex>
enebo: it should
<enebo>
lopex: ok well that would have been a big find if it isn't
<lopex>
enebo: getRegexpFromCache
<enebo>
yeah this should be fine unless our impl is broken and joni.options is not returning same stuff
<enebo>
pretty trivial for me to test I guess
<kares>
so its being cached then? cool!
<kares>
oh you're still not sure :)
<lopex>
well, if it goes through cache of course
<enebo>
well if all regexps go through this method this should be doing it
<enebo>
unless it alternates the same regexp with difference encodings or options
<enebo>
or we have a bug in the impl
<enebo>
A better impl of this would be to encode the options in front or back of bytelist
<enebo>
then you can have multiple options of same regexp pattern in the cache
<enebo>
I guess you pay the price of adding those bytes to the bytelist during lookup
<enebo>
or two param hash
<enebo>
anyways also not relevant for this problem
Caerus has quit [Read error: Connection reset by peer]
<kares>
which getRegexpFromCache path were you guys looking at?
<kares>
I followed a factory from native and it did not have the cache there for anything having options
<kares>
although I should revisit closely my assumptions ...
<enebo>
kares: ah where does it start calling into regexp?
<enebo>
kares: I am so hoping you are right :)
<kares>
well I only looked at moving some small ones into native
shellac has joined #jruby
<enebo>
I only see RubyRegexp constructor and regexpInitialize call through it
<kares>
yep that one
<enebo>
they both ask the cache
<enebo>
It is interesting to see how much happens before the cache lookup in that method
<kares>
ah right I see it now
<kares>
missed it - sorry for the confusion
<enebo>
kares: np
<kares>
yeah - that is pnly why I missed it
<kares>
since for the other case the cache lookup was in ctor
<enebo>
we need regexpcallsite
<enebo>
then it can bootstrap all this shit and just keep the pattern at the site
sgeorge_ has joined #jruby
sgeorge has quit [Read error: Connection reset by peer]
<enebo>
lol RegexpObjectSite
<enebo>
we seem to have it with indy on but I don't think it elides any of this logic
<enebo>
but it saves the regexp so I guess it works
<kares>
heh
<kares>
RubyRegexp seems to be thread-safe so I might try to keep the same instance around in an internal var
<enebo>
kares: you mean native or Ruby?
<enebo>
kares: using Joni directly may shave some more overhead off too
<lopex>
enebo: but you can ask the cache youreselfe
<kares>
native - I do have some smaller ones rewriten
<kares>
but for some reason it got way slower - I think there's smt wrong cause it should be around ~ same
<enebo>
I see the sun/mon/tue section of your PR
<enebo>
kares: HAHAHA I think I have a weird way of doing day_num faster
<enebo>
((sun)|(mon)|(tue)...)
<enebo>
if $1 is set then if $4 is set then that is tue
<enebo>
not sure if joni will be better than extracting the text then doing caseinsensitivecmp
<kares>
enebo: already have that in native :) based on C
<enebo>
yeah I mean faster than what you have in native
<kares>
yeah MRI is doing exactly that - caseinsensitivecmp
<enebo>
or maybe faster
<kares>
ah okay - let's have it :)
<enebo>
since you will be average 3.5 nil checks
<enebo>
/((sun)|(mon)|(tue)....)/i will match $1 for day match then you look from $2-$8 for which day is set
<enebo>
only unknown is region setup slower or faster
<enebo>
It very well may be way slower
<enebo>
I just wondered since it would be not searching again and doing case insensitive compare
Caerus has joined #jruby
<enebo>
a hashmap may also work out better since bytelist caches its hash calc
sgeorge_ has quit [Remote host closed the connection]
<enebo>
kares: with regards to being slower if you are using compile.invokedynamic at all the bytecode is pretty much storing the completely processed regexp into a field so some processing would go away
<enebo>
kares: although if you do as you said above and stick it into a field you should see if that overhead matters
Caerus has quit [Read error: Connection reset by peer]
<kares>
ok will do some testing on that
<kares>
wonder what are those bytes.getClass()/str.getClass() lines around RubyRegexp for
<enebo>
lopex: if I stop using matchInteruptible for match then we go from 4.9M i/s to about 5.5M
<lopex>
enebo: because there might be no region is there's no captures
<lopex>
enebo: and you still want to be able to return match begin/end
<enebo>
but how would that be returned?
<lopex>
enebo: via on demand region instance
<lopex>
enebo: getEagerRegion
<enebo>
but getEagerRegion is 'return msaRegion != null ? msaRegion : new Region(msaBegin, msaEnd);'
<lopex>
yes
<enebo>
haha I reversed the logic
<enebo>
ok so it is possible to make something on demand then I see
<enebo>
do we do that?
<lopex>
under jruby I guess not
<enebo>
So I am realizing we have two engines for sb and nonsb
<lopex>
yes
<enebo>
we could probably have a third for fast matching
<enebo>
then we may have a few more methods on the class but those could nuke all this setting if we don't actually need it
<lopex>
well in that case we could do much more in separate opcode that relate to captures
<lopex>
opcodes
<enebo>
yeah I somewhat mean that
<enebo>
although in some cases existing opcodes would be used and any region logic would just be removed from that version
<lopex>
but you cant get rid of them
<lopex>
oh, hmm
<lopex>
referred
<lopex>
enebo: !!
<enebo>
lopex: Not sure if you are happy or sad
<lopex>
we know upfront I guess if there something like (.)\1
<enebo>
yeah
<lopex>
then $1 is referred in terms of oni
<enebo>
we know lots of stuff before it starts
<lopex>
but what worried me is all that special casing just for match?
<lopex>
er "match?"
<enebo>
well we don't know how much benefit
<lopex>
enebo: so interruptible slows so much ?
<enebo>
68 consumers in Rails 5.1 core
<enebo>
seems to unless matchInterruptible does more than match beyond that
<enebo>
This is highly synthetic bench too
drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
<enebo>
no doubt in a more multi core use-case ala Rails we would not see the same penalty
<enebo>
but who knows then since we are executing tons of other code at same time
<enebo>
lopex: but going back to Rails 68 calls to match? in core code means people are using it because it is fast
<enebo>
lopex: so it likely is important for us to do it quickly too
<enebo>
lopex: since people will leverage it for the SPEEDZ
<enebo>
lopex: but I don't think we should make some huge unmaintainable mess either
<lopex>
yeah, that's why I was postponing it
<lopex>
but since we have much cleared picture now..
<lopex>
I'll look again at mri too
<enebo>
lopex: tbh the stuff I did today was not very invasive to joni
<enebo>
probably more changes to JRuby itself to not call through some massive set of common methods
<enebo>
but I also think we could start with the same general strategy I did today in making a matcherNoRegions to factory and then later consider more invasive changes to joni interp for specialized or simpler matcher
<lopex>
enebo: though but that getEagerRegion isnt use in joni, we still benefit from it since lots of core uses just getBegin / getEnd directly
<enebo>
This is the patch I ended up with for joni but I did revert your other change of that method to scanEnv
<lopex>
enebo: I'd even pass that region from factory methods
<lopex>
but that's nitpicking
<enebo>
lopex: yeah I went for minimal amount of work while trying to achieve your factory-based API
<enebo>
lopex: getEager is fine to keep but if we decide to try and improve match perf I think we can make a match engine and probably reduce the complexity of what we call for each op code
<enebo>
that will just be another code path from execute
<enebo>
but I have no idea if that will help perf or not either
<enebo>
that msaBegin+ logic is pretty simple math
<enebo>
I would think best case for specialize execute for match would be stripping logic down to the point some more stuff inlines
<lopex>
enebo: since asm stuff isnt loader the factory method sohuld be considered ply
<lopex>
polu
<lopex>
er, poly
<lopex>
er
<lopex>
mono
<lopex>
doh
<enebo>
haha
<enebo>
yeah
<enebo>
you reflective call should eliminate that being a problem
<lopex>
reflective ?
<enebo>
newInstance
<lopex>
ah, right
<enebo>
match? 7.255M (± 9.8%) i/s - 35.918M in 5.007063s