travis-ci has joined #jruby
<travis-ci> jruby/jruby (master:a5fa46c by Charles Oliver Nutter): The build was broken. https://travis-ci.com/jruby/jruby/builds/221853151 [210 min 56 sec]
travis-ci has left #jruby [#jruby]
ur5us has joined #jruby
ur5us has quit [Ping timeout: 258 seconds]
vext01 has quit [Ping timeout: 240 seconds]
jswenson[m] has quit [Quit: Idle for 30+ days]
drbobbeaty has quit [Read error: No route to host]
drbobbeaty has joined #jruby
c355e3b has joined #jruby
<headius[m]> enebo: I found one or two other cases where String functions are using backref just because that is how they get a match out of the regexp call
<enebo[m]> nice
<headius[m]> I am on the fence about making them not clear backref at this point... it is not a specified behavior but it happens accidentally (e.g. CRuby String#split just calls search until it returns nothing, which has the side effect of clearing backref)
<enebo[m]> For 9.2 I think we should for 9.3 I think we shouldn't
<headius[m]> eh?
<enebo[m]> clearing backref
<headius[m]> should clear, got it
<enebo[m]> It matches 9.2 just in case something could possibly depend on it
<headius[m]> yeah we can give it a shot... I will alter the other one or two that do this in the split backref regexp and if it passes we will give it a god
<headius[m]> a golden god
<headius[m]> or a go
<enebo[m]> That feels weird but it is a simpler path for 9.2 as far as risk
<enebo[m]> As you said yesterday it does completely remove threading issues when using the same calls within many threads which is likely the more common case of finding this issue
<enebo[m]> So in that sense I wonder how likely other scenarios are where we are doing different $~ calls across threads
<headius[m]> ah yeah so I should look at doing the threading fix for 9.2 but leave the clearing
<headius[m]> and then remove the clearing in 9.3 PR
<headius[m]> (for these other methods)
<enebo[m]> yeah
<enebo[m]> Maybe even cajole someone to do similar thing in MRI for these
<headius[m]> FWIW the ones that do actually read backref seem to do so only to reuse the object
<headius[m]> so it is a backwards perf enhancement for CRuby and our ports
<headius[m]> they are not using the data within it for anything
<enebo[m]> Part of me wondered about captures with split
<headius[m]> yeah nothing like that
<enebo[m]> yeah we use pattern already
<enebo[m]> just not from $~ since there is not block syntax
<enebo[m]> mri26 -e 'p "one two three".split(/(\S+)/)'
<enebo[m]> I was thinking about the oddity of this but we still get match data it is just where it is stored
<enebo[m]> Any method which accepts a block may expect access to $~ at that point
<headius[m]> yeah that is true
<headius[m]> some of these obviously can't eliminate it like sub or gsub or =~
travis-ci has joined #jruby
<travis-ci> jruby/jruby (master:a5fa46c by Charles Oliver Nutter): The build was canceled. https://travis-ci.com/jruby/jruby/builds/221853151 [211 min 2 sec]
travis-ci has left #jruby [#jruby]
<headius[m]> another thought occurs to me
<headius[m]> we could use a truly thread-local store for backref during regexp operations and just get/set it at the edges
<headius[m]> that would not fix those other races but it would mean the race is just in $~ and not internal to any of these methods
ebarrett has joined #jruby
<headius[m]> right now since they mostly are just trying to reuse the MatchData instance they could see it cleared in the middle of some logic
<headius[m]> so let them reuse a purely thread-local one
<headius[m]> I thought this was where the race was but now I realize these methods read and use the match object during their logic
<enebo[m]> yeah at any dispatch edges like yield or call you need to set but a bunch do not have that
<enebo[m]> Even so it is just reducing the potential for seeing the wrong value
<headius[m]> right
<headius[m]> so the methods themselves will all work, you just can't rely on $~ for the results
<headius[m]> under concurrent use
<headius[m]> oh yeah this is way cleaner
<headius[m]> so all of these backref methods will read and write solely from the truly thread-local (ThreadContext field) match data until they finish, and then if backref update was requested it will be set
<headius[m]> this could be a path toward eliminating backref framing too... for cases where we know backref needs to be used across threads we just add pre/post logic around those calls that moves the thread local one in and out of a JVM field
<headius[m]> as long as all internal methods know to look at the thread-local field, we can update frame-local stuff lazily or not at all
<headius[m]> basically pushes the frame requirement to the very edge of the call
<headius[m]> enebo: this is the start: https://github.com/jruby/jruby/pull/6647
<headius[m]> I will audit all writers of backref now to see if they are using it as out-of-band data
<headius[m]> this also adds back the reuse of the MatchData object, since we can be sure it is truly local to the thread (until exposed via the frame backref, at which point is flagged to not reuse)
<headius[m]> enebo: probably should do this for lastline methods too
<enebo[m]> sure
<c355e3b> Has anyone ever seen a constant "escape" the containing class/module?
<c355e3b> e.g. `class Test; Const = :test; end; defined?(Const) # ==> true`
<headius[m]> well that would be weird
<headius[m]> c355e3b: I guess the answer is "no"
<c355e3b> Ok, cool
<headius[m]> but if you have a case that seems wrong please open an issue
<headius[m]> enebo: first volley against unnecessary backref use: https://bugs.ruby-lang.org/issues/17771
<headius[m]> this case seems like a slam dunk since they added other boolean query methods like `match?` specifically to avoid constructing the MatchData
<enebo[m]> makes sense but I wonder who will say people use the $~ result
<headius[m]> that is the main counterargument
<enebo[m]> I was going to say we can deprecate but how :)
<headius[m]> I would argue they are using the wrong method then
<headius[m]> they are asking if it starts with X but then doing more than just checking starts with
<enebo[m]> yeah definitely since you can just data =~ /^doo/
<headius[m]> hell, we could just stop doing it and see what happens
<headius[m]> but I wanted to bring it up here
<headius[m]> I also needled tenderlove about it
<enebo[m]> It is interesting that starts with is largely just being able to omit ^ from the regexp
<enebo[m]> Like why is regexp even accepted at all?
<enebo[m]> for the duality of being able to pass either in?
<headius[m]> added 11 years ago
<headius[m]> well yeah I guess it is for the flexibilyt of regexp
<headius[m]> yeah you could just =~ /^ .../ but this basically shortcuts that for you
<headius[m]> but you can still use regexp stuff in ...
<headius[m]> I believe I have eliminated ALL frame-local backref usage within our core methods
<headius[m]> none of them used it as it... they were all just trying to re-use it, but the re-use logic was disabled
<headius[m]> oh that other one was not closed
<enebo[m]> This was linked to from that one
<headius[m]> ugh so they explicitly wanted it
<headius[m]> well they are wrong, but whatever
<enebo[m]> The duality argument I think is it
<enebo[m]> passing a string or regexp as some variable makes it just one line of code
<enebo[m]> or just one call
<enebo[m]> Ruby has never really been a language which has cared about overhead if they feel it makes code look nicer
<headius[m]> except they added match? with no MatchData
<headius[m]> so wtf
<headius[m]> around the same time as they added this one which does use MatchData
<enebo[m]> they could eliminate the $~ part of this I thikn
<enebo[m]> If someone depends on it it would break something but I feel this is rare enough they probably could just say the behavior changed
<headius[m]> in any case it doesn't affect my PR for now
<headius[m]> I have modified all consumers of getBackRef to use the context-local value instead
<headius[m]> anything that claimed it read BACKREF frame field before no longer does... they only write it if appropriate
<headius[m]> Oh Ruby
<headius[m]> enebo: maybe you can back me up on that ruby bug since you agree
<enebo[m]> Added a comment
<enebo[m]> It states your first comment in a different way as the main problem
<enebo[m]> The performance overhead is an issue but one I doubt will hold a lot of sway
<headius[m]> maybe not, but the did add match? for perf
<headius[m]> they
<enebo[m]> they added
<enebo[m]> I think that will be the distinction there
<enebo[m]> My comment mostly shows if you plan on using the match data then you already know the type will be a regexp
<headius[m]> I suggested doing the same thing as `match?` and adding a `start_with` that is not a boolean query method
<enebo[m]> That pretty much means you either mess up and accidentally pass a string and then get random match data or you should just be using a regexp
<headius[m]> that would be consistent... boolean query methods that take a regexp do not set $~
<enebo[m]> I find this behavior inconsistent enough to be a bug
<enebo[m]> The behavior of starts_with? should be pretty universal
<headius[m]> now figure out how to get them to make str[] not set $~
<enebo[m]> like it should guarantee nil for $~ if they keep it
<enebo[m]> Or whatever empty match is
<enebo[m]> but I would argue that is the wrong direction
<headius[m]> yeah agree
<enebo[m]> The more I look at something like second = $1 if "foo".starts_with?("fo(.)")
<headius[m]> haha
<headius[m]> yeah
<enebo[m]> well ignoring my obvious typo this feels like it is obfuscating
<enebo[m]> second = $1 if foo =~ /^fo(.)/
<enebo[m]> I mean which version would you want?
<enebo[m]> In this case it may even just be /^fo(second=.)/
<enebo[m]> =~ foo
<enebo[m]> or whatever direction the local var names feature is
<enebo[m]> but point being starts_with? accepting regexp is meant to be dualed with mixing string args
<enebo[m]> if you really plan on it being a known regexp them a regexp looks more correct
<headius[m]> second = $1 if foo.start_with(/fo(.)/)
<enebo[m]> I don't see the point of making a side-effect version at all
<enebo[m]> At least it seems more reasonable to not set $~ for starts_with? just to be consistent. I guess I have my thoughts on responses but we shall see
<headius[m]> PR is ready for a once-over, assuming it stays green
<headius[m]> there are no reads of backref in any core methods now
<headius[m]> only in methods explicitly reading from $~ like last_match
<headius[m]> enebo: can I merge the block framing PR?
<enebo[m]> on 9.2?
<headius[m]> 9.3
<enebo[m]> yeah it is fine...
<enebo[m]> #6643
<headius[m]> yes
<headius[m]> it is in
<headius[m]> github refresh of PR checks has gotten flaky lately... sometimes have to reload a few times to get it to update after a push
<headius[m]> it will show "passed" but it is showing previous push's result
<headius[m]> looks like this backref PR is still green, I am flipping it to ready state
<headius[m]> enebo: have a look over it
<enebo[m]> ok
<enebo[m]> headius: I feel this should just land on 9.3. The specifically reported issue was addressed already for split
<enebo[m]> It is the same pattern and in that sense it is a repeat of the same thing but it is a lot of code
<headius[m]> enebo: ah I thought you were suggesting doing this for 9.2 above
<headius[m]> I can certainly rebase on 9.3 but it will be a sticky merge
<enebo[m]> yeah sorry. in theory this should reduce future reported problems but since we have not gotten them I am not sure I want to take the risk
<headius[m]> I'm fine with that
<headius[m]> it was more invasive than expected but also cleaner than expected
<enebo[m]> Another nice 9.3 release notes entry then
<travis-ci> jruby/jruby (master:1339904 by Charles Oliver Nutter): The build has errored. https://travis-ci.com/jruby/jruby/builds/221945131 [213 min 34 sec]
travis-ci has left #jruby [#jruby]
travis-ci has joined #jruby
<headius[m]> hmm
<headius[m]> restarting those to see
<headius[m]> haha
<headius[m]> enebo: we added our own startsWith on master that does not use MatchData
<enebo[m]> heh sounds familar
travis-ci has joined #jruby
travis-ci has left #jruby [#jruby]
<travis-ci> jruby/jruby (master:1339904 by Charles Oliver Nutter): The build was canceled. https://travis-ci.com/jruby/jruby/builds/221945131 [205 min 54 sec]
<headius[m]> enebo: PR has been rebased... hopefully I didn't botch it because there were a lot of conflicts
<headius[m]> looking forward to walking away from that branch finally
<headius[m]> enebo: surge_g so I did not notice the original issue also reported a ClassCastException. I have marked that one fixed in 9.2.17.0 as well.
ur5us has joined #jruby
ur5us has quit [Ping timeout: 258 seconds]
<headius[m]> matz agreed to make the dual return target thing an error
<headius[m]> ko1 has a PR to raise when the inner proc gets orphaned from the lambda
<headius[m]> I believe we already do it right and were just +1 in
<headius[m]> on eregon's report
<headius[m]> enebo: after some merge fixes the rebased backref PR is going green
<enebo[m]> ok
travis-ci has joined #jruby
travis-ci has left #jruby [#jruby]
<travis-ci> jruby/jruby (master:e170362 by Charles Oliver Nutter): The build failed. https://travis-ci.com/jruby/jruby/builds/221972826 [203 min 35 sec]