ur5us has quit [Ping timeout: 260 seconds]
ur5us has joined #jruby
ur5us has quit [Ping timeout: 260 seconds]
ur5us has joined #jruby
ur5us has quit [Ping timeout: 244 seconds]
_whitelogger has joined #jruby
_whitelogger has joined #jruby
ur5us has joined #jruby
ur5us has quit [Ping timeout: 244 seconds]
sagax has quit [Remote host closed the connection]
sagax has joined #jruby
nirvdrum has joined #jruby
nirvdrum has quit [Remote host closed the connection]
nirvdrum has joined #jruby
Antiarc_ has quit [Remote host closed the connection]
Antiarc has joined #jruby
<headius[m]> enebo: yo
<headius[m]> so where do we stand and what can I do for .13
<enebo[m]> you can review the PR but I also pinged johnathon to test again
<enebo[m]> is he on matrix?
<headius[m]> yeah johnphillips31416
<headius[m]> I will start having a look at the PR
<headius[m]> frustratingly nothing I threw at it would break, regardless of how I loaded/evaluated methods, how many threads I use, structure of the methods and nested closures
<enebo[m]> I verified dynscope removal on all of test:mri:core:jit is the same
<headius[m]> there's some complexity to the failing cases that I have not been able to reproduce synthetically (and I don't know what the failing cases actually look like
<enebo[m]> I think another thing I will try this afternoon is endlessly making new methods and then forcing them to jit
<enebo[m]> across threads
<headius[m]> I tried normal jit mode, -X-C, different thresholds... all that
<enebo[m]> if it is a timing issue eventually a race should happen
<headius[m]> logged jit output to see that things were compiling
<headius[m]> saw occasional double jitting but no failures
<enebo[m]> but it might takes hundreds to thousands of methods JITing at same time for it to fail
<headius[m]> hmmm
<enebo[m]> based on how long it seems to take to hit it I am thinking thousands of possible races before one hits it
<headius[m]> perhaps multiple independent stacks of closures in the same method
<headius[m]> so you have one stack triggering jit at method level while another stack is still happy to interpret
<enebo[m]> that was what I was trying to do originally
<headius[m]> the DynamicScope error is a closure problem for sure
<headius[m]> the stack error likely is as well
<enebo[m]> force one sibling to JIT then second sibling would not
<enebo[m]> part of the problem is an old closure activation has to be alive before a new one JITs
<enebo[m]> or happen to get used at same time the new one JITs and is used
<headius[m]> yeah that makes sense
<enebo[m]> I guess perhaps a Thread which lives with a closure which reaches out of it but then a parent closure jits outside the thread?
<headius[m]> oh that's a thought too
<headius[m]> ok I still can't make it fail
<headius[m]> I'm going to review
<headius[m]> enebo: I wanted to close the loop on this symbol experiment: https://github.com/jruby/jruby/pull/6341#issuecomment-664588352
<headius[m]> basically the same as my messages to you on Friday but with one possible enhancement we could make: force all 7-bit clean identifiers to always use US-ASCII, explicitly acknowledging that it's not possible to have differently-encoded 7-bit identifiers
<headius[m]> I was not quite sure if we're doing that now
<headius[m]> I don't see this limitation as particularly damning, since it's pretty edgy to be running a system with lots of mixed-encoding identifiers in the first place, and even edgier to be doing so with ASCII-compatible 7-bit identifiers and expecting them to all be different logical identifiers separated only by encoding
<enebo[m]> Well 7bit == USASCII was the intent and it should work that way coming in through the parser
<headius[m]> ok, then it may be this only needs a "fix" in String#intern to always force such symbols to US-ASCII
<headius[m]> that is the root case from #1348
<enebo[m]> it has to be ascii compatible encoding as a source
<enebo[m]> yeah this is probably just that code path itself
<headius[m]> it will be an explicitly behavioral difference from CRuby, but it will be clearly spelled out: if you have 7-bit ASCII bytes it will be a US-ASCII symbol, always
<enebo[m]> It is more inconsistent on their parts if so
<headius[m]> that will avoid the cases where the first symbol intern that "wins" has some other encoding and others weirdly inherit it
<enebo[m]> their parser will also make clean 7bit in ascii compat encodings US-ASCII
<enebo[m]> if they allow it from String that is an odd man out behavior
<headius[m]> yeah, and the case in 1348 was not even CR_VALID anymore, so it's stretching the boundaries pretty far
<enebo[m]> ultimately though 7bit means ascii 7 bit so it being a different encoding does not really work in anyones favor
<headius[m]> it was US-ASCII bytes pretending to be a valid UTF-16 string
<enebo[m]> yeah so it really makes you wonder why they removed the error
<enebo[m]> I would not bet money but I suspect that error was us following them
<headius[m]> that is clearly contrived, but even the valid cases where you might have something with ASCII bytes encoded as 8859-13 still would work fine if we just force them back to US-ASCII in all cases
<enebo[m]> yeah since the 7bit would be the same chars/bytes
<headius[m]> wanting to differentiate valid 7-bit identifier collisions is a bridge too far
<enebo[m]> It should not have any effect other than a reflective one
<headius[m]> and has very little value
<headius[m]> right
<enebo[m]> :foo and :foo should hash different
<headius[m]> plus there's a ton of logic throughout Ruby to allow US-ASCII strings to interact non-destructively with any ASCII-compat encoding
<enebo[m]> I mean it is a very weird real world issue...when would you want the same string with the same chars to be a different key
<enebo[m]> yeah tons of logic which says...oh all ascii make 7bit
<enebo[m]> And this is not even about correctness
<headius[m]> yeah basically that's it
<enebo[m]> They do is so they can know how to walk it quickly and get length easily
<headius[m]> 7bit is treated as US-ASCII almost everywhere, until it needs to be something else... and if that something else non-7bit, it graduates to the proper encoding
<headius[m]> I will look into this as post .13 work after I review IR patches
<enebo[m]> as far as m17n rules goes this is one of the easiest
<headius[m]> you're right, it is actually more internally consistent
<enebo[m]> anything which is 7bit will become whatever non-7bit value it is added to (so long as encoding is ascii compat.)
<headius[m]> I suppose there's an additional case that would represent an actual flaw in JRuby: ASCII-incompatible encoding that happens to only use 7-bit characters
<enebo[m]> The only weird bit of this is that Symbols will go US-ASCII but Strings will keep their encoding
<headius[m]> I don't know such an encoding in general use offhand
<headius[m]> EBCDIC would be one
<enebo[m]> is ebcdic ascii?
<headius[m]> it is not
<headius[m]> so if you had an EBCDIC identifier we would incorrectly mark it as US-ASCII and it would never be retrievable as EBCDIC
<headius[m]> I mean if that identifier only used 7-bit range, which would mean like half the alphabet and all numbers are off the table
<enebo[m]> but I think the real problem here is not that the parser or the string functions would make that symbol wrong
<enebo[m]> It is if somehow the same chars end up coming in from non-ebcidic source they will collide
<headius[m]> yes, just that if you wanted to view it as EBCDIC it would be nonsense
<headius[m]> yeah
<enebo[m]> and I suppose if you use a gem on the internet those files will not be EBCIDIC
<headius[m]> there's a table here: https://en.wikipedia.org/wiki/EBCDIC
<headius[m]> and I was wrong, it takes the entire alphabet off the table
<enebo[m]> I do not want to be glib but if people want EBCIDIC symbol encodings JRuby is not for them
<headius[m]> so it would seem unlikely you'd have a 7-bit ebcdic identifier that would be very useful
<enebo[m]> I did actually know it was not ascii-compat
<headius[m]> the only characters in 7-bit range are symbols
<headius[m]> so like... the ! method would encode improperly
<headius[m]> and ==, ===, !=, `, etc
<enebo[m]> for (c = 'A'; c <= 'Z'; ++c) putchar(c);
<enebo[m]> lol
<enebo[m]> we inherited a lot of weird shit in Java as a result of C
<headius[m]> yeah I used ebcdic as an example because it's so broken in other ways
<enebo[m]> yeah for sure
<headius[m]> but it's the only one I could think of that's an 8-bit encoding completely incompatible with ASCII
<enebo[m]> but ultimately all of these discussions evolve into weird combinations
<headius[m]> there might be some CJK encodings that are problems
<headius[m]> EUC-JP is weird
<headius[m]> most encodings in wide use have accepted they have to be ASCII-compat though
<headius[m]> ASCII won
<headius[m]> G0 is almost always an ISO-646 compliant coded character set such as US-ASCII, ISO 646:KR (KS X 1003) or ISO 646:JP (the lower half of JIS X 0201) that is invoked on GL (i.e. with the most significant bit cleared). An exception from US-ASCII is that 0x5C (backslash in US-ASCII) is often used to represent a Yen sign in EUC-JP (see below) and a won sign in EUC-KR.
<headius[m]> so this is why you see the ¥ symbol in older Windows paths on JP machines
<headius[m]> but otherwise it's basically compat
<enebo[m]> yeah I do remember knowing about why there was a yen symbol at one point
<enebo[m]> but windows codepages
<enebo[m]> gives me enough to realize anything probably is anything somewhere in the world
<headius[m]> yeah it's an intractable problem for locales that have not accepted unicode
<headius[m]> but even Ruby has gone to default UTF-8 internally
<headius[m]> if only the byte had been represented as 2^4 instead of 2^3 we might never have had this problem (and chinese would still have to use 2^5 anyway)
<enebo[m]> I guess it all just came down to squeezing
<headius[m]> enebo: I have opened https://github.com/jruby/jruby/issues/6344 for the additional work we discussed
ur5us has joined #jruby
<headius[m]> hmmm
<headius[m]> I may have thought of an interesting idea
<headius[m]> to blunt the cost of booting JRuby when there's no gems to load, perhaps we should disable RubyGems if we don't see a path or environment variable that would indicate where gems live
<headius[m]> the use case is mostly for embedding JRuby, where you are already packaging all libraries you need at a root level of some jar file
<headius[m]> if there's no gem home, rubygems serves no purpose other than to slow down boot time
<lopex[m]> numbers!
ur5us has quit [Ping timeout: 260 seconds]
ur5us has joined #jruby
<headius[m]> could be
<headius[m]> we start up significantly faster without RG loaded