#jruby on 2020-07-23 — irc logs at freenode.irclog.whitequark.org

2020-07-01 18:55 ChanServ changed the topic of #jruby to: Get 9.2.12.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

00:18 ur5us has quit [Ping timeout: 244 seconds]

00:44 ur5us has joined #jruby

05:37 ur5us has quit [Ping timeout: 260 seconds]

13:18 nirvdrum has joined #jruby

14:44 sagax has quit [Read error: Connection reset by peer]

15:05 <enebo[m]> kares: JRUBY_OPTS="--dev -Xjit.logging -Xjit.logging.verbose -Xjit.threshold=0 -Xdebug.parser" jruby -r optparse -e 1

15:06 <enebo[m]> If you try this on master you will see a Java proxy class get setup via compilation where it passes a null superClass to newClass.

15:06 <enebo[m]> Not sure if you would care to look at it or not but I figured you know that code

15:06 <enebo[m]> kares: if that gets fixed interestingly we will see a second error raising NPE when compiling optparse

15:07 <enebo[m]> kares: which goes back to 9.2...that is something else though :P

15:10 <kares[m]> heh, reproduced. is this in an issue on the tracker?

15:10 <kares[m]> would be interested to look into but my work-queue is too full ... maybe tomorrow or over the weekend

15:10 <enebo[m]> kares: ah yeah no rush at all...it is only on master atm

15:10 <enebo[m]> but no issue

15:11 <enebo[m]> I was trying to debug IRFlags changes and there are lots of errors in our logging code now and then I realized multiple errors have nothing to do with my changes

15:17 <headius[m]> I fixed a bug in jit logging on ir_concurrency

15:17 <headius[m]> NPE in InterpretedIRMethod

15:18 <headius[m]> er maybe it was InterpretedIRBlockBody... it was returning null for implementationClass

15:19 <enebo[m]> well I have another now since fic never gets set on failed compile we cannot see what it tried to do

15:19 <enebo[m]> My changes work for startup interp but I hit some bugs when I switch to Full/JIT

15:20 <enebo[m]> now that I am debugging I am hitting things

15:20 <enebo[m]> I did see that problem so II probably had not pulled since then

15:25 <kares[m]> yet ... another attempt to run CI on Win: https://github.com/jruby/jruby/pull/6340

15:27 <kares[m]> wonder if we should finish that with pends - most seems to be io/posix issues

15:29 <enebo[m]> kares: pend with windows platform and that is just fine

15:29 <enebo[m]> kares: I think seeing what new things we break is better than not running it at all

15:33 <kares[m]> okay I will look into that and add a test there for the regression

15:34 <kares[m]> will re-target fir 9.2 it is still merged to master, right?

15:34 <headius[m]> kares: not surprising

15:34 <headius[m]> yes we are merging periodicailly

15:34 <kares[m]> yy not at all I expected worse ;)

15:35 <kares[m]> simce tgat suite is 'crazy'

15:35 <kares[m]> * since that suite is 'crazy'

15:35 <headius[m]> I modified spec tags on master to ad RbConfig::CONFIG['host_os'] to excluded tags so if you copy that over to 9.2 branch it should be possible to tag as "mswin" or something

15:35 <headius[m]> I can't remember what we use for host_os on Windows but it's always the same

15:35 <kares[m]> this is only test:jruby

15:35 <headius[m]> might be mswin32 😟

15:36 <headius[m]> oh I see

15:36 <headius[m]> still no posix IO for most stuff on Windows

15:36 <kares[m]> tagged some in the code already

15:36 <headius[m]> ok

15:36 <kares[m]> will do the rest with TODOs

15:36 <headius[m]> we used to run specs on windows but that was years ago... would be a project for a day or two

15:37 <headius[m]> kares: yeah maybe we should file a bug with that tagging commit so we don't just forget about them

15:37 <headius[m]> wish we had a better mechanism for tagging test/unit

15:37 <kares[m]> ok

16:36 joast has joined #jruby

16:44 <headius[m]> enebo: apropos of nothing: https://github.com/jruby/jruby/issues/1348#issuecomment-663112293

16:44 <headius[m]> both we and MRI do consider encoding in String hash calculation, but we don't use the same logic for symbol bytelist hashing

16:45 <headius[m]> so that's probably the only remaining fix to get identical symbols with different encodings working

16:46 <enebo[m]> I am not sure

16:46 <enebo[m]> we already will make any 7bit sequence into US-ASCII before it enters the table

16:47 <headius[m]> yeah I think that's wrong

16:47 <headius[m]> what we should do instead is make it US-ASCII only if the original encoding is 7bit-compatible and the content is all 7-bit

16:47 <headius[m]> I think

16:48 <headius[m]> otherwise we leave it as is and use the encoding index as part of the hash

16:48 <enebo[m]> I have never seen a 7bit clean symbol not be US-ASCII but I guess it is not us-ascii compatible string but happens to be us-ascii bytes maybe?

16:48 <headius[m]> yes

16:48 <headius[m]> in the example case it's an ascii string forced into utf-16

16:48 <enebo[m]> The parser does do this check though. Although it could be wrong

16:49 <headius[m]> it may be unlikely that we see bugs with this anymore in practice, but that particular report is still broken

16:49 <enebo[m]> I actually don't know of a scenario where that could happen either but I can believe it could

16:49 <headius[m]> also we scan for code range in #intern which rejects that forced UTF-16 string

16:49 <headius[m]> MRI does not reject it

16:49 <headius[m]> they appear to happily make symbols that are CR_BROKEN

16:50 <headius[m]> which is... weird

16:50 <enebo[m]> which intern?

16:50 <headius[m]> String#intern

16:50 <headius[m]> I added another comment after the one linked above

16:51 sagax has joined #jruby

16:51 <enebo[m]> I admit I have not looked in a while but this surprises me

16:52 <headius[m]> I think if we review their logic again we'll be able to sort out which piece go where

16:52 <enebo[m]> symbol in RubySymbol should be an id into the symbol table as a string which is always ISO-8859_1 bytes which will never fail to intern

16:52 <headius[m]> but they do not appear to check for CR_BROKEN on the way into symbol

16:52 <headius[m]> we do

16:52 <enebo[m]> I think we do have String entry points which maybe is not honoring that but I believe that is all coming from Java and should then all be valid UTF-16LE

16:53 <enebo[m]> Admittedly there are likely loose ends but in theory that string should not fail to intern

16:53 <headius[m]> this is just RubyString#intern for a bad UTF-16 string

16:53 <headius[m]> as in the linked issue's original repro

16:53 <enebo[m]> which I did not read :P

16:53 <headius[m]> if I remove that check, though, we still have the old behavior and are not distinguishing between the "ab" string forced to UTF-16 and the "ab" string that's just US-ASCII

16:54 <enebo[m]> at I see so RubySymbol itself could be fine here with Broken

16:54 <headius[m]> yeah seems to be

16:54 <enebo[m]> it would just be a really messed up symbol that only this RubyString would be a path to

16:54 <headius[m]> and even if string contents are 7-bit, it's still keeping the UTF-16 and US-ASCII versions of symbol separate

16:54 <headius[m]> right

16:54 <enebo[m]> if you wanted to get it by that again (or itself obviously)

16:54 <headius[m]> obviously any other UTF-16 string should not collide with a US-ASCII string ever

16:55 <headius[m]> unless there's embedded nulls, which would get rejected other ways usually

16:55 <enebo[m]> so now having properly read what you said with it being RubyString.intern() it looks like we could almost just remove the coderangescan

16:55 <headius[m]> it seems so

16:55 <headius[m]> RubySymbol uses RubyString hashing?

16:55 <headius[m]> I mean the symbol table

16:56 <enebo[m]> oh yeah ok that is the other part of that issue right?

16:56 <headius[m]> yeah

16:56 <headius[m]> from what I can see all String#intern symbols just go through getSymbol(byteList) which does not consider encoding for hash

16:57 <headius[m]> CRuby explicitly uses rb_str_hash for it

16:58 <headius[m]> (which incidentally also does the hash DOS stuff, we are not doing in symbol table)

16:59 <headius[m]> our symbol hash appears to just be the iso-8859-1 bytes, in javaStringHashCode

16:59 <enebo[m]> tbh I am not sure I have much more invested in this conversation today. It seems like you are close to having figured this out already.

16:59 <enebo[m]> it is as far as I remember

16:59 <enebo[m]> which is why a RubyString will probably just find the broken CR symbol from a string

17:00 <headius[m]> yeah I can look into it but wanted to get your thoughts

17:00 <headius[m]> it shouldn't be hard to make logic match MRI and this issue has been dangling so I thought I'd knock it down

17:00 <headius[m]> on master

17:00 <enebo[m]> I think removing just that check will address the case of broken but I still believe two weird ISO compatible strings containing the 8th bit will not work

17:00 <enebo[m]> which is where that last comment of mine came from

17:01 <headius[m]> I'll try to align logic for 9.3 and see if I can kill this issue today

17:01 <enebo[m]> but 8bit 8859-11 and 8859-13 (made these up) will still not work

17:02 <enebo[m]> I don't think this is a very significant incompatibility but since we do not tuple encoding with those bytes I think that will still be broken

17:02 <enebo[m]> broken strings probably are actually fairly common

17:03 <headius[m]> yeah unsure about that case

17:03 <headius[m]> I think e.g. 11 and 13 may be compatible under 7bit so they wouldn't show up as 7-bit only if there were differing characters?

17:03 <enebo[m]> the issue as I see it is they will be identical 8859_1 hashes

17:04 <enebo[m]> but the 8bit values will mean different things

17:04 <headius[m]> but if both had same chars and were all under 7bit they would collide in MRI too I thinkn

17:04 <enebo[m]> no but they are not only 7bit

17:04 <headius[m]> if they're not 7bit MRI will include encoding index in hash

17:04 <enebo[m]> they contain something 8bit which is the same byte but a different character

17:04 <enebo[m]> yeah MRI will but we won't

17:04 <headius[m]> right, which is what I will try to fix

17:05 <enebo[m]> ok

17:05 <headius[m]> basically just has to propagate some string hash logic into symbol table

17:05 <enebo[m]> that will be great to fix it

17:05 <headius[m]> might need to store CR

17:05 <headius[m]> I'll see

17:05 <enebo[m]> CR can be calculated but it forces a scan so it is probably nice to prop it

17:06 <enebo[m]> I still sort of wish it was in bytelist

17:06 <enebo[m]> going up to start some lunch

17:07 <headius[m]> yeah

17:45 <headius[m]> patch got bigger than expected but I have it propagating CR into Symbol now and using String hashing for the table

17:50 <headius[m]> it seems like most places that could pass a stored CR through already do... and parser already saves RubySymbol in SymbolNode

17:50 <headius[m]> there are a few places, mostly coming from Java that only have a String or ByteList in hand

18:07 <headius[m]> gotta get use to this new icon for riot/element

18:07 * headius[m] sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/VNhhdDLxGVdMuOurtQkIDGiZ >

18:21 <headius[m]> enebo: https://github.com/jruby/jruby/pull/6341

18:21 <headius[m]> bbiab

18:21 <enebo[m]> ok

18:22 <headius[m]> if we ever move CR into ByteList it can just become CodeRangeable and work with this change

18:32 <kares[m]> https://github.com/alibaba/dragonwell8_jdk/pull/18 coroutines used in production ;)

19:16 <headius[m]> cool

19:16 <headius[m]> need to spend a little time making a loom backend for fibers

19:17 <headius[m]> shouldn't be too difficult

19:18 ruurd has quit [Quit: ZZZzzz…]

19:37 <headius[m]> hmm ok symbol patch isn't quite right

19:58 <headius[m]> aha, unmarshaling a symbol is not properly encoding-aware

19:58 <headius[m]> MARSHAL

20:00 ruurd has joined #jruby

20:11 travis-ci has joined #jruby

20:11 travis-ci has left #jruby [#jruby]

20:11 <travis-ci> jruby/jruby (ir_concurrency:9a2d272 by Thomas E. Enebo): The build was broken. https://travis-ci.org/jruby/jruby/builds/711222898 [159 min 39 sec]

20:12 <enebo[m]> wot

20:12 <headius[m]> shame

20:13 <enebo[m]> lol if this is my last commit what am I watching in that other tab :)

20:13 <headius[m]> oh concurrent-ruby

20:13 <headius[m]> that suite has been marked allow-failures on master because there's a bad test

20:14 <headius[m]> yeah it's the same one

20:14 <headius[m]> https://github.com/ruby-concurrency/concurrent-ruby/issues/862

20:15 <headius[m]> bad thread accounting

20:15 <enebo[m]> well I am not even concerned about that although it is good to know it was from that

20:15 <enebo[m]> I had some other tab open with about 40% of the tests green and the rest yellow

20:15 <enebo[m]> If I click on the commit it appears to be all my last changes

20:15 <enebo[m]> What am I watching?

20:16 <enebo[m]> OH I SEE

20:16 <headius[m]> PRs done against origin test twice

20:16 <enebo[m]> It does one build for the commit and one for the PR

20:16 <headius[m]> not sure how to prevent that but still allow branches to CI

20:16 <enebo[m]> ok makes sense too but I thought I was a bad luck time traveller for a moment

20:16 <headius[m]> yeah it's annoying

20:17 <headius[m]> the push (branch) CI always runs first, fwiw

20:17 <enebo[m]> so I may be as done as I can be without rewriting it to be the right design

20:17 <headius[m]> hot crackers

20:17 <enebo[m]> but I do have one more thing to do (well two)

20:17 <enebo[m]> 1. Make sure nothing has stopped compiling and is falling back to interp

20:17 <enebo[m]> 2. Sort of related but make sure the same opts are happening before/after

20:18 <headius[m]> ugh this symbol unmarshal logic blows

20:18 <enebo[m]> #2 proves #1 already but it is not super simple...I

20:19 <headius[m]> stopped compiling as in couldn't JIT to bytecode?

20:21 <enebo[m]> headius: stopped compiling as it hit and error and bailed out

20:22 <enebo[m]> I think I can trivially just change test:mri:core:jit probably to do verbose logging on jit and see what shows up

20:22 <enebo[m]> but I am not super concerned about this as much as determining whether the passes actually did stuff like eliminate dyn scopes and stuff like that

20:23 <enebo[m]> even examining the flag does not tell me anything more than it might have happened

20:25 <headius[m]> oh like you aren't sure jit is still doing what it should after your changes

20:25 <headius[m]> compiler specs are pretty unforgiving so that's one assurance

20:25 <headius[m]> if something fails in there now it's broken in full or jit logic for other stuff too

20:26 <headius[m]> it doesn't have any fallback

20:26 <enebo[m]> headius: do spec:compiler actually depend on all passes running a particular way?

20:27 <headius[m]> they don't check anything related to passes but they run jit like jit would run

20:27 <headius[m]> they are just testing that jit works and new code behaves as expected

20:27 <enebo[m]> oh sure so it probably is a better indicator of #1 than #2

20:27 <headius[m]> yeah

20:27 <enebo[m]> ok

20:28 <enebo[m]> I was freaked this morning when I uncovered the getImplementationClass NPE

20:28 <headius[m]> deeper testing could be done against full to make sure the code structure is as expected

20:28 <enebo[m]> I was thinking oh crap we have some stuff not compiling that precedes all this

20:28 <enebo[m]> funny it was literally just the log statement that was broken (as far as we know)

20:28 <headius[m]> I guess neither full nor jit would be "easy" to verify because we're basically looking at the "assembly" after and making sure it's what's expected

20:29 <enebo[m]> well if you know JIT is eliminating a scope then that would be cool

20:30 <enebo[m]> I think largely that and scopes replaced with temps are the two main opts I want to make sure have not changed

20:30 <enebo[m]> I don't want any of them to stop but those are bigger ones from a changed perf perspective

20:31 <enebo[m]> I cannot do before/after on all flag state because I made all scopes get flags whereas we used to not do it for several scopes

20:34 <headius[m]> yeah it would be easier to do against full

20:34 <headius[m]> once it's in bytecode all I can really inspect is the bytecode

20:35 <enebo[m]> yeah I think my plan is to run same thing with same printlns before/after and those prints will say I removed dyn scope for X

20:35 <headius[m]> we have some pass listener framework I've never tried to use

20:35 <headius[m]> could possibly hook into that

20:35 <enebo[m]> I think if there is no interleaved output I can just sort and compare

20:36 <enebo[m]> yeah it exists largely to report what was run but not how it was run

20:37 <enebo[m]> So I can verify the same passes are run but I already know they are

20:38 <enebo[m]> I maybe will think a little about this though. It would be nice to report what happened in a pass. compilerpasslistener could be instrumental in making something we could regularly test

20:38 <enebo[m]> maybe some void actionPerformed(pass, fix, data...) or something

20:39 <headius[m]> I think marshal should be registering ByteList for symbols

20:39 <headius[m]> MRI appears to do something similar before they actually create a symbol object

20:39 <enebo[m]> Although it is not perhaps a great idea for next 24 hours in that I need to apply it to older code too ... same could be said of printlns but new code uses fic and old uses IRScope so println probably is simpler

20:39 <headius[m]> that's the only way I can untangle this lifecycle

20:40 <headius[m]> yeah actionPerformed kind of callback would be good

20:40 <headius[m]> maybe throw together an enum of actions they take and just feed those out if there's a listener

20:40 <headius[m]> we can improve it later

20:40 <enebo[m]> yeah I think it could also just be a log as well but as a listener testing could register that and pull it into Ruby

20:41 <headius[m]> right

20:41 <enebo[m]> oh I plan on it being string data largely

20:41 <enebo[m]> based on my 3 minutes of thinking about it

20:41 <enebo[m]> but enum has disadvantage of being a pain to extend and if it will end up as Ruby we do not gain a lot from it

20:42 <headius[m]> just a more structured way to log them really

20:43 <headius[m]> we have no consumers of it so we could reevaluate the enum later too

20:43 <enebo[m]> yeah I think at this point I also need to evaluate what an action really is for each thing

20:51 ur5us has joined #jruby

20:57 <headius[m]> well this is wacky but it's working

21:22 <headius[m]> enebo: I've finished additional fixes for the symbol encoding thing at https://github.com/jruby/jruby/pull/6341

21:22 <headius[m]> the marshaling rework puts us more in line with CRuby as far as the workflow of getting a symbol out of the marshal stream

21:23 <headius[m]> additionally I had to remove all "fast" paths from a java.lang.String to Symbol because they were using iso-8859-1 hash... the only way to get a symbol now is via a ByteList, so the only "fast" way is to keep the ByteList (and ideally a CR) on hand

21:24 <headius[m]> we will want to audit all newSymbol/getSymbol(java.lang.String) paths and try to find ways to propagate a ByteList through

21:24 <headius[m]> some like method_missing and const_missing currently use a Java String and will now be creating intermediate ByteList along the way

21:25 <headius[m]> it might be possible to restore the "fast" paths by doing all the hash calculation from the String bytes as though they have been decoded to UTF-8... I will take a quick look to see if that's possible

21:25 <headius[m]> I don't know if it would be more efficient than just creating the UTF-8 ByteList and using that, but it may be possible to eliminate the alloc

21:31 <headius[m]> I've updated the issue with a description of the two extra pieces and will look into fast pathing some of this now

22:19 <headius[m]> heh

22:20 <headius[m]> fun with optimization... I have a method that can take a Java String and calculate both Ruby hash and code range from it without any allocation, using our existing ByteBuffer cache

22:46 nirvdrum has quit [Ping timeout: 246 seconds]

23:12 <headius[m]> yay

23:12 <headius[m]> I have fast String to Symbol logic working with minimum allocation

23:12 <headius[m]> no reason not to go forward with this now