#jruby on 2020-05-29 — irc logs at freenode.irclog.whitequark.org

2019-08-12 18:53 ChanServ changed the topic of #jruby to: Get 9.2.8.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

00:08 bga57 has joined #jruby

00:15 bga57 has quit [Ping timeout: 264 seconds]

00:16 bga57 has joined #jruby

00:39 ur5us has quit [Ping timeout: 260 seconds]

00:58 ur5us has joined #jruby

03:07 sagax has quit [Remote host closed the connection]

03:14 sagax has joined #jruby

04:49 _whitelogger has joined #jruby

05:01 ur5us has quit [Ping timeout: 260 seconds]

05:19 _whitelogger has joined #jruby

06:00 nirvdrum has quit [Ping timeout: 272 seconds]

08:28 bga57 has quit [Ping timeout: 256 seconds]

08:30 bga57 has joined #jruby

08:31 ur5us has joined #jruby

09:57 ur5us has quit [Ping timeout: 260 seconds]

10:03 drbobbeaty has joined #jruby

10:16 cyberarm has quit [Quit: Idle for 30+ days]

10:19 voloyev[m] has quit [Quit: Idle for 30+ days]

10:22 simi[m] has quit [Quit: Idle for 30+ days]

10:22 rwilliams[m] has quit [Quit: Idle for 30+ days]

10:32 vitae[m] has left #jruby ["Kicked by @appservice-irc:matrix.org : Idle for 30+ days"]

12:41 travis-ci has joined #jruby

12:41 travis-ci has left #jruby [#jruby]

12:41 <travis-ci> jruby/jruby (master:72146dd by Charles Oliver Nutter): The build is still failing. https://travis-ci.org/jruby/jruby/builds/692534911 [136 min 6 sec]

13:17 nirvdrum has joined #jruby

16:03 xardion has quit [Remote host closed the connection]

16:03 xardion has joined #jruby

19:42 <headius[m]> enebo: hey

19:43 <headius[m]> I'm going to look into that master failure but I pushed this last night and it shockingly passes all tests: https://github.com/jruby/jruby/pull/6249

19:43 <headius[m]> wanted to get your opinion... look at both commits separately

19:46 <headius[m]> it makes multibyte expand_path work properly and miraculously does not break any tests

20:07 <headius[m]> master failures make no sense

20:12 travis-ci has joined #jruby

20:12 travis-ci has left #jruby [#jruby]

20:12 <travis-ci> jruby/jruby (master:72146dd by Charles Oliver Nutter): The build was canceled. https://travis-ci.org/jruby/jruby/builds/692534911 [140 min 20 sec]

20:20 <enebo[m]> looking

20:20 <headius[m]> it's part of fixing a bug in webrick that uses temp files and multibyte paths forced to ASCII-8BIT

20:21 <headius[m]> our expand_path mangles the MBC along the way because it tries to treat the individual bytes as characters in Java and then encodes them separately on the way out again

20:21 <headius[m]> with this it leaves the bytes as is and they go back out without mangling

20:21 <headius[m]> the other half I'm attempting is to do this in fnmatch as well

20:22 <headius[m]> hmm though fnmatch mostly works with bytes already

20:23 <enebo[m]> Does ISOCoder mean iso8859_1?.

20:23 <headius[m]> yeah

20:23 <headius[m]> ISOCoder? you mean decodeISO?

20:24 <enebo[m]> decodeISO calls ISOCoder in thread local

20:24 <headius[m]> oh, yeah ok

20:24 <enebo[m]> ISO was just a guess it was for 8859_1

20:24 <headius[m]> yeah to avoid constructing a new charset decoder every time, there's something similar for UTF-8

20:25 <enebo[m]> yeah I was just hung up on the name...I guess Irealized it had to be 8859_1 based on the byte + sign chopper

20:25 <enebo[m]> we have no convention on short hand but I see a second obvious reference to this being that as well

20:27 <enebo[m]> So first commit will just always make any path built internally to be 8859_1 since Java String will just barf those out as-is as char * to native calls

20:27 <headius[m]> yeah exactly

20:27 <headius[m]> I expected it would break something in URLs or properly encoded strings or whatever but it just works

20:28 <headius[m]> of course if those encodings weren't ASCII-compat they'd break, it's just not tested and super uncommon anyway

20:28 <headius[m]> so the second commit limits it only to cases where the incoming paths are already "binary" and we have no encoding to use

20:28 <headius[m]> both pass the case in question

20:30 <enebo[m]> For urls must be transcodable to UTF-something so ASCII-only is fine but I think any random garbage will work for files and dirs

20:32 <enebo[m]> ok so first commit is totally golden and second one is safe I think

20:33 <enebo[m]> second one could probably be better for cases where it is an encoding which is valid for the filesystem BUT java has no Charset for it. but that is probably very rare

20:33 <headius[m]> I had considered a couple other triggers for this binary logic

20:34 <headius[m]> yeah I don't know how often we run into that now

20:34 <headius[m]> I did an audit years ago to try to make sure all encodings would at least use the right name for the charsets they go with

20:34 <enebo[m]> if Encoding has a check in jcodings then it could just be a second check before the toString else

20:34 <headius[m]> and I think there's a hard error in getCharset now that we have not seen happen

20:35 <enebo[m]> but I really think this case seems very unusual so it may not exist in reality :)

20:35 <enebo[m]> or it exists in reality but that single user has not made a bug for it :)

20:35 <headius[m]> yeah I guess I'm coming down on the side of making this narrow as posible

20:35 <headius[m]> so if it's ASCII-8BIT clearly do this

20:36 <enebo[m]> This came about from invalid encoded strings translating to a file path right?

20:36 <headius[m]> but maybe also CR_UNKNOWN or CR_BROKEN when ascii-compat = true?

20:36 <headius[m]> yeah internally webrick forces many path strings to binary

20:36 <enebo[m]> yeah I definitely have no problems with how this was constrained. I can just see one more unlikely corner

20:36 <headius[m]> I assume to allow %AB like characters to go through without getting mangled

20:36 <enebo[m]> and fwiw that corner was already broken

20:37 <headius[m]> the case in question encodes those weird question marks using URL escaping... we end up trying to decode it as valid UTF-8 bytes and it breaks

20:38 <enebo[m]> heh

20:38 <enebo[m]> yeah so long as it ends up as binary your fix definitely fixes it somewhat cleverly

20:38 <enebo[m]> 8859_1 definitely is a way of cheating the Java charset

20:39 <enebo[m]> but honestly if it is 8bit it has to be treated no differently than a char * (e.g. garbage)

20:39 <headius[m]> yeah

20:40 <headius[m]> that's the general theme I'm seeing in these webrick failures.. including the ENV thing where we need to be able to just use raw char*

20:40 <headius[m]> https://github.com/jruby/jruby/issues/6248

20:40 <enebo[m]> Having been studying how Rust handles Strings and Paths it is amusing how much code just uses char * and hopes for the best

20:40 <headius[m]> that one I can't fix without an overhaul of env vars that doesn't use System.getenv anymore

20:40 <headius[m]> yeah it is like that in CRuby and I didn't really get why until now

20:40 <enebo[m]> It definitely also explains why they are making tests of inconsistent bytes

20:41 <headius[m]> you can put pretty much any char* into an env var and it needs to be treated as bytes and not characters

20:41 <enebo[m]> this is not all on MRI either...they do use char * but so does everything on the filesystem too

20:41 <headius[m]> or put another way it's up to the consumers of the env var to know/decide what encoding the bytes are in

20:41 <headius[m]> but JDK getenv uses String so they have to decode

20:42 <enebo[m]> yeah to the env stmt

20:42 <headius[m]> so it's basically the same problem as paths

20:42 <enebo[m]> it is same basic garbage data problem

20:42 <headius[m]> yeah

20:42 <headius[m]> MRI handles it better because they don't handle it at all

20:42 <enebo[m]> I suppose anything that passes char * or gets char * will do this

20:42 <enebo[m]> We just have not fixed all cases because MRI has not written tests for it yet

20:43 <headius[m]> this may fix a few things, I haven't checked yet

20:43 <headius[m]> I know we've had some multibyte expand_path issues in the past but this is a pretty specific weird case

20:43 <enebo[m]> yeah it will be interesting to see if you only experience positive behavior after this change or not :)

20:43 <headius[m]> 99% of the time we're dealing with properly encoded strings there

20:43 <enebo[m]> It seems reasonable to me

20:43 <headius[m]> ok

20:43 <headius[m]> I will sort out if there's any additional narrowing needed and squash these for merge

20:44 <enebo[m]> well in most modern languages you do work with properly encoded strings

20:44 <enebo[m]> Ruby decided to combine binary data with their string class

20:44 <enebo[m]> but it has the benefit of working seemlessly with any C apis

20:44 <enebo[m]> in that C apis have no constraints on string contents other than termination

20:45 <headius[m]> this may advise fixes in other places where we have to use java.lang.String too

20:45 <enebo[m]> My proof that this was a design mistake is the number of bug fixes/CVEs around finding APIs which accept strings with a \0 only part way through it

20:46 <headius[m]> could be helpful in any cases that are ascii-compat and just need to juggle around substrings on known boundaries, like / characters

20:46 <headius[m]> yeah that is a big problem for them in general too because they have to zero terminate any string content they pass to system functions

20:46 <headius[m]> so they've got \0 leaking into everything all the time

20:46 <enebo[m]> Maybe we should rename things referring to ISO as Binary

20:46 <headius[m]> hmm

20:47 <headius[m]> I'm not thrilled with that naming but it's better than ISO

20:47 <enebo[m]> ISO describes how we make a binary string

20:47 <enebo[m]> but I was just thinking about intent of making a pointer of bytes that we have no idea what they might be

20:48 <headius[m]> decodeASCII would be ok but MRI muddied the waters with US-ASCII and ASCII-8BIT

20:48 <enebo[m]> In fact (and I am not suggesting this) it has always been a problem I have had with ASCII vs USASCII as ASCII is really just BINARY

20:48 <headius[m]> though I suppose technically ASCII is only 0..127

20:48 <enebo[m]> decode8Bit

20:48 <headius[m]> yeah

20:48 <headius[m]> decodeOctets

20:48 <enebo[m]> yeah I am ok with 8bit or binary

20:49 <headius[m]> I think I prefer binary to 8bit

20:49 <enebo[m]> yeah my top preference so far is binary

20:49 <enebo[m]> it is more or less not known and ascii-8bit is commonly referred to as binary

20:50 <headius[m]> at least in Ruby world

20:50 <enebo[m]> well yeah but this is in our ruby impl so I think that fits

20:50 <enebo[m]> I am mostly thinking about how I will read the code in 6 months or longer

20:50 <headius[m]> it's fairly clear

20:50 <headius[m]> blob of bytes

20:50 <enebo[m]> decodeAsBinary or decodeAsRaw perhaps?

20:51 <enebo[m]> Raw is not a bad name either

20:51 <headius[m]> decodeRawBytes

20:51 <enebo[m]> yeah seems nice

20:51 <enebo[m]> I do not think I could misinterpret what that means in the future

20:51 <headius[m]> me neither

21:22 <headius[m]> enebo: I went with decodeRaw because the methods in question already take byte[] as a parameter.. decodeRawBytes seem seemed redundant

21:23 <enebo[m]> ah ok

21:25 <i8her8oat[m]> What would you do to clone a property? just <property.clone()> ?

21:26 <headius[m]> enebo: do you know? this is in the context of jrubyfx

21:27 <enebo[m]> a property in fxml?

21:28 <i8her8oat[m]> hm I think it has to be homemade, since JavaFX doesnt provide any thing related to cloning SimpleDoubleProperties

21:29 <enebo[m]> https://docs.oracle.com/javafx/2/api/javafx/beans/property/SimpleDoubleProperty.html

21:29 <i8her8oat[m]> prob something like <clone = double_property(self, "clone", old.value)>

21:30 <enebo[m]> that will just make a property for clone with a value right?

21:30 <i8her8oat[m]> yes

21:30 <enebo[m]> I have not used properties in years

21:30 <i8her8oat[m]> : /

21:31 <enebo[m]> That should just make an instance of that and be a short-hand for writing it iout long hand

21:31 <enebo[m]> If for some reason the DSL syntax lets you down you can always write it out long hand: javafx.beans.property.SimpleDoubleProperty.new(self, "clone", old.value)

21:32 <enebo[m]> err probably Java:: at the front of that depending on whether we load javafx as a method name to Kernel/BasicObject

21:35 <i8her8oat[m]> DSL is consistent when loaded from a Java program. It doesn't work on block building : timeline do ... animate ... end doesnt work. I suppose vbox() do ... label() ... end neither, so I have to go with the <getChildren().add()> way

21:36 <enebo[m]> i8her8oat: ah yeah that something is an issue where our auto adding is not correct

21:36 <enebo[m]> in some cases it is just missing but it is a massive API

21:36 <i8her8oat[m]> yeah you can work around pretty easily

21:37 <enebo[m]> if you decipher our code there is a place where for each type we specify what is auto added

21:37 <enebo[m]> I have not looked in quite a while but it is a big list of classes and it will specify how to add child elements for each time

21:37 <enebo[m]> err s/time/type/

21:38 <enebo[m]> I'm sorry it has just been a long time since I looked and byteit101 spent a lot of time after me expanding things out

21:39 <i8her8oat[m]> maybe the precompiled.rb file?

21:39 <i8her8oat[m]> for <Java::JavafxSceneControl::TableView>, there is

21:40 * i8her8oat[m] sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/SQzqjLvqXWcosHIVSoMHZuTN >

21:40 <i8her8oat[m]> then add is

21:40 <i8her8oat[m]> def add(value)

21:40 <i8her8oat[m]> get_columns() << value

21:40 <i8her8oat[m]> end

21:41 <headius[m]> enebo: the next case I ran into is when a binary string gets into File.directory?, where we ultimately use another java String and java file APIs

21:41 <enebo[m]> precompiled is generated if I remember right

21:41 <headius[m]> in this case it seems like assuming the chars are at least encoded like system paths is the best we can do

21:41 <headius[m]> so if it's binary I'm using default system charset, rather than decoding it as ISO-8859-1 characters

21:41 <headius[m]> it seems to work ok for this case

21:42 <i8her8oat[m]> enebo: yes, it is

21:42 <enebo[m]> i8her8oat: perhaps I am thinking of updating exts.yml and then regenerating

21:42 <enebo[m]> originally I just hard-coded this so this has changed since I worked on it

21:43 <i8her8oat[m]> if you ever feel like regeneratingn the file, could you link it? I dont generate it everytime I run my app

21:44 <enebo[m]> This has to be documented somewhere...hmm

21:45 <enebo[m]> i8her8oat: I know this is not a great answer but I am going to start dinner and hide in my basement after that :) https://github.com/jruby/jrubyfx/wiki/Developer-Overview

21:46 <enebo[m]> It does explain each file and it might explain how regeneration works

21:46 <i8her8oat[m]> ok great

21:47 <enebo[m]> i8her8oat: since we have not released and fx changes like every 18 months in some weird way this may be useful for you to 'rake reflect' anyways

21:47 <enebo[m]> not released in a long time

21:47 <enebo[m]> okies...good luck...going quiet

23:46 ur5us has joined #jruby