#jruby on 2020-04-03 — irc logs at freenode.irclog.whitequark.org

2019-08-12 18:53 ChanServ changed the topic of #jruby to: Get 9.2.8.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

00:18 ur5us has quit [Ping timeout: 240 seconds]

00:30 ur5us has joined #jruby

01:08 <headius[m]> it's merged, welcome to Ruby 2.6 support on master

01:31 dhoc has quit [Quit: dhoc]

01:41 dhoc has joined #jruby

02:29 dhoc has quit [Quit: dhoc]

03:29 <byteit101[m]> Should I use `java_annotation 'annotation'; java_signature 'void myMethod()'` or `java_signature '@annotation void myMethod()'`? Both seem to be partially supported...

03:46 <headius[m]> hmm I think enebo wrote the parser for that so I'm not sure

03:46 <headius[m]> I didn't know both of those work :-)

03:53 <byteit101[m]> Yea, just doing a git blame on NoMethodErrors and I see enebo is to blame :-)

03:54 <byteit101[m]> The former works only for jrubyc, the latter works only for become_java! on my branch

03:55 <byteit101[m]> The former saves nothing for become_java, the latter throws up with ` @annotationpublic void myMethod() {` for jrubyc (note the lack of space. will file a bug)

04:06 <headius[m]> hmm I wonder what kind of tests we have for this stuff

04:27 <byteit101[m]> minimal. annotation parameters are a explosion of exceptions

04:27 <byteit101[m]> Though `@javax.annotation.Resource(description=@java.lang.Deprecated("Testing"))` parses, which I didn't know was possible

04:28 <byteit101[m]> javac stuff and non-parameterized annotations seem well enough tested though

07:46 ur5us has quit [Ping timeout: 252 seconds]

07:52 ur5us has joined #jruby

08:42 shellac has joined #jruby

09:13 ur5us has quit [Ping timeout: 256 seconds]

12:19 shellac has quit [Ping timeout: 256 seconds]

16:03 xardion has quit [Remote host closed the connection]

16:05 shellac has joined #jruby

16:08 xardion has joined #jruby

16:31 bga57 has quit [*.net *.split]

16:31 lanceball has quit [*.net *.split]

16:31 bga57 has joined #jruby

16:32 lanceball has joined #jruby

16:37 shellac has quit [Ping timeout: 240 seconds]

16:50 shellac has joined #jruby

17:11 shellac has quit [Quit: Computer has gone to sleep.]

17:56 <lopex> headius[m]: looking

18:02 <lopex> ONIG_ENCODING_UTF8->is_code_ctype(498, ONIGENC_CTYPE_UPPER, ONIG_ENCODING_UTF8)

18:02 <lopex> zero

18:02 <lopex> so it's not a jcodings issue

18:03 <lopex> enebo[m]: you here ?

18:04 <headius[m]> lopex: hmm

18:05 <lopex> headius[m]: we;re missing quite bit of logic for is_special_global_name

18:05 <lopex> and rb_sym_constant_char_p

18:05 <lopex> there's a lot of logic

18:06 <headius[m]> I don't doubt it

18:06 <headius[m]> but does something in there magically treat this as upper-case?

18:06 <lopex> rb_sym_constant_char_p - the name suggests that someone made it public for debug purposes at some point

18:07 <lopex> if (ISASCII(*name)) return ISUPPER(*name);

18:07 <lopex> but then

18:07 <lopex> yea!

18:07 <lopex> if (rb_enc_isctype(c, ctype_titlecase, enc)) return TRUE;

18:07 <lopex> from static const UChar cname[] = "titlecaseletter"; core range

18:07 <lopex> headius[m]: ^^

18:08 <headius[m]> titlecase

18:08 <lopex> headius[m]: but we're missing almost all unicode / encoding supoort for symbol name walking

18:08 <headius[m]> that's a new term to me!

18:09 <headius[m]> JDK provides isTitleCase

18:10 <lopex> but it's equally easy to do in jcodings

18:10 shellac has joined #jruby

18:10 <headius[m]> sure

18:10 <headius[m]> let's do that

18:12 <lopex> mri also caches ctype in static function variable there

18:15 <headius[m]> $ jruby -e 'Object.const_set("ǲ", 1); p Object.const_get("ǲ")'

18:15 <headius[m]> 1

18:15 <headius[m]> woot

18:15 <headius[m]> titlecase does work

18:15 <lopex> jdk one ?

18:16 <headius[m]> yeah

18:16 <lopex> what unicode version jdks provide ?

18:17 <headius[m]> heh well this page says 6.2: https://docs.oracle.com/javase/8/docs/technotes/guides/intl/enhancements.8.html

18:17 <headius[m]> for Java 8

18:19 <headius[m]> looking for a way to query that

18:20 <lopex> in any case

18:21 <lopex> byte[] titleCase = "titlecaseletter".getBytes();

18:21 <lopex> enc.isCodeCType(498, ctype);

18:21 <lopex> int ctype = enc.propertyNameToCType(titleCase, 0, titleCase.length);

18:22 <lopex> headius[m]: also https://github.com/ruby/ruby/blob/master/symbol.c#L216

18:22 <lopex> they check both rb_enc_isupper and rb_enc_islower

18:22 <lopex> so that explains three variants

18:22 <lopex> that Dz i neither

18:22 <lopex> *is neither

18:24 <headius[m]> well it's elaborate but not too much code

18:24 <lopex> headius[m]: that's a fraction of the whole thing

18:25 <lopex> well like 1/3

18:25 <headius[m]> we have logic for these other things around

18:25 <lopex> I cant even see where it's encoding aware

18:26 <headius[m]> are you looking at IdUtil?

18:27 <lopex> no, at RubySymbol

18:27 <lopex> well, then there's som duplication

18:27 <headius[m]> RubySymbol.validConstantName has the checks for leading caps

18:28 <headius[m]> IdUtil is using Java strings but we don't keep unicode constant names as normal Java strings

18:31 <lopex> https://github.com/ruby/ruby/commit/f852af0e59899157ef695edccbe86d51fc04d23b

18:31 <lopex> indeed

18:32 <headius[m]> so this is the bulk of the logic

18:32 <lopex> it's fallsback to case folding

18:32 <lopex> for other encodings too

18:33 <headius[m]> lopex: I filed this about IdUtil and cleaning up this stuff: https://github.com/jruby/jruby/issues/6144

18:38 <lopex> headius[m]: what's the equivalent of https://github.com/ruby/ruby/blob/master/template/id.h.tmpl#L23 ?

18:59 <headius[m]> I don't think we have anything

18:59 <headius[m]> we could add it but we don't track what type of symbol a symbol is anywhere right now

18:59 <headius[m]> I have seen this code before too... it's not used much but it does avoid having to re-scan the symbol

19:00 <headius[m]> so it would be worth it for that

19:04 subbu is now known as subbu|lunch

19:09 <headius[m]> lopex: I think this will pass if I just add a titlecase check for now, using JDK isTitleCase

19:09 <headius[m]> but it would be very nice to get the symbol type logic in here

19:10 <headius[m]> are all isUpper characters also isTitleCase I wonder

19:11 <headius[m]> doesn't get the test passing without the case folding part it looks like

19:11 <headius[m]> but it gets past those first few odd chars

19:17 <lopex> headius[m]: and with jcodings ?

19:17 <lopex> it should be ok using jcodings

19:20 <lopex> ǲ is not upper

19:22 <headius[m]> well I was just going to add the Character.isTitleCase to this check

19:23 <headius[m]> but I'm trying to port nobu's case folding version now

19:23 <headius[m]> I don't get this static titlecase stuff in the middle

19:23 <lopex> but do you also check for islower and isupper ?

19:23 <lopex> for unicode ?

19:23 <lopex> you need to

19:23 <headius[m]> can it be titlecase and lower

19:23 <headius[m]> ?

19:24 <headius[m]> I assumed isupper || istitle would cover it

19:24 <lopex> it can be title and not upper

19:24 <lopex> just mimic what mri does

19:24 shellac has quit [Quit: Computer has gone to sleep.]

19:25 <headius[m]> ok this logic in the middle is just because they don't have a titlecase type?

19:25 <lopex> as in https://github.com/ruby/ruby/blob/master/symbol.c#L216

19:25 <lopex> title case is separate from lower and upper

19:25 <headius[m]> so they have to look it up from onigmo

19:25 <headius[m]> right ok they look up the ctype for "titlecaseletter" characters

19:25 <headius[m]> and save that in a static for future calls

19:25 <lopex> and islowet and isupper as well

19:25 <headius[m]> so meh I don't need this

19:26 <lopex> you need all three

19:27 <headius[m]> oh this case folding stuff goes at the tables directly

19:27 <headius[m]> boo

19:27 <lopex> the one for non unicode ?

19:27 <headius[m]> well the last part of nobu's logic

19:27 <headius[m]> I have the first part

19:28 <lopex> first check is for unicode

19:28 <lopex> upper then lower then title right ?

19:28 <headius[m]> yes

19:28 <lopex> the non unocode is folding

19:29 * headius[m] sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/kwYsIxQJBPMoBeqFPiAggqXx >

19:29 <lopex> why not title case from jcodings ?

19:29 <headius[m]> we don't have a method for title case

19:29 <lopex> jdk might have obsolete tables

19:29 <lopex> I posted an example abouve

19:30 <headius[m]> oh

19:30 <lopex> :

19:30 <lopex> byte[] titleCase = "titlecaseletter".getBytes();

19:30 <lopex> int ctype = enc.propertyNameToCType(titleCase, 0, titleCase.length);

19:30 <lopex> enc.isCodeCType(498, ctype);

19:30 <lopex> the other one is pretty straight too

19:30 shellac has joined #jruby

19:31 <headius[m]> I see

19:31 <headius[m]> I think we should add isTitleCase :-)

19:32 <lopex> and gazzilion of others :P

19:32 <lopex> but yeah

19:32 <lopex> I think you can hardcode ctype, it sohuldnt change

19:32 <headius[m]> we can add the gazillion later

19:32 <lopex> or we can add one to CharacterType in jcodings

19:33 <headius[m]> yeah it's just leaky abstraction

19:33 shellac has quit [Client Quit]

19:33 <lopex> yeah, the onse in CharacterType should match those lookep up by name

19:33 <lopex> *ones

19:35 <headius[m]> ah yeah

19:35 <headius[m]> https://github.com/jruby/jcodings/blob/f9020f63c12e9a0abbd5f146d307dea1d45f3583/src/org/jcodings/Config.java#L47

19:35 <headius[m]> so I guess 15 it is

19:36 lucasb has joined #jruby

19:37 <headius[m]> where are these CharacterType used?

19:37 <lopex> in isUpper for example

19:37 <headius[m]> oh yeah I see now

19:38 <headius[m]> I guess I'm done

19:38 <lopex> CASE_TITLECASE is for case folding, not for char type

19:39 <lopex> hmm int r = enc->mbc_case_fold(ONIGENC_CASE_FOLD

19:39 <lopex> it's wrong

19:39 <lopex> case_fold and case_map are different mechanisms

19:39 <headius[m]> ok

19:40 <lopex> it's not ok

19:40 <lopex> wait

19:40 <headius[m]> hmm

19:40 <headius[m]> assertTrue(UTF8Encoding.INSTANCE.isTitle("ǲ".codePointAt(0)));

19:40 <headius[m]> fails

19:40 <headius[m]> public final boolean isTitle(int code) {

19:40 <headius[m]> return isCodeCType(code, CharacterType.TITLE);

19:40 <headius[m]> }

19:41 <lopex> you need to shift that

19:42 <lopex> er

19:42 <headius[m]> I'm doing what isUpper does

19:42 <lopex> yeah

19:42 <headius[m]> I'll push WIP PR

19:44 <headius[m]> https://github.com/jruby/jcodings/pull/30

19:44 shellac has joined #jruby

19:45 <lopex> headius[m]: ctype for title is 41

19:46 <lopex> it's not an arbitrary number

19:46 <headius[m]> so what's up with CharacterType then

19:46 <lopex> it's what you get from propertyNameToCType

19:47 <lopex> you get name -> ctype

19:47 <lopex> and lookup using ctype

19:47 <lopex> those in CharacterType are for convenience

19:47 <headius[m]> ah because they are the same

19:47 <lopex> so they can be used in isUpper etc

19:47 <headius[m]> but this one may differ across encodings?

19:47 shellac has quit [Client Quit]

19:47 <lopex> no

19:50 <headius[m]> ok, think I have it then

19:50 <lopex> headius[m]: for example https://github.com/jruby/jcodings/blob/master/src/org/jcodings/unicode/UnicodeEncoding.java#L63

19:50 <headius[m]> passing now

19:51 <lopex> unicode uses UnicodeProperties.CodeRangeTable

19:51 <lopex> and those indexes match up for other encodings

19:51 <headius[m]> check out PR now

19:52 <lopex> for other there are char type maps (where in int array the bits describe the chars)

19:52 <headius[m]> basically what nobu's code was doing with the static ctype int

19:53 <lopex> why not hardcode the 15 ?

19:53 <headius[m]> what is the significance of the 15

19:53 <lopex> it's the same like other constants in CharacterType

19:53 <headius[m]> versus ctype of 41

19:54 <lopex> every code range has it's own ctype

19:54 <lopex> and it avoids hash lookup

19:55 <headius[m]> but what do I do with the 15

19:55 <headius[m]> this is using the lookup logic and caching the ctype that results in the encoding object

19:55 <lopex> yeah, but would you add a field for any other ctype ?

19:56 <lopex> I agree it's not perfect but it's how it is in onigmo

19:56 <headius[m]> no, but this is special

19:56 <headius[m]> I haven't had a need for any other ctype

19:57 <lopex> and it pollutes Encoding.java with unicode bits

19:57 <headius[m]> ok so that clarifies something for me I guess... other encodings may or may not have the concept of a title case

19:58 <lopex> yeah, that too

19:58 <headius[m]> I could simply move this down to UnicodeEncoding

19:58 <headius[m]> but at that point is the ctype always just 41 then?

19:58 <headius[m]> I want to do this right

19:58 <lopex> I think I'd do that mri way here

19:58 <headius[m]> nobu's logic already confirmes we're dealing with unicode so moving this to UnicodeEncoding seems to make sense to me

19:58 <lopex> the title case is used only once in the entire code base

19:59 <headius[m]> yeah but then I have to cache the ctype in JRuby

19:59 <headius[m]> which seems wrong

19:59 <lopex> you can cache it in private static

19:59 <lopex> it will never change across runtimes

20:00 <lopex> yeah, I get it

20:00 <headius[m]> so you don't think I should add isTitle to jcodings

20:00 <headius[m]> because it doesn't exist in onigmo

20:01 <lopex> headius[m]: what else https://github.com/jruby/jcodings/blob/master/src/org/jcodings/unicode/UnicodeProperties.java ?

20:01 <lopex> what is special there ?

20:01 <lopex> you'll do as you wish but I think we can come up with something better

20:02 <headius[m]> 🤷‍♂️

20:02 <lopex> like cacheable hash for those

20:02 <lopex> for unicode ?

20:02 <headius[m]> well my justification is that nobody outside jcodings should have to know about ctype lookup just to check if a codepoint is titlecase

20:02 <lopex> so one can happilly use isType("titlecase")

20:02 <lopex> yeah

20:02 <lopex> ^

20:02 <headius[m]> maybe it's not as special as isUpper but it's more special than currency symbol

20:04 <headius[m]> I understan what you're saying... why add this when it's just one of many ctypes

20:04 <headius[m]> my answer is that it's the first non-standard ctype I've needed

20:04 <headius[m]> so 🤷‍♂️

20:04 <lopex> I think CodeRangeTable could have yet another hash int -> code range

20:05 <lopex> er string => int

20:05 <lopex> then we could provide isType(someName)

20:05 <headius[m]> and do a hash lookup each time?

20:06 <lopex> well,

20:06 <lopex> or I could generate named constants

20:06 <lopex> isType(TITLE_CASE)

20:06 <lopex> ?

20:06 <headius[m]> hmm

20:07 <headius[m]> how about enums

20:07 <lopex> yeah, they would provide name lookup too

20:07 <headius[m]> could CodeRangeEntry be an enum then?

20:08 <headius[m]> I mean lots of things in jcodings could/should be enums but this is a good place to start

20:09 <headius[m]> and CodeRangeEntry is not public right now so we can do whatever we want

20:09 <lopex> enum indexes follow definition order ?

20:10 <headius[m]> yes

20:10 <headius[m]> they always have a sequential ordinal

20:10 <lopex> but we cant provide byte[] lookup ?

20:12 <headius[m]> I guess we can but I believe EnumSet is a perfect hash using ordinals

20:13 <headius[m]> EnumMap or whatever I mean

20:14 <headius[m]> I'm fine with not adding isTitle if we add something that can use constants or enums and avoid a string or byte[] hash search

20:14 subbu|lunch is now known as subbu

20:22 <headius[m]> hmm

20:22 <headius[m]> RuntimeError: class not found for encoding "CESU-8"

20:22 <headius[m]> generate_encoding_list at generate.rb:86

20:22 <headius[m]> they've added some encodings

20:23 <lopex> yep

20:23 <lopex> that one

20:24 <headius[m]> yeah just that one I guess

20:24 <headius[m]> https://en.wikipedia.org/wiki/CESU-8

20:25 <lopex> commited bits to ignore it for now

20:25 <headius[m]> yeah I made it skip

20:25 <lopex> it's a utf-8 copy

20:25 <headius[m]> yeah looks like it

20:26 <lopex> I meant commited changes to the script

20:26 <headius[m]> yup I'll pick them up

20:26 <lopex> weird define for "GB2312" vanished

20:27 <lopex> it's replicated now

20:27 <headius[m]> hmm objdump logic is failing now

20:28 * headius[m] sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/XLUuBfVnctDtsSPrKzJYSIER >

20:28 <lopex> havent tested it on osx

20:29 <lopex> but owrks for me here on linux

20:30 <lopex> gobj_dump might have different output ?

20:30 <headius[m]> that could be

20:31 <headius[m]> haha

20:32 * headius[m] sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/sOtzvxncPfkruzdchDruDuFJ >

20:32 <headius[m]> header line

20:32 <headius[m]> but no data?

20:33 <lopex> stripped ?

20:34 <headius[m]> dunno

20:37 <headius[m]> well basically I'm trying to do this

20:37 <lopex> no changes in tables though

20:37 * headius[m] sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/fCeIDgXgaFnXPEXaVUaCkHxq >

20:43 <byteit101[m]> How do I convert unsigned to signed across the ruby-java boundary? (0xffff_0000_aaaa_5555.to_java Java::long) => RangeError (bignum too big to convert into `long')

20:43 <headius[m]> hmm

20:45 <headius[m]> that's a good question

20:45 <headius[m]> that does yield a bignum

20:46 <byteit101[m]> (Same thing for all max values of all types too, though those can be marshalled through longs at least)

20:46 <headius[m]> yeah

20:47 <headius[m]> I'm guessing this is a color?

20:47 <byteit101[m]> No, I'm implementing java_signature parsing&codegen for java proxies

20:47 <byteit101[m]> */java_annotation

20:48 <byteit101[m]> @Annotation(longvalue=0xffff_0000_aaaa_5555)

20:49 <headius[m]> ah ok

20:49 <byteit101[m]> I added hex to the lexer/parser, and realized this issue. Or should we not support unsigned hex :-)

20:50 <headius[m]> I suppose we should support any literal format Java itself supports

20:50 <headius[m]> there's nothing to indicate that this is intended to be an unsigned long though so it wouldn't parse in Java I think

20:51 <byteit101[m]> though annotations in java_signature already is different from java in that you must use MyClass.java_class. Not sure if that is a bug or not?

20:51 <headius[m]> yeah Java says "Integer number too large"

20:51 <headius[m]> shouldn't have to do that

20:51 <headius[m]> we should see it's a Java proxy class and do the right thing

20:52 <headius[m]> I'm trying to find a way to get raw long bits from a bignum

20:52 <byteit101[m]> Ok, I'll fix that java_class bug while I'm mucking around here

20:53 <headius[m]> aha

20:53 <headius[m]> there's Long.toUnsignedString

20:53 <headius[m]> tell me there's a reverse

20:54 <headius[m]> aha, parseUnsignedLong

20:54 <headius[m]> so at least we have that way... java.lang.Long.parseUnsignedLong

20:55 <byteit101[m]> How did I miss that? Works for me

20:55 <headius[m]> $ jruby -e "p java.lang.Long.parseUnsignedLong('ffff0000aaaa5555', 16)"

20:55 <headius[m]> -281472113420971

20:56 <headius[m]> it doesn't like 0x or underscores

20:56 <byteit101[m]> pushed more changes to the concrete_java branch/pr, more to come for this

20:56 <headius[m]> cool

20:56 <byteit101[m]> aww... though I already strip out the latter for Integer() right now. Can change that

21:00 <byteit101[m]> Any way to avoid the extra to_java? (java.lang.Long.parseUnsignedLong("fe", 16).to_java :long).byteValue()

21:17 shellac has joined #jruby

21:19 <byteit101[m]> Yay, this now parses: @com.test.EverythingAnnotation(astr="foo", abyte=0xe9, ashort=65500, anint=-495000, along=0xdead_beef_00_ff00ff, afloat=12.633, adouble=-0.009995, abool=true, anbool=false, achar='q', Darray={@javax.annotation.Resource(description=":-("), @javax.annotation.Resource(description=":-)")}, anenum=java.lang.annotation.RetentionPolicy.RUNTIME, aClass=java.lang.String.java_class)

22:00 shellac has quit [Quit: Computer has gone to sleep.]

22:11 shellac has joined #jruby

22:15 <headius[m]> is the extra to_java needed?

22:15 <headius[m]> the return type from parseUnsignedLong should be a Fixnum, which would naturally convert to long

22:18 <byteit101[m]> Yea, haven't figured out away to avoid all the back and forth. without to_java it's a fixnum, but byteValue is on Long: (undefined method `byteValue' for 254:Integer)

22:19 <headius[m]> oh

22:20 <headius[m]> to_java(:byte) should be the same, but that's going to truncate the long in either case

22:20 <headius[m]> ah but this isn't the long string

22:20 <headius[m]> yeah to_java(:byte)

22:20 <byteit101[m]> (too big for byte: 254)

22:20 <headius[m]> damn

22:21 <headius[m]> because it's signed

22:21 <byteit101[m]> https://github.com/jruby/jruby/pull/6141/files#diff-5e0ff85f03a7d2f61e15467ff45aef29R147

22:21 <byteit101[m]> yea :-)

22:21 <headius[m]> damn you java

22:21 <byteit101[m]> What I implemented there looks so gross, but I haven't figured out a better way

22:24 shellac has quit [Quit: Computer has gone to sleep.]

22:38 ur5us has joined #jruby

22:40 <headius[m]> well we can do it with bit math I guess

22:41 <byteit101[m]> I tried that, but was getting weird results

22:42 <byteit101[m]> (again due to signed vs unsigned)

22:43 <byteit101[m]> Going signed -> unsigned worked fine though, just not the reverse

22:44 ur5us has quit [Quit: Leaving]

22:46 <headius[m]> my brain isn't working today

22:47 <byteit101[m]> At least what I have works

22:47 <headius[m]> so we have unsigned bits and need to interpret them as signed without changing them

22:47 <headius[m]> yeah

22:47 <byteit101[m]> correct

22:48 <byteit101[m]> I was thinking pack/unpack and marshalling though a string might be another way, though it still seems icky

22:49 NightMonkey has quit [Ping timeout: 265 seconds]

22:50 <byteit101[m]> slightly different thing: is it possible to call a java method but to not wrap it into a ruby object?

22:51 <headius[m]> not really

22:51 <headius[m]> not for primitives anyway

22:53 <byteit101[m]> hmm... drat

22:54 NightMonkey has joined #jruby

22:59 <byteit101[m]> Workaround works! current PR now supports all annotation types on methods except char

23:45 lucasb has quit [Quit: Connection closed for inactivity]