00:18
ur5us has quit [Ping timeout: 240 seconds]
00:30
ur5us has joined #jruby
01:08
<
headius[m] >
it's merged, welcome to Ruby 2.6 support on master
01:31
dhoc has quit [Quit: dhoc]
01:41
dhoc has joined #jruby
02:29
dhoc has quit [Quit: dhoc]
03:29
<
byteit101[m] >
Should I use `java_annotation 'annotation'; java_signature 'void myMethod()'` or `java_signature '@annotation void myMethod()'`? Both seem to be partially supported...
03:46
<
headius[m] >
hmm I think enebo wrote the parser for that so I'm not sure
03:46
<
headius[m] >
I didn't know both of those work :-)
03:53
<
byteit101[m] >
Yea, just doing a git blame on NoMethodErrors and I see enebo is to blame :-)
03:54
<
byteit101[m] >
The former works only for jrubyc, the latter works only for become_java! on my branch
03:55
<
byteit101[m] >
The former saves nothing for become_java, the latter throws up with ` @annotationpublic void myMethod() {` for jrubyc (note the lack of space. will file a bug)
04:06
<
headius[m] >
hmm I wonder what kind of tests we have for this stuff
04:27
<
byteit101[m] >
minimal. annotation parameters are a explosion of exceptions
04:27
<
byteit101[m] >
Though `@javax.annotation.Resource(description=@java.lang.Deprecated("Testing"))` parses, which I didn't know was possible
04:28
<
byteit101[m] >
javac stuff and non-parameterized annotations seem well enough tested though
07:46
ur5us has quit [Ping timeout: 252 seconds]
07:52
ur5us has joined #jruby
08:42
shellac has joined #jruby
09:13
ur5us has quit [Ping timeout: 256 seconds]
12:19
shellac has quit [Ping timeout: 256 seconds]
16:03
xardion has quit [Remote host closed the connection]
16:05
shellac has joined #jruby
16:08
xardion has joined #jruby
16:31
bga57 has quit [*.net *.split]
16:31
lanceball has quit [*.net *.split]
16:31
bga57 has joined #jruby
16:32
lanceball has joined #jruby
16:37
shellac has quit [Ping timeout: 240 seconds]
16:50
shellac has joined #jruby
17:11
shellac has quit [Quit: Computer has gone to sleep.]
17:56
<
lopex >
headius[m]: looking
18:02
<
lopex >
ONIG_ENCODING_UTF8->is_code_ctype(498, ONIGENC_CTYPE_UPPER, ONIG_ENCODING_UTF8)
18:02
<
lopex >
so it's not a jcodings issue
18:03
<
lopex >
enebo[m]: you here ?
18:04
<
headius[m] >
lopex: hmm
18:05
<
lopex >
headius[m]: we;re missing quite bit of logic for is_special_global_name
18:05
<
lopex >
and rb_sym_constant_char_p
18:05
<
lopex >
there's a lot of logic
18:06
<
headius[m] >
I don't doubt it
18:06
<
headius[m] >
but does something in there magically treat this as upper-case?
18:06
<
lopex >
rb_sym_constant_char_p - the name suggests that someone made it public for debug purposes at some point
18:07
<
lopex >
if (ISASCII(*name)) return ISUPPER(*name);
18:07
<
lopex >
if (rb_enc_isctype(c, ctype_titlecase, enc)) return TRUE;
18:07
<
lopex >
from static const UChar cname[] = "titlecaseletter"; core range
18:07
<
lopex >
headius[m]: ^^
18:08
<
headius[m] >
titlecase
18:08
<
lopex >
headius[m]: but we're missing almost all unicode / encoding supoort for symbol name walking
18:08
<
headius[m] >
that's a new term to me!
18:09
<
headius[m] >
JDK provides isTitleCase
18:10
<
lopex >
but it's equally easy to do in jcodings
18:10
shellac has joined #jruby
18:10
<
headius[m] >
let's do that
18:12
<
lopex >
mri also caches ctype in static function variable there
18:15
<
headius[m] >
$ jruby -e 'Object.const_set("Dz", 1); p Object.const_get("Dz")'
18:15
<
headius[m] >
titlecase does work
18:16
<
lopex >
what unicode version jdks provide ?
18:17
<
headius[m] >
for Java 8
18:19
<
headius[m] >
looking for a way to query that
18:20
<
lopex >
in any case
18:21
<
lopex >
byte[] titleCase = "titlecaseletter".getBytes();
18:21
<
lopex >
enc.isCodeCType(498, ctype);
18:21
<
lopex >
int ctype = enc.propertyNameToCType(titleCase, 0, titleCase.length);
18:22
<
lopex >
they check both rb_enc_isupper and rb_enc_islower
18:22
<
lopex >
so that explains three variants
18:22
<
lopex >
that Dz i neither
18:22
<
lopex >
*is neither
18:24
<
headius[m] >
well it's elaborate but not too much code
18:24
<
lopex >
headius[m]: that's a fraction of the whole thing
18:25
<
lopex >
well like 1/3
18:25
<
headius[m] >
we have logic for these other things around
18:25
<
lopex >
I cant even see where it's encoding aware
18:26
<
headius[m] >
are you looking at IdUtil?
18:27
<
lopex >
no, at RubySymbol
18:27
<
lopex >
well, then there's som duplication
18:27
<
headius[m] >
RubySymbol.validConstantName has the checks for leading caps
18:28
<
headius[m] >
IdUtil is using Java strings but we don't keep unicode constant names as normal Java strings
18:32
<
headius[m] >
so this is the bulk of the logic
18:32
<
lopex >
it's fallsback to case folding
18:32
<
lopex >
for other encodings too
18:59
<
headius[m] >
I don't think we have anything
18:59
<
headius[m] >
we could add it but we don't track what type of symbol a symbol is anywhere right now
18:59
<
headius[m] >
I have seen this code before too... it's not used much but it does avoid having to re-scan the symbol
19:00
<
headius[m] >
so it would be worth it for that
19:04
subbu is now known as subbu|lunch
19:09
<
headius[m] >
lopex: I think this will pass if I just add a titlecase check for now, using JDK isTitleCase
19:09
<
headius[m] >
but it would be very nice to get the symbol type logic in here
19:10
<
headius[m] >
are all isUpper characters also isTitleCase I wonder
19:11
<
headius[m] >
doesn't get the test passing without the case folding part it looks like
19:11
<
headius[m] >
but it gets past those first few odd chars
19:17
<
lopex >
headius[m]: and with jcodings ?
19:17
<
lopex >
it should be ok using jcodings
19:20
<
lopex >
Dz is not upper
19:22
<
headius[m] >
well I was just going to add the Character.isTitleCase to this check
19:23
<
headius[m] >
but I'm trying to port nobu's case folding version now
19:23
<
headius[m] >
I don't get this static titlecase stuff in the middle
19:23
<
lopex >
but do you also check for islower and isupper ?
19:23
<
lopex >
for unicode ?
19:23
<
lopex >
you need to
19:23
<
headius[m] >
can it be titlecase and lower
19:24
<
headius[m] >
I assumed isupper || istitle would cover it
19:24
<
lopex >
it can be title and not upper
19:24
<
lopex >
just mimic what mri does
19:24
shellac has quit [Quit: Computer has gone to sleep.]
19:25
<
headius[m] >
ok this logic in the middle is just because they don't have a titlecase type?
19:25
<
lopex >
title case is separate from lower and upper
19:25
<
headius[m] >
so they have to look it up from onigmo
19:25
<
headius[m] >
right ok they look up the ctype for "titlecaseletter" characters
19:25
<
headius[m] >
and save that in a static for future calls
19:25
<
lopex >
and islowet and isupper as well
19:25
<
headius[m] >
so meh I don't need this
19:26
<
lopex >
you need all three
19:27
<
headius[m] >
oh this case folding stuff goes at the tables directly
19:27
<
lopex >
the one for non unicode ?
19:27
<
headius[m] >
well the last part of nobu's logic
19:27
<
headius[m] >
I have the first part
19:28
<
lopex >
first check is for unicode
19:28
<
lopex >
upper then lower then title right ?
19:28
<
lopex >
the non unocode is folding
19:29
<
lopex >
why not title case from jcodings ?
19:29
<
headius[m] >
we don't have a method for title case
19:29
<
lopex >
jdk might have obsolete tables
19:29
<
lopex >
I posted an example abouve
19:30
<
lopex >
byte[] titleCase = "titlecaseletter".getBytes();
19:30
<
lopex >
int ctype = enc.propertyNameToCType(titleCase, 0, titleCase.length);
19:30
<
lopex >
enc.isCodeCType(498, ctype);
19:30
<
lopex >
the other one is pretty straight too
19:30
shellac has joined #jruby
19:31
<
headius[m] >
I think we should add isTitleCase :-)
19:32
<
lopex >
and gazzilion of others :P
19:32
<
lopex >
I think you can hardcode ctype, it sohuldnt change
19:32
<
headius[m] >
we can add the gazillion later
19:32
<
lopex >
or we can add one to CharacterType in jcodings
19:33
<
headius[m] >
yeah it's just leaky abstraction
19:33
shellac has quit [Client Quit]
19:33
<
lopex >
yeah, the onse in CharacterType should match those lookep up by name
19:35
<
headius[m] >
ah yeah
19:35
<
headius[m] >
so I guess 15 it is
19:36
lucasb has joined #jruby
19:37
<
headius[m] >
where are these CharacterType used?
19:37
<
lopex >
in isUpper for example
19:37
<
headius[m] >
oh yeah I see now
19:38
<
headius[m] >
I guess I'm done
19:38
<
lopex >
CASE_TITLECASE is for case folding, not for char type
19:39
<
lopex >
hmm int r = enc->mbc_case_fold(ONIGENC_CASE_FOLD
19:39
<
lopex >
case_fold and case_map are different mechanisms
19:40
<
lopex >
it's not ok
19:40
<
headius[m] >
assertTrue(UTF8Encoding.INSTANCE.isTitle("Dz".codePointAt(0)));
19:40
<
headius[m] >
public final boolean isTitle(int code) {
19:40
<
headius[m] >
return isCodeCType(code, CharacterType.TITLE);
19:41
<
lopex >
you need to shift that
19:42
<
headius[m] >
I'm doing what isUpper does
19:42
<
headius[m] >
I'll push WIP PR
19:44
shellac has joined #jruby
19:45
<
lopex >
headius[m]: ctype for title is 41
19:46
<
lopex >
it's not an arbitrary number
19:46
<
headius[m] >
so what's up with CharacterType then
19:46
<
lopex >
it's what you get from propertyNameToCType
19:47
<
lopex >
you get name -> ctype
19:47
<
lopex >
and lookup using ctype
19:47
<
lopex >
those in CharacterType are for convenience
19:47
<
headius[m] >
ah because they are the same
19:47
<
lopex >
so they can be used in isUpper etc
19:47
<
headius[m] >
but this one may differ across encodings?
19:47
shellac has quit [Client Quit]
19:50
<
headius[m] >
ok, think I have it then
19:50
<
headius[m] >
passing now
19:51
<
lopex >
unicode uses UnicodeProperties.CodeRangeTable
19:51
<
lopex >
and those indexes match up for other encodings
19:51
<
headius[m] >
check out PR now
19:52
<
lopex >
for other there are char type maps (where in int array the bits describe the chars)
19:52
<
headius[m] >
basically what nobu's code was doing with the static ctype int
19:53
<
lopex >
why not hardcode the 15 ?
19:53
<
headius[m] >
what is the significance of the 15
19:53
<
lopex >
it's the same like other constants in CharacterType
19:53
<
headius[m] >
versus ctype of 41
19:54
<
lopex >
every code range has it's own ctype
19:54
<
lopex >
and it avoids hash lookup
19:55
<
headius[m] >
but what do I do with the 15
19:55
<
headius[m] >
this is using the lookup logic and caching the ctype that results in the encoding object
19:55
<
lopex >
yeah, but would you add a field for any other ctype ?
19:56
<
lopex >
I agree it's not perfect but it's how it is in onigmo
19:56
<
headius[m] >
no, but this is special
19:56
<
headius[m] >
I haven't had a need for any other ctype
19:57
<
lopex >
and it pollutes Encoding.java with unicode bits
19:57
<
headius[m] >
ok so that clarifies something for me I guess... other encodings may or may not have the concept of a title case
19:58
<
lopex >
yeah, that too
19:58
<
headius[m] >
I could simply move this down to UnicodeEncoding
19:58
<
headius[m] >
but at that point is the ctype always just 41 then?
19:58
<
headius[m] >
I want to do this right
19:58
<
lopex >
I think I'd do that mri way here
19:58
<
headius[m] >
nobu's logic already confirmes we're dealing with unicode so moving this to UnicodeEncoding seems to make sense to me
19:58
<
lopex >
the title case is used only once in the entire code base
19:59
<
headius[m] >
yeah but then I have to cache the ctype in JRuby
19:59
<
headius[m] >
which seems wrong
19:59
<
lopex >
you can cache it in private static
19:59
<
lopex >
it will never change across runtimes
20:00
<
lopex >
yeah, I get it
20:00
<
headius[m] >
so you don't think I should add isTitle to jcodings
20:00
<
headius[m] >
because it doesn't exist in onigmo
20:01
<
lopex >
what is special there ?
20:01
<
lopex >
you'll do as you wish but I think we can come up with something better
20:02
<
lopex >
like cacheable hash for those
20:02
<
lopex >
for unicode ?
20:02
<
headius[m] >
well my justification is that nobody outside jcodings should have to know about ctype lookup just to check if a codepoint is titlecase
20:02
<
lopex >
so one can happilly use isType("titlecase")
20:02
<
headius[m] >
maybe it's not as special as isUpper but it's more special than currency symbol
20:04
<
headius[m] >
I understan what you're saying... why add this when it's just one of many ctypes
20:04
<
headius[m] >
my answer is that it's the first non-standard ctype I've needed
20:04
<
headius[m] >
so 🤷♂️
20:04
<
lopex >
I think CodeRangeTable could have yet another hash int -> code range
20:05
<
lopex >
er string => int
20:05
<
lopex >
then we could provide isType(someName)
20:05
<
headius[m] >
and do a hash lookup each time?
20:06
<
lopex >
or I could generate named constants
20:06
<
lopex >
isType(TITLE_CASE)
20:07
<
headius[m] >
how about enums
20:07
<
lopex >
yeah, they would provide name lookup too
20:07
<
headius[m] >
could CodeRangeEntry be an enum then?
20:08
<
headius[m] >
I mean lots of things in jcodings could/should be enums but this is a good place to start
20:09
<
headius[m] >
and CodeRangeEntry is not public right now so we can do whatever we want
20:09
<
lopex >
enum indexes follow definition order ?
20:10
<
headius[m] >
they always have a sequential ordinal
20:10
<
lopex >
but we cant provide byte[] lookup ?
20:12
<
headius[m] >
I guess we can but I believe EnumSet is a perfect hash using ordinals
20:13
<
headius[m] >
EnumMap or whatever I mean
20:14
<
headius[m] >
I'm fine with not adding isTitle if we add something that can use constants or enums and avoid a string or byte[] hash search
20:14
subbu|lunch is now known as subbu
20:22
<
headius[m] >
RuntimeError: class not found for encoding "CESU-8"
20:22
<
headius[m] >
generate_encoding_list at generate.rb:86
20:22
<
headius[m] >
they've added some encodings
20:24
<
headius[m] >
yeah just that one I guess
20:25
<
lopex >
commited bits to ignore it for now
20:25
<
headius[m] >
yeah I made it skip
20:25
<
lopex >
it's a utf-8 copy
20:25
<
headius[m] >
yeah looks like it
20:26
<
lopex >
I meant commited changes to the script
20:26
<
headius[m] >
yup I'll pick them up
20:26
<
lopex >
weird define for "GB2312" vanished
20:27
<
lopex >
it's replicated now
20:27
<
headius[m] >
hmm objdump logic is failing now
20:28
<
lopex >
havent tested it on osx
20:29
<
lopex >
but owrks for me here on linux
20:30
<
lopex >
gobj_dump might have different output ?
20:30
<
headius[m] >
that could be
20:32
<
headius[m] >
header line
20:32
<
headius[m] >
but no data?
20:37
<
headius[m] >
well basically I'm trying to do this
20:37
<
lopex >
no changes in tables though
20:43
<
byteit101[m] >
How do I convert unsigned to signed across the ruby-java boundary? (0xffff_0000_aaaa_5555.to_java Java::long) => RangeError (bignum too big to convert into `long')
20:45
<
headius[m] >
that's a good question
20:45
<
headius[m] >
that does yield a bignum
20:46
<
byteit101[m] >
(Same thing for all max values of all types too, though those can be marshalled through longs at least)
20:47
<
headius[m] >
I'm guessing this is a color?
20:47
<
byteit101[m] >
No, I'm implementing java_signature parsing&codegen for java proxies
20:47
<
byteit101[m] >
*/java_annotation
20:48
<
byteit101[m] >
@Annotation(longvalue=0xffff_0000_aaaa_5555)
20:49
<
byteit101[m] >
I added hex to the lexer/parser, and realized this issue. Or should we not support unsigned hex :-)
20:50
<
headius[m] >
I suppose we should support any literal format Java itself supports
20:50
<
headius[m] >
there's nothing to indicate that this is intended to be an unsigned long though so it wouldn't parse in Java I think
20:51
<
byteit101[m] >
though annotations in java_signature already is different from java in that you must use MyClass.java_class. Not sure if that is a bug or not?
20:51
<
headius[m] >
yeah Java says "Integer number too large"
20:51
<
headius[m] >
shouldn't have to do that
20:51
<
headius[m] >
we should see it's a Java proxy class and do the right thing
20:52
<
headius[m] >
I'm trying to find a way to get raw long bits from a bignum
20:52
<
byteit101[m] >
Ok, I'll fix that java_class bug while I'm mucking around here
20:53
<
headius[m] >
there's Long.toUnsignedString
20:53
<
headius[m] >
tell me there's a reverse
20:54
<
headius[m] >
aha, parseUnsignedLong
20:54
<
headius[m] >
so at least we have that way... java.lang.Long.parseUnsignedLong
20:55
<
byteit101[m] >
How did I miss that? Works for me
20:55
<
headius[m] >
$ jruby -e "p java.lang.Long.parseUnsignedLong('ffff0000aaaa5555', 16)"
20:55
<
headius[m] >
-281472113420971
20:56
<
headius[m] >
it doesn't like 0x or underscores
20:56
<
byteit101[m] >
pushed more changes to the concrete_java branch/pr, more to come for this
20:56
<
byteit101[m] >
aww... though I already strip out the latter for Integer() right now. Can change that
21:00
<
byteit101[m] >
Any way to avoid the extra to_java? (java.lang.Long.parseUnsignedLong("fe", 16).to_java :long).byteValue()
21:17
shellac has joined #jruby
21:19
<
byteit101[m] >
Yay, this now parses: @com.test.EverythingAnnotation(astr="foo", abyte=0xe9, ashort=65500, anint=-495000, along=0xdead_beef_00_ff00ff, afloat=12.633, adouble=-0.009995, abool=true, anbool=false, achar='q', Darray={@javax.annotation.Resource(description=":-("), @javax.annotation.Resource(description=":-)")}, anenum=java.lang.annotation.RetentionPolicy.RUNTIME, aClass=java.lang.String.java_class)
22:00
shellac has quit [Quit: Computer has gone to sleep.]
22:11
shellac has joined #jruby
22:15
<
headius[m] >
is the extra to_java needed?
22:15
<
headius[m] >
the return type from parseUnsignedLong should be a Fixnum, which would naturally convert to long
22:18
<
byteit101[m] >
Yea, haven't figured out away to avoid all the back and forth. without to_java it's a fixnum, but byteValue is on Long: (undefined method `byteValue' for 254:Integer)
22:20
<
headius[m] >
to_java(:byte) should be the same, but that's going to truncate the long in either case
22:20
<
headius[m] >
ah but this isn't the long string
22:20
<
headius[m] >
yeah to_java(:byte)
22:20
<
byteit101[m] >
(too big for byte: 254)
22:21
<
headius[m] >
because it's signed
22:21
<
byteit101[m] >
yea :-)
22:21
<
headius[m] >
damn you java
22:21
<
byteit101[m] >
What I implemented there looks so gross, but I haven't figured out a better way
22:24
shellac has quit [Quit: Computer has gone to sleep.]
22:38
ur5us has joined #jruby
22:40
<
headius[m] >
well we can do it with bit math I guess
22:41
<
byteit101[m] >
I tried that, but was getting weird results
22:42
<
byteit101[m] >
(again due to signed vs unsigned)
22:43
<
byteit101[m] >
Going signed -> unsigned worked fine though, just not the reverse
22:44
ur5us has quit [Quit: Leaving]
22:46
<
headius[m] >
my brain isn't working today
22:47
<
byteit101[m] >
At least what I have works
22:47
<
headius[m] >
so we have unsigned bits and need to interpret them as signed without changing them
22:47
<
byteit101[m] >
correct
22:48
<
byteit101[m] >
I was thinking pack/unpack and marshalling though a string might be another way, though it still seems icky
22:49
NightMonkey has quit [Ping timeout: 265 seconds]
22:50
<
byteit101[m] >
slightly different thing: is it possible to call a java method but to not wrap it into a ruby object?
22:51
<
headius[m] >
not really
22:51
<
headius[m] >
not for primitives anyway
22:53
<
byteit101[m] >
hmm... drat
22:54
NightMonkey has joined #jruby
22:59
<
byteit101[m] >
Workaround works! current PR now supports all annotation types on methods except char
23:45
lucasb has quit [Quit: Connection closed for inactivity]