<bbrowning>
org.jruby.exceptions.RaiseException: (NameError) undefined method 'set_native_database_types' for class 'ActiveRecord::ConnectionAdapters::PostgresJdbcConnection'
<enebo>
kares: ^
<enebo>
bbrowning: this is a gem locked set of deps?
<headius>
weird
<bbrowning>
enebo: yeah it's a rails 2 app using an old version of ar-jdbc
<headius>
I'd suspect JI changes
<headius>
but it's a weird method to go missing
<enebo>
yeah I would say that would be my guess as well but I do not see much in way of changes there
<kares>
bbrowning: very weird ... but AR-JDBC 1.3 still handles Rails 2.3
<bbrowning>
I'll dig a bit more - the version of AR-JDBC didn't change at all from what I can tell
mrfilip has left #jruby [#jruby]
<bbrowning>
it's just upgrading from jruby 1.7.23 to 1.7.24-SNAPSHOT
<kares>
bbrowning: ah in that case - that sounds like a blocked
<kares>
* blocker
<enebo>
I do not really see any JI changes
<enebo>
setting constant name earlier? maybe it is putting original onto a module and not a class (I do not really think this I am just brainstorming)
<enebo>
the new hashmap stuff?
<headius>
that wouldn't affect alias
<enebo>
well there very little in here
<headius>
that alias doesn't exist in current ar-jdbc
<enebo>
headius: it was why I asked if he was using the same lock
<headius>
ah, and the method is defined in Java in that release
<bbrowning>
anything change with BasicLibraryService usage?
<bbrowning>
yeah in that release the method is define in java so I wonder if perhaps the java extension isn't getting loaded
<headius>
that's a good guess
<headius>
I'd expect a lot more to be missing but maybe this is just the first thing
<bbrowning>
once the other integs finish I can run this one in isolation to see if a cause jumps out
<bbrowning>
that one rails 2.3 app was the only integ failure so from my point of view it's not a big deal :)
<bbrowning>
all the rails 3 apps seemed fine
<bbrowning>
and we have 2 other rails2.3 apps that get tested and they were both fine
<nirvdrum>
enebo: lopex: Why in the world is "String.new" required to have an ASCII-8BIT encoding?
<lopex>
nirvdrum: well, mri does so ?
<kares>
bbrowning: still weird ... hopefully it won't happen on second/isolated run at all :)
<lopex>
nirvdrum: I guess those are the questions I sked myself too before
<bbrowning>
kares: it did, but I now see there are actually 2 errors in my server logs from that test - perhaps the ar-jdbc one is non-fatal and has been there for a while
<nirvdrum>
lopex: I know it does. I'm just curious why it ignores the default encoding.
<nirvdrum>
lopex: So, "" and String.new() are different things. As are String.new() and String.new("").
<nirvdrum>
But, they're all equivalent.
<lopex>
nirvdrum: it' among similar features like the cr issue I guess
<nirvdrum>
It's lunacy.
<lopex>
nirvdrum: you'll become immune to that soon :)
<bbrowning>
enebo: headius: you said something about new hashmap stuff? :D
<bbrowning>
the 2nd error I found from this same failing test is around hashes - org.jruby.exceptions.RaiseException: (ArgumentError) A key is required to write a cookie containing the session data. Use config.action_controller.session = { :key => "_myapp_session", :secret => "some secret phrase" } in config/environment.rb
<lopex>
nirvdrum: I can trace that in mri history if you want
<bbrowning>
error itself isn't very helpful other than it looks like something's not finding a hash key where it expects one
<headius>
I think enebo was referring to a change to use IntHashMap for Java overload caching
<bbrowning>
ahh
<headius>
er, to use a different class for that
<headius>
but same functionality
<nirvdrum>
lopex: No worries. I just didn't know if you knew off-hand.
<headius>
that's also a weird error
shellac has joined #jruby
<headius>
and the others are all still fine eh?
<headius>
other envs
<bbrowning>
yep all the other apps are fine
<lopex>
nirvdrum: I recall how many things work, but not the actual rationale, I'd be a mad man otherwise
<nirvdrum>
Heh.
<nirvdrum>
Fair enough.
<lopex>
nirvdrum: but I believe we'd agree that lot's of mri features are atually bugs
<lopex>
nirvdrum: but mri's kcode was worse than that on many fronts
<lopex>
nirvdrum: so we never actually fully supported it
<lopex>
nirvdrum: btw, what's mri index for ascii 8 bit ?
<lopex>
nirvdrum: if zero, then you'd have (probable) anwser
<nirvdrum>
I'm not familiar with their encoding indices. I'd have to take a look.
<headius>
lopex, nirvdrum: that's the answer
<bbrowning>
headius: yeah we definitely don't have that in our sample app environment.rb - no idea why this is cropping up now, but should be easy for me to change the app and fix
<headius>
encoding index 0 is ASCII-8BIT and if you don't initialize string with anything it doesn't chagne encoding of allocated RString
<lopex>
yeah, that's what I hoped for
<headius>
if you think of it as BINARY it doesn't seem as strange
<lopex>
well, it's a bug
<headius>
maybe
<lopex>
well, at least a feature
<headius>
ASCII-8BIT can hold any bytes...other encodings have restrictions
<headius>
I don't know if it is an intentional thing though
<headius>
might be worth filing a bug just to get clarification on that
<lopex>
headius: but otoh one would wonder why String.new and String "" are different
<lopex>
er, String.new ""
<Papierkorb>
headius: "wanted coincidence"? Looking how many scripting languages dont have ByteArray classes where everyone uses strings instead ..
<headius>
mmm true, but of course we know it's because that form gains a default encoding from the script body
<lopex>
headius: design by coincidence
<headius>
"" is not a string with no content, it's a blank string with a default encoding
<headius>
String.new is a completely empty/default string
<Papierkorb>
headius: but then, iirc, string encoding 'binary' is aliased to ASCII-8Bit, so looks like intentional, expected behaviour
<headius>
but yeah, I'm trying to justify it a bit :-)
<headius>
Papierkorb: yeah, BINARY and ASCII-8BIT are synonymous
<lopex>
headius: so same as String.allocate
<headius>
exactly the same
<headius>
String#initialize does nothing with the allocated string if you don't pass it anything
<lopex>
nirvdrum: you wouldnt like how they store encodings
<lopex>
in mri
<lopex>
indices in RObject flags and then an IVAR if doesnt fit there
<lopex>
headius: is that still the case ?
<lopex>
design by pure overhead
<headius>
I think it's always an index but I'm not sure
<nirvdrum>
headius: I find it strange because every other constructed string is subject to the default encoding, either for the runtime or the file.
<headius>
nirvdrum: yeah, I agree with you
<nirvdrum>
It also means whenever you dup a String, you're almost certainly guaranteed to change the new string's encoding.
<headius>
I think it's worth asking about, especially given the push toward frozen literal strings
<headius>
String.new to get a mutable string will be surprising if it's ASCII-8BIT
<headius>
and I think the use cases for getting a BINARY empty string are fewer than getting an encoded one
<nirvdrum>
I just wasn't sure if there was a backwards-compatibility aspect to this or what.
<nirvdrum>
lopex: Weird. How are Symbol encodings handled?
skade has joined #jruby
<lopex>
nirvdrum: impl wise or storage wise ?
<lopex>
headius: there's still discrepency wrt Symbol and String encodings in jruby ?
donV has quit [Read error: Connection reset by peer]
skade has quit [Client Quit]
<nirvdrum>
lopex: I mean in that switch statement.
donV has joined #jruby
<nirvdrum>
lopex: While I suppose it's possible, are there any encodings supported in Ruby that support byte-wide characters that aren't ASCII compatible?
<lopex>
nirvdrum: ebcdic ?
<lopex>
let me check
<nirvdrum>
I think it's theoretically possible for someone to write their own encoding implementation and add it in, so I can't make any assumptions around this anyway.
<nirvdrum>
It just feels silly to hobble fast paths for encodings that no one really ever uses.
<lopex>
hmm IBM037 is a dummy
<lopex>
and ebcdic-cp-us is an alias
subbu is now known as subbu|afk
<lopex>
nirvdrum: so I guess a dummy guard ?
<lopex>
nirvdrum: otherwise asciiCompatible is just minLength == 1
<headius>
lopex: I don't think so
<headius>
Symbols are just considered raw bytes for ID purposes, I believe
<headius>
since differently-encoded strings rarely will have the same bytes, it works ok
<headius>
I think we have some differences when it's the same string with all ascii bytes and different encodings though
<lopex>
nirvdrum: hmm, symbols appears to be handled by default case there
lance|afk is now known as lanceball
<nirvdrum>
lopex: I'd be happy to accept ASCII compatible being a codepoint <= 127.
<nirvdrum>
headius: Gotcha. But Symbols in 1.9+ can have encodings.
<nirvdrum>
And JRuby gets this wrong currently. I think it's a known issue, but if not, I'll file it.
<nirvdrum>
"abc".to_sym.encoding is US-ASCII on MRI and UTF-8 on JRuby.
<headius>
I think that's mostly a matter of defaulting to us-ascii in strings
<nirvdrum>
I was looking at it last week. IIRC, I think the problem is ByteList is used as the symbol table key and ByteList#equals ignores encoding.
<lopex>
nirvdrum: wow, yeah, it's that defult case there
<lopex>
so default: under mri is also a symbol
<headius>
[] ~/projects/ruby $ ruby23 -e "p 'foo'.force_encoding('ASCII-8BIT').to_sym.encoding; p 'foo'.force_encoding('UTF-8').to_sym.encoding"
<headius>
#<Encoding:US-ASCII>
<headius>
#<Encoding:US-ASCII>
<lopex>
but T_DATA is more special wrt encodings
<lopex>
haha
<nirvdrum>
But it really seems to just be this default case.
<lopex>
nirvdrum: welcome to madness
<lopex>
nirvdrum: just confirmed
<nirvdrum>
"abc".encode('UTF-16BE').to_sym.encoding return UTF-16BE on both implementations.
<headius>
because the bytes distinguish it from ascii "abc"
<lopex>
oh you mean default the us-ascii ?
<nirvdrum>
headius: Sure. My point is in one case MRI uses the String's encoding for the Symbol's encoding and in the other it ignores it.
<lopex>
headius: symbol is covered as default case in that switch in mri
<lopex>
isnt that fun ?
<nirvdrum>
Byte compatibility aside, it's just unexpected to me.
<headius>
yeah the handling of symbols is a bit weird
<headius>
and I don't really know how it distinguishes these cases:
<headius>
$ ruby23 -e "p 'abc'.encode('UTF-16BE').force_encoding('ASCII-8BIT').to_sym.encoding; p 'abc'.encode('UTF-16BE').to_sym.encoding"
<headius>
#<Encoding:US-ASCII>
<headius>
#<Encoding:UTF-16BE>
<nirvdrum>
headius: A more illustrative example of JRuby's lookup table issue.
<nirvdrum>
:xyz.encoding # US-ASCII
skade has joined #jruby
<lopex>
headius: maybe there should be a random harness mri api exhauster that would gather all those cases
<nirvdrum>
"xyz".to_sym.encoding # US-ASCII if in the same session after seeing :xyz
<lopex>
headius: you could generate mri from that matrix then
<headius>
nirvdrum: yeah
<headius>
lopex: hah
<headius>
sounds good, do it
<nirvdrum>
headius: That's the part where I think ByteList#equals is messing us up a bit.
<lopex>
headius: hey you liked the idea for api diffing
<lopex>
it worked well
<lopex>
and that's a step
<lopex>
nirvdrum: you use jruby's bytelist ?
<nirvdrum>
lopex: Yeah.
<nirvdrum>
We share a lot of the same string code.
<headius>
lopex: indeed, I'd love to see more exhaustive tools here
shellac has quit [Quit: Computer has gone to sleep.]
<lopex>
headius: something like integrated circuit testers
<lopex>
so you can test as much state as possible
<lopex>
pure one is too simple
skade has quit [Quit: Computer has gone to sleep.]
camlow325 has quit [Read error: Connection reset by peer]
camlow325 has joined #jruby
thedarkone2 has joined #jruby
subbu|afk is now known as subbu
Guest has joined #jruby
Guest has quit [Client Quit]
Guest69790 has quit [Quit: leaving]
dinfuehr has joined #jruby
e_dub has quit [Ping timeout: 260 seconds]
blaxter has quit [Ping timeout: 256 seconds]
kith has quit [Quit: kith]
norc has joined #jruby
<headius>
enebo: how are we looking for 1.7.24...need anything?
<enebo>
headius: I don’t know…I did not see any resolution above woth bbrowning reported issue and I cannot repro the crasher I am working on
<enebo>
headius: I am thinking it might be a 32 bit issue for the windows issue I am looking at so I still have one avenue to try
<headius>
ok
<headius>
bbrowning: any new information?
<enebo>
I wish I had a 32 bit win7 image
<bbrowning>
headius: bisecting now - down to 5 possible commits left
<bbrowning>
but turnaround time isn't too quick
<headius>
bbrowning: oh, nice
dlbirch has joined #jruby
<dlbirch>
hi, um - dump newbie question here ... would like to reference an ENUM value from my JRuby code? The java enum looks like this:
<dlbirch>
package com.lmax.disruptor.dsl;
<dlbirch>
public enum ProducerType {
<dlbirch>
SINGLE,
<dlbirch>
MULTI;
<dlbirch>
private ProducerType() {
<dlbirch>
}
<dlbirch>
}
<dlbirch>
Not sure how to reference the 'SINGLE' value in the above ENUM?
<headius>
like a constant in Ruby... ProducerType::SINGLE
<headius>
Java enum values are just static fields
zacts has joined #jruby
<dlbirch>
ok, cool.
<dlbirch>
thank you
<nirvdrum>
lopex: Fantastic. each_byte and each_char have different semantics if the underlying string is updated.
<bbrowning>
headius: still another bisect or two to go, but it's down to a short list
<headius>
right, but this only affects loading ext jars
<headius>
there aren't many of those
<bbrowning>
sure
<bbrowning>
but any of those in a rubygem will likely have a dash
<bbrowning>
ie ar-jdbc's adapter_driver.jar
<headius>
I think I know what mkristian__ meant for this to do
<bbrowning>
err adapter_java.jar
<headius>
right, which would explain your issue...but doesn't explain why it worked in the other apps
<bbrowning>
well the other apps just may not be triggering a load of ar-jdbc
<bbrowning>
we purposely don't actually touch a database in our tests
<bbrowning>
so we don't exercise much of ar-jdbc
<headius>
ahh
yfeldblum has joined #jruby
robbyoconnor has joined #jruby
robbyoconnor has quit [Client Quit]
<headius>
I will have a fix
robbyoconnor has joined #jruby
<nirvdrum>
"git revert"
<headius>
that's one way
<nirvdrum>
I wish there were a polite way of doing that. I've done it a few times and it's not received well. But it's never a criticism of the original patch, but rather the obvious lack of a regression test.
<nirvdrum>
Basically just saying lets roll back, better understand the problem, and then try again.
<headius>
you can always add a justification to the revert commit
<bbrowning>
if this were ruby code I'd say it's just missing a file.basename() before that "-" check
<norc>
Alright, so I gave a small presentation to an IT dept. today showing them git. They got a basic understanding of how git works (pointers, branches, remotes, commits, merges, reset)
<norc>
They asked me to do some excercises with them tomorrow.
<headius>
I'm modifying it to search backward through path elements for non-java-identifier strings and not try to use them for the ext class/package
<bbrowning>
k
<headius>
that will reduce the searching it does in the general case and fix the specific case mkristian fixed
<norc>
Cannot think of sensible test things to do with them. :S
<headius>
norc: ?
<norc>
Oh absolutely wrong channel.
<norc>
Sorry. :-)
<nirvdrum>
Heh.
pawnbox has quit [Remote host closed the connection]
bb010g has quit [Quit: Connection closed for inactivity]
<headius>
norc: I was quite confused :-)
<norc>
headius: Somehow I didnt really read at the channel name, saw talk about source code and blindly assumed "well this must be #git alright"
Osho has quit [Ping timeout: 272 seconds]
eam has quit [Ping timeout: 250 seconds]
kith has joined #jruby
pawnbox has joined #jruby
Osho has joined #jruby
eam has joined #jruby
pawnbox has quit [Ping timeout: 240 seconds]
tomjoro has joined #jruby
rsim has joined #jruby
<headius>
y'all wanna eyeball my fix and check my logic?
<headius>
bbrowning: you could try a build with that too
<headius>
this should fix the original issue plus any other non-identifier characters in path, and it will stop trying class names sooner, speeding up loading jars that don't have exts in them at all
<headius>
and I added a test for the logic I added, at least :-)
<bbrowning>
headius: looks reasonable to me - testing now
<lopex>
enebo: all that is now embeded in the bytecode
<lopex>
oniguruma had shared cclasses long ago though
<lopex>
enebo: now imagine a sum of several such unicode ranges
<enebo>
lopex: heh yeah big
<lopex>
it needs to go somewhere else
<lopex>
enebo: that bitset is for fast 7bit lookups
<lopex>
the ranges are binary searched
<lopex>
so that's why the "mix"
<lopex>
cclass-mix
<lopex>
enebo: and there's three opcodes there only
<enebo>
hmmm I can see why this is a generic instr of ranges but can oni ever reduce this if another instr is an intersection?
e_dub has joined #jruby
<enebo>
or union reducing two ranges to just one
<lopex>
enebo: oni will maintain hash of whole cclass ast nodes when shared char class is enabled
<enebo>
like what is [f\P[Word}]
<enebo>
it should just become [f]
<enebo>
oh sorry no
<enebo>
ignore that
<enebo>
but the f should go away since it is subset of Word
<lopex>
so another regexp will reuse that
<lopex>
enebo: well, that's another class then
<lopex>
enebo: you wont reuse it
<enebo>
lopex: but perhaps thinking about reuse at the same time of this is too complicated
<lopex>
enebo: the idea is that lots of char classes are defined as A-Z for example
<enebo>
lopex: or it is forcing a particular design
<lopex>
enebo: it's not too complicated
<lopex>
enebo: the question is as always, economics
<enebo>
is it?
<lopex>
does the real ode use it
<enebo>
I mean economics of what though?
<enebo>
memory and time are different dimensions
<enebo>
we frquently have to make decisons to trade one for the other
<lopex>
enebo: for unicode char class alternatives this might be big
<lopex>
bitsets are cheap
<lopex>
but you need to union that ranges
<lopex>
or sum
<enebo>
yeah I guess I was wondering if you have to interp one instr for ‘f’ and one for ‘Word’ in my example above
<enebo>
and then I was wondering if the sharing is eliminating the possibililty of eliminating ‘f’ instr
<lopex>
enebo: well, no strong claims here, just thought about improving on that
<enebo>
lopex: yeah I guess you are thinking about an incremental change
<lopex>
enebo: f will be looked up from bitset
<enebo>
lopex: perhaps my suggestion about not worrying about sharing is a big change thought
<lopex>
enebo: since it's an ascii
<enebo>
lopex: but I was just giving an abstract example
<enebo>
lopex: instead of ‘f’ pretend it is a more complicated subset of Word
<enebo>
lopex: one which is not 7bit
<lopex>
ok
<enebo>
lopex: and I feel I derailed your topic a bit too
Liothen has joined #jruby
Liothen has quit [Changing host]
<enebo>
lopex: I was curious about whether oni could possibly eliminate work and merging sets was the first thing I thought of
<lopex>
enebo: you mean to lookup two ranges or lookup one summed ranges ?
<lopex>
*range
<enebo>
lopex: summed range as one lookup instr instead of having to perform two and probably removing a jump as well
<lopex>
enebo: yeah, it's merging
<enebo>
lopex: ah cool
<lopex>
enebo: always single range list
<lopex>
always single opcode for cclass
<lopex>
if that's what you're asking
<enebo>
yeah
<enebo>
lopex: another unrelated thought….and maybe nirvdrum would care if he was here….if bytelist knew widest char if it walks then that would be very useful for oni optimization
<enebo>
so if you walk a mbc string and the most bytes is 2 then Word would only need the ranges which are at most 2 bytes long
<enebo>
most bytes == widest char is 2 bytes
tcrawley is now known as tcrawley-away
<lopex>
enebo: matching is usually called from ruby String methods
<lopex>
hmm yeh
<enebo>
lopex: I am unsure if any Ruby string methods could make use of that info
<lopex>
enebo: well, we could also use that utf-16 is really ucs-2
<enebo>
lopex: actually I am unsure how useful this is for regexp processing
<lopex>
enebo: so fixed width
<enebo>
lopex: well another flag
<lopex>
enebo: so many oportunities
<enebo>
lopex: if fixed width is noticed while calc’ing cr then we can use that in many string methods
<enebo>
lopex: that one is probably much more useful than what I said although I don’t know how often people work with UCS2
<lopex>
enebo: doh, you dont have that in cr
<lopex>
for utf-16
<lopex>
just valid
<enebo>
lopex: for CR? I don’t think so yeah valid
<enebo>
valid, broken unknown 7bit
<lopex>
well, has_surrogates
<lopex>
that would be it
<lopex>
enebo: so ? the easiest is utf-32
<enebo>
lopex: when you say utf16 you mean utf16-le?
<lopex>
enebo: doesnt matter for regexp
<lopex>
enebo: the treatment is in encoding
<enebo>
lopex: oh both have fixed byte subset just backwards ordering
<enebo>
lopex: gotcha
<enebo>
lopex: so I was thinking not about JRuby but about Nashorn and Java in general
<enebo>
lopex: joni could specliaze that subset and improve its performance in a language where we always store data as utf16-le internally
<lopex>
enebo: well, String.charAt is still wrong then
<lopex>
enebo: I guess that's just a decision of semantics
<enebo>
lopex: but yeah maybe it requires more info in String itself to make that work better
<lopex>
enebo: java still has it wrong
<enebo>
lopex: so just think though a little extra info in java.lang.String and they could simplify regexp processing
<lopex>
enebo: I thing they already ignore that, but not sure
<lopex>
*think
<enebo>
ignore what?
<lopex>
enebo: surrogate presence
<enebo>
ah
<lopex>
enebo: so only 16 bit
<lopex>
enebo: but not sure
<enebo>
ok
<enebo>
not our problem anyways
<lopex>
enebo: you can use codePointAt (is that how it's called?)
<lopex>
enebo: but traversal is your responsibility then
<enebo>
lopex: yuck
<lopex>
or separte CharSequence impl
<enebo>
lopex: character data is rough :)
<enebo>
lopex: we NEED ROPEZ
<lopex>
enebo: optimized ropez
jeremyevans has quit [Read error: Connection reset by peer]
<lopex>
enebo: remember that utf-8 fast walking mri has ?
<enebo>
lopex: oh I remember something weird they did with mbc and strLen or something?
<lopex>
enebo: I cant recall what the issues was
<lopex>
*issue
<enebo>
lopex: I don’t totally remember past that it only worked for some cases
<lopex>
enebo: and how it would differ from mri (the unsafe impl in jruby)
<lopex>
*Unsafe
<lopex>
enebo: an idea how to make a "pluggable" code range tables ?
<lopex>
enebo: big5 is a big offfender here
<lopex>
probably never used
<lopex>
it's jsut a binary file in jcodings jar
jeremyevans has joined #jruby
<lopex>
enebo: ruboto suffers from jar sizes
<lopex>
that's the major issue I guess
<lopex>
all those are lazy loaded, but not sure how zip format works
<enebo>
lopex: well it is definitely possible to make them some SPI-like jars perhaps
<lopex>
does it have to skip the steam ?
<enebo>
lopex: yeah I have no idea
<lopex>
does it have any indexing ?
<lopex>
or midjumps
<enebo>
lopex: I don’t remember…for some reason I want to say it doesn't
<lopex>
:)
<enebo>
lopex: pretty amazing to realize how some of these things never got improved after they were initially made
<lopex>
enebo: but if you wanted to optimize zip/jar so most frequently used entries are easiest to access
<lopex>
and those are easy to predict
<enebo>
lopex: yeah true if there is no indexing and something is really big or really slow at io