<headius>
it looks like there's a few options not handled in the case mapping stuff, I'm wondering if that was newer bits or just something you didn't get to
<headius>
specifically there's a check in their case map options function for arg0 == :ascii and then it does things differently
<lopex>
headius: I think it's post 2.5
<headius>
ultimately I believe the problem is that when we do non-7bit case mapping we are not recalculating CR, so I can just add that, but the logic doesn't match MRI trunk
<lopex>
headius: we should match 2.5
<lopex>
or ther eis a bug
<lopex>
headius: yes that's the case
<headius>
ok that's what I figured...unsure if these specs are intended for 2.5 or after
<lopex>
yeah
<headius>
nirvdrum: apparently you wrote these specs...are they based on 2.5 behavior or 2.6?
<headius>
upcase, downcase, CR checking
<headius>
if it's 2.6 we would not want to fix this in 9.2
<headius>
I mean we should not have invalid CR but we don't have to match MMRI
<headius>
line for line
<headius>
if ((flags & Config.CASE_ASCII_ONLY) != 0) {
<headius>
I do see that in our caseMap function
<lopex>
headius: yeah, I guess we dont update the cr when downcasing sharp s to ss for example
<lopex>
so cr can change
<lopex>
from valid to 7bit
<headius>
right
<lopex>
but that shouldnt be visible actually
<lopex>
headius: I'll look at it
<headius>
ok sure
jrafanie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<lopex>
headius: we talked with ChrisBr about open addressing and one idea came up
<lopex>
headius: like caching the hash in bucket array
<lopex>
but it would have to be boxed Integer unfortunately
<lopex>
headius: and for linear hash we could use triples
<headius>
a three-elt object would be smaller than an array for that
<lopex>
headius: but having boxed integers there somewhat defeats the purpose
<lopex>
headius: you have any idea ?
<headius>
I have not followed the changes so far
<headius>
so the best option we have not does not use RubyHashEntry except as an intermediate holder for certain logic
<lopex>
well, it shouldnt use it anyways
<lopex>
just two arrays
<lopex>
we dont want that entries
<lopex>
at all
<headius>
so the issue is that we have one Object[] array that's supposed to encode all the appropriate data at offet N, N+1, N+2
nelsnelson has joined #jruby
<lopex>
yes
<headius>
we'd like to cache hash there but Java is dum
<lopex>
headius: and triples for linear
<lopex>
[key, value, hash, key, value, hash, ..]
<headius>
ok right
nels has quit [Ping timeout: 240 seconds]
<headius>
yeah I don't know :-\
<lopex>
well, but that boxed int bothers me
<headius>
oh yeah
<lopex>
it will introduce indirection
<headius>
it bothers me too
<headius>
we can't work around that with anything standard array-wise
<headius>
so I guess two thoughts come to mind
<headius>
* small hashes are far more common than big ones and more important to be compact and fast...so they could use a specialized (possibly generated) object to store N entries
<lopex>
right
<lopex>
like with ivars
<headius>
* large hashes already have different algorithms because linear search etc are not going to work, so then it's a question of whether the cost of saving those Integer are worse than recalculating
<headius>
right like ivars
<lopex>
but then we want to minimize the cost of checking the storage mode
<headius>
I have a growing need for a general object specialization library
<headius>
working right now on finally reifying all ivars, as well as evolving objects as new ivars come in
<headius>
it's ahrd
<headius>
hard
<headius>
I wonder
<headius>
if we know all objects will right-size eventually maybe the early ones that don't know how many ivars they have should just allocate the var-holding array right away and avoid all the volatility and lazy construction
<headius>
anyway that's separate
<headius>
lopex: tag-along array?
<headius>
it's another indirection but still could be cheaper than storing the Integer
<headius>
so we have both Object[] and int[]
<lopex>
headius: oh you mean keeping key index in that int[]
<headius>
yeah
<lopex>
?
<headius>
well I mean storing whatever you're worried about being Integer
<headius>
hash I thought
<headius>
so you'd get key at obj_ary[N] and value at obj_ary[N+1] and hash at hash_ary[N]
<lopex>
Iforgot if mri caches that hash
<lopex>
headius: well, there's another thing
<lopex>
headius: no we have very dens Hash
<headius>
anyway those are the two options I can think of since we can't do heterogeneous arrays
<lopex>
headius: if decrease the load fasctos
<lopex>
*factor
<lopex>
then we worry less about collisions
<lopex>
and then we dont need to cache tha hash
<lopex>
so that's another thought
<lopex>
headius: I dont see mri keeping the hash there
<lopex>
oh it does in st_table_entry
<headius>
do we know what straight-line perf impact MRI had from this move?
<headius>
I mean...they almost never accept something that slows down, but on average they may have accepted slower fetch...maybe?
<headius>
ChrisBr: you around?
<headius>
we should chat so I am sure I'm on the same page
<lopex>
headius: it does cache the hash
<headius>
ok
nelsnelson has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<headius>
so then the int[] seems like it's the lowest-cost for arbitrary-sized hashes with ivar-like generated data structs for specific smaller sizes
<lopex>
yeah, sounds like the best option
slyphon has quit [Remote host closed the connection]