apeiros changed the topic of #ruby-lang to: Nick registration required to talk || Ruby 2.0.0-p247: http://ruby-lang.org (Ruby 1.9.3-p448) || Paste >3 lines of text on http://gist.github.com
<captain_chen>
Each element is an array right?
<drbrain>
for scan, if you don't capture, each element will be a string
apeiros has quit [Remote host closed the connection]
<drbrain>
can you show what you have so far?
tkuchiki has joined #ruby-lang
apeiros has joined #ruby-lang
<captain_chen>
I haven't done much yet
hramrach has joined #ruby-lang
<captain_chen>
All I've done is change the regex to grab the I since that might be useful for getting the ID
<drbrain>
I would grab ID and document separately
<drbrain>
at least to start
<captain_chen>
I'm doing that whole
<drbrain>
it is possible to grab them together
<captain_chen>
Grab the whole document and make an array of documents
<drbrain>
still, what do you have?
CoreData has quit [Ping timeout: 246 seconds]
<drbrain>
yes, use scan to grab the whole document and make an array of documents
apeiros has quit [Read error: Operation timed out]
nertzy has quit [Quit: This computer has gone to sleep]
<captain_chen>
Using that array of documents, iterate through them to grab the ID and also the text.
<drbrain>
yes
iliketurtles has joined #ruby-lang
<captain_chen>
It'd make sense for me to do say, id = documents.each { |e| <something here to see if it's an ID> }?
jewing has quit [Remote host closed the connection]
<drbrain>
I think:
<drbrain>
documents.each do |d| id = get_id_from d; text = get_text_from d; tokens = tokenize text; update_index index, id, tokens; end
<drbrain>
so each document (d) would look like ".I 1\n.T\nBlah Blah title …"
<captain_chen>
Yeah so
<captain_chen>
Would I use another regexp to see if the element contains the .I [0-9]
<drbrain>
yes
<captain_chen>
so in the block, it'd be |e| e.match(/^(.i [0-9])/ or something?
<drbrain>
almost
<drbrain>
the regular expression should be /^.I ([0-9]+)/
<drbrain>
you can shorten it to /^.I (\d+)/
<drbrain>
oh, and you want to escape the period because it matches anything
<drbrain>
so: /^\.I (\d+)/
<captain_chen>
I probably should learn regexps after this lol
<drbrain>
and you can use String#[] with a regular expression:
imperator has left #ruby-lang ["Leaving"]
<drbrain>
id = document[/^\.I (\d+)/, 1]
<drbrain>
it will be a String instead of an Integer, so you'll need to convert it
grough has quit []
<drbrain>
well, may need to convert it
<captain_chen>
hm
digs has joined #ruby-lang
digs is now known as Guest65918
<captain_chen>
So would this not work then? id = documents.each { |e| e.match(/^\.I (\d+)/)}
<captain_chen>
Oh wait.
<drbrain>
captain_chen: it will, but you'll need to get the id out of the match data object #match returns
<drbrain>
String#[] lets you match and extract the id in one operation instead of two
grough has joined #ruby-lang
tylersmi_ has quit [Remote host closed the connection]
tylersmith has joined #ruby-lang
jsullivandigs has quit [Ping timeout: 264 seconds]
ssb123 has joined #ruby-lang
<captain_chen>
Okay so this is what I'm trying to do.
<drbrain>
captain_chen: you can try in irb too
<captain_chen>
Maybe I should just repaste what I have.
<captain_chen>
But I load in the content, which is that test.txt file.
<captain_chen>
Yeah I should learn Regular Expressions after I get this working.
<captain_chen>
They're pretty useful if used properly.
schaerli has quit [Ping timeout: 240 seconds]
<captain_chen>
So I'm assuming that before that documents.each, I make the hash then inside I start putting in the keys and values
bzalasky has joined #ruby-lang
<drbrain>
yes
<drbrain>
you should get tokenizing working first
<captain_chen>
Well since text is just a string, you can just split it right?
<captain_chen>
And this is where I can do gsubs on removing question marks etc.
<drbrain>
yes
<drbrain>
you can also use text.scan(/\w+/)
<captain_chen>
And this is where I can do gsubs on removing question marks etc.
<captain_chen>
Whoops
richardburton has joined #ruby-lang
grough has quit []
<captain_chen>
Forgot to mention, thank you for your help thus far.
<drbrain>
np
<captain_chen>
I'll make a 'choose your own adventure' in HTML5 eventually and you can maybe play it
jarm has quit [Ping timeout: 240 seconds]
<drbrain>
:)
<captain_chen>
If you want, that is lol.
<captain_chen>
I'm more comfortable with hypertext markup, css and graphics honestly.
<captain_chen>
Now that I think about it
<captain_chen>
I should have thought of extracting the documentID, title and abstract like the box model
<captain_chen>
versus words in a bag
<drbrain>
yes, it's much like that
<drbrain>
but unlike the DOM, you have to do your own parsing
robbyoconnor has joined #ruby-lang
charliesome has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
<captain_chen>
I think I like html & css because it's instant gratification of your markup displayed in a browser versus coding but I shouldn't really complain.
grough has joined #ruby-lang
grough has quit [Client Quit]
bzalasky has quit [Remote host closed the connection]
bzalasky has joined #ruby-lang
lsegal has joined #ruby-lang
iliketurtles has quit [Quit: zzzzz…..]
bzalasky_ has joined #ruby-lang
bzalasky has quit [Read error: Connection reset by peer]
ssb123 has joined #ruby-lang
<captain_chen>
What was that thing about hashes again? You could have as many keys point to the same value?
eponymi has joined #ruby-lang
lsegal has quit [Client Quit]
nisstyre has quit [Quit: Leaving]
bzalasky_ has quit [Remote host closed the connection]
bzalasky has joined #ruby-lang
lsegal has joined #ruby-lang
richardburton has left #ruby-lang [#ruby-lang]
bzalasky has quit [Ping timeout: 256 seconds]
ssb123 has quit [Ping timeout: 245 seconds]
hashkey has joined #ruby-lang
grough has joined #ruby-lang
bzalasky has joined #ruby-lang
hogeo has quit [Quit: Leaving...]
mistym has quit [Remote host closed the connection]
mbj_ has joined #ruby-lang
bzalasky has quit [Remote host closed the connection]
lsegal has quit [Quit: Quit: Quit: Quit: Stack Overflow.]
ikrima has quit [Quit: Computer has gone to sleep.]
rickhull has joined #ruby-lang
<rickhull>
slightly offtopic, is there a good way in bash to refer to directory that the executing script lives in?
<rickhull>
the bash equivalent of File.expand_path('..', __FILE__) I suppose
<rickhull>
e.g so i can cd into a subdir relative to the executing script, rather than cd into a subdir relative to $PWD, which is where the script was called from
nertzy has quit [Quit: This computer has gone to sleep]
<rickhull>
my goal is to be able to drop this "entry" script anywhere on the filesystem, and then someone from the outside, if they can find their way to the entry script, the entry script knows how to get to everything else, project-wise, relative to that entry script
<rickhull>
i.e. sets up some project globals that other scripts/components can use
<rickhull>
my concern is that bash is simply completely ill-suited for something like this, but part of me really doesn't believe that's true
<rickhull>
taking a step back, this is a repository of test scripts, most of which are implemented in bash. and everything lives in one dir and shares lots of promiscuous state
<rickhull>
i'm adding python and ruby capabilities to this repo. but bash is the underlying lingua franca
<rickhull>
this repo is primarily consumed by jenkins, which has a bash interface for the job def
<rickhull>
so one of the first things jenkins needs to do, with the new ruby stuff, is find the Gemfile, and then execute bundler properly
<rickhull>
i was hoping to abstract this a bit, with a helper script inside the repo
<rickhull>
so that jenkins doesn't need to know promiscuous details about the repo structure
headius has joined #ruby-lang
<rickhull>
i.e. jenkins would just call ./ruby/setup.sh
<rickhull>
er, ./ruby/setup.bash (w/e)
<rickhull>
but setup.bash needs to know how to find the Gemfile, relative to where it lives
<rickhull>
am I just Doing It Wrong (™)?
<rickhull>
and setup.bash has all of the bundler install parameters to do the right thing
<rickhull>
part of me thinks we should just implement all of our test scripts in python or ruby, and completely eschew bash
<rickhull>
but bash seems to be the underlying layer. it's useful for piping and redirection and blocking on `nc` etc.
<rickhull>
it's much more natural for bash to call a python script than for python to call a bash script
nathanstitt has quit [Quit: I growing sleepy]
schaerli has joined #ruby-lang
<rickhull>
the other problem i'm having, is that i potentially want every bash script to be able to find things relative to the dir it lives in
<rickhull>
but i don't want to cargocult copypasta that horrible DIR line everywhere
<rickhull>
so ideally i could define it once, and then reference it. but you have the self-referential problem that the place that defines the containing-dir process will just execute according to where the definition lives, not who called it
<rickhull>
is it that unreasonable for X to want to find Y, within a project, relative to where X lives?
<rickhull>
shouldn't we, or don't we, have better tools for this?
<rickhull>
/rant
<rickhull>
i'd really like to get some feedback, but i gotta run. i'll plan to log back in from home
ikrima has joined #ruby-lang
schaerli has quit [Ping timeout: 245 seconds]
nathanstitt has joined #ruby-lang
shinnya has quit [Read error: Operation timed out]
ikrima has quit [Ping timeout: 245 seconds]
ikrima has joined #ruby-lang
ikrima has quit [Remote host closed the connection]
Coincidental has quit [Remote host closed the connection]
Coincidental has joined #ruby-lang
Coincidental has quit [Ping timeout: 260 seconds]
ykk` has joined #ruby-lang
nvg has quit [Quit: Changing servers]
Bosox20051 has joined #ruby-lang
RickHull1 has joined #ruby-lang
<RickHull1>
did i miss any good feedback?
headius has quit [Quit: headius]
ssb123 has joined #ruby-lang
ssb123 has quit [Remote host closed the connection]
ssb123 has joined #ruby-lang
AOrtenzi has joined #ruby-lang
nertzy has quit [Quit: This computer has gone to sleep]
ssb123 has quit [Ping timeout: 245 seconds]
Guest65918 has quit [Remote host closed the connection]
mdedetrich has quit [Quit: Computer has gone to sleep.]
jarm has joined #ruby-lang
AOrtenzi has quit []
lun___ has joined #ruby-lang
lun__ has quit [Ping timeout: 264 seconds]
kgrz has joined #ruby-lang
kgrz has quit [Remote host closed the connection]
Tearan has joined #ruby-lang
kgrz has joined #ruby-lang
ssb123 has joined #ruby-lang
ssb123 has quit [Remote host closed the connection]
ssb123 has joined #ruby-lang
cyndis_ has joined #ruby-lang
kgrz has quit [Read error: Connection reset by peer]
cyndis has quit [Remote host closed the connection]
kgrz has joined #ruby-lang
kgrz has quit [Read error: Connection reset by peer]
kgrz has joined #ruby-lang
kgrz_ has joined #ruby-lang
kgrz_ has quit [Client Quit]
ssb123 has quit [Ping timeout: 245 seconds]
dhruvasagar has joined #ruby-lang
charliesome has joined #ruby-lang
mistym has joined #ruby-lang
Tearan has quit [Quit: Sleepy Badger....]
Tearan has joined #ruby-lang
headius has joined #ruby-lang
apeiros has quit [Remote host closed the connection]
apeiros has joined #ruby-lang
Tearan has quit [Quit: Sleepy Badger....]
retro|cz has joined #ruby-lang
dhruvasagar has quit [Ping timeout: 260 seconds]
richardburton has joined #ruby-lang
robbyoconnor has joined #ruby-lang
Tearan has joined #ruby-lang
Guest66192 has joined #ruby-lang
Kabaka has quit [Ping timeout: 240 seconds]
Coincidental has joined #ruby-lang
Tearan has quit [Client Quit]
kenta_ has joined #ruby-lang
richardburton has quit [Ping timeout: 240 seconds]
_jpb_ has quit [Remote host closed the connection]
jithu has joined #ruby-lang
lsegal has quit [Quit: Quit: Quit: Quit: Stack Overflow.]
robbyoconnor has quit [Read error: Connection reset by peer]
Bosox20051 has quit [Read error: Connection reset by peer]
ykk` has quit [Quit: ykk`]
Bosox20051 has joined #ruby-lang
captain_chen has quit [Quit: Page closed]
richardburton has joined #ruby-lang
ssb123 has joined #ruby-lang
kgrz has quit [Remote host closed the connection]
ssb123 has quit [Ping timeout: 245 seconds]
kgrz has joined #ruby-lang
lsegal has joined #ruby-lang
lsegal has quit [Client Quit]
kgrz has quit [Ping timeout: 245 seconds]
<xybre>
rickhull: That tldr line is the best way to do that in bash, yes.
<xybre>
rickhull: Bash is really powerful and can do a hell of a lot its fast and its worth learning.
ged has quit [Read error: Connection reset by peer]
chinno998 has joined #ruby-lang
chinno998 has left #ruby-lang [#ruby-lang]
ged has joined #ruby-lang
xrq has quit [Remote host closed the connection]
xrq has joined #ruby-lang
lun___ has quit [Remote host closed the connection]
lun__ has joined #ruby-lang
jithu has quit [Quit: Mother, did it need to be so high?]
lun__ has quit [Read error: Connection reset by peer]
kgrz has quit [Read error: Connection reset by peer]
<tbuehlmann>
do you have to use such a bad data structure? :\
<captain_chen>
I'm not too great with data structures, sorry.
<captain_chen>
Do you have a better idea?
<tbuehlmann>
I don't understand what you're doing at all. so you have a text file, what are you doing with it?
<captain_chen>
I'm extracting the documentID number, it's next to the .I
<captain_chen>
I'm also extracting the text under the .T (title) and .W (abstract)
<captain_chen>
I get the docID, and the text from the tittle and abstract to tokenize.
<captain_chen>
*title
<captain_chen>
Then I pass these values into the hash.
<captain_chen>
The hash should then have a list of terms that point to the documentID that has the term in it.
tonni has joined #ruby-lang
CoreData has joined #ruby-lang
hramrach has quit [Remote host closed the connection]
hramrach has joined #ruby-lang
<tbuehlmann>
um, what if only "algebraic" would be in document 3? would you then have to remove it from the other key?
<captain_chen>
If the term appears in other documents, it should point to that document too.
<captain_chen>
Maybe my data structure doesn't work well after all.
<tbuehlmann>
do you need to have it that way? I mean, do you need to group title words that way?
jacktrick has joined #ruby-lang
<captain_chen>
term to its documentID, yes.
<captain_chen>
If you're talking about the file where I parse in the information, it's preformatted in that way.
<tbuehlmann>
naw
<tbuehlmann>
you want to group as many document words as possible per documents?
<captain_chen>
Would that work properly with my current data structure though?
<tbuehlmann>
I still don't get the gist of it..
<captain_chen>
Sorry, I'm making an information retrieval system.
<captain_chen>
I take in a collections file, that test.txt is a small-scale model of the much larger collection.
<captain_chen>
And parse in information such as the document's ID, its title and abstract.
<captain_chen>
So in that test.txt, there are two "documents"
<captain_chen>
With document IDs, 1 and 2
<tbuehlmann>
right
<captain_chen>
Both have a title which I parse in through the program then tokenize them to be put into the hash.
<tbuehlmann>
why tokenize?
charliesome has joined #ruby-lang
<captain_chen>
Going to be matched up with a query
<captain_chen>
Of a single term
<captain_chen>
So say the hash current has: {["cat", "planet"] => ["1, 2"], "dog" => ["2"]}
schaerli has joined #ruby-lang
<captain_chen>
Wait, oh
<captain_chen>
I blame the tiredness, I just realize that makes no sense.
<tbuehlmann>
I hope so, because I don't understand either :)
<captain_chen>
The gist is, the program receives a single term
CoreData has quit [Quit: CoreData]
<captain_chen>
It checks it against the data structure and sees which document contains the term.
CoreData has joined #ruby-lang
<tbuehlmann>
and you want a clever data structure to minimize the time running?
<captain_chen>
That or easy to understand.
<captain_chen>
Or both.
<tbuehlmann>
then, why not something like this: {'1' => ['Preliminary', 'Report-International', 'Algebraic', 'Language'], ...}
<captain_chen>
I just need a data structure that can do what I'm trying to achieve. That sounds about right.
apeiros has quit [Remote host closed the connection]
<tbuehlmann>
then, loop through the values and search for terms for a query
apeiros has joined #ruby-lang
<captain_chen>
So just flip the key and values?
<captain_chen>
I'm still stuck with duplicate terms appearing and not pointing to the documents that also have them
<captain_chen>
Like doc1: cat planet, doc2: dog planet, if I search 'planet' it should point to both doc1 & doc2
adambeynon has joined #ruby-lang
schaerli has quit [Remote host closed the connection]
<tbuehlmann>
let's pretend you have it like {'1' => ['Preliminary', 'Report-International', 'Algebraic', 'Language'], ...}
<tbuehlmann>
the document pointing to the terms
schaerli has joined #ruby-lang
<captain_chen>
Oka.
<captain_chen>
-y
chinno998 has joined #ruby-lang
chinno998 has left #ruby-lang [#ruby-lang]
<tbuehlmann>
and you search for "Algebraic", you would simply loop through all values and get the key, if the term appears. eventually you get all document ids where the term Algebraic appears
schaerli has quit [Ping timeout: 245 seconds]
<captain_chen>
Oh, does it still work out?
<captain_chen>
My current data structure.
<captain_chen>
Oh, I see what you mean.
<captain_chen>
It's just not grouped together
<tbuehlmann>
right
<captain_chen>
i.e. ['Algebraic'] => ["2","3"]
<captain_chen>
There's a command to group, no?
<captain_chen>
But anyway, thanks for clarifying what should have been clear the first time.
charliesome has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
<captain_chen>
If I see you around, I will let you try a "choose your own adventure" game in HTML5 once I get started on it.
<captain_chen>
It's 5 AM so I'll be going now, thanks.
captain_chen has quit [Quit: Page closed]
<tbuehlmann>
just a sec
<tbuehlmann>
ar!
mistym has quit [Remote host closed the connection]
banisterfiend has joined #ruby-lang
CoreData has quit [Ping timeout: 256 seconds]
rippa has quit [Ping timeout: 248 seconds]
ndrst has joined #ruby-lang
rippa has joined #ruby-lang
CoreData has joined #ruby-lang
sevvie has joined #ruby-lang
workmad3 has joined #ruby-lang
rippa has quit [Quit: {#`%${%&`+'${`%&NO CARRIER]
CoreData has quit [Ping timeout: 256 seconds]
tharindu has joined #ruby-lang
kenta_ has quit [Remote host closed the connection]
yxhuvud has quit [Remote host closed the connection]
nertzy has quit [Quit: This computer has gone to sleep]
Forgetful_Lion has quit [Ping timeout: 246 seconds]
lfox has joined #ruby-lang
yxhuvud has joined #ruby-lang
Forgetful_Lion has joined #ruby-lang
gjaldon has quit [Remote host closed the connection]
richardburton has quit [Quit: Leaving.]
richardburton has joined #ruby-lang
yxhuvud has quit [Quit: Leaving]
schaerli has quit [Remote host closed the connection]
schaerli has joined #ruby-lang
schaerli has quit [Read error: Connection reset by peer]
schaerli has joined #ruby-lang
Forgetful_Lion has quit [Ping timeout: 246 seconds]
Forgetful_Lion has joined #ruby-lang
jithu has joined #ruby-lang
nisstyre has quit [Quit: Leaving]
yxhuvud has joined #ruby-lang
iliketurtles has joined #ruby-lang
Elico has quit [Read error: Connection reset by peer]
Forgetful_Lion has quit [Ping timeout: 246 seconds]
Forgetful_Lion has joined #ruby-lang
bzalasky has joined #ruby-lang
elia has quit [Quit: Computer has gone to sleep.]
benanne has quit [Quit: kbai]
Elico has joined #ruby-lang
nofxx has quit [Ping timeout: 252 seconds]
schaerli has quit [Remote host closed the connection]
schaerli has joined #ruby-lang
richardburton has quit [Quit: Leaving.]
schaerli has quit [Ping timeout: 252 seconds]
bzalasky has quit [Remote host closed the connection]
nhmood_ has quit [Quit: leaving]
bzalasky has joined #ruby-lang
Forgetful_Lion has quit [Ping timeout: 246 seconds]
bzalasky has quit [Read error: Connection reset by peer]
Elico has quit [Ping timeout: 252 seconds]
nhmood has joined #ruby-lang
bzalasky has joined #ruby-lang
Forgetful_Lion has joined #ruby-lang
malev has joined #ruby-lang
Elico has joined #ruby-lang
bzalasky has quit [Remote host closed the connection]
bzalasky has joined #ruby-lang
enebo has joined #ruby-lang
mistym has joined #ruby-lang
enebo has quit [Client Quit]
Forgetful_Lion has quit [Ping timeout: 246 seconds]
bzalasky has quit [Ping timeout: 246 seconds]
Forgetful_Lion has joined #ruby-lang
Naeblis has joined #ruby-lang
Forgetful_Lion has quit [Ping timeout: 246 seconds]
<Naeblis>
How can I include other files in Ruby? load "filename.rb" seems to work but require "filename" gives me a LoadError
<Naeblis>
(ruby 2.0.0)
Forgetful_Lion has joined #ruby-lang
deweichen has left #ruby-lang [#ruby-lang]
gazzik has joined #ruby-lang
kek has joined #ruby-lang
Forgetful_Lion has quit [Ping timeout: 246 seconds]
Forgetful_Lion has joined #ruby-lang
workmad3 has quit [Ping timeout: 240 seconds]
<yorickpeterse>
. is no longer in the load path
<yorickpeterse>
so you need `require './filename'`
<yorickpeterse>
or use require_relative
<Naeblis>
yorickpeterse: got it
kalesage has joined #ruby-lang
brianpWins has joined #ruby-lang
mbj has joined #ruby-lang
Forgetful_Lion has quit [Ping timeout: 246 seconds]
Forgetful_Lion has joined #ruby-lang
jbsan has joined #ruby-lang
Forgetful_Lion has quit [Ping timeout: 246 seconds]
<apeiros>
yorickpeterse: uh? I don't think require './foo' works
Forgetful_Lion has joined #ruby-lang
<Tearan>
Gah! I hate it when I can't get my tests working.
malev has quit [Remote host closed the connection]
<yorickpeterse>
apeiros: it does
<apeiros>
waaah!
<Tearan>
*nods nods* it does
<apeiros>
indeed it does. I'm appalled.
<Tearan>
why? It's a feature
kek has quit [Remote host closed the connection]
<yorickpeterse>
apeiros: remember that require uses LOAD_PATH and/or the full file path
<yorickpeterse>
so require '../derp' also works
Forgetful_Lion has quit [Ping timeout: 246 seconds]
bfleischer has joined #ruby-lang
<apeiros>
"the full file path" is equivalent to "relative to .", which is what was removed from $LOAD_PATH in the first place. it may now be a bit more explicit but I find it rather inconsistent.
jithu has quit [Quit: Mother, did it need to be so high?]
<apeiros>
(except for absolute paths, which neither './foo' nor '../foo' are)
<yxhuvud>
hmm. Demon trident of poison at d9. is it worth switching from war axe ?
<yxhuvud>
nm, wrong channel :)
mac___ has joined #ruby-lang
Squarepy has quit [Quit: Leaving]
<captain_chen>
lol
<captain_chen>
There's a d&d irc on here?
eponymi_ has joined #ruby-lang
<omninonsense>
captain_chen: I can't stop giggling at his slip
<captain_chen>
omninonsense: at least it wasn't erp
<captain_chen>
that would have been more awkward
sevvie has quit [Ping timeout: 256 seconds]
Gaelan is now known as GaelanAintAround
Forgetful_Lion has quit [Ping timeout: 246 seconds]
<omninonsense>
What's erp? Does it stand for what I think it stands???!!
Forgetful_Lion has joined #ruby-lang
fijimunkii has joined #ruby-lang
<captain_chen>
aka cybering
<omninonsense>
captain_chen: Yep... Just as I thought. erp is short for "erotic roleplay," I guess?
mistym has quit [Ping timeout: 240 seconds]
metus_violarium has joined #ruby-lang
rippa has quit [Quit: {#`%${%&`+'${`%&NO CARRIER]
<captain_chen>
D&D huh? Maybe I'll check it out later.
lsegal has quit [Read error: Connection reset by peer]
lsegal has joined #ruby-lang
<captain_chen>
Quick question: index = Hash.new { |term, id| index[id] = [] } is a hash that is pointing to an array correct?
grough has joined #ruby-lang
<omninonsense>
I'm not sure what that code does... It should probably looks like: Hash.new { |term, id| term[id] = [] }
Forgetful_Lion has quit [Ping timeout: 246 seconds]
<omninonsense>
It will invoke the block each time you try to access a hash entry that doesn't exist (it uses the block to generate the default value)
<captain_chen>
So the default is an empty array?
eponymi_ is now known as eponymi
<omninonsense>
Yes. Also "term" is in fact the hash object itself inside the block.
Forgetful_Lion has joined #ruby-lang
<omninonsense>
But, not just *a* empty array. It's a new empty array. If you used Hash.new([]), all objects would use the same object (all missing hash entries would have the same object_id)
<captain_chen>
Right, right. Thanks.
<captain_chen>
I'm fairly close to my goal now.
<captain_chen>
I'm just now sure how I would shove in these values
Forgetful_Lion has quit [Ping timeout: 246 seconds]
Forgetful_Lion has joined #ruby-lang
<captain_chen>
Yeah that's what I have currently, I just need to have more things like term frequency, the title of the document in which the term appears in and the abstract.
schaerli has quit [Remote host closed the connection]
adambeynon has joined #ruby-lang
sevvie has joined #ruby-lang
schaerli has joined #ruby-lang
schaerli has quit [Read error: Operation timed out]
mistym has joined #ruby-lang
sevvie has quit [Ping timeout: 245 seconds]
<omninonsense>
Hmm, I have just quickly scanned most of it, so I might be wrong, but I think you can do: index[token] += [ [doc_id.to_i, title, abstract] ]
<omninonsense>
instead of index[token] << doc_id.to_i