josh-k_ has quit [Remote host closed the connection]
<maasha>
yorickpeterse: (I am doing a Ruby mochup of bash pipes: cmd1 | cmd2 | cmd3 ... | cmdn
<maasha>
)
<maasha>
mockup*
<yorickpeterse>
If it's Bash pipes you'd actually _want_ to run those in sequence
<yorickpeterse>
that's the whole point of them
josh-k has joined #rubinius
<maasha>
yeah, so the deal is I have a bunch of records that I want to run through a number of commands each manipulating a record in what may be heavy computational steps.
<maasha>
yorickpeterse: I suppose like this: https://gist.github.com/maasha/0ab9c18880e2a84dacaa if you imagine that (e + 1) is a heavy duty operation - but you tell me that this is not going to do what I want?
<yorickpeterse>
two things there
<yorickpeterse>
1) you're only using the last enum, discarding the first 9
<yorickpeterse>
thus only 1 thread is actually started
josh-k_ has joined #rubinius
<yorickpeterse>
2) the threading setup this way guarantees no order
<yorickpeterse>
so your values can be yielded in any random order
<yorickpeterse>
There's a much easier way of doing this, gimme a sec
<maasha>
yorickpeterse: I am not discarding the first 9 enums - the output is as expected a list from 10 .. 20
<yorickpeterse>
you're doing 10.times do ....., then push into enums
<yorickpeterse>
then you do enums.last
<yorickpeterse>
that discards the first 9
<yorickpeterse>
well, it ignores them
<maasha>
yorickpeterse: the enums << is outside the thread scope - I thought order would be preserved.
<yorickpeterse>
That is correct
<yorickpeterse>
But the moment you call the enum each value is processed in a thread
<yorickpeterse>
thus each `yield` is done separately
<yorickpeterse>
meaning there's no order anymore
<yorickpeterse>
maybe I'm misreading it, but the code is rather cryptic
josh-k has quit [Ping timeout: 258 seconds]
<maasha>
yorickpeterse: Hm, are you sure? I am for each thread creating a new enum in order not to mix up anything.
<maasha>
yorickpeterse: output is on line 7 only in your gist?
<yorickpeterse>
I'm not following
<maasha>
yorickpeterse: I better see if I understand your code properly.
benlovell has quit [Ping timeout: 272 seconds]
<maasha>
My code don't work with rubinius, only MRI
<maasha>
yorickpeterse: so what is you output variable supposed to be? a yielder?
<yorickpeterse>
eh no idea, I just tried to copy it based on your code
<yorickpeterse>
also how does it not work on rbx?
<maasha>
I htae trheds
benlovell has joined #rubinius
<maasha>
yorickpeterse: so this outputs correclty 10, 11, 12 ... 20 in MRI but in rbx it outputs nothing, or it outputs 10 or rarely it completes as expected.
<maasha>
yorickpeterse: Anyways, it looks like the speedup I see is from rbx being good at calculating fib numbers and not because it does so in parallel. monitoring htop shows only a single thread at work :o(
<maasha>
yorickpeterse: aha! t.value
<yorickpeterse>
correct about the 1 thread part
<yorickpeterse>
You're joining 1 thread at a time
<yorickpeterse>
Thus only 1 runs
<maasha>
yorickpeterse: well, its complicated :o(
<yorickpeterse>
remove line #20, move line #22 into the Thread.new block
<yorickpeterse>
You'll need to lock around `enums` though
<yorickpeterse>
You can't safely modify arrays from multiple threads
<maasha>
yorickpeterse: what line 22 version are you seeing?
<maasha>
I tried with Motex.synchronize
<maasha>
*Mutex.synchronize
<yorickpeterse>
also, in that code the actual work will still be done in 1 thread
<yorickpeterse>
That will still do the actual fib work in the main thread
<yorickpeterse>
You're only _creating_ the enum in a new thread
<yorickpeterse>
Also, why are you creating 10 enums and only using the last one?
<yorickpeterse>
hmpf, this code is too confusing
<maasha>
yorickpeterse: sorry, let me try to explain. I have a number of commands that each need to iterate over a bunch of records. This bunch is rather large so they are supplied as an Enumerable rather than an Array. Make sense so far?
<maasha>
Now, the output of each command is the input of the next command.
<yorickpeterse>
Right, if the next command depends on the previous one you can't run it in parallel
<maasha>
yorickpeterse: well, you can. think of bash: cat | grep | sort. each command uses a core. I can do the same in Ruby with threads and IPC using IO.pipe. Now I want to do it with plain thread variables somehow.
<yorickpeterse>
You can't run grep if the output of cat hasn't been produced yet
<yorickpeterse>
nor can you sort if grep hasn't done its thing yet
<maasha>
yorickpeterse: no, but with a lazy enumerator I was thinking that records would trickle through the pipeline.
<yorickpeterse>
pipes don't change that, command 2 would still have to wait for command 1 to complete
<maasha>
yorickpeterse: I disagree, UNIX pipes pass records down the pipeline on the fly (with IO buffering).
<maasha>
hence you can do cat <large file> | head without waiting for all of <large file> to be catted.
<yorickpeterse>
Ah, yeah streaming would be possible I guess
postmodern has quit [Quit: Leaving]
<maasha>
yorickpeterse: so, I guess me question is if you can fake streaming with Enumerators?
<yorickpeterse>
In that case you'd have to adjust the code though, the threads would have to wait until they're signalled to terminate (= no more input)
<yorickpeterse>
So the first command gets it input from whatever it is
<yorickpeterse>
It then writes, say, a line to some pipe/queue/whatever
<yorickpeterse>
the next command takes that, does the same
<yorickpeterse>
The moment the first command has no more input it signals the other threads to stop
<yorickpeterse>
better: a command signals the _next_ command to stop
<yorickpeterse>
each command would be something like
<yorickpeterse>
say `input` is something like :stop_processing, the thread would just return
<yorickpeterse>
if that makes any sense
havenwood has joined #rubinius
<maasha>
yorickpeterse: aint what you are describing the underlying flow of Enumerators and fibers?
<maasha>
One could also go async with Celluloid.
<maasha>
But I was wondering how far I could get with Rubinius on its own.
<yorickpeterse>
Those have nothing to do with it
<yorickpeterse>
fibers are just coroutines
|jemc| has quit [Ping timeout: 272 seconds]
<yorickpeterse>
If you're using Rbx, you might also want to look into channels
<yorickpeterse>
(Rubinius::Channel)
<yorickpeterse>
it would be easier if you could just send data directly to a thread
DanielVartanov_ has quit [Ping timeout: 258 seconds]
<maasha>
yorickpeterse: yeah, but this thing of mine was designed for MRI and fork. Now it turns out that my fork solutions suffers from IO lag during IPC.
<yorickpeterse>
welp, I have a bad headache and I'm out of painkillers
<yorickpeterse>
here's to hoping this instant miso soup helps
<yorickpeterse>
brixen: considering the patch wont land until 2.5 we can upgrade to stable and apply said patches manually
<yorickpeterse>
that way people can at least use bundler 1.6.5 on rbx
<brixen>
yorickpeterse: that's my plan
<brixen>
yorickpeterse: just finished up some JIT changes and am looking at the console hang now
<yorickpeterse>
<3
<brixen>
I get it on OS X now that the JIT crash isn't happening anymore
<yorickpeterse>
if you need any fill ins just holler, I still remember most of my debugging
<brixen>
yorickpeterse: awesome, thanks
<yorickpeterse>
tl;dr seems to be that pthread_join() in console.cpp waits forever for a thread hanging in a read() call
<yorickpeterse>
in Console::stop_threads() if I remember
<yorickpeterse>
* correctly
josh-k has joined #rubinius
JimmySieben has joined #rubinius
josh-k has quit [Remote host closed the connection]
josh-k has joined #rubinius
<yopp>
huh
<yopp>
yeah, that was because of this suicide.txt
<yopp>
they removed github :)
<yopp>
censorship is so fucked up
<yorickpeterse>
what was it about?
josh-k has quit [Ping timeout: 240 seconds]
<yxhuvud>
hrrm. I still get random type errors when running tests :/
<yopp>
yorickpeterse, russian gov blocked github because of text with funny ways to kill yourself
<yopp>
In Russia, there a special registry of forbidden websites and materials. You can't write about suicide, drugs, and whole bunch of stuff. If they found something like this on your website, it will be included in registry
<yopp>
And will be blocked by all ISPs
<|jemc|>
coming soon to a government near you
<yorickpeterse>
yopp: damn
<|jemc|>
it looks like the guy who posted it might have been a ukrainian that was trolling them
postmodern has joined #rubinius
<yopp>
yeah. only one way to get out registry — remove the url or block it when accessing from russia
<|jemc|>
yopp: looks like it is already out of the registry
<yopp>
yeah
<yopp>
guy who posted this text removed it
<yopp>
and tweets of this registry deputy are hilarious
<|jemc|>
thanks for the tip-off that one shouldn't be able to segv rbx from ruby code
<|jemc|>
I had kind of assumed that what I was doing was too invasive to expect that kind of safety, but now that you say it it makes sense that the rbx vm should be checking for garbage-input and failign gracefully
<jnh>
undefined method `process_str' on an instance of Rubinius::ToolSets::Segfault::Melbourne
* jnh
cries
<|jemc|>
jnh: did you require 'rubinius/processor' in your new toolset?
<|jemc|>
that's where those process methods come from
<jnh>
|jemc|: that did it. tjhanks :)
DanielVartanov_ has joined #rubinius
<|jemc|>
no problem - happy to save you some of the head-scratching I endured
<jnh>
damn.
<jnh>
trying to duck punch attr_accessor :arity onto Rubinius::CompiledCode gives me marshalling errors
<brixen>
why are you not just using master?
<jnh>
because I want this code to work on 2.2.10 as well.
<jnh>
because otherwise no one else wants to try working with my lang if they have to build rbx from master.
<brixen>
people want to try your bleeding-edge language but not if they have to use bleeding-edge rbx? :P