lekernel changed the topic of #m-labs to: Mixxeo, Migen, MiSoC & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
sb0 has joined #m-labs
<sb0> rjo, hi
<sb0> http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html "Lots of people struggle with the complexities of getting big data systems up and running, when they possibly shouldn’t be using the systems in the first place."
<hozer> well, I'd go read that, but I just had to 'kill 666' my firefox process :P
<hozer> but links seems to work quite nice
<hozer> hrrm, I had a theory back when I did high-performance computing that FPGAs with open/programmable/reconfigurable memory controllers with *narrower* data paths would rock on graph searches
<hozer> but this requires you have some real stable FPGA toolchain you could actually *use*
<hozer> so whatever happened to github.com/wolfgang-spraul/fpgatools
<hozer> looks like opencircuitdesign.com/qflow/welcome.html might be an answer...
<sb0> there's no bitstream backend
<hozer> dammit
<hozer> well fpgatools tried to do that, right? But it looks dead
<sb0> yes
<hozer> frack
<hozer> I guess if I want a bitstream backend I'm going to have to design my own damned fpga :P
<sb0> just continue the S6 RE
<sb0> or move on to 7 series
<sb0> it's easier than spinning your own
<sb0> try this: xdl -report -pips 6slx4
<sb0> it tells you everything :)
<sb0> (well, not everything, but enough to get started)
<hozer> so my theory on graph algorithms is if you had an FPGA with a bunch of separate narrow-width (but high speed) memory busses that are all independent, you'd get a lot better graph search performance bit
mumptai has quit [Ping timeout: 246 seconds]
<hozer> because everything I'm aware of is going to 128 or 256 (or wider) bit cache lines
<sb0> yes, but that's because of DRAM, whose bandwidth has improved a lot more than its latency
<hozer> yep. But if you are doing a graph/sparse matrix, you might only need a 32 bit pointer out of an entire 256 or 512 bit cache line
<hozer> so you just wasted all that bandwidth
<sb0> it's cheap. and having a narrower DRAM won't make the latency (which is what is slowing you down) better
<hozer> but it will sure make it cooler
<sb0> you mean less power? maybe
<hozer> and if you had a cpu with say 512 thread contexts instead of 512 bit cache lines anytime you hit a memory load you switch context
<sb0> parallel programming is hard
<hozer> it would still be easier than some big-nonsense cloud data analytics whizbang :P
<sb0> and a ddr3 chip has only 8 banks
<sb0> if your thread contexts cause DRAM precharge cycles, and they probably will, you can't pipeline
<hozer> so your 'big data' FPGA search appliance has a couple fpgas, and 256 DRAM chips :P
<hozer> and the whole thing draws < 250 watts
<sb0> you can't put many DRAM chips on a FPGA. especially without shared address/command lines. and on 7-series, only certain IO banks can run ddr3 at max speed.
<hozer> This is why if I had a million dollars I'd spend half on farmland and half on rolling my own fpga :P
<hozer> well, okay, I'd probably need 100 million to actually do the fpga right
mumptai has joined #m-labs
<sb0> if you become the dumpster diving god, you can probably get your own fab and do it for much less
<sb0> which is what hackerspaces should be doing if they weren't busy with their cnc plastic extrusion stuff
<hozer> if I can dumpsterize a fab it would probably be worth about half a million to me
<hozer> cause at some point there will be some expensive tractor or combine that you can't some critical silicon for anymore
<hozer> I mean' that you can't get some critical silicon for anymore'
<hozer> and then I can go around buying up equipment for $5,000 that I could sell for $50,000 if it ran
<hozer> no infiniband, bah
<hozer> oh, and Cavium pisses me off cause they made my MontaVista stock worthless :P
<hozer> ... they came in when the market was down and acquired MontaVista and only Cavium and the company execs got stock. All the employees who had stock options (or left and excercised them) got zilch
<hozer> The only thing I know of better than Infiniband's cut-through routing latency is Cray's interconnect, and there's a good argument you might be able to get lower latency doing a remote RDMA read over infiniband than doing a local memory read if you are doing a lot of graph search stuff
<sb0> no
<sb0> the serdes themselves have more latency than a typical closely coupled dram controller
<hozer> well assumption here is the data set doesn't all fit in your local dram
<hozer> so you can either go for 'ultra-scalable' systems that are 1000x times slower than a single laptop
<hozer> or actually get some decent low-latency interconnect
<hozer> but, you know, parallel programming is hard, and it's easier to say how scalable your system is on a crappy cloud infrastructure :P
<uhhimhere> head in the cloud
<uhhimhere> meow
<hozer> sometime I want to do a single-threaded graph search on a terabyte data set with a bunch of commodity hard drives ;)
<uhhimhere> the cavium supports low latency multi-socket Soc
<hozer> I tried to figure out how to synthesize LEON/Grlib awhile go but never really finished.
uhhimhere has quit [Ping timeout: 245 seconds]
<rjo> hey sb0.
<rjo> sb0: i like sync_struct.
<rjo> but from a cursory look it seems a big fat warning about mutable members is necessary.
<sb0> rjo, you can put mutable list/dicts (and compatible objects) into sync_struct, and then mutate them - that will be handled correctly
<sb0> sync_struct will wrap them
<sb0> the only thing you should not mutate is the suitably named "read" property, which gives you access to the data
<rjo> but there are many more mutable objects.
<sb0> yeah, user classes are not supported
<rjo> are they prevented?
<sb0> nested lists/dicts are
<sb0> no, they're not
<sb0> I think the proper way to "prevent" them is to add a note to the documentation
<rjo> the big fat warning i suggested. yes.
<sb0> otherwise we'd have to scan any object (potentially with several levels of nesting) passed by the user for something that isn't a list or dict
<sb0> or immutable
<rjo> BTAFTP
<rjo> i agree.
<rjo> but the problematic use case i see is adding a numpy array and starting to call all its wonderful methods.
<sb0> hmm, we can actually make them work
<sb0> sync_struct can proxy them
<rjo> oh. that is not an acronym yet. let's make it one. (better to ask forgiveness than permission)
<sb0> only, the receiving side has to handle them
<rjo> rpc-like.
<sb0> yes
<rjo> too generic imho.
<sb0> sync_struct is pretty much a RPC already
<sb0> but with pubsub, index syntax support, and structure initialization on connect
<sb0> supporting numpy methods is a matter of replacing the hardcoded list methods (append, insert, pop, etc.) with a generic RPC
<sb0> ..of course, those methods must be called from the Notifier, and never via the read property
<rjo> i gravitate toward actually dumbing it down.
<sb0> all functionality is necessary for the GUI/master atm
<rjo> because then you will have mutable arguments to methods of numpy arrays where the method return value after execution on the subscribers might matter.
<rjo> and conflict when returned to the publisher..
<rjo> looking at the master queueing: why did the idea of letting the experiments reschedule themselves not work?
<sb0> (entry_points) ah, good call
<sb0> I'd have to think about it, and also how to support "pausable" experiments
<GitHub187> [artiq] sbourdeauducq pushed 1 new commit to master: http://git.io/yFi1kA
<GitHub187> artiq/master 6cc3a9d Robert Jordens: frontend/*: move to artiq.frontend, make entry_points...
<rjo> all in all there will be a lot of scheduling experiments by other experiments.
<rjo> so at least there will be functionality duplication where the experiments need to manage the master queue.
<sb0> so how exactly should they manage the queue?
<sb0> the simplest way to do #1 is expose a "queue_append" method to the experiments
<rjo> a stack of calibration experiments queued as a batch by another.
<rjo> if this parent experiment gets unqueued or canceled, it needs to remove its children.
<rjo> (or may want to remove them) from the queue
<sb0> can that be done by having the parent schedule its calibration experiments just before finishing?
<sb0> if it gets unqueued or canceled, then it can't schedule the children
<rjo> or something akin to an "operating mode" where a parent experiment sets up a bunch of periodic experiments and then runs a few others, unqueueing the calibrations when done.
<rjo> the problem is "ownership". if an experiment can queue other experiments, it must be able to take full responsibility of them.
<rjo> handling their failures, unqueueing them...
<rjo> yes. if the parent goes it should take its children too.
<rjo> -- or not. depending on how much responsibility the parent is able/willing to accept.
<rjo> but if you have such an "operating condition" style experiment, you need full queue management from within experiments.
<sb0> do the "operating condition experiments" do anything more than schedule a pack of periodic experiments + a batch of sequential ones?
<rjo> i would like it to be able to handle experiment failures.
<rjo> the workflow if you loose an ion (or a laser becomes unlocked) is a whole other batch of experiments (load, move, calibrate...) as a massive error handler.
<rjo> and how can i get an experiment that is periodically scheduled every hour but just on weekends?
<rjo> imho the Scheduler api should be more like an event loop. there periodic execution is also an emergent feature.
<sb0> how do you represent it in the GUI, though?
<rjo> isn't there a list of the coroutines and Tasks in the asyncio event loop?
<sb0> yes, but the tasks are parallel - there is no queue
<sb0> and I guess the GUI needs a queue
<rjo> isn't that parallelism the same as the "pausable" thingy?
<rjo> really? how does it decide wich task to continue with if one yields?
<sb0> the one that has a completed IO operation...
<rjo> there is a nice heapq in the eventloop.
<sb0> your error handling scenario means: assume that the lost ion exception didn't occur, speculatively proceed to add the next steps into the queue so that the GUI displays them, and if the exception does occur, undo it
<rjo> if they are no fds, it's just a bunch of TimerHandles.
<sb0> you don't fundamentally need scheduling access for that - you can just import the other "error handling" experiments and run them in the exception handler
<rjo> yes. that would work.
<sb0> the only problem I see with #2 is less user feedback
<rjo> but then it is not apparent in the gui which experiment is actually running if they do not pass through the scheduler.
<rjo> yes
<sb0> we can also simply add a notification of the current class name in which the execution is going on
<sb0> and for the "weekend schedule"... replace periodic execution with timed execution, and let experiments re-schedule them
<rjo> or give scheduler access.
<sb0> what was your collaborative code editor website again?
<sb0> something.io
<sb0> ah, kobra
<rjo> i am wondering now whether experiments could become real asyncio.Tasks
<rjo> and whether there could be something like a slave enventloop to the big asyncio one that manages only the experiments.
<rjo> that would give a notion of "pause" for free.
<rjo> could just follow the same call_later(...) api etc.
<rjo> if you do scheduler.queue("") by name, you can return the actual experiment instance, right?
<rjo> why do you need an rid?
<sb0> hmm, creating the instance calls build(), which initializes drivers
<rjo> ah
<rjo> the error recovery pattern is ok. could maybe streamlined a bit with contextmanager.
<sb0> I guess that we should not permit those "scheduler plugins" to access drivers, too
<sb0> otherwise, there can be conflicts if the scheduler plugin requests a driver with certain arguments, and then one of its scheduled experiments requests it again with other arguments
<sb0> only one experiment may access drivers at any given time
<sb0> and it should be permitted to have several scheduler plugins running concurrently, I guess
<sb0> so that several periodic experiments can be scheduled
<rjo> by "requesting a driver" you mean the rpc that in the end leads to the opening of the serial port?
<sb0> or even the core device driver opening the serial port
<sb0> contextlib.nested is deprecated
<rjo> i am almost convinced these backend drivers should be singletons.
<rjo> oh. even better.
<rjo> but i don't know how/whether the contextmanager jives with the coroutine that would be required.
<rjo> probably not.
<sb0> you can only yield once in @contextlib.contextmanager, no?
<rjo> yes.
<rjo> but one could just write a real class with __enter__ and __exit__(), maybe.
<sb0> can one yield from a with statement?
<sb0> in other words: can a context manager cause its calling generator to yield?
<rjo> hmm. "scheduler plugins" aka "batches" or "operating conditions" vs drivers: if you "request" a driver and get it, you do expect exclusive access.
<sb0> afaik generators don't play well with context managers
<rjo> so the parent (or any other parallel running, paused experiment) would have to -- at least -- temporarily relinquish control.
<sb0> and same with generators and class initialization. e.g. you can't create an asyncio connection in __init__
<rjo> oh. why is that?
<sb0> __init__ can't yield
<rjo> ha.
<sb0> well you can, but you have to pass the asyncio loop as a parameter to __init__
<sb0> and __init__ calls loop.run_until_complete (and becomes blocking)
<rjo> and then yield "downwards" into the loop instead of upwards?
<sb0> or create a task
<sb0> but then error handling becomes a pain
<rjo> is the scheduler already split from the experiment runner into different processes?
<rjo> is that thing the worker?
<sb0> if we have separate scheduler.queue and scheduler.run_at, with a run_at reaching the deadline taking priority over the queue, it's also easier to display in the GUI
<sb0> yes. the worker runs the user code. that way, if it goes into an infinite loop, leaks memory, imports a crashy library, etc. it can be killed
<rjo> call_later() call_soon() and call_at() like asyncio ;)
<rjo> so to streamline terminology a bit, could the "worker" be more or less the "controller" for the coredevice?
<rjo> or does that analogy not hold?
<sb0> it's not exactly a controller as it doesn't go over the network, and it doesn't even have to use the core device
<rjo> but there is a pair of filedescriptors between the scheduler and the (worker).
<rjo> pipe.
<sb0> yes. stdin/stdout, actually
<sb0> the main reason for running on the same machine is that the filesystem where the experiments and later the results are stored becomes the same
<rjo> ah. the rpcs to the controllers originate/are relayed at the worker.
<rjo> ok.
<sb0> rpcs to the controller are done directly by the worker, yes
<sb0> I'm also imagining that the worker would write HDF5 outputs itself
<rjo> and you fire a worker per experiment?
<sb0> and maybe do the git checkouts
<sb0> no, it takes instructions, runs, and reports
<sb0> and keeps going
<rjo> ok.
<sb0> we can have a collection of generator-based scheduler plugins running in the worker
<rjo> good. then i like the term worker.
<rjo> should also be ok to implement pauseability by "yield"ing from within an experiment.
<sb0> if scheduler plugins and experiments become the same thing, yes
<rjo> yep.
<sb0> hmm, we may have to move the scheduler into the worker...
<sb0> otherwise, propagating an exception worker -> scheduler -> worker will be a mess
<rjo> why does it have to go to the scheduler?
<rjo> (first)?
<sb0> see the error recovery example ...
<rjo> and who is talking to the gui, the worker or the scheduler?
<sb0> the scheduler
<sb0> if we move the scheduler into the worker, it could simply sync_struct the queue and periodic schedule from the parent process
<rjo> the results updates are proxied through the scheduler?
<sb0> the real-time results are produced by the worker, sent to the master (what you call scheduler), and subscribed to by clients
<rjo> then i should not call it scheduler.
<sb0> all results, including non-realtime ones, would then be written to the filesystem by the worker directly
<rjo> the scheduler is just a component of the master. another component is this data-hub.
<sb0> yes
<sb0> hmm. if we move the scheduler into the worker, then we cannot abort an experiment by killing the worker.
<rjo> hmm. that is nasty if an infinite loop in an experiment would block the scheduler.
<sb0> yes, that too
<rjo> OTOH there will be plenty of opportunity to DOS the scheduler if the API is accessible.
<rjo> i think i have to let that problem take a few rounds in my head.
<rjo> we didn't even get to discussing the RTData and Plot persistence stuff.
<rjo> but i am heading home now anyway.
<rjo> good night!
<sb0> I'd implement persistence by reloading from HDF5
<sb0> explicitly
<sb0> good night!
<rjo> ok. that sounds smart. if the gui can trigger that (preferrably by just "selecting" the experiment) and then fiddle with the fit/plot and stuff gets saved, that would be really smooth.
<rjo> but that's for another day.
<rjo> see you.
<GitHub94> [artiq] sbourdeauducq pushed 2 new commits to master: http://git.io/uqhPcA
<GitHub94> artiq/master 3e22fe8 Sebastien Bourdeauducq: reorganize files as per discussion with Robert
<GitHub94> artiq/master 0c2e960 Sebastien Bourdeauducq: frontend: restore artiq_ prefix
sb0 has quit [Quit: Leaving]