<lekernel>
A modern VLSI chip has a zillion parts -- logic, control, memory, interconnect, etc. How do we design these complex chips? Answer: CAD software tools. Learn how to build these tools in this class.
<lekernel>
wolfspraul :)
rejon has quit [Ping timeout: 272 seconds]
rejon has joined #milkymist
voidcoder has quit [Remote host closed the connection]
voidcoder has joined #milkymist
rejon has quit [Ping timeout: 244 seconds]
Jia has quit [Remote host closed the connection]
Jia has joined #milkymist
rejon has joined #milkymist
jimmythehorn has quit [Read error: Connection reset by peer]
jimmythehorn has joined #milkymist
kilae_ has joined #milkymist
kilae has quit [Ping timeout: 246 seconds]
Jia has quit [Quit: Konversation terminated!]
rejon has quit [Ping timeout: 250 seconds]
mumptai_ has joined #milkymist
sh4rm4 has joined #milkymist
voidcoder has quit [Read error: No route to host]
voidcoder has joined #milkymist
mumptai_ has quit [Ping timeout: 248 seconds]
voidcoder has quit [Quit: See you next time]
voidcoder has joined #milkymist
antgreen has joined #milkymist
<wpwrak>
thinking of how to overcome the LM32's slowness ... if we'd have several lm32 cores, complete with cache and tlb, and ignoring cache coherence for a moment, in M1, room for how many such cores would there be in M1 ?
<lekernel>
since most software is single-threaded, you won't overcome slowness this way
<lekernel>
and if you have to rewrite software to make it parallel, then you're better off designing proper hardware accelerators instead instead of introducing the CPU overhead
<wpwrak>
lekernel: think concurrent but loosely related programs. or different layers working concurrently. there, you can get a speedup.
<wpwrak>
so, how many cores would fit ? 2 ? 4 ? 10 ?
<lekernel>
maybe 8 or so
<wpwrak>
wow, great.
<lekernel>
perhaps even more, but I'm not sure about the block RAM for the caches
<wpwrak>
i think something around 4 may be interesting for a general-purpose workload. one for the kernel, one for the main application, one for background tasks, and one for whatever else comes along.
<lekernel>
doesn't sound too good... imo the real way out of the CPU slowness is ASIC
<wpwrak>
may need a bit of kernel tuning because the kernel tries to keep related things on the same cpu, assuming cycles are cheap but memory accesses (i.e., moving data accessed by one core to another) aren't. in our case, it's almost the opposite.
<wpwrak>
if you have the money ... ;-)
<lekernel>
well, aerospace institutes do. they're paying a lot of money for eg LEON chips.
<wpwrak>
not that i'd disagree with the technical merit of having the core in a dedicated asic ...
<wpwrak>
do you have any that would finance such work ?
<lekernel>
and maybe those parts that don't pass space qualification could still be used elsewhere
<wpwrak>
it's not only what they'd pay for the chip, but also what they'd pay to have it developed ...
<lekernel>
(or don't need, since a lot of the radiation hardening stuff is in the package, not the silicon)
<lekernel>
sounds much easier to me to get aerospace funding than anything else for this purpose
<wpwrak>
the numbers for makey makey look rather happy
<lekernel>
...which is my point
<wpwrak>
the monster cnc machine .. well, consider how many people would even have the room for such a monster :)
<lekernel>
yes, about the same number of people who'd buy a free CPU instead of a $35 rasperry pi or similar piece of crap
<wpwrak>
well, but with your aerospace contacts, you'd probably not go to kickstarter anyway
<wpwrak>
and the rpi will be a victim of its own success anyway. i wouldn't worry too much about them.
<kristianpaul>
lekernel: cparty, are you giving a talk about overclocking fpgas? :-)
<kristianpaul>
what aditional hw besides adding the other lm32 cores to the SoC is required to get SMP?
<Fallenou>
adapt wishbone code maybe to have one more master
<kristianpaul>
ah well conbus said upto 8 both master and slaves...
<wpwrak>
you also need to consider cache coherency. that can be done in hw or in sw, though.
<wpwrak>
of course, doing it in sw can make things slow. and limits the type of tasks you can use it for.
<Fallenou>
and for now lm32 caches are not doing any kind of bus snooping :(
<lekernel>
be happy that since they are write-through, you only need bus snooping and not relatively complicated protocols like MSI or its variants
<wpwrak>
yeah :)
<Fallenou>
hehe sure
lekernel_ has joined #milkymist
lekernel has quit [Ping timeout: 272 seconds]
lekernel_ is now known as lekernel
wpwrak has quit [Remote host closed the connection]
wolfspra1l has joined #milkymist
wolfspraul has quit [Ping timeout: 250 seconds]
rejon has joined #milkymist
rz2k has joined #milkymist
rz2k has left #milkymist [#milkymist]
jimmythehorn has quit [Quit: jimmythehorn]
hypermodern has joined #milkymist
Martoni has quit [Quit: ChatZilla 0.9.88.2 [Firefox 14.0.1/20120713225625]]
rejon has quit [Ping timeout: 264 seconds]
jimmythehorn has joined #milkymist
xiangfu has quit [Ping timeout: 252 seconds]
xiangfu has joined #milkymist
xiangfu has quit [Client Quit]
voidcoder has quit [Read error: Connection reset by peer]
voidcoder has joined #milkymist
hypermodern has left #milkymist [#milkymist]
wpwrak has joined #milkymist
Gurty` has quit [Ping timeout: 265 seconds]
Gurty` has joined #milkymist
<mwalle>
lekernel: i bought an rpi for me, so i'm banned now in this channel? ;)
<mwalle>
but actually, i didnt use it yet, just installed an xbmc distro, took ages and didnt work in the end..
<mwalle>
lm32/milkymist/qemu is better to hack on ;)
antgreen has quit [Remote host closed the connection]
<kristianpaul>
better and still a long/lot to :-)
<larsc>
mwalle: I have some issues with the latest qemu. Whenever I send a multi-character key (like the arrow keys) all further keypresses get delayed by the number of extra characters in a multi-character key sequence
<larsc>
e.g. press left and then type "Hello" the H will appear when you press the e, the e will appear when you press the l and so on
<larsc>
if i press left twice, the H appears when I press the l and so on
kilae_ has quit [Quit: ChatZilla 0.9.88.2 [Firefox 14.0.1/20120713134347]]
stekern has quit [Ping timeout: 272 seconds]
<Fallenou>
mwalle: I bought one as well
<mwalle>
larsc: yeah i noticed that too
<Fallenou>
as I told wolfspra1l , I turned it on for 5 minutes, it booted on my TV, and then back in the box ^^
<Fallenou>
"cool it boots", boxed
<mwalle>
larsc: once the input buffer overflows, the ringbuffer will always be N characters 'behind'..
mumptai has joined #milkymist
jimmythehorn has quit [Read error: Connection reset by peer]
jimmythehorn has joined #milkymist
<mwalle>
larsc: bug fixed in my qemu repository
<Fallenou>
:)
<mwalle>
Fallenou: how hard is it to add a cache inhibit bit to the tlb?
<Fallenou>
a cache inhibit bit ?
<mwalle>
non-cacheable
<mwalle>
eg, set this bit to bypass the d/icache
<Fallenou>
maybe a simple way would be to trigger a cache miss when this is is set ?
<Fallenou>
so that it fetches from main memory anyway
<Fallenou>
when this bit is set*
<wpwrak>
that would also make sure the cache is kept in sync
<Fallenou>
yes
<Fallenou>
but is this happens too often the cache is like useless
<Fallenou>
very suboptimal
<mumptai>
and you don't need an ugly bypass
<mwalle>
mh but it will replace other entries, yes..
<Fallenou>
and it would just be a single ( && tlb_lookup_bypass) addition to the assign miss = line
aeris has quit [Ping timeout: 276 seconds]
<mwalle>
Fallenou: there should be already some logic to bypass the cache, theres some define
<Fallenou>
wishbone is selected if dcache is not, if address < base or address > upper_limit
<wpwrak>
so would we even need a special bit ?
<Fallenou>
I don't exactly know why mwalle asked that
<Fallenou>
what does he wants to do ?
<Fallenou>
what did he have in mind ? :)
* Fallenou
cannot locate the similar trick for icache though, wonders if icache has limit/base stuff working
<Fallenou>
datasheet seems to say "yes"
<Fallenou>
but cannot locate it in the code
<wpwrak>
you don't need this for icache. i think he's after data access to memory-mapped devices
<Fallenou>
oh, right
<Fallenou>
then yes it's limit/base stuff to have memory mapped regions non cachable
voidcoder has quit [Remote host closed the connection]
voidcoder has joined #milkymist
<Fallenou>
If an instruction cache is used, attempts to fetch instructions from outside of the range of cacheable addresses result in undefined behavior, so only one cached region is supported.
<Fallenou>
hum ok
<Fallenou>
so basically the BASE/LIMIT stuff does not work for icache :)
<Fallenou>
gute Nacht :)
<wpwrak>
you don't need that for icache anyway
<Fallenou>
well you could want to fetch from wishbone directly in case of DMA containing code :p
<Fallenou>
but then maybe it's best to just invalidate Icache