<SolraBizna> one that will live FOREVER
<SolraBizna> and by FOREVER I mean longer than 3 years
emeb has quit [Quit: Leaving.]
OmniMancer has joined ##openfpga
noobineer has joined ##openfpga
unixb0y has quit [Ping timeout: 250 seconds]
unixb0y has joined ##openfpga
cr1901_modern1 has quit [Quit: Leaving.]
cr1901_modern has joined ##openfpga
flea86 has joined ##openfpga
_whitelogger has joined ##openfpga
noobineer has quit [Ping timeout: 258 seconds]
gsi__ has joined ##openfpga
gsi_ has quit [Ping timeout: 252 seconds]
noobineer has joined ##openfpga
dj_pi has joined ##openfpga
mumptai_ has joined ##openfpga
tlwoerner has quit [Quit: Leaving]
mumptai has quit [Ping timeout: 258 seconds]
Bike has quit [Quit: Lost terminal]
dj_pi has quit [Ping timeout: 245 seconds]
tlwoerner has joined ##openfpga
X-Scale has quit [Ping timeout: 255 seconds]
jevinskie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
pointfree has quit [Ping timeout: 250 seconds]
ovf has quit [Ping timeout: 250 seconds]
nickjohnson has quit [Ping timeout: 250 seconds]
dj_pi has joined ##openfpga
jevinski_ has joined ##openfpga
ovf has joined ##openfpga
pointfree has joined ##openfpga
nickjohnson has joined ##openfpga
dj_pi has quit [Ping timeout: 255 seconds]
jevinskie has joined ##openfpga
jevinski_ has quit [Ping timeout: 255 seconds]
noobineer has quit [Remote host closed the connection]
gsi__ is now known as gsi_
ewen has joined ##openfpga
Richard_Simmons has joined ##openfpga
emeb_mac has joined ##openfpga
Bob_Dole has quit [Ping timeout: 276 seconds]
Richard_Simmons2 has joined ##openfpga
Richard_Simmons has quit [Ping timeout: 264 seconds]
Richard_Simmons has joined ##openfpga
GuzTech has joined ##openfpga
Richard_Simmons2 has quit [Ping timeout: 264 seconds]
emeb_mac has quit [Ping timeout: 255 seconds]
Dolu has joined ##openfpga
Asu has joined ##openfpga
mumptai_ has quit [Quit: Verlassend]
mumptai has joined ##openfpga
ewen has quit [Quit: leaving]
nickjohnson has quit []
nickjohnson has joined ##openfpga
<Sprite_tm> Not sure if this is the right channel, but does anyone know a good way to solve a situation where multiple bus-masters (CPUs, DMA, ...) have to be able to access multiple memories/peripheral?
<azonenberg> Sprite_tm: There's pretty much two options
<Sprite_tm> Mostly, I'm dithering between either going the simple route, putting all the slaves on one bus with a bus-select, and have an arbiter to talk to that bus, with the masters on the other side.
<Sprite_tm> ...which means instant-choke-point in the arbiter.
<azonenberg> first is a single level hierarchy where you just have a big crossbar with all the masters at one end and all the slaves at the other
<azonenberg> You can have that switch have only one path, or be a full crossbar depending on performance-area tradeoffs you want
<Sprite_tm> Yeah, the full crossbar was the other idea I had :)
<azonenberg> The other option is to go with packet switching on some kind of NoC
Asu` has joined ##openfpga
<azonenberg> Where each node connects to a router (which may be either dedicated to the node, or interspersed between them)
<Sprite_tm> Hm, that's like pcie, isn't it?
<azonenberg> then the routers push packets around until they get where they need to be
<azonenberg> I'm thinking on chip interconnect, not external
<tnt> Sprite_tm: what's the bus ? what's the target fpga ?
Asu has quit [Ping timeout: 268 seconds]
<azonenberg> and i dont actually know much about pcie beyond the physical layer
<azonenberg> i'll get back to you in a few months once i've written a protocol decoder for my scope :P but that might not happen any time soon as i think it's too fast for me to decode without a new scope or a custom FPGA-based digital capture board
<Sprite_tm> tnt: bus is not really much of a real bus atm, just address/data/read/write and a done line.
<Sprite_tm> Think what the PicoRV32 has natively.
<Sprite_tm> Target is an ECP5, the -45F variant.
<Sprite_tm> azonenberg: My intuition tells me that that interconnect needs to be pretty damn fast, otherwise it'll be the choke point, right?
<azonenberg> Yes
<azonenberg> A full crossbar is probably the easiest to implement, and will have the best performance
<azonenberg> as long as you dont have too many devices it shouldn't be tooooo big
<Sprite_tm> Sounds like it. Any idea if that's heavy on the resources?
<azonenberg> well depends on your data width
<azonenberg> I've built NoCs with bus widths from 16 to >512 bits
<azonenberg> as you can imagine resource requirements scale accordingly
<Sprite_tm> Data width is 32-bit (I'll bypass the crossbar if I need really fast access) and there will be, say, 5 masters and 5 slaves.
<tnt> that won't be small for sure ...
<azonenberg> Well, let's see
<azonenberg> The port selection/arbitration logic is likely to be small compared to the crossbar itself
<azonenberg> So you have ten ports each of which can take data from one of five endpoints?
Dolu has quit [Quit: Leaving]
<Sprite_tm> 10 ports?
<azonenberg> So ten ports * 32 bits of 5:1 mux
<azonenberg> data out to the 5 masters and the 5 slaves
<azonenberg> (i assume both can tx and rx?)
<tnt> you have the 32 bit address too
<Sprite_tm> Yes, slaves essentially act as a memory-mapped device, so tx and rx.
<azonenberg> tnt: ah yeah i forgot this isnt a 32 bit wide packetized bus
<azonenberg> it's got metadata along side
<azonenberg> (the antikernel noc was 32 bits but moved 128 bit frames in 4 consuective cycles)
Dolu1990 has joined ##openfpga
<Sprite_tm> tnt: that may be limited... system has 16M of main memory, that's the largest address space... rest is, I dunno, 8 to 16 bits wide.
<azonenberg> ok so 64 data bits for 32 data + 32 addres
<azonenberg> Just for the sake of estimating
<azonenberg> And we'll say 5 ports have address, since the master probably doesnt get an address from the slave
<azonenberg> so 5*64 + 5*32 = 480 5:1 muxes
<Sprite_tm> Ah yes, master doesn't have memory.
<azonenberg> i dont know how big a 5:1 mux is in the ecp5 slice architecture
<Sprite_tm> So you'd need 4 of those.
<azonenberg> to a first order, assuming we only have lut4s available, you can do a 2:1 in a lut4
<azonenberg> Yeah, unless you have wide input muxes or something in the tiles in addition to the luts
<Sprite_tm> aka 1920 luts in total.
<azonenberg> (something akin to the xilinx F7MUX/F8MUX)
<Sprite_tm> I don't think it has anything special.
<tnt> Sprite_tm: what are your 5 masters btw, that seems heavy ?
<Sprite_tm> tnt: Risc5 master, Risc5 GPU-processor, Risc5 audio processor, and for fun, Z80 and 6502 :P
<Sprite_tm> Serial PSRAM through a cache of block RAM.
<tnt> what's the main memory ?
<Sprite_tm> Tbh, I may be able to get away with putting some of htose on the same crossbar endpoint, maybe with a small I-cache to alleviate their use of the main connection to memory.
<Sprite_tm> So theoretically, I could also give that multipe endpoints on the crossbar, so dev 2 can access cached memory while dev 1 is blocked on a cache miss...
<Sprite_tm> Well, I say 'I could', but I have to implement it myself, so that's the question :P
<Sprite_tm> azonenberg: Thanks a lot, I may be able to get away with a small crossbar switch indeed, and that would make the architecture nice and orthogonal. /me was thinking of attaching private memories to some processors before...
<tnt> tbh, I'd probably have dedicated path and cache for memory access for each of the 3 main cpu, and possibly have the PSRAM controller on a separate clock to make sure to run it at its max speed. (not sure what fmax you get on a picrorv32 on ecp5)
<Sprite_tm> tnt: 60'ish MHz atm, but that's for the entire SOC. Although the memory path seems to be the pain point atm.
<Sprite_tm> Also, dedicated cache is evil because then I need to think about cache consistency. And as this is my first SoC, I'm sure I will loose half a year of my time, half a decennium of my life and a lot of hairs to get that right.
<tnt> well a cross bar isn't improve things. You can register the transactions but that adds latency and each cycle will be a cycle where the cpu does nothing.
<Sprite_tm> Well, there's some specialized memory on the side (think FIFO) that I need to write in really fast. (GPU is effectively a RiscV 'racing the beam', Atari 2600 style.)
<Sprite_tm> But I still want all CPUs to be able to access stuff like that, because confusing otherwise.
_whitelogger has quit [Remote host closed the connection]
_whitelogger_ has joined ##openfpga
Bike has joined ##openfpga
<whitequark> azonenberg: re crossbar
<whitequark> what if instead of a mux you used logical OR between all the fan-in devices?
<whitequark> if they always output 0 when not addressed (which you can formally verify quite easily) it's more compact
<tnt> well with a full cross bar devices can be accessed in //
<tnt> but yeah, for large mux N:1 and high # of bits, decoding to a 1-hot and doing a AND then OR-ing is more efficient.
<whitequark> in // ?
<tnt> Master 1 talks to Slave 1 and Master 2 talks to Slave 2 for instance.
<tnt> (we were in a multi-master scenario)
<whitequark> ahh full crossbar
<whitequark> ok
<Sprite_tm> Blerp, I feel so much a n00b in the implementation of these things... but there's only one way to get rid of inexperience.
<Sprite_tm> ...Back in the day, when you could just plunk everything on the same bus and use DTACK etc to do multi-master, everything was much simpler :P
<Sprite_tm> Not that I ever implemented that, but hey :P
<tnt> Sprite_tm: in some old FPGA you can do that I think :)
<Sprite_tm> Yeah, I've heard. Fun stuf, you can code explode-y verilog :P
flea86 has quit [Ping timeout: 246 seconds]
flea86 has joined ##openfpga
X-Scale has joined ##openfpga
<sorear> *mumble* this is a lost cause I know but “Risc5” (Arabic) is a Project Oberon thing
Jybz has joined ##openfpga
<flea86> sorear: I'm rather surprised that concept (Oberon) isn't more popular..
<Sprite_tm> Hm, there's no way for Yosys to accept arrays as inputs/outputs to Verilog modules, is there? :/
* Sprite_tm can go the very-wide-vector approach, but it feels so janky...
<tnt> Sprite_tm: yeah I know it's unfortunate :/ I feed the same but gotta live with it AFAIK.
<Sprite_tm> That's a shame. Ah well, preprocessor directives to the rescue :)
flea86 has quit [Quit: Goodbye and thanks for all the dirty sand ;-)]
<Sprite_tm> Ergh, now Verilator is bleating that my for-loop variable isn't a constant so I can't index vectors with it. Will the hurting ever end?
<sorear> mm
<Sprite_tm> Aaaah, wait, nm, I ran into this before I think... I think it's complaining about the *width* of the slice I pull.
<Sprite_tm> Nm, think I got this :)
<Sprite_tm> For historical reasons: my issue is that I had a for loop and a vector statement with vector[i*4+3:i*4] in there.
<Sprite_tm> Verilator is too stupid to see that both sides of that slice have a fixed relation with i, and complains that the bitsize of the slice is variable.
<Sprite_tm> Solution is to rewrite it to vector[i*4+:4] instead.
<Sprite_tm> And now, when I run into this again and have forgotten what the issue is, I hopefully punch the issue in Google and get this chat log as the solution.
<Sprite_tm> So hey future Sprite_tm: Shame on you for being forgetful!
wpwrak has quit [Ping timeout: 250 seconds]
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
wpwrak has joined ##openfpga
gnufan_home has joined ##openfpga
Jybz has quit [Ping timeout: 252 seconds]
Jybz has joined ##openfpga
Jybz has quit [Client Quit]
Bike has quit [Quit: incommunicado]
_whitelogger has joined ##openfpga
zem has quit [Ping timeout: 246 seconds]
zem has joined ##openfpga
nickjohnson has quit [Ping timeout: 264 seconds]
nickjohnson has joined ##openfpga
emeb_mac has joined ##openfpga
emeb_mac has quit [Ping timeout: 244 seconds]
OmniMancer has quit [Quit: Leaving.]
Dolu1942 has joined ##openfpga
Dolu1990 has quit [Ping timeout: 250 seconds]
ZombieChicken has quit [Ping timeout: 256 seconds]
ZombieChicken has joined ##openfpga
ZombieChicken has quit [Ping timeout: 256 seconds]
ZombieChicken has joined ##openfpga
s_frit has quit [Remote host closed the connection]
s_frit has joined ##openfpga
emeb_mac has joined ##openfpga
futarisIRCcloud has joined ##openfpga
GuzTech has quit [Ping timeout: 255 seconds]
Asu` has quit [Remote host closed the connection]
jevinskie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Dolu1990 has joined ##openfpga
Dolu1990 has quit [Client Quit]
jevinskie has joined ##openfpga
Dolu1942 has quit [Ping timeout: 252 seconds]
ZombieChicken has quit [Quit: Have a nice day]