<ZirconiumX>
No, Verilog isn't quite mad enough to let you do this shit on the left hand side of an assignment
<ZirconiumX>
However, isn't this a tautology?
SingularitySurf has quit [Remote host closed the connection]
Sarayan has joined #nmigen
<Vinalon>
Does anyone have a feel for what the overhead is like with AsyncFIFO objects? Like, if I want to use one to nest CPU contexts, would it be better to have a separate FIFO for each register or one very wide FIFO to store all of the values?
<Sarayan>
Is there a way to say "tick until that signal is 1" in a python sim?
<ZirconiumX>
Vinalon: The FIFOs are built out of Memory cells internally: small Memory cells can be turned into flops, but giant Memory cells will be built out of block RAMs
<Vinalon>
You can use 'yield <signal>' to get a value in a simulation; maybe something like: https://bpaste.net/H7VA
<Vinalon>
so, smaller widths are probably easier for the tools to optimize? Okay, thanks
<Sarayan>
Vinalon: Can you have that in a sub-function? I remember the sim not liking to yield in sub-functions, but I may be wrong
<Vinalon>
I'm far from an expert, but I think it should work if you call the function with 'yield from funct(...)'.
<Sarayan>
errr ok
<ZirconiumX>
Yeah, you need to `yield from` subfunctions
<Sarayan>
thanks. Bedtime, I guess I'll try tomorrow
<Vinalon>
Good luck!
<awygle>
i'd guess "lots of small FIFOs" ends up bigger than "one very wide FIFO", depending on a number of factors including whether the synthesis tool can merge flop RAM into dist RAM into block RAM intelligently
<Vinalon>
huh - so would a 'smarter' synthesis tool be more likely to do a good job of handling a bunch of smaller ones, or vice-versa? I usually like to lean towards trusting the compiler (or synthesizer) since low-level tools are always improving.
<daveshah>
In general combined is going to be better, as that way no information is lost
<Vinalon>
so it'd be okay to have a FIFO that's around 1024-2048 bits wide?
<ZirconiumX>
Widths like that scare me :P
<Vinalon>
that's what I was thinking, but it would be nice...
<daveshah>
I can't see any particular problem with that
<Vinalon>
Cool, I'll give it a try - thanks for the information!
<daveshah>
Report back any bugs!
<daveshah>
I'm presuming this isn't on the LP383 btw
<Vinalon>
No, I'm hoping to start with an UP5K but I might have to move up to an ECP5...
<daveshah>
I think this is going to need an ECP5
<Vinalon>
I'm trying to implement a simple RISC-V CPU and this is how I'm planning to store CPU registers for trap handlers.
<daveshah>
Is that something that RISC-V even needs?
<ZirconiumX>
Yeah, you can spill the registers to stack if you need to
<Vinalon>
Not that I'm aware of, but I really like how ARM Cortex-M cores do all of that in hardware
<ZirconiumX>
Actually there's a privileged register for that
<daveshah>
That's going to be a very inefficient way to do it
<daveshah>
It would force the register file to be implemented using FFs rather than BRAM too
<ZirconiumX>
Actually a full spill presents a lot of issues
<daveshah>
As there will be spare space at the end of BRAM for a typical register file implementation a shadow register approach would be much more efficient
<daveshah>
But I'm not sure if RV actually needs this at all
<ZirconiumX>
Like environment calls being completely unimplementable
<Vinalon>
Oh...yeah, I'm sure that the whole CPU design is very inefficient, considering how little I know about the tooling and FPGA resources.
<Vinalon>
I guess I'm barking up the wrong tree with using FIFOs like this, then, thanks.
<ZirconiumX>
FIFOs - async FIFOs especially - are for clock-domain crossing
<Vinalon>
yeah, my plan was to use different clock edges for reads and writes, but actually, this brings up another question I had about clock domains
<daveshah>
The main problem with this approach is that it would need to access every register bit at once
<daveshah>
Which forces a much less efficient implementation than using the dedicated RAM
<daveshah>
Unless you really know what you are doing using different clock edges is going to cause more problems, and worse performance, than it solves
<Vinalon>
oh; it's better to always use the same edge and wait more cycles?
<daveshah>
In general, yes
<Vinalon>
and here I thought I was being clever...oh well, thanks.
<Vinalon>
Anyways, it sounds like if I want to store a couple of sets of 32 32-bit values, 'Memory' objects might be better than 'FIFO's?
<daveshah>
The best bet would be to use one deeper memory and just control the upper address bits
Asu has quit [Remote host closed the connection]
<Vinalon>
ohhhh, that would make a lot of sense - thanks! I'll have to stop using an Array of Signals for the main CPU registers, then...how's that for an inefficient design? :)
<ZirconiumX>
Oh, yeah, that's gonna be awful
<Vinalon>
but it still works better than a Python array of Signals with 'for' loops to generate long 'if/elif' blocks in the 'elaborate' method. Sometimes I'm a little slow on the uptake :P
<awygle>
:) we're all learning
<Vinalon>
The Python syntax makes these sorts of changes really easy, though. Since the '[]' operators work the same, I only needed to change how the objects were initialized.
<Vinalon>
("Array(Signal(x) for i in range(y))" -> "Memory(width=x, depth=y)")