<Evidlo>
ah, it seems like its just a newer version of gtkwave
pftbest has joined #nmigen
<Evidlo>
wish I could get sigrok to read those vcd files correctly though
<Evidlo>
or pulseview I mean
chipmuenk has joined #nmigen
jjeanthom has joined #nmigen
revolve has quit [Read error: Connection reset by peer]
revolve has joined #nmigen
jjeanthom has quit [Ping timeout: 260 seconds]
jjeanthom has joined #nmigen
Raito_Bezarius has quit [Ping timeout: 264 seconds]
rohitksingh_ has joined #nmigen
rohitksingh_ has quit [Client Quit]
Raito_Bezarius has joined #nmigen
bvernoux has joined #nmigen
Bertl_zZ is now known as Bertl
chipmuenk has quit [Quit: chipmuenk]
chipmuenk has joined #nmigen
jjeanthom has quit [Ping timeout: 264 seconds]
revolve has quit [Ping timeout: 244 seconds]
revolve has joined #nmigen
slan has joined #nmigen
jjeanthom has joined #nmigen
<_whitenotifier-4>
[nmigen] jfng opened issue #601: Slicing a Value with a bool generates invalid RTLIL - https://git.io/JmZlA
<slan>
Hobbyist HW engineer here, I have a RISC-V running from a single nmigen memory with 2 read ports and 1 write port, cpi=1 as expected. Trying to get it connected to multiple devices, I wrote an "interconnect" which works fine when I have only one device on the ibus/many on the dbus but fails with many ibus/many dbus devices (well, simulation is fine fine but Vivado synthesis is exploding with combinatorial
<slan>
loops). Maybe I missed an important theoretical limitation and what I'm trying to achieve is not possible, or I failed at implementing my idea, or there is a bug in Vavado... but after exploring all those options I'm back to square one. How would you go about it?
Bertl is now known as Bertl_oO
noknok has joined #nmigen
<modwizcode>
slan: It might help greatly to have an idea of what implemention you have now and what some of the specific failures are
<slan>
The "simple" one is working with one device on ibus
<slan>
But I can't generalize to n devices
<slan>
(see line 57, the for is limited to 1 entry)
<slan>
There are some design notes in the "docs" directorry... but since I'm a beginner the terminology might not be quite right
jjeanthom has quit [*.net *.split]
<slan>
What I'm trying to achive is 1 cpi for load/store with dbus!=ibus and 2 cpi for load/store with ibus=dbus
<slan>
I want to believe this is doable... is it?
<modwizcode>
Yeah I'm trying to look at it and figure out what exactly is going on
<modwizcode>
In the meantime have you looked at how litex does it's interconnects? (idk if there's a specific answer there but I'd assume there is)
<slan>
Thanks a lot. Vivado error didn't really help. Tried to visualize the loop showed basically my whole design. By trial and error it looks like it starts failing when I'm comparing ibus.addr with dbus.addr but...
<slan>
I had a look at minerva's wishbone interconnect (got the idea to use encoder from there). Will look into litex, thanks
<modwizcode>
Just a thought but one thing that might help a lot is to simplify the interconnect switching itself by trying to split off the implementation of the Mux from the rest
<modwizcode>
also I see something that does seem potentially relevant here which is a bit strange to me
<modwizcode>
When the address matches you assign synchronously the cache values
<slan>
Indeed, my idea is that's how I break the combinatorial loop
<modwizcode>
but when it's disabled you pass through
<modwizcode>
Yeah I don't think that's the answer there or at least it's not clear to me what's happening enough to piece it together
<modwizcode>
Let me read your design ideas here and then try to see if I can suggest something besides "don't do that"
<slan>
Much appreciated
<modwizcode>
note also that I'm just looking at interconnect itself here, I'm assuming it's decoupled enough from the core that it's not literally a feedback loop from core output to interconnect input or something.
<slan>
It shouldn't be
<slan>
The core is quite simple. Ther's only one register (pc) that's clocking the whole thing
<slan>
rest is combinatorial
<modwizcode>
okay that's what I'm reading here, I think that should be fine
<modwizcode>
What does ibus mean here? I'm a bit confused
<slan>
One thing I was worried about was my handling of traps, but I never read trap so should be fine too
<slan>
ibus is the instruction bus from where the core is reading instructions
<modwizcode>
That's what I was thinking
<modwizcode>
So the idea is that ibus takes priority over dbus, and if the data is cached at the interconnect (from previous read) use that otherwise go out and get it?
<slan>
the cache is only used when ibus and dbus need the same device
<slan>
so yes, ibus has priority, first read populates the cache so nect clock can use same bus for data request (feeding the core with cached ibus data so it can replay same instruction)
<modwizcode>
hmm
<slan>
But instructions that don't need dbus or with dbus device != ibus device just go through and are executed in one cycle
<modwizcode>
One cycle assuming memory is fast enough I guess
<modwizcode>
But if they need the smae device then there's a stall right?
<slan>
Yes, I want the ideal case to be one cycle. That works with nmigen memory on my Arty (distributed RAM)
<slan>
Yes, stall is handled with the cache
<modwizcode>
I guess I should just look at you're hart impl but I'm assuming if memory was slower than the clock for some reason you stall until memory is ready
<modwizcode>
Just covering all the bases here before I make suggestions
<slan>
Yes that's correct. I did some tests with SDRAM and it was behaving as expected.
<modwizcode>
okay cool
<slan>
(meaning stalling until data came back)
<slan>
(test code is in firmware/sdram.c FWIW)
<modwizcode>
So one problem is that I can see a feedback loop right now
<modwizcode>
At least I think so
<modwizcode>
When you compare the address in interconnect_simple in the else case you also assign the address
<slan>
self.dbus is connected to the core, dbus is the device bus inferred from self.dbus.addr
<slan>
(I need to afk for 10-15 mins, brb)
<modwizcode>
So a couple of things
<modwizcode>
first of all there's a lot of places that busses essentially unconditionally report ready
<modwizcode>
I think that generally is fine for things, except in the case of memory where I don't think you actually look at if the ports report ready?
<slan>
rdy is fine for the design in this branch
<modwizcode>
But inside the interconnect it looks like self.ibus.rdy.eq(1) is set when the cache is valid
<modwizcode>
and only would take a non 1 value if a device is matched
<modwizcode>
also why do you only connect the self.dbus.rdata and rdy to the bus when the cache is used if the bus is ready?
chipmuenk has quit [Quit: chipmuenk]
<slan>
because I'm feeding the core with the cached instruction and tell it's valid to decode/ex/etc...
<slan>
I'm also connecting the request on dbus to the device (variable name bus)
<modwizcode>
right but you're not changing the dbus signals until you get ready from the bus.
<modwizcode>
*ready from the device
<slan>
So in practice the core can execute same instruction with data bus connected
<slan>
hmm
<slan>
true, write requests should hang there...
<slan>
well actually no, the bus will be released anyways... itäs just Iäm feeding rdata which is not used for write requests
<slan>
So I think it's fine
<slan>
I can check the req type to avoid sending back rdata though
<slan>
thanks
<modwizcode>
This code is kind of difficult to follow because of how you intermix the sync and comb domains, in a lot of places I'm pretty sure you're creating latches because you're not fully elaborating the values of the comb signals.
<slan>
I guess that's one pitfall when coming from software. I tend to write procedurally and that's most likely why my design is failing
<slan>
How would you recommend to reorganise the code?
<modwizcode>
So in general you do a lot of "if this assign values" without an else. When you do that with comb definitions the result isn't generally what you want
<modwizcode>
That's not a hard rule but it makes it easier to reason about things
<slan>
Hmm, Iäll try to review my code from that angle
<modwizcode>
One other thing I'd do is probably write it out longform (not using for loops and the like to generate code) and try to get it to work when it's in a form that's fully elaborated like that
<modwizcode>
Once that's working you can figure out how to simplify it
<slan>
That's actually how I started. Maybe it's time to do it one more time...
<modwizcode>
The more incremental you work the easier it will be to figure out what went wrong here
<slan>
One thing that ot me is the fact that Vivado is really good at optimizing unused parts...
<modwizcode>
You have that pattern a lot like I mentioned, but you want to just connect those immediately in the comb block above and then clear cache_valid in that if.
<modwizcode>
I have a weird workflow for checking things where I throw the verilog into yosys and try to get a flattened optimized output and see what it came up with, but that's not necessarily effective.
<modwizcode>
One thing I'd do with this design is try to throw the verlilog into yosys and run the proc pass then get the visualized flowchart and see if that shows you combinatoral loops.
<slan>
I've tried to yosys check my design but no warning came out
<modwizcode>
Well combinatoral loops aren't necessarily illegal I think
<modwizcode>
If you throw it into the write_cxxrtl pass it should report on loops it finds
<slan>
My understanding is that yosys would warn about them
<modwizcode>
one other problem with loops is that nmigen isn't actually guranteed to work with them in the design
<slan>
oh that's cool, will try that
<modwizcode>
they have to be specified as external processes (i.e. seperately implemented in verilog) if you want to do designs with them in nmigen
<slan>
I don't think I want loops :)
<slan>
I want to understand wherethey are so I can remove them
<modwizcode>
Yeah I only mentioned it because if you do have loops that (apparently, I don't know the specific circumstances) can cause the output from nmigen to be not correct
<slan>
I also tried the simulation in Vivado and it worked as expected
<modwizcode>
Sorry that I haven't had better advice here, nothing obviously stood out enough to point it out specifically.
<modwizcode>
Does it simulate correct in nmigen?
<slan>
Hey that's great advice anyways. Thanks a lot
<slan>
Yes simulation is as expected
<slan>
(I just pushed a small fix in the firmware)
<modwizcode>
One thing you should definitely do imo is try to see if you can flatten out the design a bit
<modwizcode>
Like you have a lot of nesting of If's and such
<modwizcode>
and I tend to find it both makes things confusing to reason about and tends to be overcomplicating things
<slan>
Noted
<slan>
I assume you're talking specifically about the interconnect. Would those remarks be relevant to the cpu itself too?
<modwizcode>
In general yes
<slan>
k
<modwizcode>
If you can't flatten it out more I find it tends to mean you should move some of the code into submodules
<modwizcode>
definitely look at how litex does things though. It's written in migen not nmgien but the similarities are pretty clear
<slan>
litex was my initial inspiration to look into migen/nmigen but I thought I should learn by building :)
<slan>
Again, thank you modwizcode. I go write some code with these new ideas!
<modwizcode>
Good luck!
bvernoux has quit [Read error: Connection reset by peer]