<modwizcode>
It should be functionally the idea you want although the details aren't identical and I won't claim my coding style in it is by any means ideal or desirable
<slan>
modwizcode: wow, thanks for the detailed comments. Your implementation is much easier to read and reason about. However, plugging my cpu shows the same combinatorial loop. That probably means my problem is in the implementation of the cpu itself. I'm carefully reviewing the feedback I'm getting from yosys write_cxxrtl but the design is so big (and convoluted;) that I need to slash things here and there to
<slan>
understand better.
chipmuenk has joined #nmigen
lkcl has quit [Ping timeout: 260 seconds]
lkcl has joined #nmigen
feldim2425 has quit [Ping timeout: 276 seconds]
feldim2425 has joined #nmigen
revolve has quit [Read error: Connection reset by peer]
revolve has joined #nmigen
bvernoux has joined #nmigen
noknok has joined #nmigen
noknok has quit [Ping timeout: 260 seconds]
<d1b2>
<4o> how do i cast Signal value to an int?
<agg>
in simulation?
<d1b2>
<4o> during synthesis
<agg>
what are you trying to do?
<d1b2>
<4o> https://paste.debian.net/1189832/ can i has native python math.log for synthesis? it says: return math.log(a, 2) TypeError: must be real number, not Signal
<agg>
math.log would have to run at synthesis time but the value of `a` is only known at runtime, nmigen can't turn math.log into rtl because it's a python function that would just get called as normal at elaboration
<d1b2>
<4o> yep. true. time to add this as a feature
<agg>
for log2 you probably want something better described as clz?
<agg>
(not that nmigen has that built-in afaik, you'd have to implement it)
<agg>
or if not clz then a sort of 'position of most significant set bit' anyway
<d1b2>
<4o> nah, i'll just generate a lut for such functions. time to add native python to nmigen
<agg>
I don't think math.log is implemented in native python...
<agg>
and it's implemented for floating-point numbers, not integers, too
<d1b2>
<4o> well i mean support arbitrary python functions for synthesis
<d1b2>
<4o> fixed point is a good float
<agg>
you want it to turn your call of math.log(signal) into a ton of gates that implements log?
<d1b2>
<4o> yep
<mwk>
that... cannot possibly end well
<d1b2>
<Darius> implementing log2 in gateware sounds non trivial
<agg>
especially given as math.log is from libc
<d1b2>
<Darius> and probably something with a slew of tradeoffs
<agg>
you'd have to turn arbitrary x86_64 fpu instructions into gateware, even harder than regular HLS
<agg>
actually I guess it's easy
<d1b2>
<4o> everything is a lut in hw
<agg>
just instantiate an x86_64 CPU
<d1b2>
<dub_dub_11> lol
<d1b2>
<Darius> step 3: profit
<agg>
and load in libc's logf into rom or whatever
<agg>
you might be interested in http://www.myhdl.org/ instead then, which is more "python expressions -> gateware"
<d1b2>
<4o> nah, tried it. don't like it. i like nmigen
<agg>
well, good luck with your new project
<d1b2>
<4o> thanks
<modwizcode>
I mean maybe you could automagically try and setup linear interpolation for a range?
<modwizcode>
That might be more doable
<modwizcode>
If you reduce the function to a boundary and then define a desired precision level I think that could work
<modwizcode>
it might not be a bad idea for an nmigen "log2_int" primative though?
<modwizcode>
slan: That module might not be perfect still, like I said I haven't tested it properly. In fact I think I see a trivial issue that I need to fix.
<modwizcode>
I pushed a fix, I doubt it will fix anything but you could try, I looked at the hart design and nothing specifically jumped out at me as an issue
Bertl_zZ is now known as Bertl
Bertl is now known as Bertl_oO
emeb has joined #nmigen
<slan>
modwizcode: I'm in the process of writing a toysoc based on my full implementation to isolate potential loops. Using your interconnect as a starting point
<modwizcode>
Cool, make sure you pull in that latest commit (or make the little tweak) because i'm pretty sure without that's it's definitely broken.
<slan>
I just checked the diff
<modwizcode>
I wonder where the loops are, I was thinking about writing a top level thing I could add to throw it at vivado and test for loops
<slan>
I just did that but Vivado identified 100+ loops and I really couldn't make sense of them
<modwizcode>
Interesting
<modwizcode>
I don't think my interconnect should have that many problems...
<slan>
Thus the small soc based on similar ideas to help with vizualisation
<slan>
One thing that I'm concerned about is the idea of don't bother with dbus request if ibus is not ready (i.e. we don't have a valid instruction to decode)
<Evidlo>
does nmigen simulate combinatorial logic delay? the time resolution on the output .vcd files is insanely high (100 ns)
<slan>
That makes the whole combinatorial impl actually depend on sequence of comb signals
<slan>
Evildo: AFAIK no. There has been discussions/advice to use falling edge of clock to send comb stimulii but I don't know more
<d1b2>
<thirtythreeforty> Question - what's nmigen's replacement for LiteX's SoC framework? I know something exists, I'm just not familiar with it.
<modwizcode>
There's no true simulation of combinatoral logic delay, there's some internal handling of (forgive me I forgot the correct termonology) internal steps that all occur instantly as far as timesteps are concerned. But you won't see that on the VCD. on the VCD all changes happen at the clock.
<Evidlo>
LiteX is being replaced? I was planning on using that in a project soon
<modwizcode>
You can see that affect by changing combinatoral signals and then doing a yield Settle() in the testbench then looking at the current logic values after the Settle. It's not entirely obvious
<modwizcode>
I'm not sure that's true regarding LiteX
<d1b2>
<thirtythreeforty> Not as such, it's very much active. But there's a bit of impedance mismatch between LiteX and nMigen - you can use one with the other, just takes a little more effort
<modwizcode>
There's been some setup for some soc primitives in nmgien/nmigen-soc but it's not even close to anything like LiteX nor do I think it's intended to be
<modwizcode>
the main hold up I think on an more SoC stuff specifically for nmigen is the development of the streams RFC
<d1b2>
<thirtythreeforty> I've seen some nMigen projects foregoing LiteX, but I definitely need to pull in LiteDRAM
<modwizcode>
I don't see any serious reason to forego LiteX if you desire what it does.
<modwizcode>
Yes that's the issue
<d1b2>
<thirtythreeforty> I see thanks. Pity Wishbone doesn't have a stream companion spec like AXI does
<modwizcode>
I kind of thought it essentially did
<modwizcode>
but it's not really normal wishbone then
<d1b2>
<thirtythreeforty> Yeah, if you finagled a stream onto the Wishbone wires you'd have stream-that-kinda-looks-like-Wishbone
<Evidlo>
on a scale of 1-10, how hard would it be to implement XGII in nmigen? I'm sort of an FPGA newbie. Could I use LiteEth as a starting point?
<Evidlo>
XGMII*
<agg>
what do you want to interface it to?
<Evidlo>
to a desktop PC or laptop. I could probably do without the IP layer stuff
<agg>
XGMII is 32 bits in each direction plus control, so it's quite rare for it to leave the FPGA
<modwizcode>
Wikipedia seems to disagree with you on that
<d1b2>
<thirtythreeforty> Would not recommend routing a 32 bit bus if you can avoid it 🙂
<modwizcode>
I believe you more than wikipedia here though
<Evidlo>
doesn't any device with 10GigE do this?
<Evidlo>
maybe I misunderstood you. I'm connecting it to a 10GigE PHY, which is connected to a desktop
<agg>
which 10GigE PHY? I think some exist with XGMII but it's much less common than XAUI or RXAUI
<modwizcode>
wouldn't you also be mostly working with hard IP at those speeds too?
<Evidlo>
I don't have any specifics yet. I'm just trying to get a feel for the feasibility of this with my skillset. I can use any PHY. My only constraint is that I'd like to use ECP5
<agg>
XGMII is only 150MHz DDR or so, so would be ok without hard IP I guess
<modwizcode>
oh
<agg>
but it needs like 72 pins to the phy
<modwizcode>
I was looking at AUI which is why I was talking about high speed
<modwizcode>
sorry XAUI
<modwizcode>
AUI is rather old
<agg>
yea, XAUI is 3.125Gbps or so, you'd want to use a serdes
<modwizcode>
Heh of course lattice sells the IP block rip
<Evidlo>
is this even worth attempting, or should I just buy the IP?
<agg>
Evidlo: so, implementing xgmii in nmigen should be pretty straightforward, I haven't tried it but have done rgmii which is not hugely dissimilar just much less wide
<modwizcode>
I mean I don't think it's terribly hard if it's XGMII I would assume
<modwizcode>
I have an RGMII block from work that's not in nmigen but I'm hoping to port it (supposedly it's an open implementation but I need to ask someone first)
<agg>
but you'll need to route 2x 32-bit buses between the ecp5 and the mac, with some attention to length matching, which will complicate the pcb a bit
<d1b2>
<thirtythreeforty> Depends on if you want to learn the guts of XGMII, or want to get a project done that uses high speed ethernet
<Evidlo>
I'm willing to spend a week or two specifically on this part
<d1b2>
<thirtythreeforty> This right here is ~1/3 of your I/O on a 256 pin device. You'll likely need 6 layers to get it routed; my 4 layer design is really pushing it with just the 32 pin DDR interface
<modwizcode>
Even routing RGMII was a huge pita initially
<agg>
rgmii is higher speed so the length matching requirements are a bit tighter
<agg>
ironically
<modwizcode>
Yeah I think I saw this rgmii design before, needs some more comments
<agg>
I wrote it in like a couple hours very early in the morning :P on my to-do list is actually boxing it up with my rmii interface and other mac components into a library, but... time
<modwizcode>
Yeah if I get permission from work I'll throw our data transfer system at it and push it up somewhere, our system is rather nice :)
<modwizcode>
I want to replace the VHDL/verilog combo we have now
revolve has quit [Ping timeout: 256 seconds]
<modwizcode>
Why is it called "cl"?
<agg>
it's in the repo where i was doing pinout detection on a colorlite board
<modwizcode>
oh
<agg>
which has dual gige, so handy for this too
<agg>
I don't have any 10GbE hardware sadly :p
revolve has joined #nmigen
<modwizcode>
I should have a access to something with that at some point here
<modwizcode>
Although I'm pretty sure we're doing the high speed over hdmi cables so... there's that
<modwizcode>
I wonder if anyone knows that the uart in nmigen-stdio exists
<Evidlo>
agg: is RGMII really that simple? I guess I was expecting a module that was a couple hundred lines at least
<agg>
RGMII is a 4-bit DDR bus, so the interface in the fpga processes one byte per clock
<agg>
there's a few control signals which are also set on each clock edge, so unlike rmii you don't even have to track them
<agg>
the MAC sorts out the preamble, SFD, checksum, IPG (which I've also done in nmigen and is pretty easy, though not shown there)
<Evidlo>
why did you spin your own and not use LiteEth?
<agg>
seemed more fun
<agg>
same reason I wrote my own sdram and etc...
<agg>
I mostly don't need actual SoCs or CPUs so haven't really needed litex for anything, either
<d1b2>
<thirtythreeforty> It's actually somewhat tricky to make sure the CPU stuff in LiteX is disabled. It kinda assumes you want one
<agg>
I mostly don't have a big interconnect either, just lots of 1:1 streams between things
<agg>
so... maybe one day, but I'm sort of hoping to use the nascent nmigen-soc instead of litex/migen
<agg>
anyway it's sort of besides the point, writing rmii and the rest of the mac and ip stack was fun a few years ago, then it seemed like it would be fun to try the rgmii version once i got that colorlite board anyway
<agg>
I think xaui would be possible with an ecp5 that has a serdes, but using the serdes is a bit trickier (though i understand it is also supported by the open source tools, and even if it wasn't, nmigen can use diamond to build)
<modwizcode>
They have an IP they sell that does XAUI
<agg>
you'd spend more time faffing with the serdes control, but the pcb would be easier and the interface from your logic would probably be about the same
<agg>
sure but there's no fun in buying the IP and it definitely doesn't work with the open source tooling in that case :p
<modwizcode>
I meant that it's possible
<agg>
oh, yep, sure
<agg>
it's a shame you can't just run the serdes directly to the sfp+ module
<agg>
but i guess not quite fast enough :p
<modwizcode>
Can you not?
<agg>
I think on the faster fpgas that's totally doable and you can just put a phy in the gateware
<modwizcode>
oh
<modwizcode>
you meant that
<d1b2>
<thirtythreeforty> See also Luna's USB3 ambitions
<agg>
well the sfp+ would need a 10Gbps input I guess and the ecp5 goes up to 5Gbps
<modwizcode>
Right
<agg>
Evidlo: rmii is a bit harder than rgmii but not much
<modwizcode>
Yeah weirdly rgmii is actually the easiest iirc
<agg>
and 8 bits of control, each of which is data/control for the corresponding byte
<agg>
seems kind of nice actually, you just set the control bit to 0 for data and 1 for a control word, where a control word is like 'start'
<Evidlo>
so it seems worthwhile for me to start with RGMII, considering I didn't even know what DDR meant before this week
<agg>
I guess it will be a bunch easier to get working, especially if you already have a PCB with RGMII on it
<Evidlo>
and also considering that I only have the hardware for RGMII right now
<agg>
hopefully your PHYs also implement the clock to data phase shift so you don't need to do it in the fpga, too
<agg>
or the pcb implements it... either way hopefully you don't have to
<agg>
most phys do now, sometimes you'll need to enable it, sometimes it's a strapping option, phy datasheet should say
<agg>
this is where the fpga would ideally change the data at the same instant the clock changes, but the phy needs to sample it 2ns after the clock edge to get the valid data
<agg>
page 22, where it talks about transmit and receive path
<Evidlo>
by phase shift, you're talking about holding the data lines constant for some time before and after the clock edges?
<agg>
so you probably want to set the pad skew registers to add clk delay, and the default receive delay should be ok
<agg>
more like delaying the clock edge so the input registers capture at the right sampling instant
chipmuenk has quit [Quit: chipmuenk]
chipmuenk has joined #nmigen
<Evidlo>
so when I initialize the PHY, I set the pad skew registers correctly and then I can change the clock, data, and controls lines simultaneously
<Evidlo>
why are there so many options for delay? why not just on/off? doesn't RGMII run at a fixed frequency?
noknok has quit [Ping timeout: 260 seconds]
<agg>
you might need to compensate for different track lengths on your pcb, or inside your ICs
<agg>
since the phy has the rx skew enabled by default, you should be able to test transmission from the fpga without having to get the registers set up, which involves using the MDIO interface and might be annoying (unless there's also a strapping option)
<agg>
but yes, the ideal is that the fpga can transmit with data and clock exactly in phase, and receive with data and clock about 90' out of phase, so that your fpga doesn't need to do any delay management at all, which simplifies things a bit
pepijndevos has quit [Excess Flood]
pepijndevos has joined #nmigen
bvernoux1 has joined #nmigen
bvernoux has quit [Ping timeout: 245 seconds]
noknok has joined #nmigen
nickoe has quit [Ping timeout: 264 seconds]
nickoe has joined #nmigen
noknok has quit [Ping timeout: 264 seconds]
<d1b2>
<benzn> is there a convention in nmigen to specify (or derive) input and output signals?
<d1b2>
<benzn> for a given module
noknok has joined #nmigen
<d1b2>
<benzn> specifically, i have a helper function for python notebooks etc, where I can call make_callable([module]) and get an imperative function that I can pass inputs to and get outputs (as defined by helper methods at the moment)
cr1901_modern has quit [Ping timeout: 246 seconds]
<slan>
modwizcode: I now have a minimal repro for the loop... and I just designed my first CPU (yay!). I'm using your implementation of the interconnect. If you're still curious: https://github.com/slan/hartysoc/blob/toysoc/docs/toysoc.md
<slan>
...and I can't make sense out of the loop reported by Vivado (yet)
aquijoule__ has joined #nmigen
peeps[zen] has joined #nmigen
aquijoule_ has quit [Remote host closed the connection]
peepsalot has quit [Disconnected by services]
peeps[zen] is now known as peepsalot
bvernoux has quit [Quit: Leaving]
cr1901_modern1 has quit [Quit: Leaving.]
cr1901_modern has joined #nmigen
peeps[zen] has joined #nmigen
peepsalot has quit [Ping timeout: 265 seconds]
pftbest has quit [Remote host closed the connection]
noknok has joined #nmigen
pftbest has joined #nmigen
pftbest has quit [Ping timeout: 256 seconds]
noknok has quit [Ping timeout: 260 seconds]
slan has quit [Remote host closed the connection]
<alanvgreen>
generally, if I write m.d.comb += sum(list_of_signals) will fpga toolchains translate this into a tree of adds in gateware? Or will it do all of the adds serially?
<d1b2>
<DX-MON> it'll do all adds all the time in parallel.. you most probably want that operation on the m.d.sync domain so it only happens once a cycle
pftbest has joined #nmigen
pftbest has quit [Remote host closed the connection]
<alanvgreen>
To reprhase, I'm trying to determine whether the summing operation would be performed in O(len(list_of_signals)) time or O(log2(len(list_of_signals))) time.
Chips4Makers has quit [Remote host closed the connection]
<sorear>
I think it would be clearer to ask about latency
<tpw_rules>
i think he means propagation delays
<tpw_rules>
rather than time
Chips4Makers has joined #nmigen
<sorear>
that's my point
<tpw_rules>
and i had to do something similar with ORing signals but i wrote out a tree. but i never tested if it was any different to just oring them one at a time
<tpw_rules>
but it wouldn't hurt to describe it as a tree to point the synthesis tool in the right direction and it's real wasy to do stuff like that in nmigen
<vup>
I mean I would hope yosys / the synthesis tool is able to do such optimizations on its own...
<_whitenotifier-4>
[nmigen/nmigen] whitequark pushed 1 commit to master [+0/-0/±2] https://git.io/Jmwwv
<_whitenotifier-4>
[nmigen/nmigen] whitequark c30dcea - hdl.ast: handle int subclasses as slice start/stop values.
<_whitenotifier-4>
[nmigen] whitequark closed issue #601: Slicing a Value with a bool generates invalid RTLIL - https://git.io/JmZlA
<_whitenotifier-4>
[nmigen/nmigen] github-actions[bot] pushed 1 commit to gh-pages [+1/-1/±19] https://git.io/JmwwL
<_whitenotifier-4>
[nmigen/nmigen] whitequark 944ce25 - Deploying to gh-pages from @ c30dcea24de4dfa926345513a89f75eb4ed7c7d1 🚀