ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at · logs at
cr1901_modern has joined #nmigen
<awygle> yes, i like the idea of attaching it to the platform
<awygle> it seems the logical place
<awygle> but we may run into limitations of course, we'll see
<tpw_rules> is there a way to make the verilog export add some kind of prefix to all the module names (except the top)? for various reasons i need to generate verilog for inclusion in larger projects and there can be name conflicts since all the exported modules have the names used in the python code
<whitequark> connect_rpc already does that
<whitequark> but other than that, i don't know of an easy way to do it
<tpw_rules> i don't know what connect_rpc means
<whitequark> oh
<tpw_rules> i see. i don't think that would work for my application
<whitequark> wait, why not?
<whitequark> lots of toplevel python code?
<tpw_rules> no, it needs to spit out a verilog file that gets dumped into another fpga project
<whitequark> sure
<whitequark> you can just do write_verilog after yosys is done importing
<whitequark> like it already happens in nmigen anyways
<tpw_rules> i'm confused. i have a large fpga project which is not mine and to which i want to contribute a module. if any of my submodules are called the name of a module already in that project, it won't work. so i'd like to prefix all my submodule names with something that uniquifies them with respect to the rest of the project.
<whitequark> yes
<whitequark> hang on, i'll approach this a bit differently
<whitequark> so you know how nmigen currently outputs verilog? it emits rtlil, imports it via read_ilang, then exports via write_verilog
<whitequark> what i'm suggesting is that you could have nmigen emit rtlil, import it via connect_rpc, then export via write_verilog. as a side effect of how connect_rpc works, this will add prefixes to the modules, exactly like you want
<whitequark> without any changes to nmigen or yosys or anything else
<whitequark> it's just that i wrote this exact code for connect_rpc, but it's not accessible in any other way.
<whitequark> maybe it should become a separate pass
<whitequark> tpw_rules: does that help?
<tpw_rules> oh i see, i thought it had to communicate with the rest of the synthesis chain or something. i'll try that out if it becomes a significant problem
<tpw_rules> i was out walking the doggers
<whitequark> np
winocm has joined #nmigen
winocm has quit [Client Quit]
winocm has joined #nmigen
guan has joined #nmigen
Degi has quit [Ping timeout: 265 seconds]
Degi has joined #nmigen
zkms has joined #nmigen
<zkms> hi
<whitequark> hi!
<awygle> hello
<bubble_buster> Fun to see people use Twitter to coordinate their irc activity :D
Guest30583 has joined #nmigen
_whitelogger has joined #nmigen
chipmuenk has joined #nmigen
peepsalot has joined #nmigen
thinknok has joined #nmigen
<MadHacker> o/
<whitequark> awygle: so about the paper i just linked (which zkms discovered)
<whitequark> do you need an ILA specifically? or do you want introspectability in general?
<daveshah> Eddie has done some work on introspectability/debug too
<whitequark> and if the latter, can it be invasive? because we can implement adding scan chains in yosys
<whitequark> the main thing that's missing there is mapping of registers back to original wires
<daveshah> The big problem with that is can it run fast enough and how much complexity does it add
<daveshah> e.g. if you want to capture raw data coming out of a serdes at full rate then a ILA with a buffer is needed
<whitequark> sure
<whitequark> but for that application you probably aren't going to use microscope-like ILA
<daveshah> If you are on Xilinx you don't even need a scan chain
<daveshah> You can just use readback
<whitequark> well
<whitequark> that ties you to vendor tools hard
<daveshah> Not if someone implements that separately
<whitequark> and prevents you from using generic nmigen code that can map readback back to signals
<daveshah> Yes
<daveshah> fwiw, I have used litescope to look at DDR3 transactions before (just before the gearboxes)
<daveshah> So there are definitely use cases where something faster than a scan chain is needed
<MadHacker> Don't you end up getting dangerously close to something like the openbench logic sniffer if you're trying to extract that much info? You're going to need to buffer like crazy.
<MadHacker> + may as well have triggering conditions too at that point.
<whitequark> MadHacker: depends on what you're doing
<daveshah> A buffer of 32 cycles was enough for this case
<daveshah> Then read it out at your leisure
<whitequark> if you trigger once per second it's pretty easy
<whitequark> if you trigger at kHz you probably need an ILA
<whitequark> there is much value in using a number of approaches
<whitequark> for example scan chains don't really work for non fully static designs
<whitequark> hm
<whitequark> wait, no, i'm wrong
<whitequark> scan chains don't work for them if you reuse the register bits for the chain (or use readback?)
<whitequark> (not sure about readback, does xilinx have shadow registers?)
<MadHacker> Unless what you're scanning is just a buffer. Snapshot state from read signals into shift register, or even a stack of shift registers?
<daveshah> mwk: ^
<daveshah> Questions about xilinx readback
<whitequark> MadHacker: yeah you can definitely buffer the scan chain
<whitequark> and really nice thinking on using a multilevel shift register
pdp7 has quit [Ping timeout: 252 seconds]
<daveshah> That works well on Xilinx with hard shift registers (using LUTs as them)
<MadHacker> So, trigger (or system clock) clocks snapshot of state into shift reg and push of stack of shift regs, read out at your leisure?
<whitequark> yep
<whitequark> the more i think about this design the more i like it
<whitequark> on xilinx you could actually capture "last n states"
guan has quit [Ping timeout: 252 seconds]
<whitequark> last... 32?
<daveshah> It's mostly the extra size and routing that worries me
<daveshah> Yes
<daveshah> Or 16 if you use the smaller SRLs
<whitequark> that's quite a lot
<daveshah> Indeed
<whitequark> for routing, hm, you could cascade the registers in hard routing, right?
pdp7 has joined #nmigen
<whitequark> so a smart enough pnr could probably lay out the scan chain adjacent to actual regsiters
<daveshah> You can't load them then
<whitequark> ah
<whitequark> right ok, so needs to be tested. still it seems very promising to me
<daveshah> The other problem is that you really want the be doing the scan chain ordering in PnR
<daveshah> When you know how everything is laid out
<daveshah> Yes, definitely for many cases it seems like a good approach
<MadHacker> OK, but the chain is always going to affect PNR anyway, since you're going to tie up routing resources at a minimum.
<MadHacker> There's no point pretending you can just add it in afterwards.
guan has joined #nmigen
<whitequark> so what i'm thinking here is we need some sort of solution for mapping bits of registers at yosys input to bits of registers at pnr output
<whitequark> cxxrtl needs this too
<whitequark> this will also allow yosys to merge or remove registers
<daveshah> MadHacker: yeah but if you connect it up after placement then at least you avoid it having to go all over the place, you can reorder it into a neat line
<daveshah> Anyway, that is a nice extra at some point
<daveshah> Yeah, mapping register bits would be useful for anything readback based too
<MadHacker> OK, but again that's still going to affect the original placement. If a chunk of logic is in a region that's tight on long-range routing then suddenly it'll shift when you add in the extra wires. I get your general point that it'd be better to allow it to reorder arbitrarily, but sometimes you've just got to accept that it's a little invasive.
<daveshah> yes definitely
<whitequark> MadHacker: there's going to be at least a small impact
<whitequark> and hopefully a small impact
<whitequark> it's not the right solution for designs that push the FPGA to limits, but many don't
<whitequark> and it only really impacts you routing-wise, not critical path wise
<daveshah> It might result in a more spaced out placement but shouldn't be too big a timing impact
<whitequark> yep
<MadHacker> It's like any scope probe, there's no getting away from the fact that a fast probe is going to dump a 20k load onto your signal. The equivalent applies.
<whitequark> yep
<daveshah> It would also be possible to do a combinational simulation to recover all combinational signals too
<whitequark> yep.
<whitequark> have you seen the pdf i linked earlier?
<daveshah> Not fully yet no
<whitequark> they're using some sort of C++ backend, which i hope can be cxxrtl in our case as it already exists
<whitequark> well, ok, not quite as it exists, since it needs mapping too
<daveshah> Nice
<whitequark> here's something fun i had in mind for cxxrtl
<whitequark> using high -O levels removes the calculations for internal signals, right? that's the point
<whitequark> but you still want them in VCD
<whitequark> so i thought i'd generate "debug info" inspired by DWARF that contains all the elided calculations over, using the state bits
<whitequark> meaning you can get 100% visibility *and* 100% performance with no recompilation of the model
<whitequark> then you could just use the same thing but initialize it with a scan chain
<daveshah> Yep
<daveshah> FWIW, this is some of Eddie's observability work
<daveshah> This uses partial reconfiguration to connect a subset of signals to a deeper BRAM based buffer but partially preroutes the signals
<daveshah> Interesting but a lot more arch and tool dependent
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
<whitequark> interesting!
<MadHacker> The "route everything interesting into a region" and then separately "build an analyser in the region" steps are nice for that reordering point you made earlier.
<MadHacker> Does nextpnr already let us exclude a chunk from use?
<daveshah> Sort of but it's not exposed in the ideal way for something like this
<awygle> dang things got interesting right after I left to go to sleep
<awygle> I've had that paper in my 'to read' pile for weeks, shoulda read it oops
<MadHacker> Damn timezones, why can't people on the Wrong Side of the planet be awake when I am?
<MadHacker> UTC+0 is the only valid timezone, right?
<awygle> Scan chain would be fine in this specific case definitely. Eventually I'd like to look at DDR2 transactions, so a true ILA might be necessary
<awygle> Agree that there's no reason to limit ourselves to one approach, in fact I strongly believe we shouldn't
<whitequark> MadHacker: i just stay awake when i need to talk to someone on the other side of planet
<whitequark> my sleep schedule isn't sun-synchronized so it's ok
<whitequark> awygle: so we can build these in parallel, maybe
<Sarayan> wq: The "sun-" part feels unnecessary ;-)
<whitequark> Sarayan: well it can be synchronized to US time occasionally
<whitequark> which is distinct from being synchronized to local time
<whitequark> awygle: i'd be happy if you took care of ILA and i could take a look at register mapping and scan chains
<whitequark> unless you have a burning desire to dig into some C++ code
<awygle> Insert some anime gif subtitled "no no no no no"
<awygle> I feel much more comfortable with an ILA style approach in terms of implementation anyway, feels like less to fill in before I can be useful.
<Sarayan> Fuck, I'm reading a project proposal draft I'm supposed to work on, and I don't understand it
<Sarayan> I'm not sure whether it's genius or bullshit, but I tend towards the latter
<Sarayan> The scientific vision of the AÇAÍ project is that applying, in a new interdisciplinary synergy, principles from diverse fields of computer science and administrative law can lead to discover a canonical core of essential Artificial Intelligence (AI) methods that is simultaneously (a) maximally parsimoniously versatile, (b) meta-circularly autonomic and (c) cost-effectively certifiable for practical use in critical economic sectors
<Sarayan> Not sure if serious
<awygle> So far my personal record for "not sure if genius or talking nonsense" is like four years, held by someone most of you probably follow on Twitter
<awygle> My jury is still out
<awygle> My point being it can be hard to tell
<Sarayan> To refine the sensation, I'm not sure if it's "makes sense in his head" or "makes sense in his research domain"
<whitequark> awygle: cool!
<whitequark> i find the C++ parts pretty easy to do, actually
<Sarayan> C++ can be easy
<awygle> It's not the c++ that I find intimidating, it's the rest of it
<awygle> As usual it's all about knowing what code to write lol
<Sarayan> whitequark, you who knows magic and python and C++
<Sarayan> I'm building a python module in C++, interfacing with a big program/library we've made. I'd like the install to be single-file (e.g. a .so), but I'd also like to have part of the interfacing to be actually written in python. Do you know how much hybridization can be done? I have no issue with embedding python source code in the .so
<awygle> Somebody debug my endocrine(?) system and figure out why falling asleep is such a chore lately... sigh. gonna go give it another shot, goodnight
<Sarayan> awygle: I have tricks for that, but I have no idea whether they'd work on you
<Sarayan> I have a feeling that the limit of hybridization is that classes can either be full-C++ or full-python, but outside that you can actually mix stuff
<whitequark> Sarayan: take a look at cython, perhaps
<whitequark> but i haven't done much with it
Asu has joined #nmigen
<mwk> daveshah: what do you want to know about xilinx readback?
<daveshah> the main question was whether there is a shadow register
<mwk> depends
<mwk> for a plain FF, yes
<mwk> for SRL/RAMs, no
<daveshah> That makes sense
<mwk> also you cannot look into DSPs
<mwk> so if you have a pipelined multiplier, forget about introspecting its state
<mwk> I don't quite recall the blockram output register rules, I think if you're using the pipelined version you're likewise screwed
<daveshah> Hmm, that's an advantage for soft scan chain insertion too (which is what most of the discussion was about), probing the DSP/BRAM output should always be fine
<mwk> and, of course, the whole readback-from-ff thing is completely gone on ultrascale
<mwk> well the output is easy
<mwk> the problem is internal pipeline stages
<mwk> I suppose you'd have to replicate them in an introspectable way somehow
<daveshah> Other than the input registers, the area cost of that seems quite high
<mwk> yes
<mwk> one possibiity would be to just record inputs in SRLs and recover state in sw
<mwk> ... or not, clock enables mean that DSP can hold state arbitrarily long, ugh
<daveshah> Using SRLs was the plan
<daveshah> in general
<daveshah> Oh yeah CE is a pain
futarisIRCcloud has joined #nmigen
<Sarayan> wq: ok
thinknok has quit [Quit: Leaving]
thinknok has joined #nmigen
* zignig has a output pin with a led on it.
<zignig> I would like to attach N elaboratables to the pin with a muxy thing.
<zignig> what is the best way to do break before make ?
<ZirconiumX> You don't need to
<ZirconiumX> If you have "a muxy thing", then just use it
<zignig> no , I don't need , I want to.
<zignig> the muxy thing is the issue , I can do a 2X with a Mux(switch,a,b) , but I would like an N way.
<hell__> mux the muxes?
<whitequark> Array, perhaps?
<MadHacker> whitequark: +1 for portrait mode LCD.
<MadHacker> (sorry, reading tweets on phone and it's easier to type here)
chipmuenk has quit [Quit: chipmuenk]
Asuu has joined #nmigen
Asu has quit [Ping timeout: 260 seconds]
<ZirconiumX> zignig: That actually seems like a useful thing, hmm
<ZirconiumX> Though I guess the correct tool for the job is probably switch/case
<zignig> ZirconiumX: not sure, but I think it has a multitude of applications , hence my question
<ZirconiumX> Sure, but generally switch/case *is* a multiplexer
<zignig> it is also applicable if you have a spi interface and you request a new device, declare a new CS pin and "muxy thing" between devices.
* zignig continues to battle argparse.
_whitelogger has joined #nmigen
<hell__> zignig: right, so if you have three SPI devices, you need a mux that can choose one of them at a time?
<hell__> I'd just chain two muxes
Asuu has quit [Read error: Connection reset by peer]
Asuu has joined #nmigen
thinknok has quit [Quit: Leaving]
chipmuenk has joined #nmigen
<zignig> hell__: I'm thinking so, just stack the muxes, I also think tha ZirconiumX is right that a switch/case will elaborate to a mux stack anyway.
<zignig> Sarayan: me for now. mWHAHAHAHA !
<Sarayan> hmmm what?
* hell__ panics and hides under dozens of mainboards
<cr1901_modern> >(8:57:02 AM) sarayan: who wins?
thinknok has joined #nmigen
<Sarayan> Oh :-)
<Sarayan> Forgot by now
<awygle> Morning
<daveshah> afternoon awygle
<ZirconiumX> Evening awygle
<awygle> Whelp I guess the day is over, back to bed...
<ktemkin> mood
chipmuenk has quit [Quit: chipmuenk]
cr1901_modern1 has joined #nmigen
cr1901_modern has quit [Ping timeout: 256 seconds]
cr1901_modern1 has quit [Quit: Leaving.]
cr1901_modern has joined #nmigen
Guest30583 has quit [Quit: Nettalk6 -]
chipmuenk has joined #nmigen
thinknok has quit [Read error: Connection reset by peer]
<chipmuenk> Hi,
<chipmuenk> I've started a little project on DSP using (n)migen at To keep me from doing real work, I updated the migen logo a little for the nmigen logo at
<ZirconiumX> I'm not a huge fan of the Migen logo to begin with >.>
chipmuenk has quit [Quit: chipmuenk]
* hell__ gets scared and runs away
Asuu has quit [Ping timeout: 260 seconds]
<_whitenotifier-c> [nmigen/nmigen] whitequark pushed 1 commit to master [+0/-0/±1]
<_whitenotifier-c> [nmigen/nmigen] whitequark fbf9e1f - back.rtlil: handle signed and large Instance parameters correctly.
<_whitenotifier-c> [nmigen] whitequark closed issue #388: Integer parameters over 32 bits -
<_whitenotifier-c> [nmigen/nmigen] whitequark pushed 1 commit to master [+0/-0/±2]
<_whitenotifier-c> [nmigen/nmigen] whitequark 404b2e0 - hdl.dsl: check for unique domain name.
<_whitenotifier-c> [nmigen] whitequark closed issue #385: Bad error message for duplicate ClockDomain -