ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at · logs at
<agg> is it possible to get an ice40 SB_IO in "PIN_OUTPUT_REGISTERED_ENABLE" mode (i.e., enable is *not* registered) without just instantiating the SB_IO myself and putting dir="-"?
<agg> as far as I can see from vendor/lattice_ice40 it's not an option, all dir=io xdr=1 will have registered enable
<zignig> I think I have worked it out, using main generate , rather than platform build seems to work.
<zignig> at least it make some .cpp now.
<zignig> thanks anyway agg. ;)
<agg> sorry, i was asking my own question :X
<zignig> agg: erk, sorry.
<zignig> agg: so you want a bidirectional pin that you can select input / output ?
<agg> my bus is contending because my OE is deasserted in good time but the extra flop means a half period of sadness
<agg> I have dir="io" xdr=1, which generates an SB_IO with output mode PIN_OUTPUT_REGISTERED_ENABLE_REGISTERED
<agg> registering the output is fine, but i don't want the enable registered
<agg> the hardware supports it, but I don't think nmigen has a way to emit it
<agg> maybe I shouldn't use xdr=1 at all for this bus... hmm....
<zignig> try without it , how are you swapping you output mode , sync or comb ?
<agg> oe is driven comb, which is why it toggles so soon after the clock rising edge
<agg> but there's a flop inside the SB_IO instance which is enabled by the platform code when xdr=1, so the output isn't disabled for an extra clock period
<agg> I wanted xdr=1 because this line bus is externally clocked and that way you can directly feed the input clock to the flop inside the SB_IO, ensuring it gets sampled right on the clock edge
<agg> (I guess ideally I want to specify input and output xdr differently :P)
<zignig> agg: is that even possible ?
<zignig> can you use xdr=2 and navigate the doubling in code ?
<agg> xdr=2 also registers the enable
<agg> it's possible in that if you set a parameter bit differently in the SB_IO instance it will do what I want; it's a supported mode in the datasheet; just I don't think nmigen has a way to do it with the built in platform
<_whitenotifier-f> [nmigen] shawnanastasio opened pull request #394: hdl.rec: don't save casted shapes in Layout constructor -
<zignig> agg: you could use an Instance and hard declare _that_ pin with the correct settings
<zignig> but that takes you back to your original question ....
<agg> yea, i'm happy to do that if need be, might just be an unusual use case
<agg> going to stare at my timing diagrams harder to see if i can tell that OE needs to die one cycle earlier than currently, which would also fix it
<zignig> agg: staring at timing diagrams to find ZEN huh ? ;P
<agg> in theory digital logic seems so simple: everything happens on the clock, time is discrete, everyone's happy
<agg> but it all breaks down so quickly
<zignig> and then physics gets in the way ! so rude.
<agg> so rude!
<agg> even just the logic of a d flop sampling on the clock and holding that sampled value for the rest of the clock period... but a second series d-flop would sample its old value, i.e. the clock-to-out time is always larger than the input hold time, so all the transitions on my diagram are really little deltas away from the clocks
<zignig> just set you clock speed to Zero Hz , problem solved.
<_whitenotifier-f> [nmigen] codecov[bot] commented on pull request #394: hdl.rec: don't save casted shapes in Layout constructor -
<_whitenotifier-f> [nmigen] codecov[bot] edited a comment on pull request #394: hdl.rec: don't save casted shapes in Layout constructor -
<_whitenotifier-f> [nmigen] codecov[bot] edited a comment on pull request #394: hdl.rec: don't save casted shapes in Layout constructor -
<_whitenotifier-f> [nmigen] codecov[bot] edited a comment on pull request #394: hdl.rec: don't save casted shapes in Layout constructor -
<awygle> "PIN_OUTPUT_REGISTERED_ENABLE_REGISTERED" wow, best constant ever
<agg> they do it with "_INVERTED" at the end too
<agg> (though I don't think nmigen ever uses the output inverter)
<_whitenotifier-f> [nmigen] shawnanastasio commented on pull request #394: hdl.rec: don't save casted shapes in Layout constructor -
<agg> hmm, I have to be explicit about using a global input, right? just realised I've not set my bus clock to global in, nextpnr says the net is being promoted to use a gbuf but presumably it's not using the pin to drive it then
<_whitenotifier-f> [nmigen] shawnanastasio synchronize pull request #394: hdl.rec: don't save casted shapes in Layout constructor -
<_whitenotifier-f> [nmigen] codecov[bot] edited a comment on pull request #394: hdl.rec: don't save casted shapes in Layout constructor -
<_whitenotifier-f> [nmigen] codecov[bot] edited a comment on pull request #394: hdl.rec: don't save casted shapes in Layout constructor -
<_whitenotifier-f> [nmigen] codecov[bot] edited a comment on pull request #394: hdl.rec: don't save casted shapes in Layout constructor -
_whitelogger has joined #nmigen
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
Degi has quit [Ping timeout: 246 seconds]
Degi has joined #nmigen
_whitelogger has joined #nmigen
<_whitenotifier-f> [nmigen] whitequark reviewed pull request #394 commit -
<_whitenotifier-f> [nmigen] whitequark reviewed pull request #394 commit -
<whitequark> Degi: try `extref1.attrs["LOC"] = "EXTREF1"`
<whitequark> awygle: a value stashed in the platform yes
<whitequark> awygle: a probe is a value stashed in the platform but not only that; at least it should have an associated clock domain, possibly other things too
<whitequark> agg: no, doesn't provide registered output + unregistered enable, the same way it doesn't let you make e.g. output at xdr=1 and input at xdr=2 or something like that
<whitequark> and i think it's going to stay that way absent a very good reason why it shouldn't. it's geared towards source synchronous buses with dir="-" provided as a fallback for anything weird you might want to do
<whitequark> i'm open to considering reasons why it should be changed though
<whitequark> the problem with registered output, unregistered enable is that clock-to-out timings are well defined but clock-to-oe are just whatever
<whitequark> so i don't feel like nmigen should provide a shortcut for this problematic mode of operation
<whitequark> since it doesn't prevent you from doing it yourself
<whitequark> i also think not all of our platforms support it, though i might be wrong about that
<whitequark> but regardless, platform support is a secondary consideration here
<whitequark> agg: regarding global inputs... nmigen provides a fake "GLOBAL" attribute in the platform abstraction that causes it to use SB_GB_IO, since SB_GB_IO works very unlike SB_IO+SB_GB
<whitequark> other than that it doesn't do anything for you
<whitequark> awygle: regarding UserValue, that's a good point, yeah. can you file an issue on that so we can consider it?
<awygle> whitequark: sure
<awygle> It came up in the context of considering what Values didn't have DUIDs
<whitequark> awygle: i think i'm going to get the glasgow ELA done today to free you up for ILA work
<whitequark> ugh, DUIDs are a hack
<whitequark> please don't think too much about them
<awygle> Oh thanks, I already moved back to it but the ELA would still be hugely useful
<whitequark> they're just there to get consistent ordering and hashing between python runs
<whitequark> it's ... taking you a lot of time and i feel like i could make your life easier by just doing it
<awygle> lol
<awygle> delicately put :P
<awygle> i have spent exactly 2.5 hours on it
<awygle> but yeah i decided to move back to the ILA stuff already
<awygle> got a fair bit of playing around the design space done today all things considered, hope to have something to show tomorrow
<awygle> been trying to capture design decision points as i come across them, lay them out, and record why i picked what i did
<whitequark> cool!
<whitequark> my approach for this would be to bring in the smallest possible chunks of functionality into core but make sure it works well
<whitequark> not unlike what rust does
<whitequark> we can't do exactly like rust because rust has cargo and python has... sadness
<awygle> i am fully prepared for you to hate everything about what i've done but it should at least serve to center discussion :p
<awygle> lmao yes
<whitequark> but there's still a lot of value in cribbing their approach because nmigen tries to provide actual backwards compatibility
<awygle> yup understood. minimum core disruption.
<whitequark> so i think *eventually* it would be extremely valuable to provide a turnkey ILA shipped with nmigen but i think it's more important to have a common API in core that we won't regret
<whitequark> just my general approach
<awygle> totally agree
<awygle> that said i'm still prototyping the whole thing because i think it's important to see how the API would be used
<awygle> and also i uh... want an ILA i can use >_>
<awygle> so i'll have the core changes and then also an example of using them
<whitequark> yeah
<whitequark> that's a good approach
<awygle> mk well if i'm gonna have results tomorrow i better get to sleep
<awygle> night wq
<whitequark> night!
_whitelogger has joined #nmigen
thinknok has joined #nmigen
Guest30583 has joined #nmigen
Asu has joined #nmigen
_whitelogger has joined #nmigen
_whitelogger has joined #nmigen
<zignig> whitequark: is there an easy(ish) way to get nmigen -> cxxrtl name mapping ?
<whitequark> not yet, i'm working on it right now
<zignig> I'm considering grepping and munging ::commit() for now.
<zignig> thanks anyway.
<zignig> oh, and carry on !
<whitequark> you can get the mapping from back.rtlil and join the hierarchy with dots
<whitequark> and then you can map it to C++ names by looking at the algorithm in the pass
<whitequark> but it's not ideal
Asu has quit [Remote host closed the connection]
Asu has joined #nmigen
Guest30583 has quit [Quit: Nettalk6 -]
<whitequark> agg: re nmigen not using the output inverter, the reason for that is it's only available in one single mode
<_whitenotifier-f> [nmigen/nmigen] whitequark pushed 1 commit to master [+0/-0/±1]
<_whitenotifier-f> [nmigen/nmigen] whitequark afa4345 - vendor.lattice_ice40: reword confusing comment. NFC.
<agg> whitequark: thanks, that all makes sense. for my case I've swapped to xdr=0 and put input registers into my logic and left the output unregistered which works better for this anyway
<agg> Hadn't realised all this time I've had clocks coming into GBIN but not using the GB io and just having nextpnr promote it to a gbuf in logic, heh
<agg> Worked fine...
<whitequark> input registers in logic don't give you a defined phase relationship
<whitequark> this has bitten me badly, took months to figure out (and help of tnt)
<agg> whitequark: as in, each bit in the signal will be sampled at a different instant?
<agg> But presumably all will be strictly after the clock rising edge enters the device? But possibly any time before the subsequent rising edge?
<agg> The input registers in logic are still clocked from a gbuf that's driven by the external clock signal, is there also a concern that the internal clock signal night have significant delay compared to the external clock?
<agg> I guess if there's no defined phase relationship between the external clock and the internal flops clocked from a gbuf driven by it then my output registers are in trouble too, hmm
<whitequark> yes and yes
<whitequark> and yes re trouble
<whitequark> it will change with pnr seed
<whitequark> you might be able to use explicit LOC constraints to work around that
<agg> At the moment my output data comes from a bram and is then directly connected to output signals, so the bram registers drive it
<agg> Enabling the io output registers adds an annoying extra cycle of latency
<agg> How come the GBIO in/out regs are well defined but logic registers are not from the same gbuf? Is it because the signal routing is variable delay even though the clock is on the gbuf?
<whitequark> agg: routing latency
<whitequark> yes
<whitequark> each routing point in ice40 is a buffer
<agg> Right, makes sense
<agg> So the phase between external clock rising edge and my data output changing is the clock gbuf latency plus the bram clock to out plus the variable bram out to io routing?
<agg> In principle that total latency is constrained to less than the clock period plus the io-to-gbuf or I'd fail timing, right? So a sufficiently strict timing constraint would ensure the outputs made it in time?
<agg> Oh well, sounds like in practice I had better tell the stm32 to add one more clock of data latency betwet address and data phases, put xdr=1, live with half a cycle of contention
<agg> At higher clocks the stm32 adds more cycles between transfers anyway so I think the contention will go away due to that
<whitequark> yes
<agg> Thanks for saving me loads of head scratching months down the line when the design suddenly starts failing :p
<whitequark> glad to help
thinknok has quit [Remote host closed the connection]
thinknok has joined #nmigen
<kbeckmann> Is there an elegant way in nMigen when doing formal assertions with Past() to ensure they only are done when the past data is valid? Currently I have a counter and do something like 'with m.If(count > 5): m.d.comb += Assert(signal1 == Past(signal2, clocks=5))'
<whitequark> define valid?
<ZirconiumX> As in "has data", presumably
<whitequark> right I guessed that, I'd like to have kbeckmann's confirmation
<kbeckmann> yes that's right
<whitequark> what does SVA do here?
<kbeckmann> if i skip the if case, we're trying to look into the past that hasn't happened yet so it will be incorrect.
<kbeckmann> oh i don't know..
<whitequark> so to discuss the way to go here we first should take a look at the way Past() gets lowered
<whitequark> which is to say, it becomes a wide shift register essentially
<kbeckmann> ah, i see
<kbeckmann> and it gets initialized with zeroes?
<whitequark> yes
<whitequark> for BMC this is trivially ok, for induction not quite as simple
<whitequark> i think we can do better here though
Marc93 has quit [Ping timeout: 245 seconds]
<lkcl_> kbeckmann: use Initial()
<lkcl_> if you do not use Initial(), random data is inserted into the signals
<whitequark> yeah, that's the part where we could do better
<whitequark> nmigen can generate formal statements, but it's not really nicely integrated with the engine
<whitequark> and it should be
<lkcl_> whitequark: we seem to be getting on fine :)
<lkcl_> we do however critically rely on this: from nmigen.test.utils import FHDLTestCase
<whitequark> nmigen.test is a private module
<whitequark> you should copy FHDLTestCase into your own code
<lkcl_> whitequark: then we will need to take a copy, which is... redundant and... yeah.
<whitequark> it's not redundant
<lkcl_> we seem to be taking copies of quite a lot of nmigen functionality
<whitequark> FHDLTestCase isn't nmigen functionality
<whitequark> it's a helper class I wrote in a hour without much thought as to its design or future evolution
<whitequark> it's only part of the package because of an oversight, actually
<lkcl_> and it's extremely useful and valuable
<whitequark> it has several major issues, such as "me not bothering to read unittest docs"
<lkcl_> i don't think i've read them either :)
<lkcl_> i learned by a process of cut/paste and osmosis
<whitequark> and "hardcoding a specific SMT engine"
<whitequark> well that seems like a problem to me too
<whitequark> it doesn't help that nmigen doesn't yet have much in the way of docs (the WIP branch notwithstanding)
<lkcl_> after 20 years of python-unit-test osmosis you kinda learn the ropes :0
<whitequark> that doesn't really work in my experience
<lkcl_> fortunately our team is mostly autodidacts, used to examining the code
<whitequark> yeah, that also doesn't work
<lkcl_> everyone's different
<whitequark> if you look at the code, you see one particular implementation, but that doesn't tell you about the API contract
<lkcl_> without so many auto-learners on the team, there'd be absolutely no way we could write 75,000 lines of nmigen code
<whitequark> so when the implementation changes? if you rely on details that weren't a part of the contract, your code breaks
<whitequark> so, sure, read the code to understand how it works, and then read the docs to understand what you can actually rely upon
<lkcl_> yyeah that was quite irritating to have the shift behaviour change without warning. 10,000 lines of extremely complex IEEE754 FPU code that took 6 months to write suddenly "stopped working"
<whitequark> it's like when people "learn" C by trying different things and seeing what happens. they don't end up knowing how C works, they end up knowing how their specific version of a specific C compiler happens to work
<lkcl_> i found the discussion retrospectively
<lkcl_> i did that :)
<whitequark> then they upgrade their compiler and complain about it "breaking" their code, when in fact it was broken in first place and the upgrade just made that visible
<lkcl_> but i was lucky that my first major c project was samba, made extensive use of autoconf, and worked on pretty much absolutely every broken variant of any broken c compiler.
<lkcl_> except sunos 4.1.3's cc which was just... far too brain-dead.
<whitequark> now C isn't a very good example because the language makes it really hard to write robust code
<lkcl_> it's evolved considerably
<whitequark> regarding the shift change, I believe that was a bug
<whitequark> in both nmigen and yosys simultaneously
<lkcl_> interesting.
<whitequark> were you referring to ?
<lkcl_> ok i'm now not annoyed, i'm grateful.
<whitequark> I don't recall any other shift change, at least
<lkcl_> yes that was the one. we got away with it for some reason
<whitequark> right so this was always prohibited in RTLIL
<whitequark> but... not checked or documented
<whitequark> it was just silently broken if you gave it as input to yosys
<whitequark> same for nmigen
<lkcl_> lovely. and appreciated that it was spotted.
<whitequark> i had to fix yosys' validator, yosys' manual, then nmigen itself to make sure you can't end up with undefined behavior
<whitequark> there will likely be more instances of this
<lkcl_> painstaking and necessary.
<whitequark> as with any other massively complex software project, really
<lkcl_> hmmm.... would formal correctness proofs actually have caught that?
<lkcl_> argh.
<whitequark> i try to systematically eliminate these issues when i see them, and i did in a few other cases, but it's very time-consuming
<lkcl_> yes, not surprising
<lkcl_> i wonder...
<whitequark> you can look at my commits to yosys manual if you want to see the other cases
<whitequark> there were *quite* a few ways where RTLIL's contract was totally different from what you'd expect by investigating yosys' output
<lkcl_> we were looking to use coriolis2 gate-level simulation capabilities.
<lkcl_> appreciated the heads-up
<whitequark> so... yeah, i learned how RTLIL works by examining yosys' implementation, and i was often very wrong in the end
<whitequark> this is a serious hazard with self-learning in that way
<whitequark> you don't know the *intent*
<whitequark> but the maintainers of the project follow it whether you know it or not
<whitequark> regarding formal proofs: sometimes, sometimes not
<lkcl_> i wonder if we could do a comparison of the gate-level simulation against the RTL simulations
<whitequark> for example, if you used a shift by negative amount in your proof (because you used nmigen for that), then it'd be undefined at the same time
<lkcl_> actually use the exact same unit tests, pre- and post- RTL
<whitequark> depending on the proof approach, it might be that the proof engine will throw out the cases where the spec is doing something weird, assuming they're unreachable
<whitequark> i don't know the exact details
<lkcl_> yehyeh, we try to rewrite the formal proofs using a different algorithm
<whitequark> in my view, formal verification really shines when you apply the same spec to many different implementations
<lkcl_> appreciated. so it's not a perfect approach. noted
<whitequark> if you co-evolve a spec and an implementation in lockstep you risk introducing identical bugs
<whitequark> it's still quite useful, just needs to be handled with care
<lkcl_> yehhh, we're trying to encourage other teams to engage with the POWER9 formal proofs we're doing
<whitequark> sounds good
<lkcl_> but, like symbioticeda did, a common interface (RVF) is really needed
<whitequark> yeah
<lkcl_> it's really cool work, btw :)
<lkcl_> we're up to 75,000 lines of nmigen code (!)
<lkcl_> which is pretty mental.
<whitequark> cool
<lkcl_> anyway: kbeckmann: use Initial() :)
<lkcl_> with m.If(Initial():
<whitequark> this gets complicated if you use induction
<whitequark> well, more complicated than with BMC
<lkcl_> comb += Assume(ResetSignal() == 1)
<lkcl_> i did formal induction... errrrr.... 30 years ago. i barely remember it
<whitequark> i'm talking about symbiyosys induction specifically
* lkcl_ just re-familiarising myself with the concept of induction (in general)
<lkcl_> this is the "prove" mode, right?
<whitequark> yeah
<lkcl_> ok. going to take a while to read that
<_whitenotifier-f> [nmigen/nmigen] whitequark pushed 1 commit to master [+0/-0/±1]
<_whitenotifier-f> [nmigen/nmigen] whitequark 9c80c32 - setup: exclude tests.
<whitequark> lkcl_: btw, which other parts of nmigen were you duplicating?
<whitequark> I recall Record-related stuff (there's a plan to address that in a nice way), anything else?
<lkcl_> whitequark: yes, Record. we created RecordObject (and use it extensively)
<lkcl_> minverva also created something equivalent to RecordObject
<lkcl_> ah. flatten. i know it's silly: a 2-line function.
<whitequark> in master, Record is a subclass of UserValue, which means it's not really special
<lkcl_> ahh ok.
<whitequark> the problem is that the existing design for UserValue (and by extension Record, both in master and in 0.2) is seriously flawed in some aspects, so there would have to be a few changes
<whitequark> but after that, it should be possible to safely define e.g. Record-like objects in downstream code that will be robust against changes in upstream nmigen
<lkcl_> btw we're now also critically dependent on Record.connect
<whitequark> yeah, so
<whitequark> Record.connect in its current form will be gone and replaced with something fairly similar but of a much more sensible design, which I'm borrowing from FIRRTL
<lkcl_> will it have the "direction" capability? (in some form)
<whitequark> which I think will address many complaints, including some of yours
<whitequark> yes
<lkcl_> ah whew.
<whitequark> that's really the primary problem it will be solving
<whitequark> FIRRTL does it in a very elegant way, which I could not come up with myself
<lkcl_> Chisel3 has direction built-in to absolutely everything. it's fundamental.
<whitequark> sort of, you have passive values in it
<whitequark> in FIRRTL at least
<whitequark> but for the most part that's correct
<ZirconiumX> But Yosys infers direction from usage, right?
<lkcl_> nice. i am all in favour of not doing design work that someone else already came up with :)
<whitequark> it's what oMigen's Record was trying (and failing) to be, and what the people were complaining about in nMigen
<lkcl_> yosys infers from usage? i honestly don't know
<whitequark> I didn't recognize the complaints though until I read the FIRRTL spec and then it clearly was a superior design
<whitequark> ZirconiumX: direction for connecting
<ZirconiumX> Ah, okay
<lkcl_> Chisel3 has a special function (pair of functions) which allow you to "turn round" a signal
<whitequark> ZirconiumX: the major reason we need it is actually related to your proposal
<whitequark> for typed ports in nmigen
<ZirconiumX> Excellent
<whitequark> imagine a module with a wishbone input and output
<whitequark> unless there are interior directions in records, you can't express that without "exploding" the wishbone buses
<lkcl_> [for when current conversation thread is done] returning to Record / UserValue: the one thing we need to be able to do is to iterate through all the *values* in it (recursively) - not the bits
<whitequark> but you'd want to do something like `i : Input[wishbone.Interface]; o : Output[wishbone.Interface]` or whatever
<lkcl_> yeah we have some massive records that we need to assign, and some of the fields are different directions
<whitequark> lkcl_: once the design work for robust UserValue is done, you should just grab the current Record impl as-is and rely on it
<whitequark> since you already have a custom record
<whitequark> then, once the design work for better connect() is done, you can migrate to that
<lkcl_> it would be an utter chaotic mess if we had to walk through them
<lkcl_> ok. appreciated.
<lkcl_> it's... well... quite a lot of work to do back-porting.
<whitequark> yes, which is actually one reason I focus on UserValue first
<lkcl_> we created a mirror-image of the python "operator" module, called "nmoperator".
<lkcl_> it contains nmoperator.eq, and a few others
<Sarayan> never mind operator? ;-)
<whitequark> this means you can upgrade to newer nmigen, even a version that changes the way Record works, without having to migrate all your code to the new design
<whitequark> so that you can e.g. benefit from cxxsim
<lkcl_> and it assumes that each argument is iterable by its *values* (recursively)
<lkcl_> Sarayan, :)
<whitequark> the existing design where Record inherits from Value cannot allow iterating by values
<lkcl_> whitequark, okaaay. appreciated about cxxsim
<whitequark> but the good news is that I'm dropping that
<whitequark> UserValue won't inherit from Value anymore and so it will in fact be possible to iterate a record's fields
<lkcl_> whitequark: that's why we over-rode __iter__ in RecordObject.
<whitequark> yeah, you violate LSP there
<whitequark> but the bigger insight than "don't do that" is that UserValue/Record should have never been subject to LSP in first place
<Sarayan> Least Surprise Principle?
<whitequark> Liskov substitution principle
<lkcl_> we can then do python-style OO inheritance, to any depth.
<lkcl_> that's why i wrote nmoperator.
<whitequark> lkcl_: the problem was that you've indirectly pointed a flaw in nmigen, which is that UserValue/Record unnecessarily derived from Value
<lkcl_> interesting
<whitequark> but the way you did it is by insisting on violating LSP, which is a pretty bad idea
<whitequark> the good idea was to change the design so that violating LSP is no longer necessary to achieve the same result
<Sarayan> Ok, didn't know the name, but yeah, that's rather fundamental in OO
<whitequark> this is why I insist on examining use cases for every proposed language change
<lkcl_> whitequark: amazingly, overriding __iter__ doesn't break anything. i checked.
<ZirconiumX> I suspect this is "doesn't *presently* break anything"
<whitequark> more often than not it turns out that a proposal for change correctly indicates the presence of a flaw, but not the specific place where the flaw is
<whitequark> such as here
<whitequark> lkcl_: remember our conversation from 20 min ago?
<whitequark> it doesn't break the *code*, but it breaks the *contract*
<lkcl_> what happens is that by returning the sequence of values, the values themselves are iteratable, and consequently at the *next* level they get converted to bits.
<lkcl_> whitequark, yehyeh :)
<whitequark> so suppose nmigen starts to assume somewhere that it can safely iterate anything that is isinstance(x, Value)
<whitequark> (there is no reason I don't do this other than trying to avoid unreadable code)
<whitequark> then your implementation will break
<whitequark> if a RecordObject is-a Value, then anything that Value does, it should also do, including the way a Value iterates
<lkcl_> indeed.
<whitequark> now it's not like I'm going to intentionally break your LSP-violating code, I'm just explaining why violating contracts is bad
<whitequark> just like in real life
<whitequark> you can do it, sure! but then you don't get to complain when the other party doesn't uphold its end either
<Sarayan> Oh wq, since you're in a design mood, I have an interesting syntax(?) problem
<whitequark> sure
<whitequark> (i'm not but let's hear you out)
<whitequark> (i'm fighting with yosys)
<Sarayan> I'm reimplementing a fully synchronous chip in nmigen, with a phase signal in input
<Sarayan> whihc means it's almost entirely in a with m.If(self.i_phase): m.d.sync += lots of stuff
<Sarayan> but I have a number of comb to keep things readable, or because some combs are shared between multiple sync
<whitequark> are there multiple phases?
<Sarayan> if I put the comb in the m.If(), the signals are going to be zeroed when the phase signal is 0
<Sarayan> no, only one
<Sarayan> it's going to work, but zero outside is probably more expensive, slower in sim, and annoying in debug
<whitequark> ditch m.If and use EnableInserter(i_phase) in toplevel
<whitequark> that will do exactly what you want, I think
<whitequark> i.e. comb always runs, sync only changes when i_phase is high
<whitequark> always changes*
<Sarayan> ok, I was wrong, there are kinda multiple phases
<Sarayan> specifically, config register write is async w.r.t the phase signal
<Sarayan> so yeah you could say there are two phases, i_phase and i_cs, sorry
<lkcl_> Sarayan: are you trying to write a pipeline?
<whitequark> okay, then use something like
<lkcl_> where combinatorial blocks are basically passed along, based on some "signal"?
<lkcl_> oo diagraaams :)
<lkcl_> ooo gaate level diagraaaams, i love those
<Sarayan> extracted from die "shot" by furrtek
<whitequark> DomainRenamer({"phase":"sync", "cs":"sync"})(EnableInserter({"phase":i_phase, "cs":i_cs})(your_stuff))
<whitequark> in toplevel
<whitequark> and then put your logic into two domains m.d.phase and m.d.cs
<whitequark> (adjust names as wanted)
<Sarayan> that's sounds interesting
<ZirconiumX> That's...really elegant, wow
<whitequark> (could do e.g. DomainRenamer({"cs":"sync"})(EnableInserter({"sync":i_phase, "cs":i_cs})(your_stuff)) )
<Sarayan> where is toplevel?
<ZirconiumX> As in, your top-level module
<whitequark> do you have a module with no sync logic that includes all other modules and connects them?
<whitequark> that's toplevel
<whitequark> if not, add one
<Sarayan> hmmmm
<whitequark> actually
<whitequark> you could even do something like
<whitequark> m = Module()
<whitequark> return DomainRenamer({"cs":"sync"})(EnableInserter({"sync":i_phase, "cs":i_cs})(m))
<whitequark> ...
<whitequark> at the root of your module hierarchy
<whitequark> though it's a bit awkward
<whitequark> ZirconiumX: yep! there's no EnableSignal() yet, which would complete the picture
<Sarayan> ok..... I'm trying to understand all that correctly
<lkcl_> whitequark: this sounds extremely powerful and compact. are there examples anywhere?
<whitequark> lkcl_: not yet but it will be definitely featured in the docs
<Sarayan> why can't the magic be inside the 53251 and keep the interface outside self-contained?
<lkcl_> whitequark: ok.
<lkcl_> btw, suggestion (code-style)
<whitequark> Sarayan: i need more context to answer that
<Sarayan> the 53251 is but one chip in the arcade boards
<lkcl_> if not isinstance(newfragment) or not new_fragment.type in ...:
<whitequark> ohh
<lkcl_> return new_fragment
<Sarayan> there's going to be others, plus cpus, plus other stuff
<lkcl_> then the same trick at the next if-statement
<Sarayan> they work on varioua phase signals
<whitequark> Sarayan: yeah i wasn't clear, sure
<whitequark> *sorry
<lkcl_> for multi-nested if statements it dramatically reduces indentation.
<Sarayan> and the top level pretty much links them together and implement the ttl that's between them
<whitequark> it doesn't actually have to be the toplevel module of the entire design. in fact there's no actual requirement for it to relate to the hierarchy in any particular way
<Sarayan> so subchip (module) has one or more clock signals that are phase signals
<ZirconiumX> It's possible for it to be "the top level of your 53251" instead of "the top level of your board"
<lkcl_> (wq: will email you a diff-patch)
<Sarayan> ok, that sounds much better
<whitequark> Sarayan: what ZirconiumX said
<whitequark> lkcl_: code style is subjective and i'm not going to change it to suit anyone's particular preference
<whitequark> i'm not telling you how to format your code either
<lkcl_> whitequark: hmmm that seems like an overly harsh response, based on a misunderstanding
<whitequark> sorry, wasn't intended to be that
<Sarayan> so either I put the magic on a submodule that's the real working part of the chip, or I massage the module with it just before returning it?
<lkcl_> no problem, no offense taken
<ZirconiumX> Pretty much, Sarayan
<whitequark> yup
<Sarayan> okay, very nice
<ZirconiumX> But I mean, you'd want a single chip module to instantiate anyway, so you can put the magic in there
<Sarayan> So DomainRenamer copies the sync domain into two other, and EnableInserter plugs a phase signal into a domain
<whitequark> in the inverse order, yes
<whitequark> so first you add one enable per domain
futarisIRCcloud has joined #nmigen
<whitequark> and *then* you turn the two domains into one
<Sarayan> ohhhhhh
<whitequark> this might make it more clear
<whitequark> m = EnableInserter({"sync":i_phase, "cs":i_cs})(m)
<whitequark> m = DomainRenamer({"cs":"sync"})(m)
<whitequark> return m
<Sarayan> so you have to declare the doamins in the constructor, right?
<whitequark> can you elaborate?
<Sarayan> If you go and use m.d.cs += xxx at the beginning of the module it's going to work or you have to tell that cs exists first?
<whitequark> former
<whitequark> the "cs" domain here is a sort of fake
<whitequark> it just exists to group logic
<Sarayan> automagically added?
<whitequark> no, late bound
<whitequark> like all others
<whitequark> think of it as an argument
<whitequark> so your k53251 is like a "function" which takes cs and sync as "arguments"
* lkcl_ celebrates. found and fixed a bug in a ridiculously-complex FSM
<whitequark> and until you "call" it (add as submodule) the actual domains that will be referred to by m.d.cs and m.d.sync aren't known
<whitequark> but once you do, they can just be the same thing
<whitequark> DomainRenamer takes a "function" that takes cs and sync as "arguments" and returns a "function" that takes just sync as an "argument" and uses it for cs too
<whitequark> does this make sense?
<Sarayan> Oh, it was more about the declaration-before-use aspect with the +=
<whitequark> yes
<whitequark> you can use any domain you want with m.d.<domain> +=
<Sarayan> You don't have to say first that you're going to then
<whitequark> yep
<Sarayan> that works, you can always check at the end if they're bound, so typos are found
<whitequark> yep
<Sarayan> That looks very, very elegant
<lkcl_> so... does DomainRenamer "remap" anything that you do "m.d.sync" to "m.d.{somethingelse}"?
<whitequark> that's what it does, yeah
<whitequark> the exact details at the moment are a little bit odd around edge cases, in particular because we don't have EnableSignal
<whitequark> that's going to improve soon-ish
<Sarayan> What will EnableSignal do?
<lkcl_> and... EnableInserter is... the industry-standard term is, i believe, a "clock gater"?
<whitequark> Sarayan: EnableSignal : EnableInserter :: ResetSignal : ResetInserter
<whitequark> lkcl_: not exactly
<Sarayan> except I don't grok ResetSignal yet either
<lkcl_> i.e. when the "EnableSignal" is true, everything... ok
<whitequark> EnableInserter turns DFFs into DFFEs
<lkcl_> ok, so when EnableSignal is... oooo :)
<whitequark> this is only a logical view
<lkcl_> DFFE Primitive
<lkcl_> The DFFE primitive allows you to specify a D-type flipflop with clock enable.
<whitequark> e.g. you can use it many times and there are no DFFs with more than one E input
<whitequark> but if you think about your design in terms of idealized flip-flops, then EnableInserter adds a fresh CE input and wires it to whatever you specified
<lkcl_> iiinteresting. and the output is frozen, even combinatorially?
<whitequark> EnableInserter and ResetInserter only affect sync logic
<whitequark> since they operate on clock domains
<lkcl_> whitequark: ahh ok. we have a situation where we need a "latch bypass"
<lkcl_> it is a hybrid of a DFF and a combinatorial bypass.
<whitequark> nmigen has no direct support for latches, very deliberately so
<whitequark> you can instantiate a vendor primitive, of course, but they're not a part of HDL
<Sarayan> wq: annoyingly, nmos is latches everywhere. At least cell-cmos is ff
<whitequark> Sarayan: doesn't apply to you
<lkcl_> yeh we wrote a function which creates one.
<whitequark> since you can transform latches into clocked logic, and in fact you will have to do it
<whitequark> if you want stuff to run fast on cxxsim or fpgas
<whitequark> but lkcl_ is working on an ASIC which does have latches as actual primitives
<Sarayan> cute
<whitequark> a DFF is two latches back to back, usually
<Sarayan> but when I convert something like the via6522, it's latches everywhere, with the effect that multiple latches in a chain on the same clock signal all change on the same clock
<Sarayan> which I can't convert on multiple sync, that would add a delay that doesn't exist
<Sarayan> s/on/to/
<lkcl_> Sarayan: don't ask :) we have a number of 2D matrices which in the full multi-issue version will result in over a quarter of a million gates if we use DFFs. blech :)
<whitequark> well, let's look at it this way
<whitequark> if you don't convert them, then you won't ever simulate that on an FPGA
<ZirconiumX> I don't think Yosys handles latches very well either, since they're logic loops, right?
<whitequark> if you're fine with that... you can do Instance("$dlatch") and cxxrtl will simulate it fine
<whitequark> ZirconiumX: yosys allows certain logic loops
<whitequark> (namely, ones that it can transform into not-logic-loops, breaking the loop with a $dlatch cell)
<whitequark> (which happens as a part of the proc_dlatch pass)
<lkcl_> ZirconiumX: no "modern" proprietary tools handle them. if you try to use them, you're into $100 million custom ASIC territory.
<Sarayan> lkcl: how many gates with latches instead of dff?
<whitequark> interestingly, latches are better for timing... if your timing analyzer can handle them
<lkcl_> Sarayan: an SR NOR (or SR NAND) is 2 gates.
<lkcl_> that's it.
<lkcl_> a DFF is around 10.
<whitequark> just to be clear, above i have used "latch" as a shorthand for "D-latch"
<whitequark> SR-latches are a totally different beast and are way worse for timing analysis
<lkcl_> we're doing a multi-issue Out-of-Order execution engine, which requires Dependency Matrices.
<lkcl_> whitequark: ah, appreciated
<lkcl_> the DMs are N(registers) x N(ReservationStations)
* ZirconiumX is not really up-to-date on GPU architecture
<ZirconiumX> The GPU I'm most familiar with is the PlayStation 2's Graphics Synthesiser. Which is fixed-function.
<lkcl_> so in the GPU version of our processor, that's a whopping 128 x 30 matrix.
<lkcl_> ZirconiumX: we're kinda cheating, by using a multi-issue out-of-order execution engine, and a hardware "for-loop" which pauses the Program Counter and shoves a shed-load of "element" operations into the multi-issue engine
<lkcl_> and lets the Dependency Matrices sort it out
<lkcl_> we are *not* doing a SIMD ISA... but it's SIMD at the back-end though.
<Sarayan> when you're at the hardware level, you better cheat as much as you can, else what's the point :-)
<lkcl_> things have moved on in the 3D industry (from fixed-function), although if you study GPLGPU closely (it implemented Plan9) which Jeff Bush did, he found that, architecturally, fixed-function and shader GPUs have a lot in common
<lkcl_> Sarayan: not too much though :)
<Sarayan> ooo and predictions are cheating, in a way
<ZirconiumX> So, are you relying on aggressive IPC to make up for die area that might be used for a couple of simpler in-order cores?
<Sarayan> Don't ever have a quartus.ini file with "PDB_ASCII_DUMP=on" in the current directory it when running quartus commands btw, I'm sure it's Bad
<ZirconiumX> Oh, yeah, terrible
<Sarayan> just so you don't do that by mistake
<sorear> so you're basically doing what Itanium did (no SIMD, hardware designed for wide issue with all2all bypassing, SIMD instructions in (x86 compat mode) input are broken up into operations)
<sorear> this didn't work so well last time, but I'm interested to see a fair benchmark comparison when it's ready and when a real RVV impl is ready
<ZirconiumX> sorear: or more aptly what Terascale did, right?
<kbeckmann> lkcl_: thanks for the Initial() tip, completely missed that. Your method works great when only one cycle history is required, but I need 5 so I created a signal that i set using Fell(Initial(), clocks=5). Seems to work fine.
<whitequark> kbeckmann: ah, sorry i wasn't clear enough, i assumed you know about Initial() and were suggesting that nmigen do this Fell() trick
<kbeckmann> ah, yeah that would be neat indeed. still learning all of this :). to be honest, moving from a manual counter to Fell() feels a lot less like a hack, so I'm quite happy with this solution.
<whitequark> are you using induction yet?
FFY00 has quit [Remote host closed the connection]
FFY00 has joined #nmigen
<kbeckmann> i have tried it with symbiyosys some time ago, but haven't with nMigen. it seems quite powerful especially when you want to test something that normally would require a lot of iterations to get to a bad state.
<whitequark> right, so, i'm not sure if the Fell trick would work with induction
<kbeckmann> that's right. just tried it and it fails during step 0
<whitequark> i think you'll have to add assumptions that aren't entirely intuitive
<daveshah> step 0?
<daveshah> Failures in the induction part are invariably at step K, aiui
<kbeckmann> step 0: immediately, not even one clock period is shown in the strace
<kbeckmann> trace*
<daveshah> hmm
<kbeckmann> oh right!
<kbeckmann> i have to adjust for that
<kbeckmann> i take it back! it works. just tried it and added a bug after 32k iterations in my code, and running with "mode prove" found it.
<kbeckmann> forgot how powerful this was.
<whitequark> awygle: ohhh I remember now why you can't use arbitrary Values in Past() etc
<whitequark> because of the reset values
FFY00 has quit [Quit: dd if=/dev/urandom of=/dev/sda]
FFY00 has joined #nmigen
<lkcl_> kbeckmann, cooool
Asu has quit [Remote host closed the connection]