ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen · IRC meetings each Monday at 1800 UTC · next meeting August 17th
Degi has quit [Ping timeout: 256 seconds]
Degi has joined #nmigen
emeb_mac has quit [Ping timeout: 240 seconds]
emeb has quit [Ping timeout: 256 seconds]
emeb has joined #nmigen
emeb_mac has joined #nmigen
_whitelogger has joined #nmigen
levi has quit [Ping timeout: 246 seconds]
mithro has quit [Ping timeout: 244 seconds]
gravejac has quit [Ping timeout: 244 seconds]
guan has quit [Ping timeout: 272 seconds]
jaseg has quit [Ping timeout: 246 seconds]
mithro has joined #nmigen
jaseg has joined #nmigen
mithro has quit [Excess Flood]
rohitksingh has quit [Ping timeout: 244 seconds]
mithro has joined #nmigen
bubble_buster has quit [Ping timeout: 256 seconds]
rohitksingh has joined #nmigen
mithro has quit [Excess Flood]
bubble_buster has joined #nmigen
electronic_eel_ has joined #nmigen
electronic_eel has quit [Ping timeout: 256 seconds]
rohitksingh has quit [Ping timeout: 244 seconds]
bubble_buster has quit [Ping timeout: 244 seconds]
rohitksingh has joined #nmigen
bubble_buster has joined #nmigen
mithro has joined #nmigen
guan has joined #nmigen
levi has joined #nmigen
gravejac has joined #nmigen
emeb has quit [Quit: Leaving.]
PyroPeter_ has joined #nmigen
PyroPeter has quit [Ping timeout: 264 seconds]
PyroPeter_ is now known as PyroPeter
proteus-guy has joined #nmigen
_whitelogger has joined #nmigen
proteusguy has joined #nmigen
<proteusguy> How does nmigen differ from myhdl and is there a reason why I would use one over the other?
<d1b2> <TiltMeSenpai> hmm I only know what I literally just looked up in myhdl, but it looks like myhdl is more verilog-like
<d1b2> <edbordin> afaik myhdl tries to reuse python syntax as a HDL (presumably via some "magic" behind the scenes). it sounds like it can be hard to reason about it, especially if you want to mix things like "an if statement that represents combinatorial logic" vs "an if statement that decides whether to construct some combinatorial logic"
<d1b2> <TiltMeSenpai> I do like nmigen's "the HDL is a data structure" approach
<d1b2> <edbordin> off the top of my head, nmigen also offers a build system integrated with various targets and cxxrtl for fast simulation. probably a bunch of other nice things I'm not aware of/forgot to mention
<d1b2> <TiltMeSenpai> nmigen_boards is also super nice, makes it easy to get up and running super fast with some of the boards common in our community
<d1b2> <edbordin> coming from a data science background I do find it interesting that tensorflow went the other way... it started out as "the computation is a dataflow graph of stateful ops" and then in v2 added a bunch of magic so you could write things like loops as python code
<d1b2> <TiltMeSenpai> can you still poke the stateful graph in tensorflow?
<d1b2> <edbordin> yeah, actually they also added "eager execution" as the default (mainly because debugging is so much easier) and the autograph magic means your eager code can hopefully get translated into a graph for higher performance later
<d1b2> <edbordin> tbh most of the time you just stack some layers together and don't have to care what's happening under the hood
<d1b2> <TiltMeSenpai> oh cool
<d1b2> <TiltMeSenpai> lol that was an interesting distraction, but yeah, the data structure thing means that it's probably pretty trivial to serialize nmigen into json or something, I was thinking at building an icestudio-style graph editor at some point
<d1b2> <edbordin> yeah that was probably a bit of a tangent lol
<d1b2> <TiltMeSenpai> idk I think it's a matter of paradigm, I don't know if there's a wrong answer between myhdl and nmigen, just use what "clicks" with you more I guess
<d1b2> <edbordin> I vaguely remember a similar discussion ages ago and it seemed a few seasoned people were of the opinion that if you want to do heavy amount of metaprogramming/abstraction then nmigen is more appropriate for that sort of thing
<d1b2> <edbordin> I admit at one point I had a look at the tensorflow autograph internals to see if I could hack something together that outputs nmigen objects 😛
<d1b2> <TiltMeSenpai> it might be worth noting that the vendor platforms sometimes are fixing little issues without you needing to worry about it, I don't know if you would need to manually apply fixes if something just spat out the verilog/vhdl corresponding to your design
<d1b2> <TiltMeSenpai> like a couple of the vendor platforms fix things with funny reset behavior
<d1b2> <TiltMeSenpai> idk, nmigen feels generically more end-to-end, like the end output artifact is a bitstream you can flash onto a board
<d1b2> <edbordin> yeah, I think if you're targeting a platform that's supported by nmigen (or similar enough that you just need to write a board definition) then the experience is likely to be nicer
<proteusguy> edbordin & TileMeSenpai - thanks for the considered responses. Good to get an understanding of the architectural drivers for a toolset.
<d1b2> <edbordin> I guess the other thing is myhdl might be a bit more mature
emeb_mac has quit [Quit: Leaving.]
electronic_eel_ is now known as electronic_eel
Asu has joined #nmigen
<_whitenotifier-3> [nmigen] whitequark commented on pull request #463: Add initial support for Symbiflow toolchain for Xilinx 7-series - https://git.io/JJ5ML
<_whitenotifier-3> [nmigen-boards] whitequark closed pull request #105: Use correct IO attribute for ECP5 FPGAs - https://git.io/JJ5OP
<_whitenotifier-3> [nmigen/nmigen-boards] whitequark pushed 1 commit to master [+0/-0/±3] https://git.io/JJ5Ms
<_whitenotifier-3> [nmigen/nmigen-boards] GuzTech 2bdf05f - Use correct IO attribute for ECP5 FPGAs
<_whitenotifier-3> [nmigen/nmigen] whitequark pushed 1 commit to master [+0/-0/±2] https://git.io/JJ5Ml
<_whitenotifier-3> [nmigen/nmigen] awygle 73f672f - lib.fifo: add `r_level` and `w_level` to all FIFOs
<_whitenotifier-3> [nmigen] whitequark closed pull request #472: Add r_level and w_level to all FIFOs - https://git.io/JJ7n2
<_whitenotifier-3> [nmigen/nmigen] github-actions[bot] pushed 1 commit to gh-pages [+0/-0/±13] https://git.io/JJ5MB
<_whitenotifier-3> [nmigen/nmigen] whitequark ee2ffbf - Deploying to gh-pages from @ 73f672f57c606ac29087ffb09c43a7e4d9a9dfc6 🚀
<whitequark> proteusguy: edbordin and TiltMeSenpai are correct; I would emphasize it a bit more that myhdl is explicitly an evented simulation language (something nmigen avoids) and its chosen mechanism for metaprogramming is quite limited
<_whitenotifier-3> [nmigen] whitequark commented on issue #473: Signed math on Cat gives incorrect results - https://git.io/JJ5Dc
<_whitenotifier-3> [nmigen] pepijndevos commented on issue #473: Signed math on Cat gives incorrect results - https://git.io/JJ5D7
<_whitenotifier-3> [nmigen] pepijndevos edited a comment on issue #473: Signed math on Cat gives incorrect results - https://git.io/JJ5D7
<_whitenotifier-3> [nmigen] whitequark commented on issue #473: Signed math on Cat gives incorrect results - https://git.io/JJ5yY
<whitequark> there should be a github feature that physically prevents you from commenting while on vacation imo
<_whitenotifier-3> [nmigen] pepijndevos commented on issue #473: Signed math on Cat gives incorrect results - https://git.io/JJ59U
<lkcl> whitequark: lol.
<lkcl> proteusguy: our team evaluated a whole stack of HDLs, including pyrtl (which i'd never heard of before), chisel3, bluespec and so on.
<lkcl> we tried doing object-orientated programming in myhdl and found that, because it is a direct-to-python translator, you couldn't use classes without some serious limitations.
<lkcl> to do what we wanted with myhdl, we would actually have to have some sort of python-to-python language translator, or an auto-generator that outputted python code
<lkcl> multiple inheritance would be out of the question.
<lkcl> so if you have a simple straightforward design, there's no problem at all: myhdl is a great choice. the Algol RV32IMA RISC-V processor was successfully written in it and is very good
<lkcl> you can see there, it's really clear: python's hugely readable syntax and whitespace requirements make for much-increased readability
<lkcl> the problem is: it's still basically verilog programming paradigms (from the 1980s) under the hood.
<proteusguy> lkcl, thanks for the info & reference.
<lkcl> both nmigen and pyrtl on the other hand as people mention, you "construct" the HDL using python-based data structures, and hand the results over to another tool (both use nmigen and pyrtl use yosys).
<proteusguy> lkcl, presumably your team is using nmigen now?
<lkcl> here's a really good analysis https://ucsbarchlab.github.io/PyRTL/
<lkcl> the IEEE754FP library is entirely object-orientated and is forty five THOUSAND lines of code
<lkcl> it covers FP16, FP32, FP64, is entirely parameterised (FP8 and FP128 could be added with one line of code each in a python dictionary)
<lkcl> to give some perspective on that: minerva, a full RV32IMA core, is only 4,000 lines.
* proteusguy has yet another reason to hate floating point. ;-)
<lkcl> lol
<lkcl> you'll loove posits, then
<_whitenotifier-3> [nmigen] whitequark commented on issue #473: Signed math on Cat gives incorrect results - https://git.io/JJ59M
<lkcl> that's interesting. according to that pyrtl page (nmigen is missing! waa!) it says that pymtl is being developed by cornell.
<lkcl> _another_ python-based hdl i'd not heard of
<proteusguy> was just reading that.
<electronic_eel> lkcl: how would you describe the programming paradigm you use in your fp lib?
<whitequark> i'm never certain what's to think when someone uses the large size of their codebase as an advantage
<lkcl> electronic_eel: it makes prevalent use of OO abstraction
<lkcl> whitequark: well, it could easily have been 2 to 3x that if i had done separate FP16, separate FP32 and separate FP64 classes
<lkcl> the problem is that the abstraction, far from making things easier, actually starts to interefere with readability.
<lkcl> i created a "normalisation" class for example.
<lkcl> now instead of being able to see what's going on in a single file, you now have to have 2, 3, 4, or 5 files open simultaneously to follow the code-path.
<whitequark> you can put more than one class in a file y'know
<lkcl> whitequark: IEEE754, if you want "everything", it's just... damn big.
<lkcl> true - i put a stack of commonly-used classes in fpcommon.py
<lkcl> they're very short
<whitequark> anyway, what i'm saying is that big LOC count might say things but not necessarily positive ones
<whitequark> (remember the old joke about programmers being paid by the line?)
<lkcl> electronic_eel: you know the rule "always plan to do 3 versions of a codebase because if you don't that's what you'll end up doing anyway?"
<whitequark> solvespace is 40kLOC and that's an entire CAD and a geometric kernel
<whitequark> so to me, "IEEE754FP in nmigen is 45kLOC" makes me wonder if perhaps something is wrong. is it nmigen? is it libresoc? is it something else?
<lkcl> whitequark: yeah, it was an 8 month time-pressured learning experience of nmigen *and* IEEE754
<whitequark> (or maybe nothing did, it's just that complex)
<proteusguy> lkcl, so what's this resolve down to in terms of LUTs when you get it into a device?
<lkcl> whitequark: around half of it is unit test infrastructure, running over half a million unit tests
<whitequark> oh, that's... pretty important
<lkcl> proteusguy: there's two versions - one is FSM-based, one is pipeline-based
<whitequark> i wouldn't normally count code LOC and test LOC in one bag
<whitequark> (especially because tests are often not treated with the same care as code--which is unfortunate but true)
<lkcl> for FPMUL, the integer multiply block completely dominates, so it's on "par"
<lkcl> yeah i had to, here, because without comprehensive unit tests there was no way i was going to get this "correct".
<lkcl> also, proteusguy, the div unit isn't just a div unit: jacob wrote it so that it covers sqrt *and* rsqrt in the same pipeline
<lkcl> consequently, whilst it's 1.5x bigger than a normal radix FPDIV pipeline, you end up with more than 50% less code because you don't need a separate sqrt pipeline and separate rsqrt pipeline.
<lkcl> oh: one of the other reasons it's a big codebase is because we're planning to do dynamic partitioned SIMD as well. that's also part of the codebase
<lkcl> whitequark: at some point (not now, with the oct 2020 deadline coming up) i'd love to talk about that, with a view to commissioning your help in adding a nmigen-like version of "m.Case()" etc on top of it.
<lkcl> proteusguy: the dynamic SIMD-partitioned class, PartitionedSignal, is something that would be absolutely insane to try in myhdl, verilog or vhdl.
<lkcl> fascinatingly, IBM's POWER9 core actually does dynamic partitioning of FP32 / FP64.
<lkcl> as in: you can decide at runtime whether you want the hardware resources to "join together" to make FP64 faster, or whether you want the underlying hardware to split and make parallel FP32 results quicker
<lkcl> i'm scared of finding out the details of how they do that :)
<_whitenotifier-3> [nmigen] pepijndevos commented on issue #473: Signed math on Cat gives incorrect results - https://git.io/JJ5HY
<_whitenotifier-3> [nmigen] pepijndevos edited a comment on issue #473: Signed math on Cat gives incorrect results - https://git.io/JJ5HY
<_whitenotifier-3> [nmigen] whitequark commented on issue #473: Signed math on Cat gives incorrect results - https://git.io/JJ5HE
<whitequark> pepijndevos: so, i don
<whitequark> *i don't really want to argue about this issue
<whitequark> either you can wait until my vacation ends (which will be on monday), or you can fix the bug yourself
<proteusguy> lkcl, so this 3D open source GPU is your project?
<lkcl> cesar[m], thank you for this! i put the post you did into that link, and added a screenshot
<lkcl> proteusguy: yes. hybrid CPU-VPU-GPU using and extending the POWER9 ISA
<lkcl> cesar[m]: i've often had to add colour "by hand" to traces, but the lack of debug lines dividing thing still makes it difficult.
<proteusguy> lkcl, really interesting stuff... I'm learning a lot already. :-) the updates are especially educational.
<lkcl> this is really good!
<pepijndevos> whitequark, oh... only read this after posting. Noted. Enjoy the holiday.
<whitequark> thanks
<proteusguy> lkcl, how far out do you suspect this is from being ready to tape out?
<lkcl> proteusguy: it's... a seriously ambitious project. luckily, the team behind NLNet likes it, enough to fund it
<lkcl> well we're *going* to be doing a test 180nm ASIC, GDS-II files to be submitted end of october 2020
<lkcl> however that's for something that is only POWER9 compliant, no vectorisation etc.
<lkcl> realistically it's going to be another... 12-15 months work before we have anything that could "take on" any current ARM / x86 embedded offering
<pepijndevos> Just to clarify, when you have a nested design, you have a toplevel Elaboratable, and then submodules of submodules, right? Or do you nes the elaboratables?
<lkcl> classes == elaboratables, and there is one declaration of a Module per class
<lkcl> in that module, you drop submodules. those submodules are also classes that, again, have one and only one Module
<lkcl> there, class Adder inherits from Elaboratable, has an elaborate function, and declares a Module and returns it.
<lkcl> likewise for Subtractor
<lkcl> it's in the ALU class that you declare the instances of an Adder and Subtractor
<pepijndevos> alright
<whitequark> the Module class basically implements the builder pattern and holds all local state related to the elaborate() function
<whitequark> whereas the Fragment returned by elaborate() only contains a netlist
<lkcl> and, in the *ALU*'s elaborate function, not only do you do exactly the same thing (declare a Module and return it), you add them as submodules
<lkcl> the reason why Elaboratable was added is because people (including me) frequently forgot to add a sub-module to submodules.
<whitequark> it was a pervasive problem in Migen
<whitequark> you forget to add a submodule, now your design doesn't work in incomprehensible ways, and you waste hours or days figuring out why
<pepijndevos> ah I was thinking... I understand *how*, but not *why*, because it seems you could just have modules and do elaboration on __init__ basically.
<lkcl> mmm i wondered about that, too. but, it's... something to do with inheritance... there's an OO reason for doing "setup" in constructors followed by "do stuff" following an API / pattern
<lkcl> oh. a trick for you, pepijndevos, 1 sec let me find it
<whitequark> it's not related to inheritance per se
<pepijndevos> Still not 100% sure why having elaboratable helps. does it actually warn you if you forget? and why does elaboratable enable that?
<lkcl> pepijndevos: yes it warns you.
<lkcl> python is dynamic, not statically compiled. tracking down "missing objects that you should have added" is... well, not really possible
<lkcl> unless
<lkcl> you have a tracking mechanism
<lkcl> so whitequark added one
<whitequark> pepijndevos: having Elaboratable helps because you inherit a GC hook from it
<whitequark> so when your module is GC'd (which in CPython always happens when the interpreter quits) you get a warning
<whitequark> and if you add as a submodule something that doesn't inherit from Elaboratable, you also get a warning
<whitequark> there's no way to do this other than via GC hooks (that I can imagine), so you have to add the hook *somehow*
<whitequark> it could've been a decorator
<whitequark> but class decorators are more obscure, plus you do need to implement an abstract method
<whitequark> so inheritance it is
<pepijndevos> brilliant
<lkcl> there is another terrible way: over-ride __import__. i've had two legitimate reasons to do that in the past :)
<lkcl> pepijndevos: https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/fu/mul/pipeline.py;hb=HEAD
<lkcl> lines 37 and 44
<lkcl> the "connect()" function returns a list of eq's that would *normally* be done in the elaborate function
<lkcl> as you might guess, that function walks the pipeline i and o objects (inputs and outputs), "connecting" them together in a chain
<lkcl> the reason i mention this is because the Q class you are doing, it *might* be possible to "avoid" passing around a Module(), using this trick
<lkcl> if you remember our discussion yesterday.
<pepijndevos> I am not so far that I see the problem yet, so in an hour or so I'll probably be like "aaaaah"
<lkcl> basically the fact that elaborate is where the fragments ultimately have to be collated and returned, you do *not* have to limit yourself to only having those fragments *created* in the elaborate function
<lkcl> pepijndevos: :)
<lkcl> cesar[m], that gtkwave "trace" demo is *really* nice. being able to collapse groups of signals is incredibly useful
<lkcl> hmmm, that would be amazing to have nmigen auto-generate a hierarchy.
<whitequark> a hierarchy?
<whitequark> pepijndevos: regarding your q about splitting __init__ and elaborate, there's a different reason for it
<whitequark> well, a few reasons
<whitequark> first: the idea is that nmigen modules are soft-immutable once created. that is, you would not normally change the module itself inside of elaborate()
<whitequark> this means you can create a design, and then synthesize it for a range of platforms, for example
<lkcl> as in, take the nmigen hierarchy and auto-generate some coloured waves that include hierarchical indentation / expansion in the *Waves* window
<whitequark> or reuse modules, or something like that
<pepijndevos> uhu
<whitequark> second: it's nice to have clearly separate interface and implementation
<whitequark> __init__ is for the interface and that alone
<whitequark> elaborate is for the implementation
<whitequark> migen had these separated by "###" in __init__ by convention
<whitequark> i codified that
<lkcl> ah yeah that's the clue. if you try a multiple inheritance design, and want to do merging / re-use of some code-fragments, if everything's done in __init__ that's almost impossible to achieve
<whitequark> i never intended to enable that specifically
<whitequark> but .. i'm happy it works out?
<lkcl> it does :)
<lkcl> the nmutil pipeline API creates a base class that is further inherited and provides different *types* of pipeline behaviour (buffering, non-buffering, etc.)
<lkcl> m = super().elaborate(platform)
<whitequark> oh yea
<whitequark> you could probably assign to self.m if these two methods were merged
jeanthom has joined #nmigen
<_whitenotifier-3> [nmigen] DaKnig opened issue #474: wrong documentation for word_select - https://git.io/JJ57b
emeb has joined #nmigen
<pepijndevos> huhhhh, my testbench is just a single yield Tick() and that throws it in an infinite loop
<_whitenotifier-3> [nmigen] whitequark commented on issue #474: wrong documentation for word_select - https://git.io/JJ559
<_whitenotifier-3> [nmigen/nmigen] whitequark pushed 1 commit to master [+0/-0/±1] https://git.io/JJ55F
<_whitenotifier-3> [nmigen/nmigen] whitequark e46118d - docs/lang: use less confusing placeholder variable names.
<_whitenotifier-3> [nmigen] whitequark closed issue #474: wrong documentation for word_select - https://git.io/JJ57b
<_whitenotifier-3> [nmigen/nmigen] github-actions[bot] pushed 1 commit to gh-pages [+0/-0/±15] https://git.io/JJ55x
<_whitenotifier-3> [nmigen/nmigen] whitequark 4a2c595 - Deploying to gh-pages from @ e46118dac0df315694b0fc6b9367d285a8fc12dd 🚀
<pepijndevos> huhhhh, if I use comb and Delay it works
<DaKnig> whitequark: then shouldnt it be (b+1)*i ?
<DaKnig> or is this supposed to be one bit wide?
<whitequark> DaKnig: think of it this way: the result is always `w` (in the new docs, `i` in the old docs) wide
<DaKnig> no...
<whitequark> so it can't be anything that isn't algebraically equivalent to a[x:x+w]
<DaKnig> not the way you wrote it
<whitequark> oh?
<whitequark> b*w+w is the same as (b+1)*w
<pepijndevos> Why is tick an infinite loop...
<whitequark> pepijndevos: got an MCVE?
<DaKnig> ok I thought it was +1 before, not +i
<DaKnig> nvm then, now it looks reasonable
<pepijndevos> ooof, not yet. But basically... is there a normal reason why this might happen?
<whitequark> what do you mean by "infinite loop" here?
<whitequark> the yield never returns?
<pepijndevos> it seems so yea
<DaKnig> did you try running it line by line in, say, pdb?
<whitequark> is the clock driven?
<whitequark> forgotten add_clock perhaps?
<pepijndevos> ...
<whitequark> was that it?
<pepijndevos> probably... I'm looking at the examples folder and they all use main and don't seem to add a clock
<_whitenotifier-3> [nmigen] whitequark closed issue #402: Consider slice as signal in Record initialization - https://git.io/JfyX5
<whitequark> oh
<whitequark> so main() automatically drives the `sync` clock domain
<whitequark> it's kinda opaque and needs rework in the future
<pepijndevos> I see... yea add_clock fixed it haha
<pepijndevos> and with that, I can compile a spreadsheet with just simple expressions to nMigen :D
<whitequark> there was a bug tracking this issue with Tick() somewhere...
<whitequark> can't find it
<whitequark> feel free to file it, this really should be at least a warning
jeanthom has quit [Ping timeout: 256 seconds]
<pepijndevos> alright
<_whitenotifier-3> [nmigen] pepijndevos opened issue #475: Emit a warning when no clock is set - https://git.io/JJ5db
jeanthom has joined #nmigen
jeanthom has quit [Ping timeout: 260 seconds]
<DaKnig> where can I find an explanation about how to work with `Record`s?
<lkcl> DaKnig: we kinda pick it up by a process of osmosis :)
<lkcl> what do you need to know?
<DaKnig> oh nvm. I was just not sure about the examples that use that
<DaKnig> they arent super clear and docstrings didnt help much either
<DaKnig> I think I got this
<DaKnig> what would I use for FSMs?
<DaKnig> lkcl:
<lkcl> so this one, the "pins" Signal is 8 bits
<lkcl> and the bus Record is 8 bits
<lkcl> (add up the total bits of each element in the Record, 5+1+1+1)
<DaKnig> yeah I got it by trying this in the python interactive shell
<lkcl> the really nice thing is that the Record instance, once created, you can refer to the Layout items as object.bus, object.we
<proteusguy> lkcl, did you ever consider posits? https://www.youtube.com/watch?v=JJgT-YphE1Y
<lkcl> what about FSMs?
<lkcl> proteusguy: yyeees :) unfortunately, vulkan - and all pre-compiled shader applications written by proprietary games - don't use / support posits.
<proteusguy> lkcl, you could just convert in/out but use posits internally
<lkcl> if _vulkan_ added posits, then if NVidia, AMD, ARM (MALI) etc. all used them...
<proteusguy> (but I understand your concern)
<proteusguy> lkcl, I understand that there's some GPU that is using them internally already.
<lkcl> we went over this question 18 months ago. the problem - if you recall the history of the IBM 370 and the Ahmdahl clone
<lkcl> is that if we make even one mistake in the full interoperability (a fraction of a unit in last place - ULP) - we're hosed
* proteusguy feels your pain. :-)
<lkcl> in other words we would have to provide *full* IEEE754 FP interoperability... through yet another hardware layer (posits)
<lkcl> with IEEE754 QNaNs, signalling NaNs, +/- INFs, everything
<lkcl> Ahmdahl's "mistake" - hilariously - was that they correctly implemented IEEE754, where the IBM 370 did not
<lkcl> user applications *failed* to run on the Ahmdahl 370 clone because of this!
<lkcl> and they had to write a software OS patch that provided exact binary-interoperability with the IBM 370's faulty IEEE754 hardware implementation!
* lkcl wark-wark
<lkcl> DaKnig: what would you like to know about FSMs? how - or why - to use them?
<lkcl> you _can_ use a bit Switch/Case statement, with an object (best to do a Record) where outside of the switch statement you sync between a "future" and "current" copy of that object/Record
<lkcl> VHDL does that
<lkcl> nmigen FSMs basically take the "pain" away of having to explicitly do that type of trick
<anuejn> awygle: are you sure that the r_level and w_level outputs work for AsyncFifoBuffered?
<anuejn> somehow i am getting r_rdy == 1 but r_level is always 0
<awygle> I'm not sure of anything at 8am lol
<vup> lol
<awygle> I can look at it but it'll have to wait a bit. An example would be nice but I can make one myself if that's difficult for whatever reason
<anuejn> i see :joy:
<anuejn> no that would be the next thing i try anyways
<awygle> It's quite mysterious that that code ended up merged... It can't have been wq since she's on vacation 😛
<awygle> The condition guarding the r level update is
<awygle> with m.If(self.r_en | ~self.r_rdy):
<awygle> So you might just have to tick r_en
<awygle> But that really shouldn't happen, so maybe there's an extra delay on r_level somewhere
<awygle> Does the internal FIFO get its level set?
<pepijndevos> How do I (not) update a value from a testbench process? I'm just doing yield sig.eq(val) and... it has no effect it appears.
<vup> pepijndevos: this is how its supposed to work, for comb updates to happen you have to `yield Settle()` (or proceed to the next clock cycle using `yield`)
<pepijndevos> hmmmmm
proteusguy has quit [Ping timeout: 240 seconds]
proteus-guy has quit [Ping timeout: 240 seconds]
<pepijndevos> The value was previously set with m.d.sync += cell.value.eq(sig.value), will that prevent it from being updated? (essentially reset next clock tick?)
<anuejn> awygle: internal fifo?
<anuejn> awygle: why is there a condition guarding the r_level update?
<pepijndevos> Basically, will the yield eq force the signal to the new value, or just temporarily set it until the next sync resets it?
<anuejn> pepijndevos: to my understanding you are not supposed to drive signals in simulation that are also driven by your design
<pepijndevos> I am so good at breaking this thing...
proteus-guy has joined #nmigen
<pepijndevos> okay, testing this hypothesis
<anuejn> if i remember correctly there was a discussion about emitting a warnig but that is not implemented (yet) (afaik)
* awygle is officially out of bed now
<pepijndevos> Maybe I should file another issue for more warnings for... people like me
<awygle> yeah you're not supposed to drive things from multiple modules, and the simulation is ~ a module
<awygle> feel free, if that doesn't exist yet
<awygle> anuejn: it guards all the updates to the "buffer" elements. i think the problem is that i'm assigning to r_level in the sync domain in AsyncFIFO and that's adding a delay. what i don't know is why it passed the formal tests, but they're not great for AsyncFIFO tbh...
<_whitenotifier-3> [nmigen] pepijndevos commented on issue #475: Emit a warning when no clock is set - https://git.io/JJ5A5
<pepijndevos> I think https://github.com/nmigen/nmigen/issues/318 covers this already
<pepijndevos> I'm not the only idiot apparently ;)
<pepijndevos> (no offense to the reporter)
<_whitenotifier-3> [nmigen] whitequark commented on issue #475: Emit a warning when no clock is set - https://git.io/JJ5Ah
<_whitenotifier-3> [nmigen] whitequark closed issue #475: Emit a warning when no clock is set - https://git.io/JJ5db
<anuejn> awygle: soo... it works with AsyncFifo but is broken with AsyncFifoBuffered
<anuejn> should i file an issue?
<awygle> anuejn: probably. can you link your example? i think i know what the problem is, i'm just still setting up a testbench to confirm
<anuejn> k one sec...
<pepijndevos> yay, works... if I don't put a number in the excel sheet I can change it from the testbench
proteusguy has joined #nmigen
<anuejn> this works (with AsyncFIFO)
<awygle> oh, it didn't fail the formal tests because we don't run model equivalence for async fifos, right.
<anuejn> if you instead use AsyncFIFOBuffered, it breaks
<anuejn> yup thats what i found out too :joy:
<awygle> ok thanks, lemme take a look real quick to confirm my theory
<awygle> fun fact - if your clock period is 100,000,000 seconds, write_vcd doesn't work :p
<DaKnig> lkcl: how is this different from what VHDL has?
<DaKnig> in VHDL you'd have some enum type (whatever that's called) and one switch statement, this looks exactly like what you do in nmigen. not that its bad... and why do you need 2 versions of the record?
<anuejn> awygle: oh ouch sorry for that
<anuejn> i always mess that up because i have a helper that takes frequency
* anuejn hides
zignig has quit [Remote host closed the connection]
<pepijndevos> What determines the initial state of a FSM?
<awygle> it's the first state
<awygle> top to bottom
<pepijndevos> ok
<awygle> you can also pass it explicitly to m.FSM i believe
<lkcl> DaKnig: one of the records/things is combinatorial, the other is the "next" which is assigned to inside a rising_edge(clk) block
<awygle> yeah, with `reset="STATE"`
<lkcl> equivalent to nmigen "m.d.sync += next_fsm_state_record.eq(the_current_fsm_state_record)"
<awygle> anuejn: ok so fun fact it was _also_ broken in AsyncFIFO just less badly
<DaKnig> lkcl: you can do case inside processes tho , no/
<DaKnig> ?
<DaKnig> ah right, this lang is too crappy to have consistent behavior between comb and sync logic
<awygle> DaKnig: there are three "styles" of implementing FSMs in verilog/vhdl, called one-process, two-process, and three-process state machines
<awygle> lkcl is describing a two-process state machine
<lkcl> awygle: i am? nice to know its name :)
<awygle> specifically section 6
<lkcl> line 203 is where the "copy-of-the-state" is first done
<awygle> i don't think i've ever seen anybody use a four-process state machine but i can see why it would be nice if you were going beyond 1 anyway
<lkcl> all occurrences v.state = 'xxxx' in that file can be translated to nmigen "m.next"
<anuejn> awygle: in which way?
<awygle> anuejn: two extra clocks of latency on r_level as opposed to r_rdy
<anuejn> oh yeah that was bad but i thought it was by design ;)
<lkcl> Abstract: there are at least *seven* different FSM design techniques taught. wow
<anuejn> so now they are in sync?
<whitequark> awygle: "four-process"?
<awygle> whitequark: see the linked apper
<awygle> ... paper
<whitequark> oh
<lkcl> oo, binary vs one-hot FSM state-encoding
<lkcl> ah. page 9 of that paper. whitequark: that would be an interesting efficiency consideration / option to add to nmigen
<lkcl> to have the FSM states be generated as unary rather than binary
<daveshah> that's a transform that Yosys should be able to do
<daveshah> although its FSM detection is a bit fickle at the moment
<lkcl> daveshah: ah really?
<daveshah> have a look at the fsm meta-pass
<awygle> arright, fixed the bug
<awygle> gonna add test case
<DaKnig> wait , nmigen doesnt use one-hot?
<awygle> es
<DaKnig> that's the standard in FPGA tools
<lkcl> i was just thinking, the "ASIC vs FPGA" synthesis section, "silicon is free, flip-flops cost" but for FPGAs "flip-flops are fee, and comb logic costs", it tends to suggest, to me, to have some form of recognition of the concept of an FSM *in* ilang, itself
<daveshah> most FPGA tools will convert to one-hot if it is more efficient automatically
<lkcl> DaKnig: from what i've seen of nmigen ilang output, the fsm_state "tracker" is binary-encoded
<lkcl> but that's in the simulations
<awygle> we leave that encoding decision up to yosys, i believe
<awygle> (or whatever the synthesizer is)
<DaKnig> lkcl: usually the sim does use the binary representation since its easier to look at, and iirc I used a sim that even writes the name of the state. that has nothing to do with how its synth'd
<anuejn> awygle: interesting...
<anuejn> thanks for the fix :)
<awygle> yup, sorry for the thrash :p
<whitequark> yosys does recognize FSMs in ilang
<awygle> we really need to fix the formal for asyncfifo...
<whitequark> nmigen doesn't use those because write_verilog doesn't understand them
<whitequark> and the fsm_* passes don't understand processes
<DaKnig> that sucks...
<DaKnig> so it means vivado (in my case) wont be able to use one-hot..?
<daveshah> I expect Vivado may well recode it if it considers one-hot beneficial
<anuejn> at least diamond does that
<whitequark> in fact it's yosys which does not recognize nmigen FSMs
<whitequark> for some stupid reason i keep meaning to fix
<lkcl> whitequark: would it make a significant difference to performance if it was? (decreased logic utilisation, decreased latency)?
<whitequark> well, right now yosys simply doesn't run any FSM optimizations (unless you force it, in which case it does them unsoundly IIRC)
<lkcl> oh
<whitequark> whether the optimizations improve performance depends on the design
<DaKnig> lkcl: one hot reduces lut count in some cases
<DaKnig> for example, testing `if state = 1` takes only routing latency
<DaKnig> you do have a bit more luts to calc the next state i guess...
<DaKnig> if you have some serious logic that happens when state has some value, checking (basically XORing the bits) takes a bit more time; and then you have more delay on the longest path instead of moving this delay to the path for calc'ing the next state
cr1901_modern has quit [Quit: Leaving.]
cr1901_modern has joined #nmigen
<awygle> whitequark: do you want a non-formal test case against this r_level bug in AsyncFIFO or do you want to say "eventually we'll fix the formal stuff for AsyncFIFO"? we don't have any behavioral non-formal tests for FIFOs currently
<lkcl> DaKnig: the out-of-order dependency matrices for libresoc use one-hot vector encoding for register numbers.
<lkcl> this means that we can raise multiple bits in a single cycle, and thus do multi-issue.
<whitequark> awygle: former
<awygle> gotcha
<DaKnig> lkcl: makes total sense
<DaKnig> but is it really one-hot when you have more than one non-zero things?
<lkcl> it's one of the crucial differences between the Tomasulo algorithm (which uses binary-encoding for RS#s and therefore requires a CAM) and the 6600 style scoreboards (which only need a single AND gate)
<DaKnig> I think no
<lkcl> DaKnig: welll... it is inasmuch as each register only has one "bit" representing it, but strictly it's no longer "one-hot", yes.
<lkcl> the trade-off is that the matrices get VERY large, very quickly.
<DaKnig> more performance= more silicon, why act so surprised :)
<lkcl> :)
<DaKnig> I am using this now for multiple-bit shift-register: https://paste.centos.org/view/5f82418a
<DaKnig> I really couldnt find a better way that's shorter while still being clear
<pepijndevos> arg... I found a problem with my number class
<pepijndevos> Sticking it in an array gives trouble
<pepijndevos> It actually works surprisingly well, but I need the behavior of ArrayProxy.shape for the type of my number class.
<whitequark> ... that's a great question actually
<whitequark> i'm pretty sure it's a bug
<whitequark> please open an issue at least
<pepijndevos> ^.^
<whitequark> it's possible no one used signed values in ArrayProxy before
<_whitenotifier-3> [nmigen] pepijndevos opened issue #476: Possible bug in ArrayProxy for signed values - https://git.io/JJ5j1
<pepijndevos> You'll have like a dozen bug reports from me by the time holiday is over. Most of them about signed values it seems haha
<pepijndevos> Anyway... I think it's time for me to invent the ArrayProxyProxy to expose the type of my integers.
<pepijndevos> But uh, is it valid at all to store a non-Value in an Array?
<pepijndevos> Seems like yes.
<anuejn> awygle: I have found another bug in the r_level/w_level
<anuejn> when the fifo is full, {r,w}_level are 0
<anuejn> it is 1 for AsyncFIFOBuffered
<awygle> Sounds like a rollover bug. Thanks for the report!
<_whitenotifier-3> [nmigen] whitequark commented on issue #476: Possible bug in ArrayProxy for signed values - https://git.io/JJde6
Yehowshua has joined #nmigen
<Yehowshua> awygle, can I copy the implementation of `Record`? Will maintaining a local copy in a codebase work in future releases of nMigen?
<Yehowshua> I think `Record` subclasses `UserValue` and `Value`, so perhaps what I'm really asking is if those classes will change
<pepijndevos> Does an Array support inhomogenous elements?
<pepijndevos> Yehowshua, as far as I know subclassing Value derived classes other than UserValue is not supported
<_whitenotifier-3> [nmigen] pepijndevos opened pull request #477: Remove spurious element_signed from ArrayProxy.shape - https://git.io/JJdvN
<pepijndevos> So there is nothing that prohibits you from stuffing wildly different things in an array, but ArrayProxy.shape returns the maximum of the shape?? I don't think Verilog supports inhomogeneous arrays, so I wonder how this works in practice.
<pepijndevos> If inhomogeneous arrays are not *really* supported, I'd like to propose changing getattr on ArrayProxy so that if can prove all values would "statically" return the same value, it can just return that value instead of a proxy.
<pepijndevos> This way shape can just be removed, and it also works for my number type.
jeanthom has joined #nmigen
<_whitenotifier-3> [nmigen] pepijndevos commented on issue #476: Possible bug in ArrayProxy for signed values - https://git.io/JJdJt
<DaKnig> can I do something like `r=Record([("a", 2),("b",2)]); m.d.sync+=r.eq(a,b)`?
<DaKnig> or do I have to write that in 2 lines?
<pepijndevos> huhhh, array indexing gets compiled to a switch
<pepijndevos> so it does indeed seem to support inhomogeneous elements
<lkcl> DaKnig: there's a syntax error in what you wrote
<lkcl> generally, the convention in python is never to use semi-colons, despite it being syntactically permitted
<lkcl> r = Record([("a", 2), ("b",2)])
<DaKnig> whats the suntax error?
<lkcl> there does not exist a variable a or a variable b
<DaKnig> I used `;` to keep it in one line
<lkcl> and eq only takes one parameter. you have passed in two
<DaKnig> yes of course I wont write all the code lol, imagine they do exist and are the right size
<lkcl> ok then you will need them to be separate
<DaKnig> I tried to write in pseudo code
<lkcl> m.d.sync += r.eq(a)
<DaKnig> you mean r.a.eq(a)
<lkcl> ah because you assigned r on the LHS (r.eq) i assumed that you knew that this would involve assigning something of equal bit-width to r
<lkcl> the most logical choice for that would be "a Record that has exactly the same layout as r"
<lkcl> but yes, you could do r.a.eq(a)
<lkcl> where a (a separate signal of some type) was 2 bits in size (because r.a is declared in the Record Layout to be 2 bits wide)
<DaKnig> how is r stored, is `a` the two least significant bits?
<lkcl> if a (the separate signal of some type) is *not* 2 bits wide, then only the first two bits *of* that signal get put in
<lkcl> yes.
<lkcl> the bits are in order of declaration of the Layout
<DaKnig> so the solution would be to do `r.eq(Cat(a,b))
<lkcl> ahh nooow you're getting it
<lkcl> where a had better be 2 bits, but b does not matter so much
<DaKnig> I dont like using Cat for everything :(
<lkcl> if a is only 1 bit, you *must* do this:
<lkcl> r.eq(Cat(a, Const(0,1), b)
<lkcl> DaKnig: now you know why i created RecordObject.
<DaKnig> nah Im assuming `isinstance(a,Value)`
<DaKnig> or something like that
<lkcl> RecordObject will walk the fields of the RHS (using python getattr) and assign them to the same fields with the same name on the LHS.
<lkcl> you can perfectly well do this instead
<lkcl> r.a.eq(a)
<lkcl> r.b.eq(b)
<lkcl> where a is a non-length-matching Signal-or-isinstance(a, Value)
<lkcl> and likewise b is not 2 bits long either
<lkcl> generally however the whole idea of Records is that you... well, you only do this type of "messy" assignment at the receiving and sending end of where the Record instance is actually used
<lkcl> everywhere else, then just like a VHDL record/struct, you do
<lkcl> x= Record(....)
<lkcl> y = Record.like(x)
<lkcl> sync += x.eq(y)
<awygle> Yehowshua: UserValue is going to be replaced by ValueCastable (https://github.com/nmigen/nmigen/pull/449) for internal nmigen classes soon. UserValue itself will go through a deprecation cycle, same as Record
<awygle> so you can either wait until the ValueCastable change goes in (should be in the next couple weeks) or rely on deprecation for either Record or UserValue
<Yehowshua> OK - thanks
<awygle> (i owe wq a bunch of stuff one of which is landing ValueCastable, oops)
jeanthom has quit [Ping timeout: 256 seconds]
Yehowshua has quit [Ping timeout: 245 seconds]
<pepijndevos> How do I get stuff to show up in the vcd? Not all my signals are there and I'm not sure what the discriminating factor is
Asuu has joined #nmigen
Asu has quit [Ping timeout: 246 seconds]
<d1b2> <286Tech> with sim.write_vcd("test.vcd", "test.gtkw", traces=[yourmodule.signal_a, yourmodule.signal_b, ...]): sim.run()
<_whitenotifier-3> [nmigen] whitequark commented on issue #476: Possible bug in ArrayProxy for signed values - https://git.io/JJdT5
<_whitenotifier-3> [nmigen] whitequark commented on issue #476: Possible bug in ArrayProxy for signed values - https://git.io/JJdkI
<d1b2> <TiltMeSenpai> do people have sort of nmigen standard libraries that they use? I'm mostly starting to attempt to wrap my head around FIR filters and FFT's
<d1b2> <TiltMeSenpai> well I understand FIR filters well enough but I'm wondering if there's a way to avoid a shitton of multiplies
emeb_mac has joined #nmigen
emeb has quit [Ping timeout: 240 seconds]
emeb has joined #nmigen
emeb_mac has quit [Ping timeout: 240 seconds]
emeb_mac has joined #nmigen
<lkcl> pepijndevos: ok, long story: we discovered that for some unexplained reason, some signals do not appear at the top level unless you create a signal of exactly the same size as the signals you want to see at the top level
<lkcl> then do m.d.comb += toplevelsignal.eq(dut.sometoplevelsignal)
<lkcl> and *then* those appear in the vcd file *and* the ones from the top level module *also* appear
<lkcl> if you don't do that, they will just... not... be... present
<lkcl> there is no rhyme or reason as to why.
<lkcl> this started happening about... 3 months ago?
<whitequark> probably related to the pysim refactor
<whitequark> you should file a bug
<lkcl> i _think_ we did, at the time.
<whitequark> hmm
Asuu has quit [Quit: Konversation terminated!]
zignig has joined #nmigen