ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen · IRC meetings each Monday at 1800 UTC · next meeting TBD
nfbraun has quit [Quit: leaving]
emeb has quit [Quit: Leaving.]
lf has quit [Ping timeout: 258 seconds]
lf has joined #nmigen
<_whitenotifier> [YoWASP/yosys] whitequark pushed 1 commit to develop [+0/-0/±1] https://git.io/JLSVW
<_whitenotifier> [YoWASP/yosys] whitequark 5e121bd - Update dependencies.
<d1b2> <AlterNet> hey, is there nmigen gate-level sim?
<whitequark> what do you mean by "gate-level" in this context?
<d1b2> <AlterNet> like, LUT-level
<d1b2> <AlterNet> esp. if I instantiate LUTs directly
<whitequark> as in, you'd like to simulate nmigen netlists together with instances?
<Yatekii> hmm question: if I have some logic that automatically gets translated into blockram macros by nmigen, and that violates timing constraints in the implementation, can I tell nmigen to do it in actual logic and not BRAMs? (just an example,. I do not have this problem, I am curious :))
<whitequark> yes
<whitequark> memory.attrs["logic_block"] = 1
<d1b2> <AlterNet> yes
<whitequark> AlterNet: not yet, but this is something that a work in progress will enable
<Yatekii> no I mean the other way round :D lets say I describe a delay element that delays N clock cycles. that could be described as a ringbuffer and could go into BRAM (no idea if nmigen does this). but what if ithis is actually "slower" as in critical path than a LUT impl. could I tell it to use plain logic?
<whitequark> nmigen never does this, the toolchain will generally not do this either
<Yatekii> ok :)
<whitequark> you might see shift registers inferred
<Yatekii> how do you mean? sorry I am quite noob here ^^
<whitequark> some xilinx devices can reuse LUTs as shift registers
<Yatekii> ah
<Yatekii> I see
<Yatekii> thanks a lot :)
falteckz has joined #nmigen
<falteckz> Hi all - what's the right way to override a platform? I assume I can just create a class that extends TinyFPGABXPlatform. If I want to add more resources do I override the __init__ function? Is there an API for something that is already called that will serve this purpose?
<d1b2> <dub_dub_11> Is this for a board?
<d1b2> <dub_dub_11> Generally is code duplication
<d1b2> <dub_dub_11> So copy + edit the tinyfpgabx.py platform file
<falteckz> It's for a custom board that I wish to build eventually - but it's the same FPGA as the BX
<d1b2> <dub_dub_11> Yeah if it's a different board you make a different platform file, inheriting from Ice40Platform or whatever it is
<falteckz> LatticeICE40Platform fwiw
<falteckz> Thanks I'll do that!
<d1b2> <dub_dub_11> If you are basing the design off it, you may find it helpful to copy chunks of code, but you make a seperate file yeah
<d1b2> <dub_dub_11> Ah that checks out
<falteckz> Yep, going with the duplicate code approach. Or "explicit" as Python calls it - I think it makes sense here
<whitequark> code duplication often makes sense because it can decouple unrelated components and allow them to change separately
<falteckz> You're right - changes to TinyFPGA do not apply to me if I'm making something custom. So it's wrong to extend, since they aren't coupled in that regard
<whitequark> yep!
<whitequark> sometimes, programmers dislike code duplication to a far greater extent than is really warranted. the inverse is occasionally true as well.
<whitequark> but it is less about duplication itself and more about coupling.
<whitequark> so, you get it :)
<falteckz> Haha I'm glad :) - say, while here, would any of you happen to know of a WS2812b driver? The protocol is simple and I'd be happy to implement it myself, but I get an early lunch if I implement it quicker haha
<whitequark> not offhand but others here might know
<whitequark> agg maybe?
<whitequark> actually wait
<whitequark> but it has a bunch of unrelated stuff involved so i'm not sure how much it simplifies your task
<falteckz> It looks like VideoWS2812OutputSubtarget handles a single strip of WS2812s, so I should be able to leverage it. Thank you for the reference :-)
<falteckz> It only just occurred to me that you had already highlighted that class -.-'
<whitequark> hehe
<falteckz> Are you familiar with the FIFO interface that VideoWS2812OutputApplet is using?
<whitequark> i designed it
<falteckz> Looks like the video binds directly to a crossbar - a nice implementation but perhaps far too complicated for my needs (needs yet undefined)
<whitequark> yeah, that is why i said "i'm not sure how useful it will be to you"
<falteckz> It's cool that we have essentially a DMA controller here
<falteckz> But yeah - maybe I just mimic the FIFO interface but clock the data in myself
<whitequark> hm, glasgow doesn't have memory
<falteckz> Perhaps my terminology is incorrect - it's a round-robin FIFO pump
<falteckz> I only mention it to express my location in the source. I'm aware you designed it :-)
<whitequark> ah, yeah :)
electronic_eel has quit [Ping timeout: 246 seconds]
electronic_eel has joined #nmigen
sakirious has quit [Quit: The Lounge - https://thelounge.chat]
sakirious has joined #nmigen
Bertl_oO is now known as Bertl_zZ
PyroPeter_ has joined #nmigen
PyroPeter has quit [Ping timeout: 256 seconds]
PyroPeter_ is now known as PyroPeter
<_whitenotifier> [YoWASP/yosys] whitequark pushed 1 commit to develop [+0/-0/±1] https://git.io/JLSXJ
<_whitenotifier> [YoWASP/yosys] whitequark 0aba9d4 - Update wasi-sdk.
Degi_ has joined #nmigen
<lsneff> whitequark: you may find this interesting https://usercontent.irccloud-cdn.com/file/1s3qUoi7/ligeia.png
Degi has quit [Ping timeout: 256 seconds]
Degi_ is now known as Degi
<whitequark> lsneff: sweet
<whitequark> did i give you a large VCD as you requested?
<lsneff> I don't think so, but that was a 448M vcd
<whitequark> lemme upload you some
<lsneff> Looks like a significant amount of overhead is from the vcd crate allocating a vector for each vector value change, so I'll try to change that and see how much it speeds up.
<lsneff> Yep, that nearly cut the time in half
<awygle> that looks interesting
<awygle> what are you working on lsneff?
<whitequark> lsneff: pretty nice
<whitequark> do you have a format spec yet? i might be able to emit it directly from cxxrtl
<lsneff> awygle: I'm working on a gtkwave alternative
<awygle> oh _excellent_
<lsneff> whitequark: Not a spec yet, but it's pretty clean. Three mmapped files, one contains a bit of data about each variable and is written to for each command on that variable, the second contains a list of varint timestamp deltas, the third contains linked lists for each variable interspersed with each other.
<lsneff> Will try to come up with a spec soon
<whitequark> lsneff: oh, so that's an internal thing
<whitequark> (it would be quite impractical to generate three files from cxxrtl, so i don't think i will)
<whitequark> (for example that means i cannot stream the waveform dump over the network)
<whitequark> (not easily anyway)
<lsneff> Ah, I see what you mean, makes sense
<lsneff> Yeah, that's an internal thing.
<whitequark> right, i don't really need a spec for that :)
<whitequark> as long as it's fast
* awygle replays the FST discussion in his head rather than cluttering the channel
<lsneff> Hmm, I'll come up with something that could replace vcd.
<whitequark> awygle: you have to link to gtkwave in practice
<awygle> yeah i know
<whitequark> e.g. verilator can dump fst, using gtkwave's lib
<whitequark> right ok
<awygle> until/unless someone reimplements it with better docs and stuff
<awygle> at which point they'll probably uncover Horrors
<tpw_rules> so i was thinking earlier today how on earth to manipulate this 170GB bus trace which could be easily cast to VCD... lsneff is your stuff easily programmatically accessible? could i do fast searches with context? can i compress the files?
<lsneff> Not yet, I haven't finished implementing the query format yet.
<whitequark> tpw_rules: what's the bus?
<tpw_rules> it's from an snes emulator. all in ASCII. sample line: 0083ff sta $4200 [004200] A:0001 X:0000 Y:0000 S:01ff D:0000 DB:00 nv1BdIzc V: 0 H: 262
<whitequark> o!
<tpw_rules> i want to answer questions like "what are the 10 instructions following a store to memory location X? or execution of instruction Y? or on line 35?". i would be grateful if one of you knew something to manipulate it. i'm fine translating it to another format. but it's 170GB uncompressed and very painful to work with
<tpw_rules> (answer programmatically)
<lsneff> Support for things like that would be useful for myself as well.
<whitequark> galaxy brain: this is what eBPF was invented for
<lsneff> local supercluster brain: this is what wasm was invented for
<whitequark> you will probably have underwhelming performance with wasm
<lsneff> oh certainly lmao
<whitequark> it's fast, but this is also a use case where even 1% faster can be noticeable
<lsneff> Yeah, the issue with it is no outside memory access
<whitequark> now that wasm64 is a thing you could probably map all 170 GB
<whitequark> tpw_rules: have you considered converting those lines to basically a struct, and writing a small c or c++ program to search?
<whitequark> that seems like a good starting point, which you could improve by working on the parts that are painful
<tpw_rules> yeah that's what i was going to do
<tpw_rules> but i think some sort of index format would be a huge benefit. avoid iterating over every single thing every time
<whitequark> ... stuff it into postgres?
<whitequark> i guess postgres is not good with time series data
<whitequark> and neither is influx, not with *this* kind of time series data
<whitequark> hmm
<whitequark> maybe numba?
<tpw_rules> i thought about that too but the sticking point is i don't want to work with it uncompressed
<lsneff> Is querying if this sort a feature this program should have?
<whitequark> lsneff: i would say it is not a part of MVP
<tpw_rules> no it's not
<lsneff> 👍
<whitequark> i would really, really, really like to have something that is less excruciating to use than gtkwave, first
<tpw_rules> my current idea is to divide it into chunks (e.g. 60 emulated frames) and sort by low 8 bits of address or something. then i can run a program over several thousand files in parallel that's just a binary search. i guess linear is fine if i plan on decompressing it every time now that i think about it
<tpw_rules> maybe compression is unrealistic
<lsneff> whitequark: you can write out the metadata for each variable to the file/network-stream before running the simulation, yes?
<lsneff> You're alright if it contains variably sized data?
emeb_mac has quit [Quit: Leaving.]
Sid___ has joined #nmigen
<whitequark> lsneff: re metadata: yes, and i can easily write it out sorted in some way (e.g. tree preorder with children in lexical order or sth)
<whitequark> i have a few times wanted to *not* have that restriction, but it might well not be worth the hassle, so let's ignore that
<whitequark> re variably sized data: assuming it's something like uleb128, totally on board
<lsneff> Yep, varint/leb128 and the variably-sized value changes
<whitequark> perfect
<whitequark> oh yeah, make sure it understands signedness (which you probably already do to pick the ?leb128 variant) and supports strings (arbitrary width *per sample*) and enums
<whitequark> for enums, both the symbolic and the integral value
<whitequark> also, please make it possible to have multiple variables comprising a single source-level signal
<whitequark> e.g. one variable for a[15:0], another for a[23:16]
<whitequark> i think that's the bare minimum i need for decent user experience
<falteckz> Sorry to interject with a different topic, but is it best practice to create a Signal in a Submodule and then wire that up at top level, or instead pass the signals to be wired into the Submodule constructor?
<falteckz> i.e. The concerns of wiring up the signals (in this case, likely combinatorial domain), is this the responsibility of Top/Parent Module or Submodule?
<whitequark> falteckz: there is no strict best practice here; if you do the former, nmigen will create such intermediate signals in your output netlist
<whitequark> in the future, this might change (without breaking your code), but for now, do whatever you feel is more clear
<falteckz> Presumably this has no effect on PnR, is there a benefit to simulations to have the extra net?
<whitequark> probably makes simulations slightly slower
<whitequark> well, definitely makes them slower, but probably not to the extent that you should care about it a lot
<falteckz> Makes things slower in the general premature optimization concerns, sense.
<whitequark> yes
<whitequark> pysim achieves a fragile balance of fast startup and fast (to the extent cpython can be) simulation
<whitequark> trying to make pysim more clever for the latter tends to severely penalize the former
<whitequark> right now it's in a sort of local optimum
<lsneff> Hmm, didn't think about having multiple variables connected to a single signal
<whitequark> and that's why teaching it about signals that just alias other signals might not be very useful
<whitequark> lsneff: oh yeah that's absolutely criticla
<whitequark> in sram_soc.vcd you have 5 times as many aliases as actual variables
<lsneff> So, that would be a single value change for multiple variables
<whitequark> it should actually be more like 6 times but i had a bug
<whitequark> yeah
<lsneff> Alright,
<lsneff> Does vcd support that?
<whitequark> yes
<whitequark> and the file i sent you makes intense use of it
<lsneff> How are aliased variables indicated in a vcd?
<whitequark> they use the same identifier
<whitequark> the !@#$% thing
<lsneff> Ah, I see, but the range within that variable isn't specified?
<whitequark> VCD can only represent exact aliases
<whitequark> well
<whitequark> you can do it the other way around though
<whitequark> you could have both `a` and `b[15:0]` use identifier !, and then `b[23:16]` use identifier "
<whitequark> this also maps nicely to how cxxrtl works. names are secondary, variable data is primary
<lsneff> Okay
<whitequark> cxxrtl is pretty happy splitting variables to have multiple data chunks if necessary, this can speed up netlists in some cases quite a bit
<lsneff> Would having metadata for signals, and then the tree that has variables that can map to a range within a signal work for you?
<lsneff> So, true aliases basically
<falteckz> Does nmigen have a standard library with common stuff? I'm slowly working through my WS2812 driver, I wonder if there is an existing object that offers a synchronized interface like the ready/ack I'm trying to put together? https://i.imgur.com/lRXwbbK.png
<whitequark> lsneff: sure, a rangelist-to-rangelist relationship between source-level names and dump-level variables would work great for me
<whitequark> falteckz: it does (nmigen.lib), but the specific thing you want will be introduced in the next release
<whitequark> for now, roll your own
<falteckz> Thank you, will do that.
jeanthom has joined #nmigen
bvernoux has joined #nmigen
<lkcl> falteckz: this was the first thing that our team did, it's called nmutil
<lkcl> i developed a library similar to chisel3's iolib
<falteckz> Oh very nice!
<lkcl> there's a ton of examples:
<lkcl> https://git.libre-soc.org/?p=nmutil.git;a=blob;f=src/nmutil/test/test_buf_pipe.py;hb=HEAD
<lkcl> the "simple" ones have no buffering, the more complex ones use Synchronous FIFOs
<lkcl> with combinatorial bypass so there's no delay (but you have to watch out, obviously)
<lkcl> it is... with a bit of messing about... possible to use the same "StageAPI" for FSMs.
<lkcl> as in: the same ready/valid signalling protocol on both input and output can be used - without that protocol "bleeding into" the actual "Stage" if you know what i mean - can be used to create an FSM-version *and* a pipeline version of the same functionality
<lkcl> so, here's a FSM example:
<lkcl> https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/experiment/alu_fsm.py;hb=HEAD
<lkcl> and here's a pipelined example:
<lkcl> https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/experiment/alu_hier.py;hb=HEAD
<falteckz> Ah sure - so an FSM that holds on bus signalling
<falteckz> That's essentially something I wanted to do too, an FSM for the WS2812 that starts on one signal, and asserts a done signal at the final FSM state - which looks similar to your ALU
<lkcl> via the same ready/valid signalling that's used to create a pipeline from the same combinatorial blocks, yes
<falteckz> A most unfortunate thing with nmigen is that I seem to have a cognitive inability to properly visualize how my code will be synthesized. It's just not something I've mentally grasped yet. So I find that I get "writers block" a lot in nmigen
<lkcl> that's easily solved by using yosys "show {insert module}"
<falteckz> Something that can't be fixed with an iolib obviously, but the less hurdles the better
<whitequark> now that is one thing lkcl can help you with :D
<lkcl> :)
<lkcl> for about 8 months i ran yosys "read_ilang {insert filename}" after a build, followed by "show top" or "show {insert modulename}" after literally every single build
<falteckz> ilang takes *.il ?
<whitequark> personally i'm unconvinced it's essential, but if you do have that problem, it is very easily solved
<lkcl> for exactly the same reason: i am used to gate-level design
<whitequark> yeah, ilang takes *.il
<whitequark> the command is called read_rtlil now
<whitequark> (yosys no longer has the nonsensical rtlil/ilang distinction, it is always "rtlil" now)
<whitequark> (read_ilang still works though)
<falteckz> Beyond theoretical which was read about, my applied introduction to "gate-ware" started, without shame, in Minecraft as Redstone - I then moved to logic simulators like Logisim and Digital - and then a small about of Verilog
<lkcl> yeah 18 months later i have sort-of developed the ability to visualise the 1D lines of source code as a 2D graph however it remains tenuous
<falteckz> It was very dirty and low level from the beginning
<whitequark> perhaps it is my background in compilers that makes it a lot easier to figure out what my code translates to, or something else, but i've never cared much for `show`. however for those people who do need it, it certainly can be helpful
<whitequark> (`show` is also pretty hard to use on large netlists)
<whitequark> for the docs, i am going to use inline netlistsvg diagrams to teach some of this intuition
<whitequark> (which are a lot easier to read than the graphs made by `show`)
<lkcl> whitequark: i "fixed" that by always splitting code down into smaller modules, and, where the expressions end up being repeated, made sure to place them explicitly into temporary signals
<lkcl> nice!
<falteckz> `ERROR: No such command: read_rtlil` darn
<whitequark> yosys too old
<falteckz> Looks like I'm outdated -.-
<lkcl> ah using the javascript plugin that mithro found?
<whitequark> do you have 0.9 or something?
<falteckz> Yosys 0.9+2406 (open-tool-forge build) (git sha1 09ecb9b2, x86_64-w64-mingw32-g++ 10.1.0 -Os)
<lkcl> falteckz: it'll be called read_ilang in the version you're using
<whitequark> i recommend updating anyways
<lkcl> you also need to make sure you install graphviz
<lkcl> falteckz: what whitequark said :)
<falteckz> The downside with updating is that my currently working system might stop working Hahaha
<falteckz> I'm running Windows
<whitequark> open-tool-forge builds should be pretty robust
<whitequark> you can also use yowasp if you want
<falteckz> Righto, will hit the update
<whitequark> it's a bit of a different approach, packaging yosys as a python package
<falteckz> There are size limits for that, yes?
<falteckz> If I recall, yowasp was reduced down, some features stripped?
<whitequark> yowasp did not have any features stripped other than those which would not build
<whitequark> ... ah, right, `show` is one of those
<whitequark> ignore me and proceed with open-tool-forge builds
<falteckz> Affirm, Willco
<whitequark> i might actually make `show` work in yowasp soon, but it requires work
<falteckz> I assume there is no update and I just trample open forge with a newer one
<lkcl> whitequark: the other reason i followed an early strict rule (for about a year) of using graphviz show was because if it took more than about... 10 seconds for a given page to render, i took that as a sign that the design wasn't modular enough
<lkcl> split it...
<whitequark> it's easy to end up with reasonable netlists that have pathological renders in yosys
<lkcl> and readability of the source code improved greatly as a result
<whitequark> it can work for some, but like i mentioned, i am very hesitant to call it essential in general
<lkcl> yehyeh no absolutely. everyone's different
<falteckz> As someone that can mentally visualize monoliths of software, I gotta say it's very frustrating to not be able to conceptualize a single module without trying to get GraphViz in Windows haha
<lkcl> :)
<whitequark> falteckz: i think you'll get the hang of it eventually if you can visualize software
<falteckz> Recursion all the way down
<whitequark> in some ways, nmigen-to-rtl correspondence tends to be a lot more straightforward than something like c++-to-assembly correspondence
<lkcl> falteckz: you're a terry pratchett fan? :)
<whitequark> in most ways really.
<falteckz> lkcl: My partner is, I don't have the attention span for a book without a lot of meds
<lkcl> whitequark: yeah i'm really relieved that the translation to ilang is readable
<whitequark> that too but i mean post-synthesis rtl, or at least post-coarse-synthesis
<lkcl> falteckz: lol i'm also losing the ability to focus on 2-hour-long films, but really good books i'm happy to read straight
<falteckz> lkcl: People are most curious :D
<falteckz> Who's lazy idea was it to paste open-forge into Git!
<whitequark> oh dear
<whitequark> that is outstandingly cursed
<whitequark> i don't have the slightest idea how you would end up with that lmao
* lkcl likes mingw32. but for aaaall the wrong reasons :)
<whitequark> see, this is the kind of stuff that drove me to make yowasp
<falteckz> I was probably desperate to get an LED blinking in 30 minutes, and cut corners with a sledge hammer
<whitequark> at least you already have to deal with python package management whether you like it or not (let's face it, you probably don't like it)
<falteckz> :O
<falteckz> How did you know that I do not not not like Python package management
<whitequark> i've met probably one person total who likes it
<falteckz> They wrote it and were very proud?
<lkcl> whitequark: yowser
<whitequark> to PSF's credit, they have improved it greatly this year and continue improving
<whitequark> but... it really does have a reputation
<whitequark> no, i think you will find that the people who wrote it are acutely aware of its issues
<whitequark> well, maintain it
<falteckz> At workplace <unnamed> we are still using Python2 for the foreseeable future, so I wont be seeing any refreshing revisions to package management soon
<lkcl> the "gold standard" is debian packaging. unnfortunately, it's such an extremely high standard that many people go "surely it doesn't have to be that complicated" and you end up with pip3, npm, etc. etc. as a result
<whitequark> i think the new pip resolver might actually be available on 2.7?
<falteckz> I like npm for 3 reasons
<whitequark> the debian packaging of python in particular is so atrocious that PSF is seeking feedback to really drive the point in
<whitequark> of how broken it is so that maybe finally something will change
<Sarayan> gold standard my ass, making debian packages is a fucking pain in the ass
<falteckz> 1. It's local by default, 2. It does tree shaking, 3. global works how you expect
<whitequark> the amount of time i personally wasted because of debian packaging issues is beyond any reason
<falteckz> Isn't debian package management awfully slow?
<Sarayan> I'm made debian, rpm, gentoo and arch packages, and the debian ones are the most annoying by a fucking lot
<Sarayan> I've
<falteckz> You are what you make, Sarayan
<lkcl> Sarayan: yes, it's such a high standard that it's viewed as "such a pain" that the problems it solves are disregarded
<lkcl> it's very unfortunate
<falteckz> Honestly - if Debian package management is the gold standard, and let's say I believe that. How hopeless we all are. But Debian packages have burnt me just as often as going well
<whitequark> falteckz: if you want something vaguely npm-like, you could use virtualenv or poetry
<Sarayan> I'm not convinced the package building is more expressive in the debian case
<whitequark> personally i would recommend sticking with pip and virtualenv, though some people swear by poetry
<falteckz> I don't love Python enough to learn another way to fix packages
<falteckz> Wow - that sounded so close minded.
<falteckz> I'll look into Poetry
<whitequark> if you want to spend the absolute minimum of time, i recommend virtualenv
<falteckz> VSCode makes venv not so bad too
<whitequark> `python3 -m venv env; source env/activate`
* lkcl wants to know what poetry is, too... looking it up..
<whitequark> that's pretty much all you need
<whitequark> er, it's env/bin/activate i think
<whitequark> ah sorry, windows. `env/bin/actviate.bat` then
<d1b2> <tizilogic> it's actually env/Scripts/activate.bat on windows for some reason...
<whitequark> oh oops
<whitequark> the linux version doesn't write the bat file and i didn't have a windows vm
<whitequark> well, not one that was running
<falteckz> I don't write code on Windows
<falteckz> I code in Linux - I nmigen in Windows cause Work Laptop v. Gaming Desktop at home that also does FPGA Synth
<lkcl> yeah virtualenv is nice for isolating development cleanly. multiple projects, or even just if you have a local (stock, distro) installation of python and don't want it "messed up"
<d1b2> <tizilogic> I try to avoid it too.. but sometimes one must
<whitequark> eh, my job is to tell you about virtualenv, it's up to you whether to bother with it or not
<whitequark> personally i live without virtualenvs most of the time
<whitequark> mostly because my python needs are quite simple
<lkcl> falteckz: ah interesting setup. then you might want to consider installing yosys on the linux side, not for doing FPGA compilation but just to do the yosys "show"
<d1b2> <tizilogic> I use them all the time for exactly that reason..
<falteckz> Apparently I'm already using a .venv in Windows :shrug:
<falteckz> lkcl, yosys show is working in Windows - but I'm not sure how to render it yet
<lkcl> whitequark: yeah i get away with not using virtualenv. for the coriolis2 layout, Jean-Paul and Staf taught me that you need the *entire* P&R virtualised (repeatable builds) for ASIC development
<whitequark> sometimes i feel like it would be easier to stuff nmigen and the entire toolchain into a browser
<whitequark> graphviz too
<lkcl> ooo god please don't put nmigen in a web browser! i've been down that rabbit-hole :)
<whitequark> if webassembly is not solving your problems, compile more things to webassembly
<lkcl> graphviz - good call. nmigen? urrrr :)
<falteckz> I mean a browser has everything you need, USB access, WASM, a very good accelerated renderer, full multi-platform support, and lastly it has already reserved all of your RAM.
<whitequark> lol
<whitequark> yowasp already does run in the browser
<whitequark> ... sort of
* lkcl is enjoying the conversation but needs to get up, walk around and get something to drink
<lkcl> apologies - later folks
<falteckz> peace
<Sarayan> Perhaps I should run fpga sims in the browser ;-)
<whitequark> well, if we port not only yowasp but also clang to wasm, you could build cxxrtl simulations to wasm locally in the browser and then load them
<Sarayan> intriguing
<whitequark> this sounds like a bad joke but it is entirely technically feasible and people do in fact use something like this in production already
<whitequark> it is also cursed beyond belief but it does work
<Sarayan> I was thinking from bitstream, but yeah, there's that too
<whitequark> let's put it that way, once it leaves nextpnr, it is not my responsibility :P
lkcl has quit [Ping timeout: 240 seconds]
<Sarayan> looks like for cyclone v it's mine :-)
<Sarayan> dunno if you noticed, we've started relesing stuff publically
<whitequark> i've seen a bit
<Sarayan> I need to finish mapping the logic blocks connections, then I'll go on timings
<Sarayan> I suspect there's some very interesting work to do in synthesis. Did you know that the clock line drivers have an enable input? So if you have an clock+enable combo used a lot you can turn it into a simple clock and avoid routing the enable to the individual ffs
<whitequark> huh
<whitequark> are FFs enable-over-reset or the other way round?
<Sarayan> I'm not understanding the question
<whitequark> suppose you have a DFF with enable and reset inputs
<whitequark> if enable is 0 and reset is 1, is the FF reset?
<whitequark> (sync reset)
<Sarayan> there's a sync reset and an async reset
<whitequark> sync reset specifically yes
<Sarayan> I see the question, not sure of the answer though
<Sarayan> maybe it's in the docs, or otherwise another thing to test
<whitequark> that's kinda fundamental to the logic block heh
<Sarayan> They represent the sync reset as replacing the input though, so pretty sure no reset when enable=0
<whitequark> ok, that makes sense then re: enable inputs to clock line drivers
<whitequark> neat
lkcl has joined #nmigen
nfbraun has joined #nmigen
<Sarayan> there are 82 to 104 clock lines fwiw, so it's a real resource
<whitequark> oh yeah i believe that
<whitequark> fpga vendors do weird stuff with clock buffers
<daveshah> For some reason despite almost every FPGA having resources like that, vendor tools rarely if every push enables onto them
<daveshah> UltraScale+ had enables at leaf level (group of about 30 tiles) and it still never seems to be used
<Sarayan> nmigen really pushes enables on clocks though :-)
<falteckz> In the generated graphs from yosys, I see blocks with naming `PROC $group_1 $group_1`
<falteckz> any idea what these are to represent?
<whitequark> processes (decision trees)
<whitequark> run `proc` first
<whitequark> daveshah: how realistic do you think it is to add (e.g. for me, or someone i'll pay) db compression to ice40 in nextpnr?
<daveshah> It depends how you do it
<whitequark> yowasp nextpnr-ice40 databases are so massive i will have to make some unpleasant decisions in glasgow
<whitequark> i mean... i'd like to do it in a way that is realistic :D
<daveshah> If you don't mind a bit of startup cost, just using a generic compression algorithm would be easiest
<daveshah> Very long term, nextpnr will probably move to a standardised deduplicated database for all arches but I can't put a date on that
<whitequark> can i hack something for ice40 specifically the way you did for ecp5?
<whitequark> going from 20MB for 8k to 40MB for 8k+5k is not very friendly
<daveshah> Yeah, although that would be quite a bit more work
<whitequark> and that's already compressed
<whitequark> (in a wheel, so it's just deflate)
<Sarayan> wa: Lofty gets rather good results with lzma for cyclonev
<Lofty> It's very regular text to be fair
<whitequark> nextpnr databases are bbas though
<whitequark> iirc i tried 7z and got little additional benefit
<mithro> whitequark: We have a lot of interest in better db representation
<whitequark> not justifying the overhead of actually using lzma in the toolchain
<whitequark> mithro: i believe that, but for once, i just want an imperfect solution now
<whitequark> i am sure i could sink half a year into improved nextpnr database format, but i also don't have that time
<Sarayan> Lofty: the ones you're compressing and not text anymore, they're binary
<whitequark> daveshah: can you explain at a high level how it works in ecp5? just so i can see roughly how much work itis
<Sarayan> are
<daveshah> whitequark: for ECP5 basically it looks for grid locations that are identical based on relative coordinates and stores them only once, with a horrible workaround for globals (basically means globals are not actuallg conneted properly in the routing graph)
<daveshah> something similar but with better handling of globals would get some improvement, but for iCE40 you have a problem that a lot of the device is in the 'edge case' category
<mithro> whitequark: FYI - a group at BYU who have very compact representations for Xilinx parts is working with us to help explore the space that we hope to be useable by nextpnr in the future. Probably doesn't help you short term.
<daveshah> Yeah sadly I don't think there is a good short term solution here
<daveshah> but there is low hanging fruit more generally, like not storing wire names as complete strings but building them programmatically
<whitequark> i'd expect that to be handled very well by even deflate
<whitequark> daveshah: do you think there is any gain in reformulating the database to store deltas rather than absolute indexes?
<whitequark> i think that might expose more redundancy that compressors can exploit, but i'm not sure
<daveshah> yeah maybe
<daveshah> tbh half the problem here is iCE40 doesn't just have the routing graph, it also has the bits associated with each pip in a non-deduplicated form
<daveshah> this makes the db bigger and moving to a deduplicated db more work
<whitequark> i see
Bertl_zZ is now known as Bertl
<Sarayan> from a 640M source that's nice
<Sarayan> gah wrong channel
<_whitenotifier> [YoWASP/nextpnr] whitequark pushed 1 commit to develop [+0/-0/±1] https://git.io/JLSjc
<_whitenotifier> [YoWASP/nextpnr] whitequark 40f39a5 - Update wasi-sdk.
<Sarayan> something that's going to be interesting with the cyclones is that some decisions are probably going to be solved by converting part of the logic blocks to routing graph
<Sarayan> +s
<Sarayan> like which pll counter to use, or which clock mux
Bertl is now known as Bertl_oO
yehowshua has joined #nmigen
<yehowshua> anybody interested or working on an alternative VCD viewer to GTKWave?
<whitequark> lsneff: ^
<yehowshua> oh cool! just read through scroll back
<yehowshua> @lsn
<yehowshua> lsneff, I'm assuming you haven't made the source public yet
<yehowshua> Heres a rust VCD backend https://github.com/psurply/dwfv : not sure how fast it is
<yehowshua> for larger files
<yehowshua> https://github.com/hecrj/iced, could also be useful for the GUI frontend. Its a WebGPU cross-platform UI builder that's quite fast, pretty, and tiny(3MB Static I think)
<yehowshua> Its also written in rust
<whitequark> dwfv is not very fast
<whitequark> though it's not too bad either
FL4SHK has quit [Ping timeout: 246 seconds]
FL4SHK has joined #nmigen
emeb has joined #nmigen
<yehowshua> ah - hadn't really used it yet
yehowshua has quit [Remote host closed the connection]
FFY00 has quit [Remote host closed the connection]
FFY00 has joined #nmigen
<lsneff> What an interesting coincidence
FFY00 has quit [Remote host closed the connection]
FFY00 has joined #nmigen
chipmuenk has joined #nmigen
FFY00 has quit [Remote host closed the connection]
chipmuenk has quit [Quit: chipmuenk]
chipmuenk has joined #nmigen
korken89 has joined #nmigen
SpaceCoaster_ has joined #nmigen
SpaceCoaster has quit [Ping timeout: 272 seconds]
<korken89> Hey kbeckmann I quite like your cli and Applets approach to structuring a project! I'm thinking I'll use it in my project as well :D
<kbeckmann> korken89: ah glad you like it! i borrowed the ideas from the glasgow project :)
<korken89> Nice!
<korken89> I'm bringing up my FT600 right now and saw you started a driver for it :)
<kbeckmann> oh sweet
<kbeckmann> beware that it's probably not very good, the one i wrote
<kbeckmann> but there are a bunch of implementations floating around
<korken89> Cool, I'll search around!
<kbeckmann> oMigen but might be a good inspiration anyway https://github.com/enjoy-digital/pcie_screamer/blob/master/gateware/ft601.py
<korken89> Oh, interesting approach to the control signals
<korken89> Makes sense, does not need the extra clock of delay then it seems when controlling them
<korken89> Oh, that was a tiny implementation
<korken89> Ahh, it only sends, never receives
<anuejn> yup
<korken89> I'm also working on a camera, so could be a good start!
<korken89> anuejn: I think you are the author of that module right? It seems it does not have a cache for handling when the TXE indicates the FIFO is full and the written word is rejected, or am I maybe misunderstanding something about your implementation?
<korken89> Oh wait, it seems to be a part of the `Stream` module
<anuejn> I did not really understand your question but the semantics of stream are that a sucessfull transaction only happens when valid and ready are 1
<korken89> Makes sense, I'm thinking about the case where one takes a word from the internal FIFO and places it on the bus when in the same clock the TXE goes high, so the FT60x does not accept the word. But it seems in your case the Stream FIFO handles this case
<korken89> I have to give it a deeper look, seems like a very clean approach!
<korken89> Thanks for linking!
FFY00 has joined #nmigen
<korken89> I have run into a new warning, `Warning: Wire top.$verilog_initial_trigger has an unprocessed 'init' attribute.`, I get this when setting a set of IOs to a constant as `m.d.comb += pseudo_power.ft.eq(Const(-1))`. Is this an issue or just expected?
FFY00 has quit [Remote host closed the connection]
FFY00 has joined #nmigen
nfbraun has quit [Ping timeout: 256 seconds]
nfbraun has joined #nmigen
emeb_mac has joined #nmigen
<korken89> Seems to be working as it should, so the warning is probably fine :)
jeanthom has quit [Ping timeout: 240 seconds]
tannewt has quit [Read error: Connection reset by peer]
tannewt has joined #nmigen
mtk99 has joined #nmigen
mtk99 has quit [Remote host closed the connection]
DaKnig has quit [Ping timeout: 256 seconds]
DaKnig has joined #nmigen
DaKnig has joined #nmigen
jeanthom has joined #nmigen
<_whitenotifier> [nmigen] korken89 opened issue #569: ECP5 blockram silently trucates depth? - https://git.io/JLHeH
chipmuenk has quit [Quit: chipmuenk]
<_whitenotifier> [nmigen] daveshah1 commented on issue #569: ECP5 blockram silently trucates depth? - https://git.io/JLHvL
jeanthom has quit [Ping timeout: 256 seconds]
<_whitenotifier> [nmigen] korken89 commented on issue #569: ECP5 blockram silently trucates depth? - https://git.io/JLHvs
<_whitenotifier> [nmigen] korken89 closed issue #569: ECP5 blockram silently trucates depth? - https://git.io/JLHeH
korken89 has quit [Remote host closed the connection]
bvernoux has quit [Read error: Connection reset by peer]
jeanthom has joined #nmigen
jeanthom has quit [Ping timeout: 246 seconds]
<Chips4Makers> whitequark: Comment on the reset-over-enable for FFs. On ASICs you commonly don't have FFs with synchronous resets, they are typically async. So if you do have asycnhrounous reset in your RTL the reset will just be part of the logic before the FF.
<falteckz> Are there docs for -soc and -boards ?
<falteckz> I know in nmigen there is some indexable Memory construct, but I cannot seem to find the docs for it. I'm starting to suspect it's not part from `nmigen` itself
<falteckz> Or is it that source is docs for now as docs are under development?
<agg> doesn't look like Memory is in the online docs yet, but you can find its source here: https://github.com/nmigen/nmigen/blob/master/nmigen/hdl/mem.py
<falteckz> Thank you agg, so it is in fact part of the `nmigen` library
<agg> from nmigen import Memory; mem = Memory(width=8, depth=1024, init=[1,2,3,4]); read_port = m.submodules.read_port = mem.read_port(transparent=False); write_port = m.submodules.write_port(); m.d.sync += read_port.addr.eq(read_port.addr + 1); m.d.comb += output.o.eq(read_port.data); that sort of thing
<agg> yea, it's part of nmigen itself
<agg> you can add multiple read/write ports and specify a clock domain for each one, by default 'sync'
<falteckz> Is there any constraints as far as internal hardware goes? i.e. will Memory use FPGA RAM, and if so, is there a read and write port limit?
<agg> nmigen doesn't know, but synthesis results will depend on what you've done
<agg> by default very small memories will probably use LUTs and larger memories use block RAMs, but you can set attributes on the memories to force it one way or the other
<agg> if you have e.g. one write and two read ports, but your hardware only has one of each per BRAM, it can duplicate the BRAM, wire the write ports together, and you end up with two independent read ports
<falteckz> The attributes are platform specific?
<agg> memory.attrs["logic_block"] = 1 or memory.attrs["ram_block"] = 1, I think are quite widely accepted
<agg> (if you really need fine control, you can always instantiate the platform's memory primitive directly, too)
<falteckz> Not sure what I need yet, but good to know the options. I think I'll want to be reading from SPI Flash and buffering into RAM