#nmigen on 2020-12-29 — irc logs at freenode.irclog.whitequark.org

2020-12-07 01:53 ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen · IRC meetings each Monday at 1800 UTC · next meeting TBD

00:07 nfbraun has quit [Quit: leaving]

00:13 emeb has quit [Quit: Leaving.]

00:25 lf has quit [Ping timeout: 258 seconds]

00:25 lf has joined #nmigen

00:56 <_whitenotifier> [YoWASP/yosys] whitequark pushed 1 commit to develop [+0/-0/±1] https://git.io/JLSVW

00:56 <_whitenotifier> [YoWASP/yosys] whitequark 5e121bd - Update dependencies.

01:19 <d1b2> <AlterNet> hey, is there nmigen gate-level sim?

01:24 <whitequark> what do you mean by "gate-level" in this context?

01:30 <d1b2> <AlterNet> like, LUT-level

01:30 <d1b2> <AlterNet> esp. if I instantiate LUTs directly

01:30 <whitequark> as in, you'd like to simulate nmigen netlists together with instances?

01:32 <Yatekii> hmm question: if I have some logic that automatically gets translated into blockram macros by nmigen, and that violates timing constraints in the implementation, can I tell nmigen to do it in actual logic and not BRAMs? (just an example,. I do not have this problem, I am curious :))

01:33 <whitequark> yes

01:33 <whitequark> memory.attrs["logic_block"] = 1

01:33 <d1b2> <AlterNet> yes

01:34 <whitequark> AlterNet: not yet, but this is something that a work in progress will enable

01:35 <Yatekii> no I mean the other way round :D lets say I describe a delay element that delays N clock cycles. that could be described as a ringbuffer and could go into BRAM (no idea if nmigen does this). but what if ithis is actually "slower" as in critical path than a LUT impl. could I tell it to use plain logic?

01:35 <whitequark> nmigen never does this, the toolchain will generally not do this either

01:35 <Yatekii> ok :)

01:36 <whitequark> you might see shift registers inferred

01:38 <Yatekii> how do you mean? sorry I am quite noob here ^^

01:38 <whitequark> some xilinx devices can reuse LUTs as shift registers

01:38 <Yatekii> ah

01:38 <Yatekii> I see

01:40 <Yatekii> thanks a lot :)

01:43 falteckz has joined #nmigen

01:43 <falteckz> Hi all - what's the right way to override a platform? I assume I can just create a class that extends TinyFPGABXPlatform. If I want to add more resources do I override the __init__ function? Is there an API for something that is already called that will serve this purpose?

01:44 <d1b2> <dub_dub_11> Is this for a board?

01:45 <d1b2> <dub_dub_11> Generally is code duplication

01:45 <d1b2> <dub_dub_11> So copy + edit the tinyfpgabx.py platform file

01:45 <falteckz> It's for a custom board that I wish to build eventually - but it's the same FPGA as the BX

01:46 <d1b2> <dub_dub_11> Yeah if it's a different board you make a different platform file, inheriting from Ice40Platform or whatever it is

01:48 <falteckz> LatticeICE40Platform fwiw

01:48 <falteckz> Thanks I'll do that!

01:48 <d1b2> <dub_dub_11> If you are basing the design off it, you may find it helpful to copy chunks of code, but you make a seperate file yeah

01:49 <d1b2> <dub_dub_11> Ah that checks out

01:52 <falteckz> Yep, going with the duplicate code approach. Or "explicit" as Python calls it - I think it makes sense here

01:52 <whitequark> code duplication often makes sense because it can decouple unrelated components and allow them to change separately

01:53 <falteckz> You're right - changes to TinyFPGA do not apply to me if I'm making something custom. So it's wrong to extend, since they aren't coupled in that regard

01:54 <whitequark> yep!

01:54 <whitequark> sometimes, programmers dislike code duplication to a far greater extent than is really warranted. the inverse is occasionally true as well.

01:54 <whitequark> but it is less about duplication itself and more about coupling.

01:54 <whitequark> so, you get it :)

01:56 <falteckz> Haha I'm glad :) - say, while here, would any of you happen to know of a WS2812b driver? The protocol is simple and I'd be happy to implement it myself, but I get an early lunch if I implement it quicker haha

01:56 <whitequark> not offhand but others here might know

01:56 <whitequark> agg maybe?

01:56 <whitequark> actually wait

01:57 <whitequark> you *might* be able to draw inspiration from https://github.com/GlasgowEmbedded/glasgow/blob/9fff92ddfbfd309c62aaf3e4170a904e4a426382/software/glasgow/applet/video/ws2812_output/__init__.py#L11-L137

01:57 <whitequark> but it has a bunch of unrelated stuff involved so i'm not sure how much it simplifies your task

02:03 <falteckz> It looks like VideoWS2812OutputSubtarget handles a single strip of WS2812s, so I should be able to leverage it. Thank you for the reference :-)

02:04 <falteckz> It only just occurred to me that you had already highlighted that class -.-'

02:05 <whitequark> hehe

02:25 <falteckz> Are you familiar with the FIFO interface that VideoWS2812OutputApplet is using?

02:25 <whitequark> i designed it

02:29 <falteckz> Looks like the video binds directly to a crossbar - a nice implementation but perhaps far too complicated for my needs (needs yet undefined)

02:30 <whitequark> yeah, that is why i said "i'm not sure how useful it will be to you"

02:30 <falteckz> It's cool that we have essentially a DMA controller here

02:31 <falteckz> But yeah - maybe I just mimic the FIFO interface but clock the data in myself

02:31 <whitequark> hm, glasgow doesn't have memory

02:31 <falteckz> Perhaps my terminology is incorrect - it's a round-robin FIFO pump

02:32 <falteckz> I only mention it to express my location in the source. I'm aware you designed it :-)

02:34 <whitequark> ah, yeah :)

03:42 electronic_eel has quit [Ping timeout: 246 seconds]

03:42 electronic_eel has joined #nmigen

04:02 sakirious has quit [Quit: The Lounge - https://thelounge.chat]

04:04 sakirious has joined #nmigen

04:24 Bertl_oO is now known as Bertl_zZ

04:27 PyroPeter_ has joined #nmigen

04:30 PyroPeter has quit [Ping timeout: 256 seconds]

04:30 PyroPeter_ is now known as PyroPeter

04:32 <_whitenotifier> [YoWASP/yosys] whitequark pushed 1 commit to develop [+0/-0/±1] https://git.io/JLSXJ

04:32 <_whitenotifier> [YoWASP/yosys] whitequark 0aba9d4 - Update wasi-sdk.

04:42 Degi_ has joined #nmigen

04:44 <lsneff> whitequark: you may find this interesting https://usercontent.irccloud-cdn.com/file/1s3qUoi7/ligeia.png

04:44 Degi has quit [Ping timeout: 256 seconds]

04:44 Degi_ is now known as Degi

04:45 <whitequark> lsneff: sweet

04:46 <whitequark> did i give you a large VCD as you requested?

04:47 <lsneff> I don't think so, but that was a 448M vcd

04:47 <whitequark> lemme upload you some

04:53 <lsneff> Looks like a significant amount of overhead is from the vcd crate allocating a vector for each vector value change, so I'll try to change that and see how much it speeds up.

05:24 <lsneff> Yep, that nearly cut the time in half

05:55 <whitequark> lsneff: try this https://mega.nz/file/8xxy0BID#bhMn4RLNtU-WGDb9aFHbbrwXJ7cqjG4Nrdh6kx7LOq8

05:55 <awygle> that looks interesting

05:56 <awygle> what are you working on lsneff?

06:10 <lsneff> whitequark: https://usercontent.irccloud-cdn.com/file/4Ssq4c2S/sram_soc.vcd.1.png

06:10 <whitequark> lsneff: pretty nice

06:10 <whitequark> do you have a format spec yet? i might be able to emit it directly from cxxrtl

06:12 <lsneff> awygle: I'm working on a gtkwave alternative

06:12 <awygle> oh _excellent_

06:15 <lsneff> whitequark: Not a spec yet, but it's pretty clean. Three mmapped files, one contains a bit of data about each variable and is written to for each command on that variable, the second contains a list of varint timestamp deltas, the third contains linked lists for each variable interspersed with each other.

06:15 <lsneff> Will try to come up with a spec soon

06:15 <whitequark> lsneff: oh, so that's an internal thing

06:15 <whitequark> (it would be quite impractical to generate three files from cxxrtl, so i don't think i will)

06:16 <whitequark> (for example that means i cannot stream the waveform dump over the network)

06:16 <whitequark> (not easily anyway)

06:16 <lsneff> Ah, I see what you mean, makes sense

06:16 <lsneff> Yeah, that's an internal thing.

06:16 <whitequark> right, i don't really need a spec for that :)

06:16 <whitequark> as long as it's fast

06:17 * awygle replays the FST discussion in his head rather than cluttering the channel

06:18 <lsneff> Hmm, I'll come up with something that could replace vcd.

06:18 <whitequark> awygle: you have to link to gtkwave in practice

06:18 <awygle> yeah i know

06:18 <whitequark> e.g. verilator can dump fst, using gtkwave's lib

06:18 <whitequark> right ok

06:18 <awygle> until/unless someone reimplements it with better docs and stuff

06:19 <awygle> at which point they'll probably uncover Horrors

06:21 <tpw_rules> so i was thinking earlier today how on earth to manipulate this 170GB bus trace which could be easily cast to VCD... lsneff is your stuff easily programmatically accessible? could i do fast searches with context? can i compress the files?

06:22 <lsneff> Not yet, I haven't finished implementing the query format yet.

06:22 <whitequark> tpw_rules: what's the bus?

06:23 <tpw_rules> it's from an snes emulator. all in ASCII. sample line: 0083ff sta $4200 [004200] A:0001 X:0000 Y:0000 S:01ff D:0000 DB:00 nv1BdIzc V: 0 H: 262

06:25 <whitequark> o!

06:25 <tpw_rules> i want to answer questions like "what are the 10 instructions following a store to memory location X? or execution of instruction Y? or on line 35?". i would be grateful if one of you knew something to manipulate it. i'm fine translating it to another format. but it's 170GB uncompressed and very painful to work with

06:25 <tpw_rules> (answer programmatically)

06:26 <lsneff> Support for things like that would be useful for myself as well.

06:26 <whitequark> galaxy brain: this is what eBPF was invented for

06:29 <lsneff> local supercluster brain: this is what wasm was invented for

06:29 <whitequark> you will probably have underwhelming performance with wasm

06:29 <lsneff> oh certainly lmao

06:29 <whitequark> it's fast, but this is also a use case where even 1% faster can be noticeable

06:29 <lsneff> Yeah, the issue with it is no outside memory access

06:30 <whitequark> now that wasm64 is a thing you could probably map all 170 GB

06:30 <whitequark> tpw_rules: have you considered converting those lines to basically a struct, and writing a small c or c++ program to search?

06:31 <whitequark> that seems like a good starting point, which you could improve by working on the parts that are painful

06:31 <tpw_rules> yeah that's what i was going to do

06:31 <tpw_rules> but i think some sort of index format would be a huge benefit. avoid iterating over every single thing every time

06:31 <whitequark> ... stuff it into postgres?

06:32 <whitequark> i guess postgres is not good with time series data

06:32 <whitequark> and neither is influx, not with *this* kind of time series data

06:32 <whitequark> hmm

06:32 <whitequark> maybe numba?

06:32 <tpw_rules> i thought about that too but the sticking point is i don't want to work with it uncompressed

06:32 <lsneff> Is querying if this sort a feature this program should have?

06:33 <whitequark> lsneff: i would say it is not a part of MVP

06:33 <tpw_rules> no it's not

06:33 <lsneff> 👍

06:33 <whitequark> i would really, really, really like to have something that is less excruciating to use than gtkwave, first

06:34 <tpw_rules> my current idea is to divide it into chunks (e.g. 60 emulated frames) and sort by low 8 bits of address or something. then i can run a program over several thousand files in parallel that's just a binary search. i guess linear is fine if i plan on decompressing it every time now that i think about it

06:39 <tpw_rules> maybe compression is unrealistic

06:53 <lsneff> whitequark: you can write out the metadata for each variable to the file/network-stream before running the simulation, yes?

06:53 <lsneff> You're alright if it contains variably sized data?

06:58 emeb_mac has quit [Quit: Leaving.]

07:00 Sid___ has joined #nmigen

07:04 <whitequark> lsneff: re metadata: yes, and i can easily write it out sorted in some way (e.g. tree preorder with children in lexical order or sth)

07:04 <whitequark> i have a few times wanted to *not* have that restriction, but it might well not be worth the hassle, so let's ignore that

07:04 <whitequark> re variably sized data: assuming it's something like uleb128, totally on board

07:09 <lsneff> Yep, varint/leb128 and the variably-sized value changes

07:09 <whitequark> perfect

07:10 <whitequark> oh yeah, make sure it understands signedness (which you probably already do to pick the ?leb128 variant) and supports strings (arbitrary width *per sample*) and enums

07:11 <whitequark> for enums, both the symbolic and the integral value

07:13 <whitequark> also, please make it possible to have multiple variables comprising a single source-level signal

07:14 <whitequark> e.g. one variable for a[15:0], another for a[23:16]

07:14 <whitequark> i think that's the bare minimum i need for decent user experience

07:17 <falteckz> Sorry to interject with a different topic, but is it best practice to create a Signal in a Submodule and then wire that up at top level, or instead pass the signals to be wired into the Submodule constructor?

07:18 <falteckz> i.e. The concerns of wiring up the signals (in this case, likely combinatorial domain), is this the responsibility of Top/Parent Module or Submodule?

07:18 <whitequark> falteckz: there is no strict best practice here; if you do the former, nmigen will create such intermediate signals in your output netlist

07:18 <whitequark> in the future, this might change (without breaking your code), but for now, do whatever you feel is more clear

07:19 <falteckz> Presumably this has no effect on PnR, is there a benefit to simulations to have the extra net?

07:19 <whitequark> probably makes simulations slightly slower

07:19 <whitequark> well, definitely makes them slower, but probably not to the extent that you should care about it a lot

07:20 <falteckz> Makes things slower in the general premature optimization concerns, sense.

07:21 <whitequark> yes

07:21 <whitequark> pysim achieves a fragile balance of fast startup and fast (to the extent cpython can be) simulation

07:21 <whitequark> trying to make pysim more clever for the latter tends to severely penalize the former

07:22 <whitequark> right now it's in a sort of local optimum

07:23 <lsneff> Hmm, didn't think about having multiple variables connected to a single signal

07:23 <whitequark> and that's why teaching it about signals that just alias other signals might not be very useful

07:23 <whitequark> lsneff: oh yeah that's absolutely criticla

07:23 <whitequark> in sram_soc.vcd you have 5 times as many aliases as actual variables

07:23 <lsneff> So, that would be a single value change for multiple variables

07:23 <whitequark> it should actually be more like 6 times but i had a bug

07:23 <whitequark> yeah

07:24 <lsneff> Alright,

07:24 <lsneff> Does vcd support that?

07:24 <whitequark> yes

07:24 <whitequark> and the file i sent you makes intense use of it

07:27 <lsneff> How are aliased variables indicated in a vcd?

07:28 <whitequark> they use the same identifier

07:28 <whitequark> the !@#$% thing

07:29 <lsneff> Ah, I see, but the range within that variable isn't specified?

07:30 <whitequark> VCD can only represent exact aliases

07:31 <whitequark> well

07:31 <whitequark> you can do it the other way around though

07:31 <whitequark> you could have both `a` and `b[15:0]` use identifier !, and then `b[23:16]` use identifier "

07:32 <whitequark> this also maps nicely to how cxxrtl works. names are secondary, variable data is primary

07:32 <lsneff> Okay

07:32 <whitequark> cxxrtl is pretty happy splitting variables to have multiple data chunks if necessary, this can speed up netlists in some cases quite a bit

07:32 <lsneff> Would having metadata for signals, and then the tree that has variables that can map to a range within a signal work for you?

07:33 <lsneff> So, true aliases basically

07:33 <falteckz> Does nmigen have a standard library with common stuff? I'm slowly working through my WS2812 driver, I wonder if there is an existing object that offers a synchronized interface like the ready/ack I'm trying to put together? https://i.imgur.com/lRXwbbK.png

07:33 <whitequark> lsneff: sure, a rangelist-to-rangelist relationship between source-level names and dump-level variables would work great for me

07:34 <whitequark> falteckz: it does (nmigen.lib), but the specific thing you want will be introduced in the next release

07:34 <whitequark> for now, roll your own

07:34 <falteckz> Thank you, will do that.

08:18 jeanthom has joined #nmigen

10:28 bvernoux has joined #nmigen

10:50 <lkcl> falteckz: this was the first thing that our team did, it's called nmutil

10:50 <lkcl> https://git.libre-soc.org/?p=nmutil.git;a=tree;f=src/nmutil;hb=HEAD

10:50 <lkcl> i developed a library similar to chisel3's iolib

10:51 <falteckz> Oh very nice!

10:51 <lkcl> there's a ton of examples:

10:51 <lkcl> https://git.libre-soc.org/?p=nmutil.git;a=blob;f=src/nmutil/test/test_buf_pipe.py;hb=HEAD

10:52 <lkcl> the "simple" ones have no buffering, the more complex ones use Synchronous FIFOs

10:53 <lkcl> with combinatorial bypass so there's no delay (but you have to watch out, obviously)

10:53 <lkcl> it is... with a bit of messing about... possible to use the same "StageAPI" for FSMs.

10:54 <lkcl> as in: the same ready/valid signalling protocol on both input and output can be used - without that protocol "bleeding into" the actual "Stage" if you know what i mean - can be used to create an FSM-version *and* a pipeline version of the same functionality

10:56 <lkcl> so, here's a FSM example:

10:56 <lkcl> https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/experiment/alu_fsm.py;hb=HEAD

10:56 <lkcl> and here's a pipelined example:

10:56 <lkcl> https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/experiment/alu_hier.py;hb=HEAD

10:56 <falteckz> Ah sure - so an FSM that holds on bus signalling

10:57 <falteckz> That's essentially something I wanted to do too, an FSM for the WS2812 that starts on one signal, and asserts a done signal at the final FSM state - which looks similar to your ALU

10:57 <lkcl> via the same ready/valid signalling that's used to create a pipeline from the same combinatorial blocks, yes

10:58 <falteckz> A most unfortunate thing with nmigen is that I seem to have a cognitive inability to properly visualize how my code will be synthesized. It's just not something I've mentally grasped yet. So I find that I get "writers block" a lot in nmigen

10:59 <lkcl> that's easily solved by using yosys "show {insert module}"

10:59 <falteckz> Something that can't be fixed with an iolib obviously, but the less hurdles the better

10:59 <whitequark> now that is one thing lkcl can help you with :D

10:59 <lkcl> :)

11:00 <lkcl> for about 8 months i ran yosys "read_ilang {insert filename}" after a build, followed by "show top" or "show {insert modulename}" after literally every single build

11:00 <falteckz> ilang takes *.il ?

11:00 <whitequark> personally i'm unconvinced it's essential, but if you do have that problem, it is very easily solved

11:00 <lkcl> for exactly the same reason: i am used to gate-level design

11:00 <whitequark> yeah, ilang takes *.il

11:01 <whitequark> the command is called read_rtlil now

11:01 <whitequark> (yosys no longer has the nonsensical rtlil/ilang distinction, it is always "rtlil" now)

11:01 <whitequark> (read_ilang still works though)

11:01 <falteckz> Beyond theoretical which was read about, my applied introduction to "gate-ware" started, without shame, in Minecraft as Redstone - I then moved to logic simulators like Logisim and Digital - and then a small about of Verilog

11:01 <lkcl> yeah 18 months later i have sort-of developed the ability to visualise the 1D lines of source code as a 2D graph however it remains tenuous

11:01 <falteckz> It was very dirty and low level from the beginning

11:02 <whitequark> perhaps it is my background in compilers that makes it a lot easier to figure out what my code translates to, or something else, but i've never cared much for `show`. however for those people who do need it, it certainly can be helpful

11:02 <whitequark> (`show` is also pretty hard to use on large netlists)

11:03 <whitequark> for the docs, i am going to use inline netlistsvg diagrams to teach some of this intuition

11:03 <whitequark> (which are a lot easier to read than the graphs made by `show`)

11:03 <lkcl> whitequark: i "fixed" that by always splitting code down into smaller modules, and, where the expressions end up being repeated, made sure to place them explicitly into temporary signals

11:03 <lkcl> nice!

11:04 <falteckz> `ERROR: No such command: read_rtlil` darn

11:04 <whitequark> yosys too old

11:04 <falteckz> Looks like I'm outdated -.-

11:04 <lkcl> ah using the javascript plugin that mithro found?

11:04 <whitequark> do you have 0.9 or something?

11:04 <falteckz> Yosys 0.9+2406 (open-tool-forge build) (git sha1 09ecb9b2, x86_64-w64-mingw32-g++ 10.1.0 -Os)

11:04 <lkcl> falteckz: it'll be called read_ilang in the version you're using

11:04 <whitequark> i recommend updating anyways

11:05 <lkcl> you also need to make sure you install graphviz

11:05 <lkcl> falteckz: what whitequark said :)

11:05 <falteckz> The downside with updating is that my currently working system might stop working Hahaha

11:05 <falteckz> I'm running Windows

11:05 <whitequark> open-tool-forge builds should be pretty robust

11:05 <whitequark> you can also use yowasp if you want

11:05 <falteckz> Righto, will hit the update

11:05 <whitequark> it's a bit of a different approach, packaging yosys as a python package

11:06 <falteckz> There are size limits for that, yes?

11:06 <falteckz> If I recall, yowasp was reduced down, some features stripped?

11:06 <whitequark> yowasp did not have any features stripped other than those which would not build

11:06 <whitequark> ... ah, right, `show` is one of those

11:06 <whitequark> ignore me and proceed with open-tool-forge builds

11:07 <falteckz> Affirm, Willco

11:07 <whitequark> i might actually make `show` work in yowasp soon, but it requires work

11:08 <falteckz> I assume there is no update and I just trample open forge with a newer one

11:10 <lkcl> whitequark: the other reason i followed an early strict rule (for about a year) of using graphviz show was because if it took more than about... 10 seconds for a given page to render, i took that as a sign that the design wasn't modular enough

11:10 <lkcl> split it...

11:10 <whitequark> it's easy to end up with reasonable netlists that have pathological renders in yosys

11:10 <lkcl> and readability of the source code improved greatly as a result

11:10 <whitequark> it can work for some, but like i mentioned, i am very hesitant to call it essential in general

11:11 <lkcl> yehyeh no absolutely. everyone's different

11:11 <falteckz> As someone that can mentally visualize monoliths of software, I gotta say it's very frustrating to not be able to conceptualize a single module without trying to get GraphViz in Windows haha

11:12 <lkcl> :)

11:12 <whitequark> falteckz: i think you'll get the hang of it eventually if you can visualize software

11:12 <falteckz> Recursion all the way down

11:12 <whitequark> in some ways, nmigen-to-rtl correspondence tends to be a lot more straightforward than something like c++-to-assembly correspondence

11:12 <lkcl> falteckz: you're a terry pratchett fan? :)

11:12 <whitequark> in most ways really.

11:13 <falteckz> lkcl: My partner is, I don't have the attention span for a book without a lot of meds

11:13 <lkcl> whitequark: yeah i'm really relieved that the translation to ilang is readable

11:14 <whitequark> that too but i mean post-synthesis rtl, or at least post-coarse-synthesis

11:14 <lkcl> falteckz: lol i'm also losing the ability to focus on 2-hour-long films, but really good books i'm happy to read straight

11:14 <falteckz> lkcl: People are most curious :D

11:15 <falteckz> Who's lazy idea was it to paste open-forge into Git!

11:15 <falteckz> context: https://i.imgur.com/37uDjjc.png

11:16 <whitequark> oh dear

11:16 <whitequark> that is outstandingly cursed

11:17 <whitequark> i don't have the slightest idea how you would end up with that lmao

11:17 * lkcl likes mingw32. but for aaaall the wrong reasons :)

11:17 <whitequark> see, this is the kind of stuff that drove me to make yowasp

11:17 <falteckz> I was probably desperate to get an LED blinking in 30 minutes, and cut corners with a sledge hammer

11:17 <whitequark> at least you already have to deal with python package management whether you like it or not (let's face it, you probably don't like it)

11:17 <falteckz> :O

11:18 <falteckz> How did you know that I do not not not like Python package management

11:18 <whitequark> i've met probably one person total who likes it

11:19 <falteckz> They wrote it and were very proud?

11:19 <lkcl> whitequark: yowser

11:19 <whitequark> to PSF's credit, they have improved it greatly this year and continue improving

11:19 <whitequark> but... it really does have a reputation

11:19 <whitequark> no, i think you will find that the people who wrote it are acutely aware of its issues

11:19 <whitequark> well, maintain it

11:20 <falteckz> At workplace <unnamed> we are still using Python2 for the foreseeable future, so I wont be seeing any refreshing revisions to package management soon

11:20 <lkcl> the "gold standard" is debian packaging. unnfortunately, it's such an extremely high standard that many people go "surely it doesn't have to be that complicated" and you end up with pip3, npm, etc. etc. as a result

11:20 <whitequark> i think the new pip resolver might actually be available on 2.7?

11:20 <falteckz> I like npm for 3 reasons

11:20 <whitequark> the debian packaging of python in particular is so atrocious that PSF is seeking feedback to really drive the point in

11:20 <whitequark> of how broken it is so that maybe finally something will change

11:21 <Sarayan> gold standard my ass, making debian packages is a fucking pain in the ass

11:21 <falteckz> 1. It's local by default, 2. It does tree shaking, 3. global works how you expect

11:21 <whitequark> the amount of time i personally wasted because of debian packaging issues is beyond any reason

11:21 <falteckz> Isn't debian package management awfully slow?

11:21 <Sarayan> I'm made debian, rpm, gentoo and arch packages, and the debian ones are the most annoying by a fucking lot

11:21 <Sarayan> I've

11:21 <falteckz> You are what you make, Sarayan

11:21 <lkcl> Sarayan: yes, it's such a high standard that it's viewed as "such a pain" that the problems it solves are disregarded

11:22 <lkcl> it's very unfortunate

11:22 <falteckz> Honestly - if Debian package management is the gold standard, and let's say I believe that. How hopeless we all are. But Debian packages have burnt me just as often as going well

11:22 <whitequark> falteckz: if you want something vaguely npm-like, you could use virtualenv or poetry

11:23 <Sarayan> I'm not convinced the package building is more expressive in the debian case

11:23 <whitequark> personally i would recommend sticking with pip and virtualenv, though some people swear by poetry

11:23 <falteckz> I don't love Python enough to learn another way to fix packages

11:23 <falteckz> Wow - that sounded so close minded.

11:23 <falteckz> I'll look into Poetry

11:23 <whitequark> if you want to spend the absolute minimum of time, i recommend virtualenv

11:23 <falteckz> VSCode makes venv not so bad too

11:24 <whitequark> `python3 -m venv env; source env/activate`

11:24 * lkcl wants to know what poetry is, too... looking it up..

11:24 <whitequark> that's pretty much all you need

11:24 <whitequark> er, it's env/bin/activate i think

11:24 <whitequark> ah sorry, windows. `env/bin/actviate.bat` then

11:25 <d1b2> <tizilogic> it's actually env/Scripts/activate.bat on windows for some reason...

11:25 <whitequark> oh oops

11:25 <whitequark> the linux version doesn't write the bat file and i didn't have a windows vm

11:26 <whitequark> well, not one that was running

11:26 <falteckz> I don't write code on Windows

11:26 <falteckz> I code in Linux - I nmigen in Windows cause Work Laptop v. Gaming Desktop at home that also does FPGA Synth

11:26 <lkcl> yeah virtualenv is nice for isolating development cleanly. multiple projects, or even just if you have a local (stock, distro) installation of python and don't want it "messed up"

11:26 <d1b2> <tizilogic> I try to avoid it too.. but sometimes one must

11:26 <whitequark> eh, my job is to tell you about virtualenv, it's up to you whether to bother with it or not

11:27 <whitequark> personally i live without virtualenvs most of the time

11:27 <whitequark> mostly because my python needs are quite simple

11:27 <lkcl> falteckz: ah interesting setup. then you might want to consider installing yosys on the linux side, not for doing FPGA compilation but just to do the yosys "show"

11:27 <d1b2> <tizilogic> I use them all the time for exactly that reason..

11:27 <falteckz> Apparently I'm already using a .venv in Windows :shrug:

11:27 <falteckz> lkcl, yosys show is working in Windows - but I'm not sure how to render it yet

11:28 <lkcl> whitequark: yeah i get away with not using virtualenv. for the coriolis2 layout, Jean-Paul and Staf taught me that you need the *entire* P&R virtualised (repeatable builds) for ASIC development

11:28 <whitequark> sometimes i feel like it would be easier to stuff nmigen and the entire toolchain into a browser

11:28 <whitequark> graphviz too

11:29 <lkcl> ooo god please don't put nmigen in a web browser! i've been down that rabbit-hole :)

11:29 <whitequark> if webassembly is not solving your problems, compile more things to webassembly

11:29 <lkcl> graphviz - good call. nmigen? urrrr :)

11:29 <falteckz> I mean a browser has everything you need, USB access, WASM, a very good accelerated renderer, full multi-platform support, and lastly it has already reserved all of your RAM.

11:29 <whitequark> lol

11:29 <whitequark> yowasp already does run in the browser

11:29 <whitequark> ... sort of

11:30 * lkcl is enjoying the conversation but needs to get up, walk around and get something to drink

11:30 <lkcl> apologies - later folks

11:30 <falteckz> peace

11:38 <Sarayan> Perhaps I should run fpga sims in the browser ;-)

11:39 <whitequark> well, if we port not only yowasp but also clang to wasm, you could build cxxrtl simulations to wasm locally in the browser and then load them

11:39 <Sarayan> intriguing

11:39 <whitequark> this sounds like a bad joke but it is entirely technically feasible and people do in fact use something like this in production already

11:39 <whitequark> it is also cursed beyond belief but it does work

11:39 <Sarayan> I was thinking from bitstream, but yeah, there's that too

11:40 <whitequark> let's put it that way, once it leaves nextpnr, it is not my responsibility :P

11:40 lkcl has quit [Ping timeout: 240 seconds]

11:40 <Sarayan> looks like for cyclone v it's mine :-)

11:40 <Sarayan> dunno if you noticed, we've started relesing stuff publically

11:41 <whitequark> i've seen a bit

11:42 <Sarayan> I need to finish mapping the logic blocks connections, then I'll go on timings

11:45 <Sarayan> I suspect there's some very interesting work to do in synthesis. Did you know that the clock line drivers have an enable input? So if you have an clock+enable combo used a lot you can turn it into a simple clock and avoid routing the enable to the individual ffs

11:45 <whitequark> huh

11:45 <whitequark> are FFs enable-over-reset or the other way round?

11:46 <Sarayan> I'm not understanding the question

11:46 <whitequark> suppose you have a DFF with enable and reset inputs

11:46 <whitequark> if enable is 0 and reset is 1, is the FF reset?

11:46 <whitequark> (sync reset)

11:47 <Sarayan> there's a sync reset and an async reset

11:47 <whitequark> sync reset specifically yes

11:48 <Sarayan> I see the question, not sure of the answer though

11:48 <Sarayan> maybe it's in the docs, or otherwise another thing to test

11:49 <whitequark> that's kinda fundamental to the logic block heh

11:49 <Sarayan> They represent the sync reset as replacing the input though, so pretty sure no reset when enable=0

11:50 <whitequark> ok, that makes sense then re: enable inputs to clock line drivers

11:50 <whitequark> neat

11:53 lkcl has joined #nmigen

11:58 nfbraun has joined #nmigen

11:59 <Sarayan> there are 82 to 104 clock lines fwiw, so it's a real resource

12:00 <whitequark> oh yeah i believe that

12:00 <whitequark> fpga vendors do weird stuff with clock buffers

12:00 <daveshah> For some reason despite almost every FPGA having resources like that, vendor tools rarely if every push enables onto them

12:00 <daveshah> UltraScale+ had enables at leaf level (group of about 30 tiles) and it still never seems to be used

12:00 <Sarayan> nmigen really pushes enables on clocks though :-)

12:05 <falteckz> In the generated graphs from yosys, I see blocks with naming `PROC $group_1 $group_1`

12:05 <falteckz> any idea what these are to represent?

12:06 <whitequark> processes (decision trees)

12:06 <whitequark> run `proc` first

12:06 <whitequark> daveshah: how realistic do you think it is to add (e.g. for me, or someone i'll pay) db compression to ice40 in nextpnr?

12:07 <daveshah> It depends how you do it

12:07 <whitequark> yowasp nextpnr-ice40 databases are so massive i will have to make some unpleasant decisions in glasgow

12:07 <whitequark> i mean... i'd like to do it in a way that is realistic :D

12:07 <daveshah> If you don't mind a bit of startup cost, just using a generic compression algorithm would be easiest

12:08 <daveshah> Very long term, nextpnr will probably move to a standardised deduplicated database for all arches but I can't put a date on that

12:08 <whitequark> can i hack something for ice40 specifically the way you did for ecp5?

12:08 <whitequark> going from 20MB for 8k to 40MB for 8k+5k is not very friendly

12:08 <daveshah> Yeah, although that would be quite a bit more work

12:08 <whitequark> and that's already compressed

12:09 <whitequark> (in a wheel, so it's just deflate)

12:09 <Sarayan> wa: Lofty gets rather good results with lzma for cyclonev

12:10 <Lofty> It's very regular text to be fair

12:10 <whitequark> nextpnr databases are bbas though

12:10 <whitequark> iirc i tried 7z and got little additional benefit

12:10 <mithro> whitequark: We have a lot of interest in better db representation

12:11 <whitequark> not justifying the overhead of actually using lzma in the toolchain

12:11 <whitequark> mithro: i believe that, but for once, i just want an imperfect solution now

12:11 <whitequark> i am sure i could sink half a year into improved nextpnr database format, but i also don't have that time

12:12 <Sarayan> Lofty: the ones you're compressing and not text anymore, they're binary

12:12 <whitequark> daveshah: can you explain at a high level how it works in ecp5? just so i can see roughly how much work itis

12:12 <Sarayan> are

12:13 <daveshah> whitequark: for ECP5 basically it looks for grid locations that are identical based on relative coordinates and stores them only once, with a horrible workaround for globals (basically means globals are not actuallg conneted properly in the routing graph)

12:14 <daveshah> something similar but with better handling of globals would get some improvement, but for iCE40 you have a problem that a lot of the device is in the 'edge case' category

12:15 <mithro> whitequark: FYI - a group at BYU who have very compact representations for Xilinx parts is working with us to help explore the space that we hope to be useable by nextpnr in the future. Probably doesn't help you short term.

12:15 <daveshah> Yeah sadly I don't think there is a good short term solution here

12:16 <daveshah> but there is low hanging fruit more generally, like not storing wire names as complete strings but building them programmatically

12:17 <whitequark> i'd expect that to be handled very well by even deflate

12:18 <whitequark> daveshah: do you think there is any gain in reformulating the database to store deltas rather than absolute indexes?

12:18 <whitequark> i think that might expose more redundancy that compressors can exploit, but i'm not sure

12:18 <daveshah> yeah maybe

12:19 <daveshah> tbh half the problem here is iCE40 doesn't just have the routing graph, it also has the bits associated with each pip in a non-deduplicated form

12:19 <daveshah> this makes the db bigger and moving to a deduplicated db more work

12:23 <whitequark> i see

12:25 Bertl_zZ is now known as Bertl

12:28 <Sarayan> from a 640M source that's nice

12:29 <Sarayan> gah wrong channel

12:29 <_whitenotifier> [YoWASP/nextpnr] whitequark pushed 1 commit to develop [+0/-0/±1] https://git.io/JLSjc

12:29 <_whitenotifier> [YoWASP/nextpnr] whitequark 40f39a5 - Update wasi-sdk.

12:30 <Sarayan> something that's going to be interesting with the cyclones is that some decisions are probably going to be solved by converting part of the logic blocks to routing graph

12:30 <Sarayan> +s

12:31 <Sarayan> like which pll counter to use, or which clock mux

14:15 Bertl is now known as Bertl_oO

14:47 yehowshua has joined #nmigen

14:47 <yehowshua> anybody interested or working on an alternative VCD viewer to GTKWave?

14:50 <whitequark> lsneff: ^

14:54 <yehowshua> oh cool! just read through scroll back

14:54 <yehowshua> @lsn

14:55 <yehowshua> lsneff, I'm assuming you haven't made the source public yet

14:56 <whitequark> https://github.com/lachlansneff/ligeia

14:59 <yehowshua> Heres a rust VCD backend https://github.com/psurply/dwfv : not sure how fast it is

14:59 <yehowshua> for larger files

15:00 <yehowshua> https://github.com/hecrj/iced, could also be useful for the GUI frontend. Its a WebGPU cross-platform UI builder that's quite fast, pretty, and tiny(3MB Static I think)

15:00 <yehowshua> Its also written in rust

15:02 <whitequark> dwfv is not very fast

15:02 <whitequark> though it's not too bad either

15:13 FL4SHK has quit [Ping timeout: 246 seconds]

15:18 FL4SHK has joined #nmigen

15:19 emeb has joined #nmigen

15:24 <yehowshua> ah - hadn't really used it yet

15:25 yehowshua has quit [Remote host closed the connection]

15:36 FFY00 has quit [Remote host closed the connection]

15:46 FFY00 has joined #nmigen

16:04 <lsneff> What an interesting coincidence

16:05 FFY00 has quit [Remote host closed the connection]

16:06 FFY00 has joined #nmigen

16:09 chipmuenk has joined #nmigen

16:16 <lsneff> whitequark: start of spec: https://github.com/lachlansneff/ligeia/blob/main/src/bvcd.txt

16:20 FFY00 has quit [Remote host closed the connection]

16:30 chipmuenk has quit [Quit: chipmuenk]

16:31 chipmuenk has joined #nmigen

16:58 korken89 has joined #nmigen

17:21 SpaceCoaster_ has joined #nmigen

17:22 SpaceCoaster has quit [Ping timeout: 272 seconds]

18:12 <korken89> Hey kbeckmann I quite like your cli and Applets approach to structuring a project! I'm thinking I'll use it in my project as well :D

18:14 <kbeckmann> korken89: ah glad you like it! i borrowed the ideas from the glasgow project :)

18:14 <korken89> Nice!

18:15 <korken89> I'm bringing up my FT600 right now and saw you started a driver for it :)

18:15 <kbeckmann> oh sweet

18:15 <kbeckmann> beware that it's probably not very good, the one i wrote

18:16 <kbeckmann> https://github.com/kbeckmann/Kilsyth/blob/master/software/kilsyth/gateware/ft600/__init__.py

18:16 <kbeckmann> but there are a bunch of implementations floating around

18:16 <korken89> Cool, I'll search around!

18:17 <kbeckmann> oMigen but might be a good inspiration anyway https://github.com/enjoy-digital/pcie_screamer/blob/master/gateware/ft601.py

18:19 <korken89> Oh, interesting approach to the control signals

18:21 <korken89> Makes sense, does not need the extra clock of delay then it seems when controlling them

18:22 <anuejn> korken89: there is also one implementation here: https://github.com/apertus-open-source-cinema/nmigen-gateware/blob/2d30578/src/lib/io/ft601/ft601_stream_sink.py

18:23 <korken89> Oh, that was a tiny implementation

18:24 <korken89> Ahh, it only sends, never receives

18:24 <anuejn> yup

18:25 <korken89> I'm also working on a camera, so could be a good start!

18:30 <korken89> anuejn: I think you are the author of that module right? It seems it does not have a cache for handling when the TXE indicates the FIFO is full and the written word is rejected, or am I maybe misunderstanding something about your implementation?

18:31 <korken89> Oh wait, it seems to be a part of the `Stream` module

18:32 <anuejn> I did not really understand your question but the semantics of stream are that a sucessfull transaction only happens when valid and ready are 1

18:38 <korken89> Makes sense, I'm thinking about the case where one takes a word from the internal FIFO and places it on the bus when in the same clock the TXE goes high, so the FT60x does not accept the word. But it seems in your case the Stream FIFO handles this case

18:39 <korken89> I have to give it a deeper look, seems like a very clean approach!

18:39 <korken89> Thanks for linking!

19:38 FFY00 has joined #nmigen

19:46 <korken89> I have run into a new warning, `Warning: Wire top.$verilog_initial_trigger has an unprocessed 'init' attribute.`, I get this when setting a set of IOs to a constant as `m.d.comb += pseudo_power.ft.eq(Const(-1))`. Is this an issue or just expected?

19:49 FFY00 has quit [Remote host closed the connection]

19:54 FFY00 has joined #nmigen

20:00 nfbraun has quit [Ping timeout: 256 seconds]

20:30 nfbraun has joined #nmigen

20:34 emeb_mac has joined #nmigen

20:41 <korken89> Seems to be working as it should, so the warning is probably fine :)

20:42 jeanthom has quit [Ping timeout: 240 seconds]

21:27 tannewt has quit [Read error: Connection reset by peer]

21:29 tannewt has joined #nmigen

21:31 mtk99 has joined #nmigen

21:33 mtk99 has quit [Remote host closed the connection]

21:36 DaKnig has quit [Ping timeout: 256 seconds]

21:38 DaKnig has joined #nmigen

21:44 jeanthom has joined #nmigen

21:55 <_whitenotifier> [nmigen] korken89 opened issue #569: ECP5 blockram silently trucates depth? - https://git.io/JLHeH

21:57 chipmuenk has quit [Quit: chipmuenk]

21:59 <_whitenotifier> [nmigen] daveshah1 commented on issue #569: ECP5 blockram silently trucates depth? - https://git.io/JLHvL

22:00 jeanthom has quit [Ping timeout: 256 seconds]

22:02 <_whitenotifier> [nmigen] korken89 commented on issue #569: ECP5 blockram silently trucates depth? - https://git.io/JLHvs

22:02 <_whitenotifier> [nmigen] korken89 closed issue #569: ECP5 blockram silently trucates depth? - https://git.io/JLHeH

22:43 korken89 has quit [Remote host closed the connection]

22:56 bvernoux has quit [Read error: Connection reset by peer]

23:02 jeanthom has joined #nmigen

23:15 jeanthom has quit [Ping timeout: 246 seconds]

23:22 <Chips4Makers> whitequark: Comment on the reset-over-enable for FFs. On ASICs you commonly don't have FFs with synchronous resets, they are typically async. So if you do have asycnhrounous reset in your RTL the reset will just be part of the logic before the FF.

23:37 <falteckz> Are there docs for -soc and -boards ?

23:37 <falteckz> I know in nmigen there is some indexable Memory construct, but I cannot seem to find the docs for it. I'm starting to suspect it's not part from `nmigen` itself

23:38 <falteckz> Or is it that source is docs for now as docs are under development?

23:38 <agg> doesn't look like Memory is in the online docs yet, but you can find its source here: https://github.com/nmigen/nmigen/blob/master/nmigen/hdl/mem.py

23:38 <falteckz> Thank you agg, so it is in fact part of the `nmigen` library

23:39 <agg> from nmigen import Memory; mem = Memory(width=8, depth=1024, init=[1,2,3,4]); read_port = m.submodules.read_port = mem.read_port(transparent=False); write_port = m.submodules.write_port(); m.d.sync += read_port.addr.eq(read_port.addr + 1); m.d.comb += output.o.eq(read_port.data); that sort of thing

23:39 <agg> yea, it's part of nmigen itself

23:40 <agg> you can add multiple read/write ports and specify a clock domain for each one, by default 'sync'

23:41 <falteckz> Is there any constraints as far as internal hardware goes? i.e. will Memory use FPGA RAM, and if so, is there a read and write port limit?

23:41 <agg> nmigen doesn't know, but synthesis results will depend on what you've done

23:41 <agg> by default very small memories will probably use LUTs and larger memories use block RAMs, but you can set attributes on the memories to force it one way or the other

23:42 <agg> if you have e.g. one write and two read ports, but your hardware only has one of each per BRAM, it can duplicate the BRAM, wire the write ports together, and you end up with two independent read ports

23:42 <falteckz> The attributes are platform specific?

23:43 <agg> memory.attrs["logic_block"] = 1 or memory.attrs["ram_block"] = 1, I think are quite widely accepted

23:44 <agg> (if you really need fine control, you can always instantiate the platform's memory primitive directly, too)

23:45 <falteckz> Not sure what I need yet, but good to know the options. I think I'll want to be reading from SPI Flash and buffering into RAM