#nmigen on 2020-12-28 — irc logs at freenode.irclog.whitequark.org

2020-12-07 01:53 ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen · IRC meetings each Monday at 1800 UTC · next meeting TBD

00:15 <_whitenotifier> [YoWASP/nextpnr] whitequark pushed 1 commit to develop [+0/-0/±2] https://git.io/JLyCM

00:15 <_whitenotifier> [YoWASP/nextpnr] whitequark d9adbdb - Update dependencies.

00:26 lf has quit [Ping timeout: 260 seconds]

00:26 lf has joined #nmigen

00:56 <_whitenotifier> [YoWASP/yosys] whitequark pushed 1 commit to develop [+0/-0/±1] https://git.io/JLy8g

00:56 <_whitenotifier> [YoWASP/yosys] whitequark a2c7f01 - Update dependencies.

02:10 <smkz> what is EnableInserter and is it something i should use or not?

02:11 <tpw_rules> it lets you put an enable pin on some logic

02:11 <tpw_rules> i've never needed it

02:14 <mwk> smkz: take a module with a clock, take a clock enable signal, get an equivalent module with the clock enable applied everywhere the clock is used

02:14 <tpw_rules> ^

02:14 <mwk> it's done by recursively descending the AST and inserting clock enable at every endpoint though; not by actually gating the clock

02:15 <mwk> so it has the slightly annoying property of not working when manual instantiation of sync primitives is involved

02:18 <whitequark> that's an implementation detail, there is an open issue for introducing EnableSignal

02:18 <whitequark> which would fix that problem

02:18 <whitequark> that said, it will never actually gate the clock

02:21 <mwk> heh, would be nice to actually have first-class support for clock gating in yosys

02:21 <mwk> but yea, not happening anytime soon

02:21 <whitequark> yeah

02:22 <mwk> ... is clock gating one of those things that could be easy and portable if verilog actually had support for it

02:23 <whitequark> verilog technically does

02:23 <mwk> if I see `assign gated_clk = clk & ce` one more time I'm going to flip a table

02:24 <whitequark> in cxxrtl, i think i can recognize that idiom and turn it into a clock gate

02:24 <whitequark> i'm not yet sure though

02:24 <whitequark> and maybe it should not be done in cxxrtl

02:25 <mwk> I mean it's an invalid idiom in the first place

02:25 <mwk> if we ever recognize it, it should be to shout THIS IS NOT SUPPOSED TO BE DONE LIKE THIS, FIX IT

02:25 <whitequark> oh, right

02:25 <whitequark> i forgot it should be a latch

02:26 <whitequark> ... should it

02:26 <mwk> latch and a gate, I think? I don't remember, something like it

02:26 <whitequark> sigh

02:27 <mwk> plus of course this doesn't really work either if you want it to be in sync with the original clock

02:27 <whitequark> right, that's what i was thinking about

02:28 <whitequark> there was some sort of aggravating idiom where you'd use blocking assignment in a clocked always block

02:28 <mwk> like, the latch-gate is correct, as long as you don't care about the result being synchronous

02:28 <whitequark> but i can't recall it

02:28 <mwk> well yeah, getting simulation behavior is one thing, getting it synthesized correctly (and understood by timing analyzer down the line) is another

02:31 <mwk> hence I think doing it right would require new first-class primitives in yosys, which would effectively map to BUFGCE or whatever

03:01 <d1b2> <dub_dub_11> BUFGCE is global buffer with clock enable right? Do the tools actually use that to gate the clock rather than all the FFs, if you use EnableInserter? (Presumably doing so would save a lot of power from driving the clock tree)

03:03 <d1b2> <dub_dub_11> Or equivalently if you had in Verilog always@(posedge clk) begin if(enable) begin Logic end end

03:05 <whitequark> no, the tools very much do *not* do that

03:05 <whitequark> EnableInserter essentially turns DFFs into DFFEs (if this is deemed useful)

03:10 <d1b2> <dub_dub_11> Ah ok

03:11 <d1b2> <dub_dub_11> And the pnr won't recognise if every FF driven by a clock has the same enable signal, and gate the clock?

03:12 <whitequark> nope

03:12 <whitequark> and it should not, since that would change timings in ways you might not want

03:12 <whitequark> also, many FPGAs only have one BUFGCE in first place

03:13 <whitequark> (I'm not sure which Xilinx devices have only one but it's something I've noticed before)

03:13 <d1b2> <dub_dub_11> Oh okay

03:13 <d1b2> <dub_dub_11> I didn't realise it was actually a seperate resource either

03:14 <d1b2> <dub_dub_11> I thought it was like an input buffer where there can be different versions using the same primitive

03:14 <whitequark> it is also like that

03:15 <whitequark> there's a bunch of different cells that map to BUFG*, and the exact BUFG* that is present in your device varies from chip to chip

03:16 <d1b2> <dub_dub_11> Uh huh

03:17 <d1b2> <dub_dub_11> And is there usually a few BUFGs (but not with the CE)?

03:21 <whitequark> ask mwk

03:34 <mwk> it's, as usual, complicated

03:35 <mwk> before ultrascale, there is really no such primitive as BUFG/BUFGCE; there's BUFGMUX [on spartan] or BUFGCTRL [on virtex], which have BUFG/BUFGCE as special case uses, and it's all the same resource

03:36 <mwk> the number of those varies, but expect 8-32

03:38 <mwk> there is definitely no xilinx FPGA with only a single BUFGCE though; perhaps there is such a xilinx CPLD, I don't know

03:39 <mwk> and ultrascale... complicates things in multiple ways, having the clock enable part done in leafs instead, so while you still describe BUFG/BUFGCE to the toolchain, it doesn't really correspond to physical resources anymore, the buffers are kinda generated on the fly in the P&R stage

03:40 <mwk> and also it's the one that has multiple global buffer types; some are actual hardware BUGCEs, some are BUFGs without any clock gating, some are BUFGCTRL with lotsa features; the leaf clock-enables are in addition to that

03:40 <mwk> and then there's versal which complicates it even more by adding /2, /4 and /8 clock dividers on leafs, but eh

03:41 <whitequark> mwk: hang on, why does xilinx pnr tell me that it has BUFGCE 1/1 or something like that

03:41 <whitequark> it is just lying for some reason?

03:41 <mwk> huh

03:41 <mwk> what chip?

03:41 <whitequark> lemme try to reproduce

03:42 electronic_eel has quit [Ping timeout: 260 seconds]

03:42 electronic_eel has joined #nmigen

03:43 <_whitenotifier> [nmigen/nmigen-boards] whitequark pushed 1 commit to master [+0/-0/±1] https://git.io/JLyu8

03:43 <_whitenotifier> [nmigen/nmigen-boards] whitequark 5c8c8ca - arty_s7: fix blinky built-in test.

03:44 <mwk> spartan7, eh? 16 BUFGCTRLs on the smallest device, 32 BUFGCTRLs on the others

03:44 <whitequark> no, that was just a bug i found

03:45 <whitequark> i don't think i've ever built anything for s7, my vivado is too old

03:48 <whitequark> mwk: wtf, no idea where i even took that

03:48 <whitequark> you are obviously right

03:49 <whitequark> i must have mixed it up with something else. my bad. sorry for misinformation dub_dub_11

03:51 emeb has left #nmigen [#nmigen]

03:52 <mwk> there are definitely non-xilinx fpgas with 1 of whatever the equivalent of BUFGCE is

03:52 <whitequark> yes, but i even checked ecp5

03:52 <whitequark> and that has... more than 1

03:53 <mwk> so we'd need resource management in yosys to choose upgrade-worthy ones

03:53 <mwk> if any

03:53 <whitequark> tbh resource management in yosys would be neat

03:53 <whitequark> if costly to maintain, perhaps

03:53 <mwk> but also: changing FF enables to clock gating can only happen for FFs with enable priority over reset

03:53 <sorear> what would "resource management" look like?

03:53 <mwk> or async resets

03:54 <mwk> sorear: well first of all we'd need a bigass database of FPGAs and their resource counts

03:54 <whitequark> which FPGAs have enable over reset?

03:54 <mwk> then we'd need some framework for passes to know how much headroom they have for a given resource

03:55 <mwk> taking into account weird stuff like resources being splittable, hierarchical designs (if you use a resource in module instantiated 10 times in hierarchy, it costs 10 times more etc)

03:55 <mwk> and also pre-instantiated primitives

03:55 <mwk> and then we'd need some heuristics

03:57 <mwk> ... and then there are annoying rules like "using a blockram with 32-bit width on spartan 3e actually also takes up a multiplier slot because input muxes are shared"

03:57 <whitequark> uhm

03:57 <whitequark> it what

03:57 <mwk> high 16 data bits of memory write port are also multiplier inputs for co-located MULT18X18*

03:58 <whitequark> sounds aggravating

03:58 <d1b2> <dub_dub_11> Ah np I had a scroll through the UG and found the bit about V5/7 series having one type of global clock resource and as always learnt lots along the way

03:58 <mwk> yep

03:58 <mwk> so anyway, yeah, resource management needs to happen

03:58 <mwk> ... at some point

03:59 <d1b2> <dub_dub_11> Also yes that sounds extremely aggravating

03:59 <mwk> but it's not trivial by any means

03:59 <whitequark> i feel like we should sort out the situation where i merge the most yosys PRs

03:59 <whitequark> first

04:00 <mwk> well it'd be easier if the, uh, event hadn't happened

04:00 <whitequark> that is indeed true

04:00 <mwk> I very much hope it's normalized soon

04:01 <whitequark> i am quite tempted to go to a certain country and break a few facial bones of a certain person

04:01 <mwk> same

04:02 <d1b2> <dub_dub_11> Oh no

04:02 <whitequark> anyway, at least we have a new maintainer for the verilog frontend

04:02 <mwk> I hoped to spend some effort on yosys this fall, but instead got myself trapped in a big screwup of uni work

04:03 <whitequark> for... a few months at least

04:03 <whitequark> probably not sufficient to replace the frontend with something sane, but maybe enough to make it less aggravating, and freeing me from the necessity to read that code for a while

04:03 <mwk> ... at this point either I get adhd meds and stabilize the situation or just quit that damn place

04:04 <mwk> I've had enough of being a fucking mess

04:04 <whitequark> from personal experience, i would recommend both, though it's up to you of course

04:05 <mwk> *shrug* the main problem wit that is that I tend to actually enjoy it when I'm not trapped in an anxiety/procrastination combo of having to actually prepare the classes

04:06 <whitequark> i see

04:06 <mwk> but yeah, it's not sustainable in the current configuration, something has to give

04:07 <mwk> and yeah, the new frontend maintainer is... well, an unexpected boon in shitty times

04:08 <mwk> at least for a while

04:08 <whitequark> i'm really impressed that he actually manages to understand it

04:08 <whitequark> no matter how much effort i sunk into it, i can not make sense out of genrtlil

04:09 <whitequark> save for the simplest parts

04:09 <whitequark> what sorcery is that!

04:13 <mwk> I... kinda think I understood it enough to make some simple fixes to it some time ago?

04:13 <mwk> but given the small surface of stuff I touched I'm probably very wrong about it

04:14 <whitequark> i mean, i understood it enough to shoehorn the $print cell in

04:14 <whitequark> but Zachary is fixing things like constant function expansion and some cursed memory-related width inference... stuff

04:14 <whitequark> in both of these cases i could not even identify the root cause

04:15 <mwk> ... right, memory changes

04:15 <mwk> gods, I need to get that branch finished and merged

04:15 <mwk> I shudder to think what will happen if I rebase it now, I haven't touched it for a month or two

04:16 <whitequark> there's been barely any changes for a month or two i think

04:16 <mwk> there were verilog frontend changes and lots of cxxrtl changes; my branch touches both

04:16 <whitequark> oh

04:17 <mwk> CONFLICT (content): Merge conflict in backends/cxxrtl/cxxrtl_backend.cc

04:17 <mwk> how surprising

04:17 <whitequark> that one is almost certainly benign

04:17 <mwk> nope

04:17 <whitequark> oh? what did i change?

04:17 <whitequark> i mean, memory code, yes, what in particular (you can just post the diff)

04:17 <whitequark> (with conflicts)

04:18 <mwk> I changed how signal initialization works

04:18 <whitequark> oh...

04:18 <whitequark> uh you really ought to rebase once https://github.com/YosysHQ/yosys/pull/2495 is merged

04:18 <mwk> because adding initial memory read port values suddenly means that not just (*init*) on a wire can require an initialization

04:18 <whitequark> and i will do that soon, in that case

04:19 <whitequark> since that will move all of the signal init values from the class definition to the constructor or a function

04:19 <mwk> (it's a parameter on the $memrd cell)

04:19 <mwk> hm, alright

04:19 <mwk> okay, all the conflicts seem to be related to initial values, not bad

04:20 <mwk> oh wait, no

04:20 <mwk> fuck me

04:20 <mwk> it's just conflicts for the first patch on my branch, out of 11

04:20 <whitequark> :s

04:21 <mwk> *of course* it's only about initial calues

04:21 <mwk> anyway

04:21 <mwk> the main problem with getting my branch merged is uhh getting any sort of review for it

04:21 <whitequark> can i help?

04:22 <whitequark> i'm guessing no, but

04:22 <mwk> I mean I can self-approve my bugfixes and minor xilinx improvement, but I'd rather have actual design review for something that throws away a big portion of memory passes and replaces them with all-new stuff

04:24 <mwk> whitequark: I'd appreciate reviewing the general design

04:24 <whitequark> mwk: ack. let me go through my current backlog (might take a day or two) and i'll see what i can do

04:25 <mwk> the outline is here: https://github.com/YosysHQ/yosys/issues/1959#issuecomment-694277431 ; there's more in the doc changes in https://github.com/YosysHQ/yosys/tree/mwk/mem-inference

04:25 <mwk> the current status of the branch is: the data model changes are done, the missing part is the actual match-rams-to-blockrams pass, and transparency inference pass

04:26 <mwk> and there's one thing left in the model design that I'm unsure of, which is how to handle transparency

04:27 <whitequark> hm i might have useful input on that, actulaly

04:27 <mwk> as implemented now, it is: transparency (and write priority) only applies between sync ports that happen to have the exact same clock (same signal and polarity), if you have two clock signals that just so happen to have edges at the same time, we make no guarantees

04:28 PyroPeter_ has joined #nmigen

04:28 <mwk> also priority and transparency is stored per-port-pair

04:29 <whitequark> it would be nice to have it explicit so that e.g. nmigen could be sure exactly when transparency will apply

04:29 <mwk> priority can effectively be A first, B first, or undefined; transparency is either "always read old value" or "always read new value"

04:30 <mwk> and I'm wondering if I should change it to be "always read old value", "always read new value", "read X"

04:30 <whitequark> hm

04:31 PyroPeter has quit [Ping timeout: 256 seconds]

04:31 PyroPeter_ is now known as PyroPeter

04:31 <whitequark> "read X" mapping to "NO_CHANGE"?

04:31 <mwk> read X mapping to "do whatever the hardware supports with minimal effort"

04:31 <whitequark> what if i select "read X" and make sure that w_en xor r_en?

04:31 <whitequark> will that have the NO_CHANGE semantics?

04:31 <mwk> if w_en xor r_en, you will have NO_CHANGE semantics anyway

04:32 <mwk> transparency explicitely only matters if there is a possibility of collision

04:32 <whitequark> i mean, yosys might not know that w_en xor r_en

04:32 <mwk> that's what sat solvers are for

04:32 <whitequark> i'm trying to *avoid* inference

04:33 <whitequark> mhm

04:33 <mwk> I don't think it's possible for the body of extant Verilog code

04:33 <whitequark> oh certainly

04:33 <whitequark> i trust you that you'll consider what is necessary to handle extant verilog

04:33 <whitequark> my perspective is doing things for... better inputs

04:33 <mwk> the main memory inference pass will have a sat solver, and will use it in multiple scary ways

04:34 <whitequark> yes. i'm not trying to avoid it in general. i want to understand if nmigen can avoid it, or otherwise make sure the results have a hard guarantee

04:34 <whitequark> and see whether this desire can inform the design

04:34 <whitequark> (i do not have a hard requirement that nmigen never touches any sat parts)

04:35 <mwk> mhm

04:35 <whitequark> (it wouldn't make sense since it will often produce verilog anyways)

04:35 <mwk> so the main purposes of sat are as follows

04:36 <mwk> 1) you can have two write ports where one has priority over the other because it follows it in the same verilog block; sat can determine the enables are disjoint and drop the priority, allowing mapping to blockrams without defined write port priority (ie. most of them)

04:38 <mwk> 2) xilinx (and vendors that cribbed from them), allows you to have two configurations of r_en and w_en: either they are disjoint (NO_CHANGE), or w_en always implies r_en (READ_FIRST, WRITE_FIRST)

04:38 <mwk> before we merge a read port and a write port into a read/write port, we have to make sure one of the cases applies

04:39 <mwk> 2) is the main unavoidable part, it's simply not possible to infer a xilinx blockram with rw port without verifying this condition

04:40 <mwk> we could, in theory, require verilog code that makes the condition obvious, and somehow propagate that info from the frontend, but good luck with it

04:40 <whitequark> yep

04:41 <whitequark> ok, so neither of those really applies to nmigen, since it emits $memwr/$memrw explicitly

04:41 <mwk> 2) does

04:41 <whitequark> oh, there's no parameter?

04:42 <mwk> how would you write nmigen code that gets synthd to a single rw port?

04:42 <whitequark> mem.read_write_port() # does not exist yet

04:42 <mwk> and how does r_en/w_en work on this port?

04:43 <mwk> are they two independent signals? if so, we're back to square one

04:43 <whitequark> if i was doing it, i would go with two independent signals but also a parameter that specifies behavior during a conflict

04:44 <whitequark> this might not be optimal, of course

04:44 <mwk> and what would be the description?

04:45 <mwk> if it's "transparent" or "read old value", you still need sat to verify that user cannot do w_en without r_en

04:45 <whitequark> can you explain why?

04:45 <whitequark> i thought you didn't

04:45 <mwk> because that's how xilinx blockram works

04:45 <whitequark> oh, right

04:46 Degi_ has joined #nmigen

04:46 <whitequark> ok, so it is not so much "sat to infer memory" as much as "sat to map memory to xilinx"

04:46 <mwk> en=0 means w_en=0,r_en=0; en=1,we=0 means w_en=0,r_en=1; en=1,we=1 means w_en=1,r_en=1

04:46 <mwk> yes

04:46 <whitequark> that's still kind of gross but it seems unavoidable

04:46 <mwk> and ecp5 and anlogic and gowin because of course nobody actually designs fpgas themselves

04:46 Degi has quit [Ping timeout: 240 seconds]

04:46 Degi_ is now known as Degi

04:47 <whitequark> naturally

04:47 <whitequark> ... what does ice40 spram do

04:47 <mwk> hmm

04:47 <mwk> according to my notes, no defined behavior on read/write conflict

04:47 <whitequark> amazing

04:48 <mwk> but indepenedent enables at least

04:48 <mwk> and here's the other problem

04:48 <mwk> we *need* to consider three options

04:48 <mwk> read old value, transparent, X

04:48 <mwk> because that's in practice what vendors give you

04:48 <mwk> read old value is the annoying one, because you cannot implement it if the vendor cannot

04:49 <whitequark> you could probably duplicate the RAM?

04:49 <whitequark> i think?

04:49 <mwk> how?

04:49 <mwk> if all you have is SDP blockram like ice40?

04:50 <mwk> you could cheat and make one port use falling edge, but *it's a giant fucking can of worms*

04:50 <whitequark> would the same trick work as what people use to do multiple write ports out of SDP RAMs?

04:50 <mwk> (oh btw old intel parts do exactly that, latch address on rising edge, actually perform write on falling edge)

04:51 <mwk> (yes, in hardware)

04:51 <mwk> (always)

04:51 <mwk> (well, old altera parts really)

04:51 <whitequark> that reminds me of ultraram somehow

04:51 <mwk> ultraram is less insane because at least you cannot observe that

04:51 <mwk> given that it only has one clock

04:52 <mwk> but old altera parts do it on *dual clocked* memory

04:52 <mwk> but anyway

04:52 <mwk> maybe there's a way around it, maybe not

04:52 <mwk> but I'm pretty certain there's no non-obnoxious way around it

04:52 <whitequark> i'm in despair! EDA has left me in despair

04:53 <mwk> which is why we don't want to be marking RAMs as "read old value" unless we really mean it

04:53 <mwk> then, there's the "transparent" option

04:53 <whitequark> ok, makes sense

04:53 <mwk> which has the advantage of actually being somewhat sanely emulatable if it's not supported in hardware

04:54 <mwk> address comparator, mux, some registers, done

04:54 <mwk> so at least it's always feasible

04:54 <mwk> but still — it'd be better to avoid it if not actually necessary

04:54 <whitequark> yeah

04:55 <mwk> which is why there *must* be the "don't care" option

04:55 <mwk> the only question is how to deal with it

04:55 <whitequark> what are the options?

04:55 <mwk> how to store it, how to infer it from Verilog, etc.

04:55 <mwk> well, first thing, we could have it infered in the Big Pass

04:55 <mwk> using, of course, SAT

04:56 <mwk> r_en exclusive with w_en? done. always r_addr != w_addr? done. have an explicit mux that muxes 'x in case of conflict? done.

04:56 <mwk> of course, this is obnoxious, but honestly not that different from the SATing that already needs to happen

04:57 <mwk> second is to extend the model to have it as an explicit option along with the current "read old value" and "transparent" options

04:57 <mwk> needs two bitmasks instead of one, but eh

04:58 <mwk> and for Verilog, have a separate pass that recognizes "read value doesn't matter if that port is writing" patterns and marks it

04:59 <mwk> also when I said that there are three options ("read old", "transparent" aka "read new", "don't care")? I lied.

04:59 <mwk> hello intel memories again

05:00 <mwk> they have the wonderful "read new value, except the byte lanes which are not covered by byte enable are undefined" fourth option

05:00 <mwk> ... which I'm tempted to just forget about and pretend they're either option 2 or option 3 depending on whether the BEs are all identical

05:00 <whitequark> hang on

05:01 <whitequark> read byte enables?

05:01 <whitequark> or... write byte enables?

05:01 <mwk> write byte enables

05:01 <whitequark> right.

05:01 <whitequark> that sounds insane.

05:02 <mwk> oh right

05:02 <mwk> and this is the case where you *cannot* emulate transparency

05:03 <mwk> because transparency would presumably have the current valid value of the non-BEd bytes on the output, and there's no way to obtain it

05:03 <mwk> ... okay I'm now extra convinced that we *need* the "don't care" mode and also that it needs to be the default

05:03 <whitequark> wait, intel doesn't have a "read old" option?

05:03 <mwk> nope

05:04 <whitequark> uhh

05:04 <mwk> well let me check again in notes

05:04 <mwk> okay, it does

05:05 <mwk> when the read and write happen on *different* ports

05:05 <mwk> it's always "read old"

05:05 <mwk> when the read and write collide on *same* port, it's "read new except non-BEs, fuck those"

05:05 <whitequark> who thought this is a good idea

05:06 <mwk> .. do you see why I want to just beat this problem with a sat solver into submission

05:06 <whitequark> so... on intel, you would have to map a non-transparent rw port to two different physical ports, right?

05:06 <whitequark> can your architecture do that?

05:06 <mwk> a specifically read-old-value rw port, yes

05:07 <whitequark> ok, good

05:07 <whitequark> i agree re don't care being the default

05:07 <mwk> read-old-value is not the only possible option other than transparency, you know

05:07 <whitequark> yeah

05:07 <mwk> and yeah, my architecture can do that

05:07 <whitequark> (don't care) since with unrelated ports the collision behavior is inherently don't care

05:07 <mwk> I basically have a backtracking algorithm assigning ports to hw ports

05:07 <whitequark> it only makes sense that it would be the default for related ports

05:08 <mwk> this is also why I decided $memrdwr is a supremely bad idea

05:08 <whitequark> for nmigen i suppose i would not do anything like detecting the port domain, and instead use explicit groups

05:09 <mwk> with shenanigans like these, there's no way for the upstream to know when it's actually OK to merge a read port with a write port

05:09 <whitequark> yes

05:09 <whitequark> agreed

05:09 <whitequark> that's horrifying, but i agree

05:09 <whitequark> i'm glad i learned this now so i don't have to rip read_write_port out of nmigen later

05:11 <whitequark> so... does this answer your question re transparency?

05:11 <mwk> I think so

05:12 <mwk> as in, I'm now pretty much convinced the model needs to include the three transparency options

05:13 <mwk> and while inference of "don't care" will be necessary with "read first" being the naïve default for Verilog code, we can and should just emit "don't care" ports directly in nmigen

05:14 <mwk> are we on the same page here?

05:14 <whitequark> nmigen's contract is that it doesn't have x

05:15 <mwk> yes, that's the big problem here

05:16 <whitequark> i think it would be also "read first" by default (ie with default, absent, transparency parameter)

05:16 <whitequark> hm

05:16 <mwk> as long as you're fine with never infering a blockram on ice40

05:17 <whitequark> i mean it infers blockrams now so not being able to do it after the PR is a regression

05:17 <mwk> now the default is transparent

05:17 <whitequark> oh, right

05:17 <whitequark> so "write first" by default

05:17 <mwk> yes

05:17 <mwk> which is... well, at least always implementable, if sometimes inefficient

05:18 <whitequark> yes, so that was a lucky choice for a default (i can't claim to have understood all of this intricacy back then)

05:19 <whitequark> wait, ice40 blockrams don't support read first?

05:19 <mwk> I've read the available documentation multiple times and they have absolutely no mention of what happens on a read/write conflict

05:20 <mwk> maybe it's read first, maybe it's write first, maybe it's nasal demons

05:20 <mwk> are you willing to check it and assume it'll always be right?

05:20 <whitequark> ok

05:20 <whitequark> i am not, and i'll bet everyone who knew left lattice a long time ago

05:20 <mwk> ... or siliconblue for that matter

05:20 <whitequark> ... though i do want to check it

05:20 <whitequark> what does the sim model do?

05:21 <mwk> ... same

05:21 <mwk> hm, no idea

05:21 <mwk> my notes don't mention anything and I don't remember checking

05:23 <whitequark> where does that live anyway

05:23 <whitequark> ... why does icecube LSE have references to quicklogic?!

05:23 <mwk> wut

05:23 <whitequark> LSE/data/nbexpand.acd

05:23 <mwk> ... they all just sleep in the same bed don't they

05:24 <mwk> anyway, some other data points

05:25 <mwk> ecp5 has (according to documentation) undefined behavior on inter-port conflicts (within a port, you get same three choices as xilinx)

05:25 <mwk> so the same problem here unless read and write port address happens to be the same

05:29 <mwk> anyway... I'm sorry

05:30 <mwk> I really like the "no X ever" stance in nmigen, but it... just doesn't appear to match the reality of FPGA blockrams

05:30 <whitequark> I mean, I'm willing to change it when toolchains stop conflating three completely unrelated things as X

05:30 <whitequark> but not before

05:31 <whitequark> I'm fine with inefficient transparency logic being emitted though

05:31 <mwk> if you really want to avoid UB, the closest thing is transparent port by default

05:31 <whitequark> yep, that's what I'll keep doing, then

05:31 <mwk> with the new approach at least it's no longer incompatible with a read enable

05:32 <whitequark> wtf is ice8

05:32 <mwk> and of course you can always hope the sat solver logic in yosys will be able to change it to a "don't care" port

05:33 <whitequark> yep

05:33 <whitequark> fine with that

05:34 <mwk> but IMO it'd be good to at least have a "don't care" option, even if it has to come with a big fat warning

05:34 <mwk> and isn't properly simulated

05:34 <mwk> I mean, it's either that or manually instantiating vendor primitives

05:35 <mwk> and of course, "don't care" remains the only viable option for different clock domains

05:35 <whitequark> yes, something like an unsafe opt-in is going to happen at one point

05:35 <whitequark> but it won't be the default, and it will be some time until we get that

05:35 <whitequark> same as for e.g. full case

05:35 <mwk> right

05:35 <whitequark> mwk: if(Time_Collision_Detected & Address_Collision_Detected)

05:35 <whitequark> RDATA <= 16'hXXXX;

05:35 <whitequark> from SB_RAM model

05:36 <mwk> mhm, sounds rather undefined

05:36 <whitequark> it also writes a message to a log

05:36 <whitequark> so, yeah, read X is the native ice40 mode

05:37 <mwk> thanks for confirmation

05:38 <whitequark> still curious wtf the hardware actually does

05:38 <whitequark> but it might be tricky to actually check that

05:38 <whitequark> if there's a setup/hold violation

05:38 <whitequark> can't think of a way to get good data

05:39 <whitequark> i need to somehow de-embed the RAM out of the rest of FPGA and my testbench

05:39 <whitequark> and see what kind of phase difference results in old/new value

05:40 <whitequark> which seems like more pain than i'm ready right now

05:40 <whitequark> *ready for

05:40 <mwk> I'll probably have a lot of fun with stuff like that for xilinx at some point

05:40 <mwk> fun fact: the bitstream actually contains trimming values for strobe delays in every blockram config

05:40 <whitequark> fun or "fun" ?

05:40 <whitequark> wow

05:41 <whitequark> how configurable are they?

05:41 <mwk> it actually differs between device grades sometimes

05:41 <whitequark> like what resolution?

05:41 <mwk> NFI

05:41 <mwk> 3 or 4 bits

05:41 <mwk> have no idea how it corresponds to time units

05:41 <whitequark> that seems like a massive pain to bin devices to

05:41 <mwk> oh it's not

05:41 <mwk> it depends on the operating *voltage*

05:41 <whitequark> ah

05:42 <whitequark> makes sense

05:42 <whitequark> similar to ecp5-5g?

05:42 <mwk> yeah

05:42 <mwk> I've specifically observed this for spartan 6, which I have almost completely fuzzed, and which has a hilariously broken low-power version

05:43 <mwk> like... just.... read the data sheets and look at the so-called "low power devices", then laugh at the completely shit timings or just complete feature brokenness

05:46 <mwk> they took the spartan 6 chips, applied voltage to it way below the original spec to create a low power version, then documented which shit broke

05:46 <whitequark> hm, i can't find "low power" in ds162 or ds160

05:46 <whitequark> oh

05:46 <whitequark> wtf

05:47 <mwk> https://0x04.net/~mwk/lowpower.png

05:47 <mwk> "yeah uh the DCM just kind of shits itself sometimes but we added auto-reset circuitry, it's fine, right?"

05:47 <whitequark> uhhhh

05:48 <d1b2> <Darius> fixed(*) in software!

06:32 <mwk> whitequark: also while we're on the topic of xilinx, have you looked at https://github.com/nmigen/nmigen/pull/563 ?

06:32 emeb_mac has quit [Quit: Leaving.]

06:35 <whitequark> mwk: just did. i will give it another look, but essentially LGTM

06:39 <whitequark> added to 0.3 milestone

06:40 <_whitenotifier> [nmigen] whitequark commented on pull request #547: vendor.lattice_machxo_2_3l: fix sdc generation - https://git.io/JLya5

06:52 Bertl_oO is now known as Bertl_zZ

06:57 <_whitenotifier> [YoWASP/nextpnr] whitequark pushed 1 commit to develop [+1/-1/±1] https://git.io/JLyVk

06:57 <_whitenotifier> [YoWASP/nextpnr] whitequark 9465899 - Update boost and eigen.

07:45 esden has quit [Ping timeout: 264 seconds]

07:45 tannewt has quit [Ping timeout: 260 seconds]

07:47 tannewt has joined #nmigen

07:47 esden has joined #nmigen

07:49 <_whitenotifier> [YoWASP/yosys] whitequark pushed 1 commit to develop [+1/-0/±2] https://git.io/JLyrG

07:49 <_whitenotifier> [YoWASP/yosys] whitequark 1802f7b - Use getopt with argument permutation, like GNU getopt.

07:56 jeanthom has joined #nmigen

08:05 <_whitenotifier> [YoWASP/yosys] whitequark pushed 1 commit to develop [+0/-0/±1] https://git.io/JLyoI

08:06 <_whitenotifier> [YoWASP/yosys] whitequark 1400cf5 - Fix caching so that it actually checks WASM binary digest.

08:06 <_whitenotifier> [YoWASP/nextpnr] whitequark pushed 1 commit to develop [+0/-0/±2] https://git.io/JLyoL

08:06 <_whitenotifier> [YoWASP/nextpnr] whitequark a69d7f3 - Fix caching so that it actually checks WASM binary digest.

08:06 <whitequark> well that's an embarrassing bug

08:48 jeanthom has quit [Ping timeout: 264 seconds]

09:17 jeanthom has joined #nmigen

09:39 <_whitenotifier> [YoWASP/yosys] whitequark pushed 7 commits to release [+1/-0/±8] https://git.io/JLyi2

09:39 <_whitenotifier> [YoWASP/yosys] whitequark cb7ac71 - Update dependencies.

09:39 <_whitenotifier> [YoWASP/yosys] whitequark 5fd917d - Update dependencies.

09:39 <_whitenotifier> [YoWASP/yosys] whitequark 3ef7a0d - Update dependencies.

09:39 <_whitenotifier> [YoWASP/yosys] ... and 4 more commits.

09:39 <_whitenotifier> [YoWASP/nextpnr] whitequark pushed 4 commits to release [+1/-1/±6] https://git.io/JLyiV

09:39 <_whitenotifier> [YoWASP/nextpnr] whitequark e36feaf - Update dependencies.

09:40 <_whitenotifier> [YoWASP/nextpnr] whitequark d9adbdb - Update dependencies.

09:40 <_whitenotifier> [YoWASP/nextpnr] whitequark 9465899 - Update boost and eigen.

09:40 <_whitenotifier> [YoWASP/nextpnr] whitequark a69d7f3 - Fix caching so that it actually checks WASM binary digest.

11:35 jeanthom has quit [Ping timeout: 246 seconds]

12:29 korken89 has joined #nmigen

13:19 feldim2425_ has joined #nmigen

13:19 feldim2425 has quit [Ping timeout: 260 seconds]

13:19 feldim2425_ is now known as feldim2425

14:18 nfbraun has joined #nmigen

14:49 Bertl_zZ is now known as Bertl

15:40 <korken89> Thanks for the help yesterday agg and kbeckmann ! Got it working today: https://twitter.com/korken89/status/1343582274307158018 :D

15:40 <daveshah> nice!

15:42 feldim2425 has quit [Ping timeout: 260 seconds]

15:45 <agg> :D

15:45 <agg> more fun awaits when you need to get the ddr3 clocks going I guess

15:45 <agg> are you generating the 25/50/100/150MHz clock domains all from the PLL?

15:46 <korken89> Yeah

15:46 <agg> (btw I am @adamgreig)

15:46 <korken89> Ahhhhhhh

15:46 <korken89> Hahaha hello adam xD

15:46 <agg> my sneaky(?) freenode disguise

15:46 <korken89> Worked wonders on me!

15:46 <korken89> I'm running an example from kbeckmann

15:47 <korken89> That I rewrote

15:50 feldim2425 has joined #nmigen

16:25 korken89 has quit [Remote host closed the connection]

16:30 <kbeckmann> korken89: nice to hear that it works!

17:09 FFY00 has quit [Remote host closed the connection]

17:09 FFY00 has joined #nmigen

17:13 FFY00 has quit [Remote host closed the connection]

17:14 FFY00 has joined #nmigen

17:16 FFY00 has quit [Remote host closed the connection]

17:16 FFY00 has joined #nmigen

17:19 emeb has joined #nmigen

17:23 feldim2425 has quit [Quit: ZNC 1.8.x-git-91-b00cc309 - https://znc.in]

17:23 feldim2425 has joined #nmigen

17:28 FFY00 has quit [Read error: Connection reset by peer]

17:28 FFY00 has joined #nmigen

17:48 FFY00 has quit [Remote host closed the connection]

17:48 FFY00 has joined #nmigen

17:56 <lkcl> agg: adamgreig as in "followingrobot with the STM32F102" adamgreig?

17:56 <lkcl> frickin loved that project.

17:57 <agg> guilty as charged

17:58 <lkcl> totally cool :)

17:59 <lkcl> i tried driving a 640x480 camera with the same STM32F103, it alllmost worked.

18:03 <agg> when did you come across that project? it was a school project from 2008 or so, so it's been a while...

18:03 FFY00 has quit [Ping timeout: 268 seconds]

18:04 FFY00 has joined #nmigen

18:08 jeanthom has joined #nmigen

18:11 Bertl is now known as Bertl_oO

18:11 FFY00 has quit [Remote host closed the connection]

18:11 FFY00 has joined #nmigen

18:13 FFY00 has quit [Read error: Connection reset by peer]

18:13 FFY00 has joined #nmigen

18:45 FFY00 has quit [Quit: dd if=/dev/urandom of=/dev/sda]

18:45 FFY00 has joined #nmigen

18:46 korken89 has joined #nmigen

19:21 lkcl has quit [Ping timeout: 268 seconds]

19:34 lkcl has joined #nmigen

20:26 <_whitenotifier> [nmigen] cestrauss commented on issue #565: cxxsim: random garbage in memory traces - https://git.io/JLSC1

20:34 emeb_mac has joined #nmigen

20:56 <tpw_rules> what sort of guarantees are there, if any, that verilog.generate() will output the same file given the same input?

20:56 <tpw_rules> it seems like it does, but i don't know if that's expected

20:56 <tpw_rules> (the same == identical byte wise)

21:15 <_whitenotifier> [nmigen] whitequark commented on issue #565: cxxsim: random garbage in memory traces - https://git.io/JLS8D

21:33 <whitequark> tpw_rules: it should be idempotent, but it can (and in practice, will) vary with environment, e.g. yosys version

21:34 <whitequark> so, if two consecutive calls, or two calls with different hash seed, don't return the same input, that's a bug

21:34 <whitequark> but the rest is out of my hands

21:38 <tpw_rules> okay, that's about what i expected. thanks for clarifying

21:38 <tpw_rules> turns out what i fed to it sometimes had an unlucky hash seed :)

21:43 <whitequark> yep, nmigen internally uses OrderedDict pervasively

22:48 Felkin has joined #nmigen

22:55 jeanthom has quit [Ping timeout: 256 seconds]

22:56 <Felkin> Yo yo, I've a question - as far as I can tell, nmigen doesn't currently have any sort of primitives for instantiating xilinx's DSP48E1s, right? Are there any plans to try and abstract them away a little? I have this super specific plan I have in mind, will start really trying it on my board mid-late jan, but wanted to get a sense of how much has to

22:56 <Felkin> be done.

22:58 <Felkin> To be more specific, I have this rather novel idea of designing a CORDIC core that makes extremely heavy use of a DSP, but here I am actually going to focus heavily on what the final PnR result is. The amount of logic elements in a LUT combined with a DSP cell at the bottom can actually lead to some very cool design.

22:58 <d1b2> <dub_dub_11> Your best bet for now is probably using Instance to insert a Verilog instantiation

23:00 <mwk> Felkin: if you want to go target-specific, just instantiate the raw primitive; if you don't, use the usual arithmetic opcodes and hope for the best in dsp inference

23:01 <mwk> write a closer wrapper if you want

23:07 <Felkin> Okey, will likely take the Instance route then, forgot those existed. It's very important for me to keep the DSPs as actual objects at language-level. I'm thinking of two papers in mind right now. One is about CORDIC+DSP hybrid architecture and another is using these as sort of almost cuda core-like primitives to outline a methodology for designing

23:07 <Felkin> accelerators. I had asked about CORDIC here a week or 2 back and did end up writing a python implementation for some image processing task. I quickly noticed that it could actually be very doable to make an HLS-like jump from the pure python program to an nmigen implementation. But it all banks on a hyper efficient implementation with the DSP so

23:07 <Felkin> need to go rly low lvl.

23:09 SpaceCoaster has quit [Quit: ZNC 1.7.2+deb3 - https://znc.in]

23:10 SpaceCoaster has joined #nmigen

23:12 korken89 has quit [Remote host closed the connection]

23:13 <Felkin> Anyways thanks for the advice, will come back to bug you more once I start trying to actually implement the thing.

23:19 Felkin has quit [Remote host closed the connection]