#nmigen on 2021-01-23 — irc logs at freenode.irclog.whitequark.org

2020-12-07 01:53 ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen · IRC meetings each Monday at 1800 UTC · next meeting TBD

00:39 peeps[zen] has joined #nmigen

00:40 peepsalot has quit [Ping timeout: 264 seconds]

00:42 peeps[zen] is now known as peepsalot

00:47 lf_ has joined #nmigen

00:48 lf has quit [Ping timeout: 264 seconds]

01:09 jjeanthom has quit [Ping timeout: 256 seconds]

01:16 <modwizcode> hmm I didn't realize that about the clocking, something about that feels problematic to me. I think I don't like the idea that I can't actually divide a clock which I didn't actually realize.

01:16 <modwizcode> that clock balancing, would that be happening manually then?

01:17 <whitequark> which part feels problematic to you?

01:18 <modwizcode> not problematic as in wrong, problematic in the sense I don't like the truth.

01:18 <whitequark> oh. um.

01:18 <whitequark> that's an interesting way to feel about it :D

01:18 <whitequark> but yeah it's been bothering me too

01:18 <modwizcode> So there's no deterministic way in the HDL to divide down the clock? (This is the truth I don't like)

01:19 <modwizcode> Other than enables

01:19 <whitequark> it... depends

01:19 <whitequark> if you're using a strict Verilog simulator, you can use = in the divider

01:19 <whitequark> if you're using a strict VHDL simulator, you can balance clock trees

01:19 <whitequark> (you might ask: what about mixed VHDL/Verilog code? to which I will not provide an answer)

01:19 <whitequark> if you're using Verilator, I think that has some custom attributes

01:20 <whitequark> if you're using CXXRTL, you kinda just can't at the moment at all.

01:20 <modwizcode> I think all of my work projects are mixed... for reasons. But I don't trust my simulators down to the delta cycle either anyway.

01:21 <modwizcode> Fixing CXXRTL to have a deterministic solution seems important. But attributes feel like a bad route.

01:21 <whitequark> CXXRTL explicitly implements synthesis semantics, which in case of a clock divider are "like, whatever"

01:21 <modwizcode> Anyway in VHDL that's me doing the balancing, not the simulator right?

01:21 <whitequark> yes, in VHDL you do it manually

01:22 <modwizcode> When I read it originally I was assuming that balancing happens under the hood but then I realized that's like impossible and makes no sense.

01:22 <whitequark> yes. you have to think about it. it is not a happy set of thoughts, usually.

01:22 <whitequark> I don't know how would one debug this beyond "staring intensely" and that's even worse

01:22 <modwizcode> For it to be automatic it would have to understand what you want out of the logic in a way that I doubt is possible. Maybe attributes could coax it into doing it for you, but again yikes.

01:23 <whitequark> no no, it can be automatic.

01:23 <whitequark> consider this: the PNR tool must know which clocks are related, and PLLs can divide clocks

01:23 <modwizcode> Right but you have to tell it that you're using a PLL somehow or that something is a clock.

01:23 <whitequark> in order to correctly route a design that has a CDC between two related clocks without additional synchronization (ie no 2FFs everywhere) you need to be able to tell the PNR tool the relationships between clocks

01:23 <whitequark> yes. you can use a similar method to tell CXXRTL about this relationship/

01:24 <whitequark> *.

01:24 <whitequark> it could be attributes, or it could be a .sdc file, or it could be honestly whatever, the important part is actually scheduling according to that

01:24 <modwizcode> Yeah this is the part that my knowledge is pretty weak on.

01:25 <modwizcode> Oh wait.

01:25 <modwizcode> Okay I think what you said clicked a bit better there.

01:25 <whitequark> remember those edge detectors?

01:25 <modwizcode> unfortunately yes

01:25 <whitequark> you have to arrange the scheduling so that edge detectors for in_clk and out_clk trigger in the same delta cycle

01:25 <modwizcode> oh

01:25 <whitequark> (or so that the design behaves as-if they do)

01:26 <whitequark> that's it

01:26 <whitequark> very simple

01:26 <whitequark> "simple".

01:26 <modwizcode> there's a way to do this already then? (You'd think with me poking through the scheduling code properly I would know)

01:26 <modwizcode> but what I saw suggests not

01:27 <whitequark> no, I was saying "simple" sarcastically

01:27 <whitequark> it's one of the hardest problems I've encountered in EDA so far

01:27 <modwizcode> ohhh

01:27 <modwizcode> Yeah okay that's what I thought you were saying

01:28 <whitequark> I don't think it requires anything fundamentally new, it's a small open research problem

01:29 <modwizcode> the computed clock's edge detectors have to have a scheduling dependency on the statements that compute the divide clock right?

01:30 <modwizcode> And then anything that depends on that clock has to have a scheduling dependency placing it after that clock itself. Or something...

01:30 <modwizcode> *the edge detector itself

01:35 <whitequark> sssomething like that

01:35 <whitequark> but

01:36 <whitequark> note that when you have feedback arcs in a single clock designs, the scheduling dependencies being violated aren't a big deal. you just want as few of them violated because it improves performance.

01:36 <whitequark> however if you have feedback arcs with edge detector dependencies, you can't just do that

01:36 <whitequark> also, we can't just reject them either

01:36 <modwizcode> right that's basically what I was thinking about

01:37 <whitequark> because our approach to the feedback arc set problem is inexact

01:37 <modwizcode> If it were exact would it be solved?

01:38 <whitequark> if there was a way to say "never break these edges unless absolutely necessary" then yes

01:39 <modwizcode> okay I was only asking to validate that I was thinking about things the right way

01:40 slan has quit [Remote host closed the connection]

01:40 <modwizcode> but wait. if the edges ever break aren't you screwed? Because now that's not the same delta cycle?

01:40 <whitequark> yes, so you need to display a diagnostic

01:40 <whitequark> however, it might be possible to break a lot of *other* edges instead

01:41 <whitequark> since a FAS solution is not necessarily unique

01:41 <whitequark> the current algorithm heuristically computes an approximation to MFAS but with the edge detector dependencies you can no longer just use MFAS

01:42 <modwizcode> At some point I should read that paper I put it off bc it was fairly math syntaxy which I find difficult to parse.

01:42 <whitequark> it's kind of not all that readable yeah

01:42 <whitequark> like even if you're fine with mathy syntax

01:43 <modwizcode> I skimmed it and concluded that on it's own I had no idea how I would know it's relevant at all to what CXXRTL does (i.e. that the FAS stuff is connected to anything meaningful)

01:45 <whitequark> well, a netlist is a graph that describes parallel computation, and a C++ function is serial, so you want to compute a sequence of nodes such that by the time you compute the outputs of a cell, all of the inputs have already been computed

01:45 <whitequark> a feedback arc is an input that is not computed by the time you are computing the outputs.

01:45 <whitequark> zero feedback arcs? one delta cycle. fewer feedback arcs? fewer delta cycles

01:46 <whitequark> note that FAS can only be used to reliably schedule netlists that have no logic loops

01:47 <whitequark> in other words, you can only use FAS to schedule a netlist if the consequence of computing the outputs when the inputs haven't been computed is benign, i.e. an extra delta cycle

01:47 slan has joined #nmigen

01:47 <modwizcode> yeah that's basically my understanding of things so far.

01:48 <whitequark> netlists with logic loops (we are mostly interested in stable logic loops, i.e. latches) can behave unreliably, since we disregard the side effects of computing the outputs of them.

01:48 <whitequark> yeah

01:49 <modwizcode> this implies now that I realize it that there might be use for a false path type of indication in cxxrtl, since certain inputs connected to outputs might not care about the intrinsic race conditions. Although given the way cxxrtl works I'm not sure if that might horribly break things.

01:49 <whitequark> this already exists

01:49 <whitequark> check out the help text for black boxes

01:50 <modwizcode> I am not surprised to hear it

01:50 <modwizcode> I have ignored most of the black box stuff as I have not needed it yet

01:50 <whitequark> what it calls "sync outputs" is essentially equivalent to what you call "false path" here

01:50 <modwizcode> right

01:50 <whitequark> CXXRTL black boxes are kinda neat because you can have combinatorial feedback and still fit into one delta cycle

01:51 <modwizcode> Hmm

01:51 <whitequark> I'm not aware of any other simulator that can do it

01:51 <modwizcode> So you're saying if I put all the messy stuff into that it will work

01:51 <whitequark> I believe verilator would need at least two

01:51 <whitequark> well... not necessarily

01:51 <modwizcode> Uh correction then. it *might* work

01:51 <whitequark> black boxes are scheduled at the granularity of, well, black box

01:51 <whitequark> if you have some signal zipping back and forth between black box inputs and outputs, no dice

01:52 <whitequark> but having a single combinatorial "ack" is generally fine

01:53 <modwizcode> I know there's a comment in the code describing this (because I've read it) but the fact that "sync output" is combinational is definitely confusing.

01:53 <modwizcode> I understand *why* it's called that, but it doesn't making it not still initially confusing

01:54 <whitequark> er

01:54 <whitequark> it's not combinatorial

01:54 <whitequark> you mark the ack as a comb output. you mark (usually) all others as sync

01:54 <modwizcode> oh than I misunderstood

01:55 <whitequark> I mean it depends on the black box in question but if you need Wishbone with combinatorial ack, this is what you'd do

02:11 levi has joined #nmigen

03:28 <_whitenotifier> [nmigen/nmigen] whitequark pushed 1 commit to master [+0/-0/±1] https://git.io/JtsTm

03:28 <_whitenotifier> [nmigen/nmigen] whitequark a2da34a - README: add ChipEleven as a sponsor.

03:29 <_whitenotifier> [nmigen/nmigen] github-actions[bot] pushed 1 commit to gh-pages [+0/-0/±13] https://git.io/JtsTO

03:29 <_whitenotifier> [nmigen/nmigen] whitequark 10ca62b - Deploying to gh-pages from @ a2da34a5bc5b5e1be8d0a94e579ab4ea5a876868 🚀

03:30 <cr1901_modern> >ChipEleven

03:30 <cr1901_modern> Huh, that's cool... we get a free tiny POWER CPU and free big POWER CPU

03:30 <modwizcode> neat

03:31 <cr1901_modern> (the tiny one being microwatt)

03:32 <tpw_rules> is that lkcl's work?

03:34 <modwizcode> I've never heard of them until now interesting

03:40 <sorear> no

03:41 electronic_eel has quit [Ping timeout: 240 seconds]

03:42 electronic_eel has joined #nmigen

03:53 Degi_ has joined #nmigen

03:54 Degi has quit [Ping timeout: 265 seconds]

03:54 Degi_ is now known as Degi

04:02 PyroPeter_ has joined #nmigen

04:05 PyroPeter has quit [Ping timeout: 246 seconds]

04:05 PyroPeter_ is now known as PyroPeter

05:48 emeb has quit [Quit: Leaving.]

06:17 emeb_mac has quit [Quit: Leaving.]

06:59 _whitelogger has joined #nmigen

07:00 Bertl_oO is now known as Bertl_zZ

07:14 futarisIRCcloud has quit [Quit: Connection closed for inactivity]

08:50 pftbest has quit [Remote host closed the connection]

09:04 JJJollyjim has quit [*.net *.split]

09:06 JJJollyjim has joined #nmigen

09:09 whitequark[m] has quit [Ping timeout: 244 seconds]

09:10 emily has quit [Ping timeout: 243 seconds]

09:10 blazra has quit [Ping timeout: 260 seconds]

09:10 JJJollyjim has quit [Ping timeout: 258 seconds]

09:10 vmedea[m] has quit [Ping timeout: 265 seconds]

09:10 jfng has quit [Ping timeout: 260 seconds]

09:10 cesar[m] has quit [Ping timeout: 268 seconds]

09:34 jjeanthom has joined #nmigen

09:36 pftbest has joined #nmigen

09:39 whitequark[m] has joined #nmigen

09:42 emily has joined #nmigen

09:46 cesar[m] has joined #nmigen

09:55 emily has quit [Ping timeout: 240 seconds]

09:55 cesar[m] has quit [Ping timeout: 240 seconds]

09:56 whitequark[m] has quit [Ping timeout: 260 seconds]

10:19 cesar[m] has joined #nmigen

10:39 vmedea[m] has joined #nmigen

10:39 emily has joined #nmigen

10:39 whitequark[m] has joined #nmigen

10:39 JJJollyjim has joined #nmigen

10:39 blazra has joined #nmigen

10:39 jfng has joined #nmigen

10:42 futarisIRCcloud has joined #nmigen

10:58 bvernoux has quit [Read error: Connection reset by peer]

11:01 jjeanthom has quit [Ping timeout: 244 seconds]

11:06 jjeanthom has joined #nmigen

11:23 jjeanthom has quit [Ping timeout: 258 seconds]

11:37 pftbest has quit [Ping timeout: 244 seconds]

11:48 jjeanthom has joined #nmigen

12:02 pftbest has joined #nmigen

12:30 <lkcl> tpw_rules: no, they're... a team that betrayed our trust by screwing us over when speaking to investors

12:33 <lkcl> they've caused an awful lot of damage by not pathologically not listening to feedback, yet truly and firmly believe that they are "good".

12:33 <lkcl> it's a dangerous combination.

12:34 <lkcl> one that, sadly, is quite common

12:34 <d1b2> <dub_dub_11> :( what was the feedback concerning

12:35 <lkcl> 5 people advised them how to speak to investors, regarding the software license, to keep it libre rather than non-free (non-commercial)

12:36 <lkcl> it's a long list and much of it is confidential, i apologise.

12:36 <lkcl> bottom line they've pissed off a *lot* of very powerful people

12:37 <d1b2> <dub_dub_11> Ah right

12:37 <d1b2> <dub_dub_11> I think I vaguely remember this topic

12:37 <d1b2> <dub_dub_11> What is the difference between libre and non-free (non-commercial)

12:38 <lkcl> essentially, "we develop everything publicly and transparently, all code and all technical discussions are available in real-time"

12:38 <lkcl> vs

12:38 <lkcl> "we keep everything closed, but promise *eventually* to release source code, but when we do you can't use it for commercial purposes"

12:39 <d1b2> <dub_dub_11> Ah

12:39 <d1b2> <dub_dub_11> So libre is actually open source Vs ... Not

12:39 <lkcl> for a while they tried to take libre-soc LGPLv3-licensed source code and slap a non-commercial license on it without permission

12:39 <lkcl> aka "Fake Open Source", yes.

12:40 <lkcl> anyway - that's enough of tha, if that's ok dub. this is nmigen.

12:40 <d1b2> <dub_dub_11> I'm not an expert on licensing but I remember enough to know that GPL states you can't change the licence terms

12:40 <d1b2> <dub_dub_11> Yeah you're right

12:46 Bertl_zZ is now known as Bertl

12:52 jjeanthom has quit [Ping timeout: 264 seconds]

13:00 <lkcl> correct: and, also, you cannot take someone else's copyrighted material (jointly or otherwise, majority or otherwise) and slap a different license on it.

13:07 chipmuenk has joined #nmigen

13:20 nfbraun has joined #nmigen

13:49 <nfbraun> Is there a way to use input delay elements (e.g. IDELAY on Xilinx 7series) from nMigen proper (without instantiating vendor-specific primitives)?

13:50 <nfbraun> Seems more or less required for DDR input...

13:51 <mwk> not at the moment

13:51 <mwk> they're in general too hardware-specific to be usefully specifiable in generic code

13:53 <nfbraun> For the runtime-configurable variant I'd agree, but the fixed variant (a fixed delay specified at bitstream generation time) should be doable?

13:54 <nfbraun> Then again, I never really used other FPGA families...

13:55 <mwk> like, just specify the delay in ns? I suppose that would be doable, except not all FPGAs can actually do that

13:55 <mwk> since it requires some calibration scheme in hw

13:58 <mwk> and even on 7series, getting it working is nontrivial

13:58 <mwk> since you have to set up IDELAYCTRL and need a 200MHz reference clock for that

13:58 <mwk> doing that automatically inside a synthesis tool? not going to happen

14:01 <nfbraun> I guess that is true, the user would probably need to take care of the reference clock manually. Which kind of negates the whole point of having this integrated.

14:02 <mwk> I'm thinking it could be integrated, just not in a generic way

14:03 <mwk> like, the platform code already instantiates vendor primitives for IOs, you could make it do more with vendor-specific parameters

14:03 <mwk> not sure if it's a good idea thoug

14:03 <d1b2> <dub_dub_11> the Libraries Guide says the primitives need to be instantiated manually, and can't be inferred so basically yeah it would have to be manually coded in specific to the platform

14:04 <d1b2> <dub_dub_11> and there's quite a long list of primitives that can only be instantiated 😅

14:04 <mwk> what xilinx tools can infer or not is besides the point here

14:04 <mwk> nmigen already has generic support for DDR, which likewise has to be manually instantiated in verilog

14:05 <mwk> the question is whether the relevant primitives are consistent enough between various FPGA types that you can actually feasibly write platform-independent code

14:06 <nfbraun> Yes. I was thinking something along the lines of platfrom.request("foo", xdr=2, delay=1200).

14:06 <mwk> eg. IMO it'd be nice if stuff like xilinx BUFGCE was instantiable by an nmigen primitive

14:06 <d1b2> <dub_dub_11> yes, but the generic support is implemented by manually inferring a primitive (just in the platform script not the user)

14:07 <d1b2> <dub_dub_11> yeah that's what I mean, there are other things and it's hard to decide what should be left to user instantiation and what nmigen could support

14:07 <d1b2> <dub_dub_11> like a BUFGCE, or DSP, or BRAM

14:08 <d1b2> <dub_dub_11> or PLLs which is sort of being worked on now I think

14:11 <nfbraun> Thanks for the answer. I have it running with Instance()s and will stick to that for the time being.

14:12 <d1b2> <dub_dub_11> (which btw mwk if there are extra primitives for the Xilinx platform I could help with adding support for I'd be happy to)

14:12 <mwk> not that I know of

14:13 <mwk> it'd be nice to have a set of nice nmigen wrappers for various platforms, and for I/O stuff perhaps even integrate it with usual I/O instantiation, but we don't have anything like that yet

14:13 <d1b2> <dub_dub_11> that's kinda what I meant yeah

14:14 <nfbraun> mwk: FWIW, I tested your "unify Xilinx platforms" PR (#563) on Zynq and I seemed to work, at least until I ran into the issue that I needed delays.

14:15 <mwk> well it doesn't really change anything for Zynq

14:16 <mwk> ... I mean unless you want to synth with ISE for some strange reason

14:16 <nfbraun> No :)

14:18 <d1b2> <dub_dub_11> the two I've been playing around with recently are BRAM and DSP. You can get BRAM usage easily enough with generated code but when I tried to use the pipeline registers (which requires instantiation) and initialise the contents it was a real headache so I have a wrapper that might help with that usecase (maybe with some cleaning up).

14:19 <mwk> huh, pipeline registers require instantiation? that's news to me

14:20 <d1b2> <dub_dub_11> similar issues with DSP blocks you might remember me asking about, where I wanted a multiply-accumulate but the generated code meant I was getting the accumulator in fabric. So likewise I'd be happy to make a wrapper for instantiating DSP (which seems like a much bigger task than BRAM)

14:20 <mwk> mhm

14:21 <mwk> ... of course the interesting part with both of those two is not actually emitting the instance

14:21 <mwk> but also getting the damn thing simulatable

14:23 <nfbraun> BTW: how usable is the Symbiflow toolchain on Zynq? Is this worth trying out already?

14:29 <d1b2> <dub_dub_11> maybe the pipeline registers can be inferred in Vivado? I'm working with V5 and ISE, and I definitely remember somewhere saying you needed instantiation

14:29 emeb_mac has joined #nmigen

14:32 <d1b2> <dub_dub_11> ah here we go: UG190, BRAM Library Primitives: "Some block RAM attributes can only be configured using one of these primitives (e.g., pipeline register, cascade, etc.)"

14:32 <d1b2> <dub_dub_11> I see what you mean wrt simulation though, as I don't know to what extent if at all you can use the vendor sim model

14:45 zignig has quit [Quit: Lost terminal]

14:53 _whitelogger has joined #nmigen

14:55 <d1b2> <dub_dub_11> that doesn't necessarily mean it is accurate tbf

15:02 <modwizcode> I'm fairly certain you have to manually specify those

15:02 <modwizcode> All the code I've seen goes through that process.

15:05 <mwk> that's... interesting

15:06 <modwizcode> Is it?

15:06 <modwizcode> I'

15:06 <modwizcode> oops

15:07 <mwk> sounds like kind of an easy thing to implement and I'm surprised they didn't do that

15:07 <mwk> well, relatively easy

15:07 <d1b2> <dub_dub_11> yeah it doesn't seem hard to think of an inference template

15:08 <modwizcode> I think their reasoning was probably that other parameters are not so easy to infer, so rather lets not infer only some of them

15:08 <d1b2> <dub_dub_11> basically just any synchronous read lol

15:08 <modwizcode> It's definitely possible the docs are wrong though, maybe I should test it. Although I'm not 100% sure how I'd verify :p

15:10 <d1b2> <dub_dub_11> I have tried on ISE at least, when doing it by inference and using PlanAhead to look at the layout it put the output registers in fabric and the highest frequency I could hit was like 330MHz

15:10 <d1b2> <dub_dub_11> same design but instantiated BRAM with the pipeline register option, easily met timing at 450MHz which is the BRAM switching limit

15:10 <modwizcode> I can check in vivado

15:11 <modwizcode> where is the spec for the limit defined btw?

15:11 <d1b2> <dub_dub_11> uh probably ug473

15:11 <d1b2> <dub_dub_11> wait no

15:11 <d1b2> <dub_dub_11> it will be the dc and switching datasheet

15:12 <d1b2> <dub_dub_11> https://www.xilinx.com/support/documentation/data_sheets/ds187-XC7Z010-XC7Z020-Data-Sheet.pdf zynq-7000 right

15:13 <modwizcode> I don't remember how xylinx license work. I wonder if i can make a zynq-7000 design

15:14 <d1b2> <dub_dub_11> well what part/board do you have

15:14 <modwizcode> I don't remember lol

15:14 <modwizcode> I'll have to check

15:14 <d1b2> <dub_dub_11> 😄

15:19 <d1b2> <dub_dub_11> https://github.com/H-S-S-11/numerically-controlled-oscillator you might be able to adapt nco_lut, nco_lut_pipelined, tone_synth to something you can use to compare the inferred vs pipelined lut. Sorry for the mess... I've promised myself I will organise it at some point

15:51 GenTooMan has quit [Quit: Leaving]

15:54 <modwizcode> Hmm

15:56 <d1b2> <dub_dub_11> (ofc if it doesn't make any sense which is highly possible then just ignore it 😅

15:56 <modwizcode> I'm just trying to understand what's going on in this code XD

15:57 <d1b2> <dub_dub_11> yeah I should really comment everything 😐

15:57 <modwizcode> I mean it's not awful it's just a new codebase to me

15:58 <d1b2> <dub_dub_11> nco_lut is a sine wave generator so it uses a large BROM

15:58 <modwizcode> nco_lut doesn't instantiate the BROM? it relies on inference from the switch statement?

15:59 <d1b2> <dub_dub_11> correct

15:59 <modwizcode> we know it infers correctly as is?

16:00 <d1b2> <dub_dub_11> it infers BRAM, initialises it, all great... except, even though the output is assigned in the sync domain, it won't use the output registers

16:00 pftbest has quit [Remote host closed the connection]

16:01 <d1b2> <dub_dub_11> so when you look at the resource usage/layout viewer you will see that sin_o is in FFs in fabric

16:02 pftbest has joined #nmigen

16:02 <modwizcode> Okay

16:02 <d1b2> <dub_dub_11> and if you try stretch the timing constraints, I would expect you to see the path from BRAM inputs to the sin_o FFs become the critical path

16:03 <modwizcode> I will try to instantiate this in a design. I'm probably going to convert it to verilog and paste it into the IDE :p

16:03 <modwizcode> but I will do that in a few hours because I have other things to do at the present moment

16:03 <d1b2> <dub_dub_11> fair enough

16:03 <d1b2> <dub_dub_11> tone_synth feeds the output into a pwm module but it's not really important where it goes

16:04 <d1b2> <dub_dub_11> it's probably a better test without the pwm tbh

16:04 <modwizcode> Yeah I think I'll bypass that and just focus on nco_lut and nco_lut_pipelined directly

16:04 <modwizcode> why does one of them instantiate a PLL and the rest don't out of curiousity

16:05 <d1b2> <dub_dub_11> uhh I was playing around 😛

16:05 <d1b2> <dub_dub_11> I've not actually run any of them at 450MHz, only put that as the timing constraint and done STA

16:07 pftbest has quit [Ping timeout: 246 seconds]

16:08 <d1b2> <dub_dub_11> the bram wrappers are in bram_instantiation

16:12 <d1b2> <dub_dub_11> they're very much a "hacked together to do what I needed at the time" version though, and would benefit from me spending more time to flesh out things like the initialisation data generator

16:33 <agg> whitequark: if you get a few minutes, could i ask for a quick look at https://github.com/nmigen/nmigen/pull/578 please? I hope it should be pretty straightforward

16:42 jjeanthom has joined #nmigen

16:59 jjeanthom has quit [Ping timeout: 246 seconds]

17:19 jjeanthom has joined #nmigen

17:25 jjeanthom has quit [Ping timeout: 260 seconds]

18:06 <whitequark> agg: looking now

18:06 <_whitenotifier> [nmigen] whitequark closed pull request #578: ECP5: Replicate OE FF for each output bit - https://git.io/JtJci

18:06 <_whitenotifier> [nmigen/nmigen] whitequark pushed 1 commit to master [+0/-0/±1] https://git.io/JtsKY

18:06 <whitequark> agg: thank you!

18:06 <_whitenotifier> [nmigen/nmigen] adamgreig 6ce2b21 - vendor.lattice_ecp5: replicate OE signal for each output bit.

18:07 <agg> thanks :)

18:07 <agg> i am slowly fuzzing out more ECP5 DSP bits, hopefully have nextpnr doing a working MAC sometime soon

18:07 * agg swaps nmigen back to master branch

18:07 <_whitenotifier> [nmigen/nmigen] github-actions[bot] pushed 1 commit to gh-pages [+0/-0/±13] https://git.io/JtsKZ

18:07 <_whitenotifier> [nmigen/nmigen] whitequark 403c433 - Deploying to gh-pages from @ 6ce2b21e196a0f93b82748ed046098331d20b3bf 🚀

18:15 <whitequark> wonderful

18:26 <miek> agg: ooh, nice!

18:34 <awygle> whitequark: re: that PR - the only way to do xdr=4 with dir=io on the ECP5 is using the DQS modules for memory, according to daveshah. is that something nmigen should deal with, either by throwing some specific error or by trying to autogen the memory stuff?

18:34 <modwizcode> agg: what's the workflow you're using to figure out the bits?

18:34 <awygle> Probably not the latter one, that seems Difficult

18:35 <daveshah> You can't really autogen the memory stuff as you need to have a DQS

18:35 <agg> modwizcode: the fuzzing stuff is best described here https://prjtrellis.readthedocs.io/en/latest/db_dev_process/overview.html

18:35 <modwizcode> ty

18:35 <daveshah> And it's a very ECP5 specific structure so it doesn't really make sense to have something like that as a generic thing

18:35 <agg> the details have gotten a bit messy

18:36 <awygle> Yeah that's what I figured

18:36 <awygle> I think putting an exception in the vendor code makes sense

18:37 <agg> does xdr=4 with fixed i or o work?

18:37 <awygle> You can do xdr=2 dir=io, right?

18:37 <awygle> Yes

18:37 <agg> yea, that's what my PR just fixed

18:37 <agg> (and i have working on hardware with npnr)

18:38 <awygle> Okay cool

18:38 <agg> wait, sorry, no, I have working xdr=1 on hardware

18:38 <agg> I did synthesise and run xdr=2

18:38 <agg> but the design doesn't need xdr=2 so I didn't completely check it worked

18:39 <agg> (previously xdr=1 io didn't work from nmigen)

18:39 <agg> (for >1 bit wide resources)

18:40 <awygle> Right, I read the PR. Good catch+fix

18:47 Bertl is now known as Bertl_oO

18:53 FFY00 has quit [Ping timeout: 260 seconds]

18:54 FFY00 has joined #nmigen

18:54 ChipEleven has joined #nmigen

18:56 <ChipEleven> Hello. d1b2, I should clarify from earlier(https://freenode.irclog.whitequark.org/nmigen/2021-01-23#28933194;) that our CPU without FPU, and Vector support will be released under an OSI approved license

18:57 <agg> (d1b2 is a Discord bridge, so the specific user is the name in <> at the start of the message)

18:59 <modwizcode> @dub_dub_11 uhh so weirdly running the generated verilog in the simulator seems to simulate very differently. Which doesn't make sense to me at all...

18:59 <d1b2> <dub_dub_11> oh are you using the vendor primitives to simulate

18:59 <ChipEleven> agg, thanks. Noted.

18:59 <modwizcode> no

18:59 ChipEleven has quit [Quit: Connection closed]

19:00 <modwizcode> I'm just testing the one that does inference

19:00 <d1b2> <dub_dub_11> ah

19:00 <d1b2> <dub_dub_11> wdym by differntly?

19:00 <modwizcode> I was checking sim before I synthized and such to make sure it's the same

19:00 <modwizcode> uh by differently I mean clearly wrongly hmm ss might help

19:00 <d1b2> <dub_dub_11> ah, have you got the right radix

19:00 <modwizcode> Actually yeah

19:00 <modwizcode> I was going to uh check that

19:01 <d1b2> <dub_dub_11> think it defaults to signed

19:01 <d1b2> <dub_dub_11> but gtkwave or we will do unsigned

19:01 <d1b2> <dub_dub_11> sooo it will look kinda weird

19:01 <modwizcode> Yeah you're right I'm just being dumb

19:02 <modwizcode> actually it looks like signed is the one that looks right in vivado

19:03 <d1b2> <dub_dub_11> yeah that should be the case

19:04 <modwizcode> Weirdly now GTK wave's looks wrong

19:04 <modwizcode> OH

19:05 <d1b2> <dub_dub_11> oh it's a pain on gtkwave

19:05 <modwizcode> I know why. god I'm stupid. I was originally looking at the phi signal by mistake. Which is why I saw a different, but I was looking at the right signal in vivado

19:05 <d1b2> <dub_dub_11> if you want to do an analog wave you have to click signed decimal, then go back and click analogue

19:05 <d1b2> <dub_dub_11> ah

19:05 <modwizcode> that is

19:05 <modwizcode> very dumb

19:15 richbridger has joined #nmigen

19:16 <modwizcode> @dub_dub_11 how do I know if it's a pipeliend blockram in the viewer? I can't make heads or tails of it

19:17 <d1b2> <dub_dub_11> if you click on it you should see the attributes of the primitive, including the DO_reg or whatever it is called

19:17 <modwizcode> ok I tried to do that but I didn't see any related properties hmm

19:17 <modwizcode> I see DOA_REG=1, DOB_REG=0

19:17 <d1b2> <dub_dub_11> DOA_REG

19:18 <d1b2> <dub_dub_11> that's the one

19:18 <d1b2> <dub_dub_11> yeah so that's a pipelined bram

19:18 <modwizcode> Yeah so vivado inferred a pipelined bram

19:18 <d1b2> <dub_dub_11> oh interesting

19:19 <d1b2> <dub_dub_11> this was with nco_lut.py?

19:19 <modwizcode> yes

19:19 <d1b2> <dub_dub_11> neat

19:19 <modwizcode> I'm trying to see what it decided as fmax

19:19 <d1b2> <dub_dub_11> yeah that's the other thing is to push that, in theory if it's pipelined and there's nothing else in the design you should be able to hit the switching limit of the bram

19:20 <modwizcode> do I just have to calculate this from the slack? It seems vivado doesn't dump out an estimated fmax

19:21 <d1b2> <dub_dub_11> well you can get an idea of the delay paths looking through the timing report

19:21 <d1b2> <dub_dub_11> but you won't know the "true" fmax unless you keep decreasing the period constraint until it fails

19:23 <modwizcode> And then there's the effort setting for the place and route (Implementation as vivado calls it

19:43 <modwizcode> Well I couldn't do 450MHz at least on the default effort settings

20:05 <d1b2> <dub_dub_11> what's your part

20:05 <modwizcode> I just kinda picked one at random maybe I should have asked what you were using

20:05 <modwizcode> I went with a xc7z010clg400 speed grade -1

20:05 <d1b2> <dub_dub_11> the limit will be the switching limit for your part and speed gradfe

20:06 <modwizcode> I assume I probably need a better part or speedgrade but I was hoping it'd be achievable without actually picking one

20:06 <modwizcode> what did you use?

20:06 <d1b2> <dub_dub_11> https://www.xilinx.com/support/documentation/data_sheets/ds187-XC7Z010-XC7Z020-Data-Sheet.pdf check here

20:07 <d1b2> <dub_dub_11> for spped grade -1, the BRAM fmax is 388M

20:08 <d1b2> <dub_dub_11> I have a Virtex 5 speed grade -1, for which the limit is 450

20:09 <modwizcode> ah

20:09 <modwizcode> what is block ram cascade anyway

20:09 <d1b2> <dub_dub_11> using multiple primitives with them connected using dedicated column routing

20:09 <d1b2> <dub_dub_11> as opposed to using the fabric routing

20:10 <d1b2> <dub_dub_11> in order to act as a larger bram

20:10 <modwizcode> ah

20:14 <slan> What would be the best approach to access SDRAM in arty with nmigen? Is it litedram or custom tcl injection with a vivado-generated MIG, or?

21:06 <modwizcode> @dub_dub_11 (does this actually ping? do I need the @ to ping?) did you want me to test anything else now?

21:09 <d1b2> <dub_dub_11> yeah it does ping, I think if you can hit 388 or thereabouts that basically confirms it's inferred pipeline registers

21:12 <modwizcode> I can hit 450 (I switched to some random Virtex 7 part though I don't have Virtex 5 parts)

21:12 <modwizcode> It can even do better

21:12 <modwizcode> So I think that confirms it.

21:13 <modwizcode> I should try forcing it to not infer the register but I'm not 100% sure how to do that right now

21:13 <d1b2> <dub_dub_11> is that the bram fmax?

21:13 <modwizcode> uhh I don't know and I doubt it, I tested something else that confirms that we're there.

21:14 <modwizcode> The other test we did was close to the bmax (when I speced in that Zynq part) which seemed right

21:14 <d1b2> <dub_dub_11> if you send it through some intermediate signals, maybe into a higher level module that might do it

21:14 <modwizcode> I think it might try to be too clever

21:15 <modwizcode> The tool you're using doesn't infer it right? Are you using ISE?

21:17 <d1b2> <dub_dub_11> yeah, ISE

21:17 <modwizcode> Yeah they must have changed the inference.

21:18 <modwizcode> probably doesn't help you though if you can't use Vivado (are these supported by any open source workflows yet?)

21:18 <d1b2> <dub_dub_11> nah and there's not much point really, they're hardly common

21:19 <modwizcode> The Virtex 5 is the older version of the Virtex 7 right? (I can't tell if it's a speced down or just older)

21:26 <d1b2> <dub_dub_11> older yes, mostly the same architecture iirc. number is the generation

21:31 lkcl has quit [Ping timeout: 260 seconds]

21:34 <modwizcode> Ahh okay

21:45 lkcl has joined #nmigen

22:38 FFY00 has quit [Remote host closed the connection]

23:14 chipmuenk has quit [Quit: chipmuenk]

23:54 GenTooMan has joined #nmigen