##openfpga on 2018-12-05 — irc logs at freenode.irclog.whitequark.org

00:05 <whitequark> ok, one bug fixed, one more left

00:07 <pie__> qu1j0t3, that would be tragic

00:07 <qu1j0t3> pie__: :)

00:09 genii has quit [Remote host closed the connection]

00:26 <whitequark> daveshah: it is DONE

00:37 <whitequark> https://github.com/YosysHQ/yosys/pull/717

00:37 <whitequark> daveshah: ^

00:38 <whitequark> "write your own logic optimization program", he said.

00:38 <whitequark> "we are not a billion dollar company", he said.

00:38 <whitequark> well I am not one either :P

00:44 futarisIRCcloud has joined ##openfpga

01:13 <adamgreig> woo! first light on my first fpga pcb

01:14 <adamgreig> and my stupid bodged programmer worked first time too

01:15 <whitequark> nice!!!

01:16 <adamgreig> also the phy has link so I can't be far off blatting network packets :p

01:16 <adamgreig> https://photos.app.goo.gl/S3ob94rCDfYjrPXV8

01:19 <swetland> nice!

01:20 <swetland> I survived kicad schematic creation yesterday, but still need to actually lay out a PCB and sent it out... https://pbs.twimg.com/media/DtkQ02BUcAA34cm.jpg:large

01:21 <adamgreig> i sent these to jlcpcb and they got the boards to me in one week, incredible

01:21 <whitequark> is that a switch?

01:21 <adamgreig> yes but not an ethernet switch

01:21 <whitequark> oh

01:21 <whitequark> what kinda switch

01:21 <adamgreig> https://photos.app.goo.gl/dfY9MSRnjg8V2P7X9

01:21 <adamgreig> don't really know yet

01:21 <adamgreig> tbc

01:21 <adamgreig> going to see how far i can push ice40's pseudo-lvds down utp

01:21 <whitequark> that's a lot of FPGAs

01:21 <adamgreig> with some sort of 8b10b and etc

01:21 <adamgreig> i want to make a circuit switched network

01:21 <whitequark> uhhhhh

01:21 <adamgreig> yes yes "not far"

01:21 <whitequark> ice40 cannot do clock recovery

01:21 <adamgreig> well

01:22 <whitequark> it's pointless to do 8b10b for the most part

01:22 <adamgreig> I was hoping to use ddr on the gpio, ice40 at 100MHz, and the data clock is less

01:22 <whitequark> I mean, you've ran at least two pairs to each port, right

01:22 <adamgreig> so you can oversample

01:22 <adamgreig> yea, there's a tx and an rx pair on each port

01:22 <whitequark> that sounds like it'll fail but I'm curious.

01:22 <swetland> could do a dedicated diffpair clock and one or more data lanes then, no?

01:22 <swetland> similar to MIPI CSI

01:23 <adamgreig> (and power on the other pairs)

01:23 <whitequark> yeah that's what I would do

01:23 <whitequark> clock pair

01:23 <whitequark> could still do half-duplex I guess

01:23 <adamgreig> not easily; the ice40 lvds is hard wired to tx or rx

01:23 <whitequark> oh right

01:23 <whitequark> ok

01:23 <adamgreig> you don't reckon you could do cdr with 4x oversampling?

01:24 <adamgreig> not looking to push bandwidth or distance records here really

01:24 <adamgreig> already not going to have equalisation and the diff voltage is small too

01:24 <whitequark> i mean... with that level of oversampling, you could run, like, uart

01:24 <adamgreig> well sure :p

01:24 <whitequark> what's the point in 8b10b if you're not actually doing clock recovery?

01:25 <whitequark> is it capacitively coupled even?

01:25 <adamgreig> guess it still gives you some sort of framing

01:25 <adamgreig> no, dc

01:25 <whitequark> so

01:25 <whitequark> you don't need dc balance

01:25 <whitequark> you don't need guaranteed transitions

01:25 <whitequark> you just use it as a framing with 20% overhead

01:25 <whitequark> this is literally uart but more complex

01:25 <adamgreig> you can see how that's appealing, though?

01:25 <whitequark> no?

01:26 <adamgreig> fun to write an 8b10b enc/dec

01:26 <whitequark> that's just a LUT

01:26 <adamgreig> hmm

01:26 <whitequark> i'd probably put it into a BRAM, even

01:27 <adamgreig> well in any event the objective here was strictly to make some fpgas and experiment with connecting them

01:27 <adamgreig> so really anything goes

01:27 <swetland> did that on ZYBO to drive HDMI. sadly without OSERDES you can't really get the data you need for something like that

01:28 <adamgreig> anyway uart also has 20% overhead ;)

01:28 <adamgreig> if I'm going to transmit ten bits for each eight data bits, 8b10b seems like it'l be more fun than a start and stop bit

01:29 <whitequark> uart gives you higher clock rate

01:29 <whitequark> and less device utilization

01:29 <whitequark> with everything else being equal

01:30 <swetland> I think the only advantage to a symbol based system is that if you plug together two sides where one is constantly chattering you might avoid character mis-alignment

01:31 <whitequark> indeded

01:31 <whitequark> *indeed

01:32 <adamgreig> honestly I'd do it just because I've implemented uarts in fpgas before

01:32 <adamgreig> step one is the ethernet side anyway

01:32 <swetland> if you want to run 100Mbps over a reasonable distance, ethernet PHYs are about $1, and RJ45 + Magnetics are about $4 (qty 1), and RMII is a 2bit/clk 50MHz interface, very easy to talk to with FPGAs

01:33 <adamgreig> totally, I already have ethernet on this for "uplink"

01:33 <adamgreig> the objective for the other side is having a synchronised system clock and circuit switched data though

01:33 <adamgreig> which okay you could just send udp packets and maybe even use ptp

01:34 <swetland> yeah, there is plenty of knowledge about how to do clock sync

02:02 pie___ has joined ##openfpga

02:05 pie__ has quit [Ping timeout: 268 seconds]

02:14 egg|egg is now known as egg|zzz|egg

02:28 azonenberg_work has quit [Ping timeout: 245 seconds]

02:37 unixb0y has quit [Ping timeout: 268 seconds]

02:38 unixb0y has joined ##openfpga

02:45 <whitequark> siiiiigh

02:45 <whitequark> so i'm gonna write a techmapper i think.

03:06 Miyu has quit [Ping timeout: 272 seconds]

03:11 catplant has joined ##openfpga

03:33 catplant has quit [Ping timeout: 250 seconds]

03:59 rohitksingh_work has joined ##openfpga

04:00 Bike has quit [Quit: Lost terminal]

04:08 prpplague has joined ##openfpga

04:09 <prpplague> anyone know if the details for orconf2019 have been announced?

04:20 catplant has joined ##openfpga

04:37 catplant has quit [Ping timeout: 250 seconds]

04:45 azonenberg_work has joined ##openfpga

04:53 emeb has quit [Quit: Leaving.]

05:07 <whitequark> daveshah: lmao what the fuck

05:07 <whitequark> naive techmapping: 51 LUT

05:07 <whitequark> naive techmapping followed by my opt_lut: 18 LUT

05:07 <whitequark> abc: ............. 17 LUT

05:08 <whitequark> this isn't even in C, this mostly just uses Yosys techmap pass...

05:16 azonenberg_work has quit [Ping timeout: 250 seconds]

05:17 <swetland> ooh, I need to give this a try. yosys is using 60% more LUTs than icecube2

05:19 jevinskie has joined ##openfpga

05:19 <whitequark> swetland: grab my other PR

05:19 <whitequark> and try doing synth_ice40 -relut

05:20 <swetland> 717?

05:20 <whitequark> 717?

05:20 <whitequark> oh yeah

05:20 <whitequark> that one

05:20 jevinski_ has quit [Ping timeout: 268 seconds]

05:26 _whitelogger has joined ##openfpga

05:26 jevinski_ has joined ##openfpga

05:26 <swetland> ERROR: timing analysis failed due to presence of combinatorial loops, incomplete specification of timing ports, etc.

05:26 genii has joined ##openfpga

05:26 <swetland> w/ tot+717 (vs tot which works without complaint)

05:27 <whitequark> interesting

05:27 <whitequark> can you try to reduce the design?

05:27 jevinskie has quit [Ping timeout: 250 seconds]

05:27 <whitequark> or, can you post it in the issue? yosys json or something like that

05:28 <swetland> I can toss the json up right now and can poke at it a bit later and see if I can find a smaller failure case

05:29 <whitequark> sure, that works

05:32 <swetland> actually is the json (output from yosys) useful here?

05:32 <whitequark> I think so yeah

05:37 <swetland> interesting. only fails if I infer this 256x16b ram instead of invoking SB_RAM40_4K manually.

05:37 pie___ has quit [Quit: Leaving]

05:39 <whitequark> interesting

05:39 <whitequark> if you instantiate, does the design work?

05:42 <swetland> provided I don't use -relut it does work

05:42 <swetland> with -relut nextpnr fails

05:44 <swetland> without -relut both inferred and instantiated version of the design works. with -relut instantiated version will not pass nextpnr, but inferred version does and also works

05:44 <whitequark> what fails exactly?

05:44 <whitequark> timing?

05:45 <whitequark> wait

05:45 <whitequark> with -relut instantiated version will not pass

05:45 <whitequark> nextpnr, but inferred version does and also works

05:45 <whitequark> I'm confused

05:45 <whitequark> didn't you just say the opposite of this?..

05:49 <swetland> sorry, I may have misspoke. if I infer the ram, nextpnr succeeds whether or not I used -relut with yosys synth_ice40 and both resulting bitfiles work

05:50 <swetland> if I instantiate the ram nexpnr only succeeds if I do not use -relut, and the resulting bitfile works

05:50 <swetland> https://www.irccloud.com/pastebin/inCVVlZ1/

05:51 <whitequark> the plain one has *more* LUTs? that seems like an obvious bug

05:53 genii has quit [Remote host closed the connection]

05:54 <swetland> plain.asc is the design built without using -relut. relut.asc is the design built with -relut

05:55 <whitequark> er

05:55 <whitequark> I meant

05:55 <whitequark> relut has *more* LUTs in your paste

05:55 <whitequark> relut: 1248, plain: 1220

05:55 <swetland> that the relut version *also* has an additional bram is even weirder

05:55 <whitequark> huh

06:04 ZipCPU|Laptop has joined ##openfpga

06:14 rohitksingh_work has quit [Read error: No route to host]

06:15 rohitksingh_work has joined ##openfpga

06:19 azonenberg_work has joined ##openfpga

06:25 rofl__ has quit [Read error: Connection reset by peer]

06:28 rohitksingh_work has quit [Read error: Connection reset by peer]

06:30 rohitksingh_work has joined ##openfpga

06:30 jcarpenter2 has joined ##openfpga

06:39 rohitksingh_work has quit [Ping timeout: 240 seconds]

06:41 rohitksingh_work has joined ##openfpga

07:45 f003brv has joined ##openfpga

07:45 <f003brv> Hello friends

07:57 catplant has joined ##openfpga

08:05 <f003brv> hi catplant

08:29 f003brv has quit [Quit: Page closed]

08:34 futarisIRCcloud has quit [Quit: Connection closed for inactivity]

08:34 <daveshah> whitequark: Awesome

08:34 <daveshah> Seems my second observation about abc being like vpr was right...

08:38 <daveshah> I guess getting timing up is the next challenge. One way to approach that might be trying to balance critical path length when merging LUTs

08:38 catplant has quit [Ping timeout: 272 seconds]

08:39 <tnt> Is there any existing to feedback a pnr result back into synthesis to guide it for a second pass to know where to optimize better ?

08:39 <tnt> "existing ways"

08:39 <daveshah> No, not yet

08:40 <daveshah> You could go all the way back through icebox_vlog

08:40 <daveshah> But that's probably going to make things a lot worse

08:55 mumptai has joined ##openfpga

09:00 GuzTech has joined ##openfpga

09:15 catplant has joined ##openfpga

09:18 <tnt> whitequark: mmm, I get what(): Assertion failure: cout_port.net != nullptr (/tmp/ice40/nextpnr/ice40/chains.cc:92)

09:18 <tnt> (with your -relut option)

09:27 <daveshah> Sounds like there might be a dangling carry somewhere

09:28 <daveshah> Surprised that Yosys' own optimisations haven't dealt with that

09:35 <tnt> Mmm, I don't actually see anything wrong in the .json

09:38 <tnt> I also have your carry chain pull request in that tree.

09:48 <daveshah> tnt: Maybe try without that PR?

09:57 <tnt> yeah, it works without that PR.

09:58 <daveshah> Can you post the JSON somewhere?

09:59 <tnt> http://246tnt.com/icebreaker/top.json http://246tnt.com/icebreaker/top-icebreaker.pcf

10:00 <tnt> nextpnr-ice40 --up5k --package sg48 --json top.json --pcf top-icebreaker.pcf --asc top.asc --freq 60 --opt-timing -

10:04 catplant has quit [Quit: WeeChat 2.2]

10:04 catplant has joined ##openfpga

10:07 <tnt> Mmm, the DFFs have a different control set for the different bits of the alu result.

10:07 <daveshah> That should still be dealt with

10:07 <daveshah> looking niw

10:12 <daveshah> I think I've pushed a fix

10:12 <daveshah> can you test that it actually functions?

10:15 jevinski_ has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

10:20 <tnt> daveshah: yeah, seems to be working fine.

10:23 <tnt> tx !

10:23 <daveshah> no problem

10:25 <tnt> On another design I also get the comb loop issue raised above (with -relut).

10:26 <daveshah> Can you post that netlist too?

10:28 <tnt> Sure done. Same location, I overwrote the files.

10:28 <daveshah> cheers

10:38 <daveshah> tnt: The problem seems to be that `top.$abc$4480$n629` is undriven

10:39 <daveshah> Running `setundef -undriven -zero` on the netlist does cause nextpnr to accept it

10:39 <daveshah> I suspect that an undriven wire feeding logic is a problem

10:39 <tnt> of course with such explicit names, I directly know where that is in my design :p

10:50 <tnt> ok, yeah, I see what that signal to be. Some pretty big comb path with adders and muxes ... so exactly what relut should modify.

10:53 mumptai has quit [Remote host closed the connection]

10:54 jevinskie has joined ##openfpga

10:54 catplant has quit [Quit: aaaaaaa]

10:55 <whitequark> daveshah: hm, any idea what should I do in opt_lut to fix that?

10:55 <whitequark> call setundef -undriven -zero? something else?

10:56 <tnt> Minimal test case : https://pastebin.com/wxvw9VCN

10:56 <daveshah> whitequark: It depends where they are coming from

10:57 <daveshah> If they are genuinely floating, then you should modify the LUT function to remove that input

10:57 <whitequark> mmmm, okay

11:03 <q3k> whitequark: does the abc-less boneless tech mapping flow then run (in yosys?) any logic minimization step?

11:04 <q3k> i'm not even sure what kind of work does 'opt' do...

11:05 <daveshah> afaik opt is a mix of more advanced coarse-grain optimisations, and some generic stuff like trimming dead logic or merging equivalent stuff

11:06 <daveshah> I don't think opt does any real low-level logic optimisation though

11:06 <q3k> that's what abc did, right?

11:07 <whitequark> q3k: what is "logic minimization"

11:07 <whitequark> exactly

11:07 <whitequark> q3k: like removing redundant LUTs?

11:07 <q3k> whitequark: no, something like espresso

11:07 <tnt> more minimal ... https://pastebin.com/xeVwqGri

11:08 <q3k> whitequark: or you know, karnaugh maps if you did that manually :)

11:08 <whitequark> oh!

11:09 <q3k> i'm not sure it makes sense to run that per-lut (especially on narrow 4luts)

11:09 <whitequark> yes, probably not per lut

11:09 <tnt> per-lut ... at best you'd find useless inputs.

11:09 <q3k> yeah

11:10 <whitequark> might still be valuable

11:10 <whitequark> but not very generic

11:11 <tnt> I'm not sure how a karnaugh maps helps to map a N input comb function to a minimal amount of LUT4 (and then ... what do you consider minimal, depth ? or total # of luts)

11:20 catplant has joined ##openfpga

11:23 <whitequark> tnt: can you give me the json that needs setundef?

11:25 <daveshah> whitequark: Couldn't resist experimenting with the topological ordering idea

11:25 <daveshah> https://github.com/daveshah1/yosys/blob/opt_lut_top_order/passes/opt/opt_lut.cpp

11:25 <daveshah> This now gives identical results to ABC on the big and case

11:25 <daveshah> going to see how it affects larger designs

11:26 <whitequark> daveshah: what the hell, nice

11:26 <whitequark> I was just opening my editor...

11:27 <tnt> whitequark: I posted https://pastebin.com/xeVwqGri

11:27 <tnt> whitequark: it's the verilog source that creates the issue

11:27 <tnt> (well ... a minimal test case)

11:27 <whitequark> tnt: oh thanks!

11:28 <daveshah> Doesn't help boneless much sadly

11:28 <whitequark> daveshah: oh it's okay, boneless has a real awful alu i think

11:28 <whitequark> i mean

11:28 <whitequark> this whole thing grew out of me trying to make a less bad alu for boneless

11:28 <whitequark> and discovering that yosys generates absurdly bad output for it

11:28 <whitequark> and fixing that

11:29 <daveshah> boneless is down to 713 vs 745

11:29 <daveshah> without abc

11:29 <whitequark> that's actually pretty good

11:29 <whitequark> that's approaching abc quality, which is 669

11:29 <daveshah> 482 for me?

11:29 <whitequark> oh, LUTs

11:29 <whitequark> not total cells

11:29 <daveshah> yeah

11:29 <whitequark> ok sure

11:29 <whitequark> still a nice improvement

11:29 <whitequark> what about -abc -relut?

11:30 <daveshah> gives me 463 LUTs

11:31 <daveshah> with the topological ordering, it seems to converge (in the noabc case) after two runs of -relut

11:31 <daveshah> don't know if that is different to before

11:31 <whitequark> oh, that's a bug i'm about to fix

11:31 <whitequark> it should converge immediately

11:33 <tnt> Damn, the default yosys output for that minimal example is really bad ... I mean, there are 3 LUT-1 following each other ...

11:33 <daveshah> picorv32 does pretty well without abc. 1953 LUTs without compared to 1538 LUTs with (so only about 27% overhead)

11:33 * daveshah eats hat....

11:34 rohitksingh_work has quit [Ping timeout: 268 seconds]

11:36 rohitksingh_work has joined ##openfpga

11:37 <daveshah> but Fmax is 16MHz compared to 56MHz with abc

11:38 <whitequark> yes, I've noticed that Fmax gets pretty bad

11:38 <whitequark> there should probably be some kind of K-map based (?) logic rebalancing (?)

11:39 <daveshah> Yes, it's definitely the rebalancing that's the issue

11:39 <whitequark> I mean, that could probably be done naively, even

11:39 <daveshah> This might be as simple as a heuristic when merging LUTs to start with

11:39 <whitequark> oh, yeah!

11:39 <daveshah> Just try and merge the one that with the larger path length

11:42 <whitequark> bleh, probably need to base gate2lut PR on opt_lut PR...

11:42 <whitequark> kind of messy

11:43 <whitequark> or, hm

11:46 <whitequark> hmmmm

11:51 m4ssi has joined ##openfpga

11:58 <whitequark> daveshah: take a look at what i just pushed

12:00 <daveshah> yeap

12:00 <whitequark> converges immediately now?

12:00 <whitequark> or did i miss something?

12:01 <whitequark> seems to converge right away here

12:03 <daveshah> Yes, looks good

12:03 <daveshah> I think that should always converge fine now

12:03 <whitequark> let me add some stats to opt_lut while I'm at it.

12:12 <whitequark> oh, this fails a test...

12:14 <whitequark> ah I think this is the same issue tnt hits

12:16 <whitequark> daveshah: ok, figured the cause i think

12:17 <whitequark> daveshah: Found top.$abc$163$auto$blifparse.cc:492:parse_blif$187 (cell A) feeding top.$auto$alumacc.cc:474:replace_alu$19.slice[2].adder (cell B).

12:17 <whitequark> Cell A is a 1-LUT. Cell B is a 3-LUT. Cells share 0 input(s) and can be merged into one 3-LUT.

12:17 <whitequark> Not combining LUTs into cell A (cell B has attribute \lut_keep).

12:17 <whitequark> Combining LUTs into cell B.

12:17 <whitequark> Connecting input 0 as \d [2].

12:17 <whitequark> Leaving input 1 as \c [2].

12:17 <whitequark> Leaving input 2 as $abc$163$n52.

12:17 <whitequark> Leaving input 3 as $auto$alumacc.cc:474:replace_alu$19.C [2].

12:17 <whitequark> this is... an off by one of some sort?

12:20 <whitequark> ok I think I see

12:21 jevinskie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

12:35 <whitequark> tnt: can you recheck?

12:35 <whitequark> I think I fixed all the bugs you've hit

12:36 <tnt> whitequark: sure

12:42 <tnt> whitequark: seems to work :) builds and the bitstream appear to operate properly on the device.

12:43 <whitequark> wonderful :D

12:45 <whitequark> tnt: what about timing? how bad is it?

12:50 <tnt> It really didn't change anything wrt to timing.

12:51 <tnt> I mean on that particular design it only combined 4 LUTs out of 260.

12:57 <whitequark> ah ok

13:06 <tnt> I tried another where it combined a bit more LUTs but they were not in the critical path either.

13:06 <tnt> whitequark: you only consider LUT -> LUT connections where there is only 1 user of the signal ?

13:07 <whitequark> tnt: yes

13:07 <whitequark> it might make sense to consider more than that, e.g. 1-LUTs can *always* be folded

13:08 <whitequark> yeah, definitely, that would be a significant improvement

13:08 <tnt> yeah, I was looking at a couple netlist and I saw plently of cases or 1 or 2 luts feeding other 2/3 luts ...

13:09 <tnt> the original one has to be kept because sometime the signal goes else where that can't be folded, but that would still be cutting the path for the other signals, at the expense of a higher fanout ...

13:10 <whitequark> right

13:12 <whitequark> hm it might make sense to do that as a part of a more general pass...

13:15 <cr1901_modern> How can you merge a 1-LUT and a 3-LUT into a 3-LUT when none of the inputs are shared?

13:16 <whitequark> cr1901_modern: no, it's a different case

13:16 <whitequark> it's a case of 1-LUT feeding a 3-LUT and something else

13:16 <whitequark> merging 1-LUT into this 3-LUT *and* keeping the original 1-LUT trades fanout for logic levels

13:16 <whitequark> this should be almost always advantageable

13:17 <whitequark> gonna try that soon

13:17 s_frit has joined ##openfpga

13:18 <daveshah> whitequark: Small issue with the LUT merging stuff

13:19 <daveshah> If a CARRY input is 1'b0, then the corresponding LUT input needs to stay 1'b0 too

13:19 rohitksingh_work has quit [Read error: Connection reset by peer]

13:19 <daveshah> it seems this is not being preserved and creates a monstrous carry chain full of legalisation LCs which breaks nextpnr on picorv32

13:19 <whitequark> lol

13:19 <cr1901_modern> I understand the fanout decreases if it's merged, but what do you mean by "trades fanout for logic levels"?

13:19 <whitequark> can you reduce a testcase?

13:19 <daveshah> sure

13:19 <whitequark> cr1901_modern: fanout *increases*

13:20 <cr1901_modern> How does it increase? 1-LUT is no longer driving the 3-LUT if it's merged.

13:20 <cr1901_modern> Oh, whatever was driving the 1-LUT has its fanout increase tho...

13:21 <whitequark> yes.

13:21 <whitequark> actually

13:21 <whitequark> in case of 1-LUT that doesn't increase the fanout at all

13:21 <whitequark> it just moves things around

13:22 * cr1901_modern nods

13:22 <whitequark> daveshah: can you rebase your branch btw?

13:22 <whitequark> I refactored opt_lut a bit, to use a worker

13:22 <cr1901_modern> so what did you mean by the "logic levels" part then?

13:22 <whitequark> daveshah: https://github.com/YosysHQ/yosys/pull/717/commits/5c85d0bb91dc1fc3ce232254bcfd18377afa814f

13:23 <whitequark> this should go nicely with timing reports... once proc learns to not assign some dumbass internal names

13:23 <whitequark> that is on my shortlist

13:23 <whitequark> i want to have ZERO $fuckyou$ names in the reports.

13:24 <tnt> cr1901_modern: well imagine sig_in -> LUT1 -> LUT3 -> sig_out .. if you merge the LUT1 function into the LUT3 and you get sig_in -> LUT3 -> sit_out (and in parallel you may still have sig_in -> LUT1 -> other places that signal went).

13:24 <tnt> cr1901_modern: you reduced the depth of the path from sig_in to sig_out but you increased the sig_in fanout.

13:24 * cr1901_modern nods

13:25 <whitequark> tnt: however you decreased LUT1 fanout

13:25 <whitequark> so in this case it's even

13:25 <whitequark> now, if you are merging LUT2 to LUT3, it is not as clear cut

13:26 <tnt> sure ... but the delay on the net depends on the fanout of that net, not the total fanout of the whole fpga.

13:26 <tnt> so propagation time for sig_in are a bit worse.

13:27 <daveshah> whitequark: MCVE https://www.irccloud.com/pastebin/2isYxevP/carry_merge.v

13:27 <tnt> (tbh, I'm not sure if that works like that on the ice40, I'm just basing that on my experience of xilinx where net driving lots of loads are slower)

13:27 <sorear> whitequark: it completely destroys buffer trees though

13:27 <daveshah> The I2(1'b0) should be preserved so the carry and LUT can be packed

13:28 <daveshah> will sort out rebase in a bit

13:28 <whitequark> sorear: can you elaborate?

13:29 <whitequark> daveshah: so... the constraint here is that I2 must be the same as I2.

13:29 <whitequark> er.

13:29 <whitequark> lut_i.I2 must be the same as carry_i.I1.

13:29 <sorear> *finds keyboard*

13:30 <sorear> let's say you have a signal with a fanout of 256. maybe a clock or a reset

13:30 <daveshah> whitequark: yes, ditto with I1 and I0

13:30 <sorear> an electrical fanout of 256 will be *extremely slow* because you have far more capacitance on the output than a gate is designed to drive

13:31 <sorear> but if you turn it into a 4-level tree of inverters, each with a fanout of 4, you have a faster circuit

13:31 <whitequark> daveshah: ooooh, so effectively... the inputs bound to SB_CARRY should not be considered "free" like normal constant inputs

13:31 <whitequark> and should not be used for reencoding

13:31 <whitequark> that's definitely doable

13:32 <daveshah> Yes

13:32 <sorear> of course multipass optimizers quite frequently do "pessimize something in pass A that you know pass B will clean up", and it probably makes more sense to do this kind of selective duplication after logic optimization (possibly even combined with placement)

13:32 <daveshah> Probably best as a attr on SB_CARRY

13:32 <whitequark> daveshah: are you sure?

13:33 <sorear> so I'm not saying abc/relut would be *wrong* to do this, merely that it's not *prima facie optimal8

13:33 <whitequark> daveshah: oh hm, is this because SB_CARRY can be optimized out?

13:33 <whitequark> sorear: but modern FPGAs have routing buffers instead of routing pass transistors

13:34 <whitequark> so in effect you have buffer trees whether you want it or not, no?

13:34 <sorear> yes, but I was on a terrible phone keyboard and thought "buffer tree" sufficiently implied "ASIC"

13:34 <whitequark> oh!

13:34 <whitequark> I have no idea about anything related to ASICs

13:34 <whitequark> and besides

13:34 <whitequark> opt_lut is not intended for ASIC flow?

13:35 <whitequark> in fact, *abc* is probably good for ASIC flow, it does area optimizations and stuff

13:35 <whitequark> I mean, I assume it is good for at least something. definitely not FPGA flow.

13:35 <tnt> lol

13:36 <sorear> pretty sure I've heard boomcpu complain about critical paths and net naming

13:37 <whitequark> there are 2 kinds of people: those who complain about critical paths and net naming, and those who suffer silently.

13:38 <whitequark> daveshah: now that I think about it... might be a better idea to ditch attributes entirely

13:38 <whitequark> and have something like...

13:39 <whitequark> -dlogic SB_CARRY:1=I0:2=I1:3=CI

13:39 <whitequark> daveshah: this could help ecp5 too, maybe?

13:43 Miyu has joined ##openfpga

13:52 scrts has joined ##openfpga

13:52 rohitksingh has joined ##openfpga

14:01 rohitksingh has quit [Ping timeout: 250 seconds]

14:32 <daveshah> whitequark: looks good

14:33 <daveshah> The problem with ECP5 is that the CCU2C carry primitive is 2 LUT4s with the output XORd with carry for the sum output and 2 LUT2s sharing inits with the bottom of the LUT4s plus some add and ors to generate carry

14:33 <daveshah> It's a pretty tricky one to optimise or even split

14:33 <whitequark> ohhhh I see

14:33 <daveshah> https://github.com/YosysHQ/yosys/blob/master/techlibs/ecp5/cells_sim.v#L19-L49

14:34 <daveshah> if you are curious

14:34 <daveshah> I think ABC might have some support for this, but being ABC the documentation for that sort of thing is just two words "fuck off"

14:34 <whitequark> lol

14:34 <whitequark> i feel like this is a good job for an SMT solver?

14:34 <daveshah> Yes, probably

14:34 <whitequark> anyway, halvarflake is reading some papers on my behalf

14:34 <daveshah> nice

14:36 <whitequark> apparently, there is some obscure connection between equation system solvers on GF(2) and Quine/McCluskey algorithm

14:36 <whitequark> they are equivalent or something??

14:36 <whitequark> and halvarflake's MSc was on the former...

14:37 genii has joined ##openfpga

14:44 Flea86 has quit [Quit: Goodbye and thanks for all the dirty sand ;-)]

14:49 <daveshah> whitequark: rebased commit, hopefully didn't break anything in a somewhat messy merge

14:49 <daveshah> https://github.com/daveshah1/yosys/commit/d3aecd47d0c2fafb6d09caf314c0b877487de833

14:56 <daveshah> oops, https://github.com/daveshah1/yosys/commit/cf2475fa1e705a07dc3b15ab208e7ce6b7e6e4b8

15:01 <whitequark> ok, I added dlogic recognition

15:01 <whitequark> now just need to wire it to avoid disturbing those

15:17 <whitequark> ok, I *think* I'm done.

15:25 <whitequark> daveshah: oh holy shit

15:25 <whitequark> this *really* improves timing *dramatically*

15:25 <whitequark> like by 10 MHz

15:25 <daveshah> sweeet

15:25 <daveshah> I guess the timing problems before might have been excessive feedthroughs being inserted

15:25 <whitequark> yeah

15:26 <whitequark> let me check with -noabc too

15:27 <daveshah> The Yosys/nextpnr changes over the last month must mean we are close to a 30-40% improvement in timing overall by now

15:27 <whitequark> that's a lot

15:28 <whitequark> this would make UP5K Glasgow actually usable :D

15:28 <daveshah> next big step will be vpr-style criticality driven placement

15:28 <daveshah> I might play with that now in fact

15:28 <daveshah> I'm not sure if that will actually lead to an overall improvement in performace, or just make the opt-timing pass redundant

15:28 ZipCPU|Laptop has quit [Ping timeout: 245 seconds]

15:29 <daveshah> The other thing I want to try is swapping macros, at the moment I think the inability to perform swaps after constraint legalisation limits Fmax with carrys

15:29 <daveshah> without macro swapping support, LUTCascade will probably cause a step back in QoR too

15:31 <whitequark> daveshah: ah no, I misread the report

15:31 <daveshah> :(

15:31 <whitequark> doesn't seem to lead to that much of an improvement in timing, sadly

15:31 <daveshah> definitely not the first time I did that

15:31 <whitequark> ok

15:31 <daveshah> once I remember thinking that I had like a 30% increase in Fmax

15:31 <daveshah> turns out I was compared hx8k against lp8k

15:31 <tnt> whitequark: is it on your repo already ?

15:32 <tnt> daveshah: lol

15:32 <whitequark> lol

15:33 <sorear> improve timing 30% with this one weird trick

15:33 <whitequark> daveshah: can you check if this actually works as intended?

15:33 <whitequark> just pushed

15:33 <daveshah> sure

15:34 GuzTech has quit [Quit: Leaving]

15:34 <whitequark> daveshah: I looked at your MCVE and it looks like there's no actual change if I run opt_lut on it at all?

15:34 <whitequark> I mean

15:34 <whitequark> it has one LUT

15:34 <whitequark> opt_lut would not change it...

15:34 <daveshah> It should have two LUTs

15:35 <daveshah> opt_lut was previously illegaly merging those two

15:35 <whitequark> oh, `a+b`

15:35 <whitequark> oh sorry

15:35 <daveshah> yeah

15:35 <whitequark> let me recheck

15:36 <daveshah> 2 LUT4s, looks good

15:37 <whitequark> hm, the log is a bit confusing

15:37 <whitequark> let me tweak it a bit

15:37 <daveshah> picorv32 example seems to work fine too now

15:37 <daveshah> :)

15:37 <whitequark> :D :D

15:37 <whitequark> so, what changed? fmax before/after? lc before/after?

15:37 <whitequark> is this -noabc or?

15:38 <daveshah> No -noabc

15:38 <daveshah> But a big jump in timing

15:38 <daveshah> from 67MHz average without -relut to 72MHz with

15:38 <whitequark> ooooh wow

15:38 <daveshah> let me test on a soc design to make sure it still works on hardware

15:38 <whitequark> I test on hardware periodically, seems to work still

15:38 <daveshah> cool

15:39 <daveshah> just want to test it together with my nextpnr carry changes

15:39 <daveshah> That example that's at 72MHz now was pretty much stuck around 52MHz for a long time

15:39 <whitequark> yeah, definitely curious

15:39 <whitequark> oh wow

15:39 <daveshah> like until a few weeks ago

15:40 <daveshah> I don't think I even have min_ce_use in there, so it can probably get even better

15:40 <daveshah> But I know opt-timing and the nextpnr carry changes each added about 10%

15:41 <whitequark> what is opt-timing?

15:41 <daveshah> It's a post-placement path that uses a fairly odd algorithm to minimise the critical path

15:42 <daveshah> *post-placement pass

15:42 <daveshah> basically, a BFS of neighbour bels of critical path bels

15:43 <daveshah> hardware test is working (design is a picorv32 soc, qspi controller, and CSI-2 interface if you are curious)

15:43 <daveshah> that design is now getting 24MHz on an ultraplus

15:43 <daveshah> which is pretty good

15:45 <whitequark> Cell A is a 3-LUT with 3 dedicated connections. Cell B is a 2-LUT.

15:45 <whitequark> Cells share 0 input(s) and can be merged into one 4-LUT.

15:45 <whitequark> Not combining LUTs into cell B (combined LUT wider than cell B).

15:45 <whitequark> Combining LUTs into cell A.

15:45 <daveshah> oops, forgot to add relut to the syn script for that hardware test

15:45 <daveshah> let me actually check again

15:46 <daveshah> yep, still works

15:46 <whitequark> :D :D

15:46 <whitequark> any change in fmax?

15:46 <daveshah> dropped to 22MHz

15:46 <whitequark> or is it just size?

15:46 <whitequark> huh

15:46 <whitequark> average?

15:46 <daveshah> this is one run

15:46 <daveshah> unlike the previous test

15:47 <daveshah> let me run some proper 16-run comparisons on this design too

15:48 <daveshah> size drops from 3371 LCs to 3311 LCs

15:49 dingbat has quit [Quit: Updating details, brb]

15:50 dingwat has joined ##openfpga

15:50 <tnt> I tried it on a couple of designs here (over 10 runs each). Doesn't seem to affect F_avg / F_max (it's within the noise ... < 1 MHz variation on a 70 MHz design)

15:50 dingwat has quit [Client Quit]

15:51 dingwat has joined ##openfpga

15:51 <daveshah> I dare say, this is where a Threadripper was a good buy :P

15:51 <tnt> ~ 5 % less LUTs

15:52 <whitequark> I'm guessing the critical path is some sort of long carry chain

15:52 <daveshah> difference is in the noise here too

15:52 <daveshah> with relut: min = 23.45 MHz, avg = 25.30 MHz, max = 27.32 MHz

15:52 <daveshah> without relut: min = 24.22 MHz, avg = 25.35 MHz, max = 27.05 MHz

15:52 <daveshah> let me check with min_ce_use too

15:54 <daveshah> whitequark: certainly a big part of it

15:54 <daveshah> https://www.irccloud.com/pastebin/k0Buul5q/critical%20path%20ctrlsoc

15:55 <daveshah> There's some disturbingly long arcs in there too like (14,23) -> (23,16)

15:55 <whitequark> yeah

15:55 <daveshah> This is hopefully improveable with a better placer

15:55 <whitequark> hm, going to merge your topological ordering stuff now

15:56 <daveshah> thanks

16:01 <whitequark> daveshah: that... actually pessimizes boneless.

16:01 <whitequark> by 9 LTUs

16:01 <whitequark> *LUTs

16:01 <whitequark> let me push into a branch...

16:02 <whitequark> daveshah: pull from opt_lut_topo_noabc

16:02 <daveshah> Maybe it is not the best way forward

16:02 <whitequark> think you can take a look at the reason?

16:02 <daveshah> sure

16:05 <daveshah> Think it might have been a merge issue

16:05 <whitequark> oh?

16:05 <daveshah> Accidentally left in a line of old code

16:05 <daveshah> if (lutA_output_ports.size() != 2)

16:05 <daveshah> continue;

16:05 <daveshah> before the loop that iterates over ports

16:05 <daveshah> but now I'm getting an attribute-related assert fail

16:06 <daveshah> *param-related

16:06 <daveshah> almost as if there are LUTs without init

16:06 <whitequark> hm, odd

16:07 <daveshah> actually, looks like that if statement should be there

16:08 <whitequark> yes

16:08 <daveshah> I can't see any other problem

16:08 <whitequark> I think it's required rn

16:08 <whitequark> hm, ok

16:08 <daveshah> I fear that topological ordering is just not always optimal

16:09 <whitequark> I'm going to try and massage LUTs into a form that opt_merge can deal with

16:09 <daveshah> I was thinking too much about the specific tree-of-gates case

16:14 <daveshah> now that the carry issue is fixed, the small picorv32 test is doing much better with noabc btw - from 19MHz up to 45MHz based on one run

16:15 <daveshah> didn't have a frequency constraint, oops

16:16 <daveshah> running 8 runs with --freq 50 and --opt-timing gives

16:16 <daveshah> min = 51.77 MHz, avg = 54.53875 MHz, max = 57.28 MHz

16:16 <daveshah> not bad at all

16:16 <whitequark> wow

16:16 <whitequark> so... -noabc picorv32 now is the same as abc picorv32 1 month ago?

16:16 <daveshah> yeah

16:17 <whitequark> niiiiice

16:17 <tnt> daveshah: that's not on a up5k is it ?

16:17 <daveshah> no, hx8k

16:17 <whitequark> hell no

16:17 <tnt> :)

16:17 <whitequark> up5k can barely run a vga sync gen at 50 MHz

16:18 <q3k> i mean, you gotta pipeline the fuck out of it

16:18 <whitequark> oh?

16:18 <q3k> also maybe it's a bit better now, i haven't tried in a while

16:18 <whitequark> out of what

16:18 <whitequark> syncgen?

16:18 <q3k> yes

16:18 <whitequark> picorv3@?

16:18 <tnt> most of my 5k design are > 60 MHz so far ... I ran a MIPI-DSI display at 80 MHz.

16:18 <whitequark> it's... just counters?

16:18 <q3k> vga on up5k

16:18 <q3k> yep

16:18 <whitequark> how do you pipeline counters like that

16:19 <q3k> i mean, for me the longest chain was counter -> comparison -> pixel_{x,y}

16:19 <whitequark> oh i don't do comparisons

16:19 <q3k> so if you just register counter -> comparison it's much easier

16:19 <whitequark> or rather

16:19 <whitequark> yes

16:19 <q3k> (where comparison was == 0 iirc)

16:19 <whitequark> that's what i'm already doing

16:19 <whitequark> i get like 60 mhz but barely

16:19 <q3k> but yeah, the counter chain for vga barely fit

16:19 <q3k> yes

16:19 <daveshah> I would do a comparison with the vendor tools at this point

16:20 <daveshah> But my icecube license expired and I still haven't heard from them after requesting a new one yesterday

16:20 <daveshah> So the vendor tools are disqualified and get 0MHz

16:20 <q3k> heh

16:20 <whitequark> lmao

16:20 <daveshah> ∞% better

16:20 <tnt> daveshah: lol

16:21 <tnt> But I was thinking of running a couple design through icecube to compare because it must be getting pretty close by now.

16:21 <whitequark> daveshah: so... can abc do what opt_lut -dlogic does?

16:21 <whitequark> like, unnderstand the constraints added by SB_CARRY

16:21 <daveshah> I think it theoretically can

16:21 <whitequark> hmm

16:21 <daveshah> Or at least second-hand rumours say it can

16:21 <whitequark> but no one knows how to make it do that? :D

16:22 <daveshah> Including understanding the logic of SB_CARRY

16:22 <daveshah> yeah, basically

16:22 <whitequark> hm

16:22 <sorear> how ice40-specific is the new pass?

16:22 <whitequark> sorear: not at all.

16:22 <whitequark> it supports any hard logic attached to LUT inputs.

16:23 <daveshah> The only ice40-specific thing is assuming all LUTs are the same size

16:23 <whitequark> daveshah: it does not assume that!

16:23 <daveshah> For ECP5 and Xilinx you really want some way of mapping larger LUTs with an increasing cost

16:23 <whitequark> I spent a lot of time making sure it would e.g. pack into a wider LUT if it's possible.

16:23 <whitequark> what it does *not* do is combine to larger LUTs

16:23 <daveshah> yep

16:23 <whitequark> but if ou *already* mapped to larger LUTs it will optimize those.

16:25 rohitksingh has joined ##openfpga

16:25 <whitequark> daveshah: the ecp5 cells_sim.v is so weird

16:26 <whitequark> strangely out of order and weird verilog

16:26 <whitequark> hm, nevermind

16:26 <daveshah> The out of order was mostly because mithro also wanted split files for SymbiFlow stuff

16:27 <daveshah> And I think I started it split then combined it

16:27 <daveshah> what verilog is weird?

16:27 <sorear> so you map to larger LUTs, instead of mapping to LUTs, PFUMUXs, and L6MUXs?

16:27 <sorear> s:2nd/LUTs/LUT4s/

16:27 <daveshah> More precisely, we tell ABC to map to larger LUTs then split to small LUTs and muxes with a techmap rule

16:27 <daveshah> This is because the documented subset of ABC doesn't map muxes directly

16:28 <daveshah> I understand this is definitely possible

16:28 <_whitenotifier> [whitequark/Glasgow] whitequark pushed 1 commit to master [+0/-0/±1] https://git.io/fp1sF

16:28 <_whitenotifier> [whitequark/Glasgow] whitequark b4deab5 - arch.boneless: fix typo.

16:29 <whitequark> "the documented subset of ABC"...

16:30 <mithro> daveshah: we use the ABC map to LUT8s and then a techmap to split into LUT6s + F7MUX + F8MUX on Xilinx

16:30 <_whitenotifier> [Glasgow] Success. The Travis CI build passed - https://travis-ci.org/whitequark/Glasgow/builds/463927286?utm_source=github_status&utm_medium=notification

16:30 <daveshah> yes, it's the same on the ECP5 just LUT4..LUT7 instead of LUT6..LUT8

16:30 <daveshah> In fact I think that's where I based my implementation on

16:31 emeb has joined ##openfpga

16:39 <mithro> Anyone know much about LTO and soft-float? I'm getting "undefined reference to `__divsi3'" when building with LTO

16:40 <cr1901_modern> mithro: I would assume the two are in fact unrelated and that __divsi3 isn't provided by compiler_builtins

16:40 <cr1901_modern> but LTO is creating an opt that uses it

16:41 <sorear> __divsi3 isn't soft float, it's soft-integer

16:41 <sorear> soft float is __divsf3

16:41 <sorear> either way it's a symbol from libgcc

16:41 <sorear> is -lgcc somehow getting lost from the LTO build?

16:42 <mithro> compiler_rt/lib/builtins/divsi3.c seems to provide it?

16:42 <sorear> compiler_rt is clang's version of libgcc and provides most of the same stuff

16:43 <mithro> sorear: In migen we seem to be linking against that instead of libgcc

16:43 <sorear> (i'm not very familiar with LTO)

16:44 parport0 has quit [Ping timeout: 272 seconds]

16:44 parport0 has joined ##openfpga

16:44 <mithro> sorear: From what I can see is that LTO is dropping the __divsf3 symbol because it is "unused" until a later pass which generates the symbol?

16:44 Zorix has quit [Ping timeout: 268 seconds]

16:45 rohitksingh has quit [Ping timeout: 244 seconds]

16:45 <sorear> possible?

16:46 <sorear> what toolchain are you using?

16:47 <mithro> sorear: gcc

16:48 m4ssi has quit [Remote host closed the connection]

16:52 s_frit has quit [Remote host closed the connection]

16:52 s_frit has joined ##openfpga

17:09 * shapr hugs mithro for so much awesome

17:09 <mithro> shapr: ?

17:09 <shapr> mithro: TinyFPGA is the specific awesome of the moment

17:09 <mithro> shapr: I didn't actually do the TinyFPGA, that was tinyfpga

17:09 <shapr> ok

17:10 * tinyfpga hugs shapr and mithro

17:10 <shapr> in that case, there are more good reasons for supportive hugs :-)

17:10 * shapr hugs tinyfpga

17:13 <mithro> sorear: Well, is I install the lm32 toolchain with libgcc and use -lgcc it links....

17:17 <mithro> sorear: I wonder if gcc handles libgcc in some special way

17:18 <cr1901_modern> lm32 is configured with --disable-libgcc, fwiw

17:42 <whitequark> daveshah: LMAO

17:42 <whitequark> ok, let me verify because this is absurd

17:46 <daveshah> what is happening?

17:46 <mithro> cr1901_modern - https://gist.github.com/mithro/7fee3383bf08eb0deaa1be2dab28c5af

17:47 <whitequark> daveshah: ahahaha

17:47 <whitequark> so

17:47 <whitequark> abc cannot merge two identical LUTs

17:48 <whitequark> with inputs in different order.

17:48 <daveshah> lol

17:48 <whitequark> I've just proven them identical with equiv_simple to be extra sure that I didn't fuck this up

17:48 <whitequark> I did not abc is just that bad

17:49 <whitequark> daveshah: even better

17:49 <whitequark> if I XOR these cells, so they are *definitely* in the same comb network

17:49 <daveshah> yes

17:49 <whitequark> it STILL cannot figure out that these are the same LUTs

17:50 <whitequark> i mean? what? why are we using this??

17:50 <whitequark> oh I see

17:50 <whitequark> the cause of this is it doesn't understand what SB_LUT4 is

17:50 <whitequark> let me try again

17:51 <whitequark> daveshah: nevermind, if I unlut them abc manages to figure it out

17:51 <daveshah> that does make more sense

17:51 <whitequark> so it's just the techmapping/techunmapping issue

17:51 bubble_buster has quit [Ping timeout: 252 seconds]

17:51 <whitequark> ok, going to add canonicalization to opt_lut now.

17:52 <whitequark> still not sure if there's some more general approach

17:52 pointfree has quit [Ping timeout: 264 seconds]

17:52 digshadow has quit [Ping timeout: 264 seconds]

17:52 jeandet has quit [Ping timeout: 264 seconds]

17:53 bubble_buster has joined ##openfpga

17:54 pointfree has joined ##openfpga

17:55 jeandet has joined ##openfpga

17:55 digshadow has joined ##openfpga

18:17 ZipCPU|Laptop has joined ##openfpga

18:24 f003brv has joined ##openfpga

19:15 <sensille> shapr: as you recommended haskell so vehemently i now started to read on it :)

19:16 <shapr> sensille: what are your thoughts?

19:16 <shapr> sensille: I like to think I really advocate learning one programming from each of 1. imperative 2. functional 3. logic

19:16 <sensille> now i know where rusts's typesystem comes from :)

19:16 <shapr> like, *really* learning

19:17 f003brv has quit [Ping timeout: 256 seconds]

19:17 <shapr> I once spent three or four months using only prolog, so I'm not sure I'm even following my own advice

19:17 <shapr> yeh! lots of rust things come from Haskell

19:18 <sensille> but when i read something like 'all (`elem` ['a'..'z']) "Frobozz"' i immediately think: this might be nice, but can this ever perform well?

19:18 <shapr> Haskell is surprisingly fast

19:18 <sensille> (from the book "real world haskell")

19:18 lambdabot has joined ##openfpga

19:18 <shapr> > let ones = 1 : ones in take 15 ones

19:18 <lambdabot> [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]

19:19 <shapr> > let fib = 1 : 1 : zipWith (+) fib (tail fib) in take 15 fibs

19:19 <whitequark> daveshah: so, i'm entertaining myself right now by repeatedly running

19:19 <lambdabot> error:

19:19 <lambdabot> • Variable not in scope: fibs :: [a]

19:19 <lambdabot> • Perhaps you meant ‘fib’ (line 1)

19:19 <shapr> > let fib = 1 : 1 : zipWith (+) fib (tail fib) in take 15 fib

19:19 <whitequark> `lut2mux ; abc -lut 4`

19:19 <lambdabot> [1,1,2,3,5,8,13,21,34,55,89,144,233,377,610]

19:19 <sorear> *blink*

19:19 <whitequark> each time i get slightly different result

19:19 <sorear> @help

19:19 <lambdabot> help <command>. Ask for help for <command>. Try 'list' for all commands

19:19 <shapr> sorear: want to learn Haskell? again? ;-)

19:19 <whitequark> sometimes it infers more logic

19:19 <whitequark> sometimes less

19:19 <sensille> i'd like to see a simple loop like for (i=0; i < 10000000; ++i) a += i; written in haskell and yielding a nice result in disassembly

19:19 <sorear> @list

19:19 <lambdabot> What module? Try @listmodules for some ideas.

19:19 <whitequark> @no

19:19 <lambdabot> Error: expected a Haskell expression or declaration

19:20 <shapr> sensille: tried godbolt?

19:20 <shapr> lambdabot: @leave ##openfpga

19:20 lambdabot has left ##openfpga [##openfpga]

19:20 <shapr> bye now

19:20 <shapr> no more offtopic spam from that bot

19:23 <shapr> sensille: https://haskell.godbolt.org/z/yjg-C5 ?

19:23 <miek> i'm having some trouble bringing up a Glasgow revB - `glasgow factory` seems to read back all 0s from the eeprom but `fx2tool` suggests it programmed ok? https://pastebin.com/raw/ZLjxQWce

19:23 <sensille> shapr: i just used ghc and looked at some results

19:24 <tnt> Is there such things as gearboxes ICs that take 2 * 5G serdes and make a 10G one ?

19:24 <whitequark> miek: hm, interesting

19:24 <sensille> shapr: but it might be too offtopic for this channel

19:24 <whitequark> miek: can you try this firmware? https://cloud.whitequark.org/s/Kzcq5gJP43DRnFq

19:25 <shapr> sensille: in general (very broad brush strokes) , naive straightforward Haskell runs in about twice the time of naive straightforward C or C++

19:26 <shapr> I'd argue that naive straightforward Haskell takes less than half the human thinking time, compared to C or C++, to implement the same solution.

19:27 <shapr> sorear: you have experience on both sides, what do you think?

19:27 <sensille> what i really need to understand is copying data vs. manipulating in place

19:28 <shapr> I'd really like to see someone solving the Advent of Code puzzles on an FPGA

19:28 <shapr> (going back on topic)

19:28 <shapr> sensille: want to try #haskell-beginners or just #haskell for this topic?

19:29 <miek> whitequark: same results with that firmware

19:29 <whitequark> miek: very strange

19:30 <sensille> shapr: i haven't even read one third of the book, so definitely -beginners :(

19:30 <shapr> works for me

19:30 <sensille> s/(/)

19:31 <shapr> oh wow, I found such a project! https://github.com/alokmenghrajani/adventofcode2018 using the icestick even

19:31 <whitequark> miek: if you re-plug the device, it comes up with Z-99999... serial, right?

19:32 <whitequark> or rather

19:32 <whitequark> what VID/PID/DID does it have?

19:33 <miek> so after `factory` it comes up with Z-9999.., after replugging comes up with 20b7:9db1 but no product/manufacturer/serial

19:34 <whitequark> mmm, try this

19:35 <whitequark> do `glasgow factory` then `glasgow flash`

19:37 <whitequark> daveshah: hey, what do you think about lut cascade?

19:37 <whitequark> should this be done on yosys level? nextpnr?

19:37 <daveshah> nextpnr

19:37 <daveshah> I wrote a half finished attempt at it

19:37 <daveshah> Atm it's actually hurting QoR

19:38 <daveshah> This is because nextpnr's placer doesn't handle relative constraints that well

19:38 <miek> https://pastebin.com/raw/vwssivCx

19:38 <daveshah> It can't swap chains after constraint legalisation

19:38 <daveshah> If you want to look: https://github.com/daveshah1/nextpnr/tree/lutcascade

19:43 <whitequark> ahhh ok

19:43 <whitequark> i was just thinking about what i should do next...

19:43 <whitequark> daveshah: any suggestions btw?

19:43 <whitequark> miek: any chance you can take a look at the i2c bus?

19:43 <whitequark> i have never seen anything like this

19:43 <whitequark> wait.

19:43 <whitequark> waaaaait.

19:43 <daveshah> whitequark: imo looking at timing driven synthesis optimisations would be awesome

19:44 <whitequark> writes succeed, but reads come up with 0

19:44 <whitequark> miek: your I2C SDA is stuck at 0.

19:44 <whitequark> SDA and/or SCL.

19:44 <daveshah> At first just critical path based lut merging

19:44 <whitequark> daveshah: tell me more

19:44 <daveshah> This is not something I really know about, just random thoughts that would be fun to play with

19:45 <whitequark> oh ok

19:45 <daveshah> But I think the topological ordering could be used to work out path lengths

19:45 <daveshah> And that could be used to guide the LUT merger to make decisions based on minimising the critical path

19:45 <whitequark> so what i was thinking about is doing something to `proc` (i think it's proc?)

19:45 <whitequark> so that it would actually make sensible names

19:45 <whitequark> and not $fuck$you

19:46 <daveshah> Yes please

19:46 <whitequark> ok

19:46 <whitequark> gonna do that next.

19:46 <daveshah> Also the alu/macc stuff

19:46 <whitequark> yeah i'm going to start with simple logic

19:46 <daveshah> Carry chains always have stupid names atm

19:46 <whitequark> then alu

19:46 <whitequark> then ffs

19:46 <whitequark> *everything* has stupid names atm.

19:46 <daveshah> Yep

19:47 <whitequark> why cannot yosys work out that x | y should be called _x_or_y_ or something

19:47 <whitequark> grumble grumble

19:47 <daveshah> Where there is no sensible naming, it should be possible to use the src attribute to get a source file and line and use that

19:47 <whitequark> i fucking *knew* it in like *2015* that i will end up wriitng this

19:47 <whitequark> and here we are

19:47 <whitequark> yes

19:47 <daveshah> From memory Yosys other than abc is quite good at tracking src

19:47 <whitequark> yes

19:47 <whitequark> that i appreciate for sure

19:47 <whitequark> but src is used basically nowhere right now

19:48 azonenberg_work has quit [Ping timeout: 246 seconds]

19:48 <daveshah> Yes, it should be possible to replace almost all autogen names with src for a minimum

19:48 <whitequark> daveshah: looked through dumped verilog

19:49 <whitequark> looks like with -noabc -relut, every single cell has proper src

19:49 <daveshah> Sweet

19:50 <TD-Linux> miek, you might have i2c stuck. it goes all the way to the adc and dac chips so it may be a short on any of those

19:50 <TD-Linux> oh I see it was already answered

19:50 <whitequark> TD-Linux: I think I need to add a detector for that...

19:51 <whitequark> query a nonexistent chip

19:51 <whitequark> if it ACKs, I2C is stuck.

19:51 <TD-Linux> it would be nice, it's an easy failure because i2c goes so many places

19:52 <whitequark> ok sure

19:55 _whitenotifier has quit [Remote host closed the connection]

20:09 <whitequark> TD-Linux: actually

20:09 <whitequark> what the hell is happening on that board?

20:09 <whitequark> i just tried and if I deliberately add an i2c fault it hangs...

20:09 <TD-Linux> it sounds to me kind of what happened when I had scl and sda shorted together

20:10 <whitequark> yeah ,I tried that too

20:10 <daveshah> Missing pullup?

20:10 <daveshah> That might make i2c go funky

20:10 <TD-Linux> but on mine, it would get halfway through factory, change ids, and then return to original id when unplugged and replugged

20:10 <whitequark> oh, that sounds about right!

20:10 <whitequark> TD-Linux: hmmmm

20:12 <TD-Linux> (er to be clear, never appear as the new id)

20:13 <miek> i2c seems to be behaving: https://imgur.com/a/f8LbyVD

20:13 <miek> (that's while running `glasgow -vv flash`)

20:13 <whitequark> miek: that does not seem normal

20:13 <whitequark> those gaps at low

20:13 <whitequark> not sure though

20:13 <whitequark> can you try and decode it?

20:13 <whitequark> miek: i am about 95% sure you have an i2c fault somewhere

20:13 <whitequark> double-check pullups

20:14 <whitequark> double-check continuity and solder bridges

20:15 <miek> ok, will double-check everything and see if i can find an fx2 to decode it

20:15 <TD-Linux> solder wick the dacs and adcs

20:23 azonenberg_work has joined ##openfpga

20:32 <mwk> huh

20:32 <mwk> so 7a15t/7a35t/7a30t/7s50 are basically the same shit with different markings?

20:32 <daveshah> Yes

20:32 <mwk> nice

20:32 <daveshah> They are resource count limited

20:32 <daveshah> So the entire die is guaranteed working

20:33 <whitequark> daveshah:

20:33 <whitequark> Info: 1.2 19.8 Source boneless.v:176$893_LC.O

20:33 <whitequark> Info: 1.8 21.5 Net s_opB[0] budget 3.783000 ns (13,21) -> (12,22)

20:33 <whitequark> Info: Sink boneless.v:269$1058_LC.I0

20:33 <whitequark> Info: 1.3 22.8 Source boneless.v:269$1058_LC.O

20:33 <whitequark> Info: 2.4 25.2 Net boneless.v:269$14[0] budget 3.783000 ns (12,22) -> (11,24)

20:33 <whitequark> Info: Sink boneless.v:269$646$CARRY.I2

20:33 * mwk just got a very grateful friend with a 7a35t and a h4xed bitstream

20:33 <whitequark> daveshah: ooooooh

20:34 <whitequark> so my critical path is a subtractor in ALU

20:34 <whitequark> holy shit this is SO USEFUL

20:34 <daveshah> Nice

20:34 <whitequark> i can't believe no one using yosys spent like 30 minutes writing a pass

20:34 <whitequark> what the hell lmao

20:35 <tnt> is that the 'dress' stuff ? or something else ?

20:36 <whitequark> no

20:36 <whitequark> rename -src

20:36 <whitequark> I don't really care about abc anymore :P

20:39 <whitequark> daveshah: https://github.com/YosysHQ/yosys/pull/722

20:40 <whitequark> might actually (gasp) look at floorplanning

20:40 <miek> so i checked/reflowed a bunch of stuff but no joy. i haven't got anything around to decode easily, but the waveform is identical on the scope between using `glasgow flash` (all 0s) and `fx2tool read_eeprom` (correct readback)

20:40 <whitequark> miek: very strange

20:40 <whitequark> unfortunately, i don't really know how to help you at the moment

20:41 <whitequark> i'll let you know if i have ideas, or ping me in a few days

20:41 <miek> ok, no worries, i'll keep playing around. cheers for the help so far

20:42 <whitequark> daveshah: so, like half of the boneless cpu design is attributed to the FSM

20:42 <whitequark> cells wise

20:42 <whitequark> and nets

20:42 <daveshah> whitequark: PR looks good, thanks

20:44 <daveshah> Interesting that the FSM is so significant even with a 16 bit datapath

20:46 <miek> oh good it gets stranger, wireshark shows the correct data coming in

20:46 <whitequark> miek: ohhhhh

20:47 <whitequark> now *this* is something i know

20:47 <whitequark> try updating your python-libusb

20:47 <whitequark> miek: are you using the one in debian by any chance?

20:48 <miek> ubuntu, but yeah. i just installed one from pip and it works! thanks!

20:48 <whitequark> daveshah: *sigh* https://github.com/YosysHQ/nextpnr/issues/167

20:48 <whitequark> miek: what is the version you have in debian

20:48 <miek> 1.6.3-1

20:49 <daveshah> Guess the nextpnr team all have shitty laptops :P

20:49 <whitequark> hehe

20:51 _whitenotifier has joined ##openfpga

20:51 <_whitenotifier> [whitequark/Glasgow] whitequark pushed 1 commit to master [+0/-0/±1] https://git.io/fp1XA

20:51 <_whitenotifier> [whitequark/Glasgow] whitequark 3aafe48 - software: require libusb1>=1.6.6.

20:53 <_whitenotifier> [Glasgow] Success. The Travis CI build passed - https://travis-ci.org/whitequark/Glasgow/builds/464039932?utm_source=github_status&utm_medium=notification

21:45 <miek> yay, little bit of rework and it's passing selftest :)

21:49 <whitequark> sweet!

21:55 <tnt> I still have one failing the loopback test for the second EP pair ... couldn't see anything wrong with the solder joinst under magnification.

22:01 <tnt> Mmm, data end up being slightly mangled :/ b'\xaaU.god yzal eht rgvo spmuj xof nworb kciuq ehT') vs b'\xaaU.god yzal eht revo spmuj xof nworb kciuq ehT'

22:02 <whitequark> uh

22:02 <whitequark> those look the same?

22:03 <daveshah> rgvo vs revo

22:03 <whitequark> oh, so bit 2

22:03 <whitequark> fascinating

22:03 <whitequark> tnt: are you using an up to date toolchain?

22:04 <tnt> whitequark: From a few hours ago ... with about every experimental patches from daveshah, you and me ...

22:07 ClausPillow has joined ##openfpga

22:07 <whitequark> mm, okay, so it's probably not gateware

22:07 <whitequark> only the second ep pair though? let me see

22:07 <whitequark> hm those pins aren't really close

22:07 <whitequark> dunno

22:08 <tnt> yeah .. EP2OUT->EP6IN works fine.

22:08 <whitequark> weird.

22:08 <whitequark> always the same error?

22:08 <prpplague> anyone heard of details for orconf 2019?

22:09 <tnt> whitequark: no, seems change sometimes. b'\xaaW.god yzal eht rgvo spmuj xof nworb kckuq ehT') kckuq vs kciuq

22:09 <whitequark> same bit

22:10 <whitequark> hmmm

22:10 <tnt> Ok, I'll recheck bit 2.

22:10 <whitequark> wonder if it's PTV

22:10 <whitequark> it *could* be PTV but i don't know

22:10 <whitequark> the FX2 bus stuff is still slightly suspect to me

22:33 Zorix has joined ##openfpga

22:34 <tnt> https://imgur.com/a/nnjQZHg

22:34 <tnt> Those look just fine to me :/

22:35 <whitequark> those look damn great

22:35 <whitequark> i have never seen a better qfn solder joint in my life

22:35 <whitequark> tnt: what about sending a stream of 55 aa

22:35 <whitequark> via the loopback pipe

22:36 <whitequark> and then looking at it via a scope?

22:36 genii has quit [Remote host closed the connection]

22:36 <_whitenotifier> [Glasgow] miek opened pull request #88: access.direct.demultiplexer: fix TypeError when length is None - https://git.io/fp15i

22:38 <tnt> I can give it a shot. There was a benchmark somewhere right ? Probably the easiest to mod to send that.

22:38 <whitequark> tnt: the benchmark applet uses an LFSR

22:38 <whitequark> actually

22:38 <whitequark> try running it

22:38 <whitequark> it uses EP2-EP6

22:38 <whitequark> so if it's electrical you'll probably see the bug on those two too

22:39 <_whitenotifier> [Glasgow] Success. The Travis CI build passed - https://travis-ci.org/whitequark/Glasgow/builds/464084330?utm_source=github_status&utm_medium=notification

22:39 <_whitenotifier> [whitequark/Glasgow] whitequark pushed 1 commit to master [+0/-0/±1] https://git.io/fp157

22:39 <_whitenotifier> [whitequark/Glasgow] miek 8806951 - access.direct.demultiplexer: fix TypeError when length is None

22:39 <_whitenotifier> [Glasgow] whitequark closed pull request #88: access.direct.demultiplexer: fix TypeError when length is None - https://git.io/fp15i

22:39 <tnt> Seem to 'hang' at the loopback test (i.e. never return)

22:39 <whitequark> no, that doesn't work

22:39 <whitequark> it's uh

22:39 <whitequark> bug #44

22:39 <whitequark> run either source, sink, or both, explicitly

22:40 <tnt> I: glasgow.applet.benchmark: running benchmark mode source for 4.000 MiB

22:40 <tnt> I: glasgow.applet.benchmark: mode source: 10.193 MiB/s

22:40 <tnt> I: glasgow.applet.benchmark: running benchmark mode sink for 4.000 MiB

22:40 <tnt> I: glasgow.applet.benchmark: mode sink: 0.969 MiB/s

22:41 <_whitenotifier> [Glasgow] Success. The Travis CI build passed - https://travis-ci.org/whitequark/Glasgow/builds/464084973?utm_source=github_status&utm_medium=notification

22:42 <whitequark> tnt: hmmm

22:44 <tnt> Can I easily make the benchmark use the other EP ?

22:44 <whitequark> tnt: add a dummy target.multiplexer.claim_interface() call in Benchmark.build

22:44 <whitequark> target.multiplexer.claim_interface(self, args=None)

22:44 <whitequark> something like this

22:45 <tnt> ran just fine too

22:46 <whitequark> bizarre.

22:47 <whitequark> does -v mention EP4/EP8?

22:47 <whitequark> -vv

22:48 <tnt> Yeah T: glasgow.device.hardware: USB: BULK EP8 IN data=<dcf3b8e771cfe29ec43d8 .....

22:48 <whitequark> ok

22:48 <whitequark> your hardware is likely fine

22:49 <whitequark> this is probably my shitty FX2 arbiter then

22:49 <whitequark> I really need to rewrite it and, I dunno, add tests...

22:49 <SolraBizna> why route when you can have a 128-layer board and each signal its own plane

22:50 <tnt> whitequark: Do you use the IO registers ?

22:50 <whitequark> tnt: yes

22:50 <whitequark> before that it barely worked

22:51 <tnt> yeah not surprising, timing would be highly dependent of the PnR results.

22:55 <whitequark> i ned a model of fx2 in migen...

22:55 <whitequark> need*

22:56 <tnt> why would it affect only D1 though ?

22:56 <whitequark> no idea

22:56 <whitequark> i have not observed this particular failure

22:56 <whitequark> can you try hmmm

22:57 <whitequark> tnt: can you locally modify migen to pass --randomize-seed to nextpnr

22:57 <whitequark> and see if that changes things

23:02 <tnt> Yeah it seems it does

23:02 <tnt> Is there a way to force rebuilt ?

23:03 <whitequark> yes

23:03 <whitequark> --rebuild :p

23:06 <tnt> It actually works most of the time ... I guess just not with the default seed in my particular machine.

23:06 <whitequark> tnt: so this is a timing issue... bleh

23:06 <whitequark> :S

23:06 <whitequark> i was afraid of that

23:08 <tnt> nextpnr doesn't really analyze the path to/from D_{IN,OUT} as part of the sync logic. It doesn't seem to know when IO registers are enabled or not.

23:09 <daveshah> Yes, that needs fixing

23:09 <daveshah> It will count as the $async <-> clock paths though

23:10 <whitequark> ohhhh

23:10 <tnt> yeah, that's how I know it doesn't work atm :) because I see those path in <async>

23:10 <daveshah> I'm not convinced icetime handles them entirely correctly either

23:12 <daveshah> However, if the delay in the <async> path is still less than the clock period then its not a problem

23:12 <daveshah> If it is, then that will be it

23:12 <whitequark> daveshah: there is also setup/hold timing of fx2

23:12 <whitequark> which is rather complicated.

23:13 <daveshah> Yes, it is on my masters todo list to look at this kind of stuff in nextpnr

23:13 <daveshah> But that won't be until next year now

23:14 <whitequark> the fx2 timing is nightmarish in places

23:14 <whitequark> because it has setup/hold timings... longer than one clock cycle

23:14 <whitequark> like, what?

23:15 <daveshah> yeah that's crazy

23:16 <tnt> whitequark: mmm ...

23:16 <tnt> whitequark: instead of having SB_IO followed by a SB_GB, can't you use SB_GB_IO ?

23:17 <whitequark> tnt: where?

23:17 <whitequark> also, is that actually different?

23:17 <tnt> Yes.

23:17 <whitequark> shit

23:17 <whitequark> ok fine

23:17 <daveshah> Yes SB_IO, SB_GB adds a bit of fabric routing

23:17 <tnt> As is, the clokc will be routed to the fabric and brought to a random SB_GB depending on placement.

23:18 <tnt> which means the clock phase will vary run to run.

23:18 <whitequark> ughhhhhh

23:21 <daveshah> Seems that the ice40up5k input register has a whole 4ns of its own setup time

23:21 <daveshah> And clock to out of 1.5ns

23:22 <daveshah> Just the pin and register excluding global network etc

23:22 <whitequark> daveshah: what the fuck

23:22 <Richard_Simmons> I'm seeing more and more of these Gowin fpgas, yet I still know nothing about them

23:23 <whitequark> this makes the benchmark applet fail on my glasgow

23:23 <whitequark> using SB_GB_IO

23:23 <whitequark> but only in sink mode

23:23 <whitequark> tnt: can you check this patch https://hastebin.com/susujikoro.diff

23:23 <daveshah> Hmm

23:23 <tnt> Well, tusing SB_GB_IO the phase will be constant .... I didn't say it was going to be right :p

23:24 <whitequark> actually, selftest now consistently fails

23:24 <daveshah> Add a few manually placed LUTs and a manually placed GB to sort it out :P

23:25 <whitequark> AAAAAAAA

23:26 <tnt> Yeah, now both test fails ... consistently.

23:26 <whitequark> this looks like uh

23:26 <whitequark> 54686520717569636b2062726f776e20666f78206a756d7073206f76657220746865206c617a7920646f672e55aa

23:26 <whitequark> 5454686520717569636b2062726f776e20666f78206a756d7073206f76657220746865206c617a7920646f672e55

23:26 <whitequark> this looks like SLRD strobe is not registered in time

23:27 <daveshah> The delay with a GB_IO is probably much lower than with a GB

23:28 <whitequark> lmao this *totally* fucks up all of my strobes

23:29 <daveshah> This is why most FPGAs have input delay blocks

23:29 <tnt> You can use the PLL to select the phase of the clock ...

23:30 <whitequark> tnt: try this https://hastebin.com/izovololen.diff

23:30 <whitequark> in addition

23:30 <whitequark> yeah, this unfucks selftest and benchmark for me

23:30 <whitequark> now, i'm not sure why is that

23:30 <whitequark> will need to read the spec again

23:31 <tnt> Yeah, passes self test.

23:31 <whitequark> shit

23:31 <whitequark> ok

23:31 <whitequark> thanks

23:31 <_whitenotifier> [Glasgow] whitequark created branch sb_gb_io - https://git.io/fp4Wh

23:31 <_whitenotifier> [whitequark/Glasgow] whitequark pushed 2 commits to sb_gb_io [+0/-0/±6] https://git.io/fp1A4

23:31 <_whitenotifier> [whitequark/Glasgow] whitequark 50cc07a - cli: add --synthesis-opts, for passing options to Yosys' synth_ice40.

23:31 <_whitenotifier> [whitequark/Glasgow] whitequark 1df4379 - WIP

23:32 <_whitenotifier> [whitequark/Glasgow] whitequark pushed 1 commit to master [+0/-0/±3] https://git.io/fp1AB

23:32 <_whitenotifier> [whitequark/Glasgow] whitequark efb6bc6 - cli: add --synthesis-opts, for passing options to Yosys' synth_ice40.

23:32 <_whitenotifier> [whitequark/Glasgow] whitequark pushed 1 commit to sb_gb_io [+0/-0/±3] https://git.io/fp1AR

23:32 <_whitenotifier> [whitequark/Glasgow] whitequark eadfa5a - WIP

23:32 <_whitenotifier> [Glasgow] Error. The Travis CI build could not complete due to an error - https://travis-ci.org/whitequark/Glasgow/builds/464103933?utm_source=github_status&utm_medium=notification

23:34 <_whitenotifier> [Glasgow] whitequark opened issue #89: Use SB_GB_IO instead of SB_IO+SB_GB - https://git.io/fp1A2

23:34 <_whitenotifier> [Glasgow] whitequark assigned issue #89: Use SB_GB_IO instead of SB_IO+SB_GB - https://git.io/fp1A2

23:34 <whitequark> tnt: ^

23:34 <_whitenotifier> [Glasgow] Success. The Travis CI build passed - https://travis-ci.org/whitequark/Glasgow/builds/464104158?utm_source=github_status&utm_medium=notification

23:34 <_whitenotifier> [Glasgow] Success. The Travis CI build passed - https://travis-ci.org/whitequark/Glasgow/builds/464104221?utm_source=github_status&utm_medium=notification

23:34 <tnt> It weird how that one board seem to behave so differently from other people's ... (and from the other board I built). First the FX2 LEDs and now this :p

23:34 <whitequark> well, yeah

23:34 <whitequark> it's interesting

23:36 <whitequark> tnt: think you can try and abuse the design a bit in that branch?

23:36 <whitequark> if it works decently enough i might just merge it...

23:36 <whitequark> or write a model...

23:38 <tnt> You could try to run the design in icecube to get the "official" timing number for the IO (i.e. how much sys_clk is delayed compared to the clock at the io pin, and how much setup/hold is expected on each pin and the clk to out etc ...)

23:38 <whitequark> I donn't even have icecub

23:38 <tnt> Ah :) Well, I can give it a shot.

23:38 <tnt> Is there an option to save the .v / .pcf ?

23:39 <tnt> CTRL-C during nextpnr works :p

23:39 <whitequark> `glasgow build -t v`

23:39 <whitequark> just for verilog

23:39 <whitequark> or

23:39 <whitequark> `glasgow build -t zip`

23:39 <whitequark> for the entire design

23:40 <whitequark> caution: zipbomb

23:42 <tnt> E2792: Instance SB_IO_18 incorrectly constrained at SB_IO_OD location

23:42 <tnt> damn

23:44 <tnt> wtf ... they removed all the underscore in the ports names from SB_IO to SB_IO_OD ...

23:45 Bike has joined ##openfpga

23:49 ZipCPU|Laptop has quit [Ping timeout: 240 seconds]