##openfpga on 2019-11-24 — irc logs at freenode.irclog.whitequark.org

00:04 freemint has joined ##openfpga

00:04 freemint has quit [Remote host closed the connection]

00:09 <ZirconiumX> I'm pretty sure there's an SPIResource in nmigen-boards

00:12 rektide has joined ##openfpga

00:15 <kernlbob> So I guess Resources and Records can be bidirectional. Thanks.

00:43 <kernlbob> Can I pass a record through a module without unpacking it? Can't say `m.d.comb += submod.spi.eq(self.spi)` since signals go both says.

01:13 dh73 has quit [Quit: Leaving.]

01:28 davidw has joined ##openfpga

01:28 davidw is now known as Guest83739

01:29 Guest15525 has quit [Read error: Connection reset by peer]

01:31 <kernlbob> I tried using the `connect` method. But it seems to be for records that have fanin or fanout.

01:35 <OmniMancer> My map of bits I know do things in the logic blocks is much more full now

01:36 <kernlbob> I am dropping my line of questions for now. Next week I will try to pull together an example that shows where I'm confused.

01:45 freemint has joined ##openfpga

01:45 freeemint has joined ##openfpga

01:46 freeemint has quit [Remote host closed the connection]

02:06 Guest83739 has quit [Read error: Connection reset by peer]

02:07 Guest83739 has joined ##openfpga

02:18 juri__ has joined ##openfpga

02:18 juri_ has quit [Ping timeout: 240 seconds]

02:35 freemint has quit [Remote host closed the connection]

02:39 Guest83739 has quit [Quit: Leaving]

02:43 freemint has joined ##openfpga

02:44 <OmniMancer> daveshah: what is the best way to give a map of bits that are not a known config?

04:12 IanMalcolm has quit [Read error: Connection reset by peer]

04:13 IanMalcolm has joined ##openfpga

04:16 simeonm has joined ##openfpga

04:37 rohitksingh has quit [Ping timeout: 250 seconds]

04:37 rohitksingh has joined ##openfpga

05:32 _whitelogger has joined ##openfpga

05:34 simeonm has quit [Quit: Bye]

06:15 _whitelogger has joined ##openfpga

06:15 _whitelogger_ has quit [Remote host closed the connection]

06:20 <OmniMancer> now to figure out how IO works

06:50 cr1901_modern has quit [Quit: Leaving.]

06:53 cr1901_modern has joined ##openfpga

07:04 <OmniMancer> so many unknown bits

07:21 m4ssi has joined ##openfpga

08:07 _whitenotifier-f has quit [Ping timeout: 264 seconds]

08:38 <pepijndevos> IO is the most complicated part

08:38 <pepijndevos> And at least on Gowin, there are several variations of any given tile -.-

08:39 <OmniMancer> I think the Anlogic part has different tiles for the left right top and bottom

08:40 <pepijndevos> Yea, and on Gowin there are at least two different per side and they have loooots of options for logic levels, current drive, LVDS, etc, etc

08:41 <pepijndevos> rn I only support just... input and output at the default setting

08:44 * zignig female humans @ ~2k days are dangerous little monsters.

09:21 m4ssi has quit [Remote host closed the connection]

09:22 m4ssi has joined ##openfpga

09:39 Jybz has joined ##openfpga

09:43 rombik_su has joined ##openfpga

10:08 rohitksingh has quit [Ping timeout: 250 seconds]

10:10 m4ssi has quit [Remote host closed the connection]

10:16 Asu has joined ##openfpga

10:20 <pepijndevos> what is the type of a bare parameter LUT = 0; in verilog?

10:20 <pepijndevos> In particular, how wide is it?

10:22 juri__ has quit [Read error: Connection reset by peer]

10:26 juri_ has joined ##openfpga

10:28 <ZirconiumX> pepijndevos: this is Verilog, did you expect this to be reasonable? AIUI it coerces into the width of whatever expression you use it in.

10:28 <pepijndevos> eh ok

10:37 AndrevS has joined ##openfpga

10:39 <whitequark> no

10:39 <whitequark> pepijndevos: 0 is the same as 32'd0

10:39 <whitequark> why 32? because fuck you

10:40 <OmniMancer> Is verilog at all derived from C?

10:40 <pepijndevos> So how does this work with anything bigger than a LUT5? https://github.com/YosysHQ/yosys/blob/master/techlibs/common/simlib.v#L1227

10:41 <pepijndevos> I guess in the sense that everything with curly braces and semicolons is derived from C?

10:41 <pepijndevos> https://en.wikipedia.org/wiki/Verilog influenced by: C, Fortran

10:42 <whitequark> pepijndevos: parameters aren't typed

10:42 <whitequark> so you can pass a 64'd0 as .LUT there

10:42 <pepijndevos> Ah, yea, that's what I was asking: the width of the parameter, so, undefined/any

10:43 <OmniMancer> So I think I am missing some phantom bits

10:43 <whitequark> yeah

10:43 <whitequark> otherwise you couldn't have string parameters

10:43 <pepijndevos> Actually VHDL also has unconstrained types, which can map to this... just forgot about them hehe

10:43 <OmniMancer> my understanding of where tiles are works until you get to the first global clock spine

10:43 <whitequark> because strings are just null terminated sequences of octets

10:44 <pepijndevos> ah, so they took *that* part of C too -.-

10:44 <OmniMancer> lets copy all of the footguns of C

10:45 <whitequark> no, verilog has significantly more footguns

10:45 <whitequark> language designed by idiots

10:46 <OmniMancer> whitequark: I am not saying they didn't add more, but they certainly didn't seem to try to exclude any they had within reach

10:46 <whitequark> systemverilog: let's remove some of the footguns. haha joking we won't actually specify how always_ff works or make emitting a hard error on it standards compliant

10:47 <pepijndevos> I wonder what the story is... like IIRC JS being developed on very short notice as an afterthought or something like that

10:48 <OmniMancer> behold, the problem: https://imgur.com/tjSPD9Q

10:48 <pepijndevos> https://thenewstack.io/brendan-eich-on-creating-javascript-in-10-days-and-what-hed-do-differently-today/

10:49 <sorear> verilog and vhdl were originally created to write sim models while the actual design files were made by a different team with vector graphics programs, synthesis was tacked on later. this doesn't explain why it's gotten *worse* though, synthesis has been an established use case for decades now

10:50 <whitequark> systemverilog was made by people who looked at c++ and thought, "what a honkin' good idea"

10:50 <sorear> might be a C++ thing "we'll add all the features, if you don't like them don't use them"

10:50 <whitequark> that's all the explanation i need

10:51 <pepijndevos> Yea, but, like... IIRC VHDL was designed by some commitee, so just wondering if Verilog was just some rando at some company who mad a thing they needed.

10:52 <OmniMancer> it seems the global clock spine is actually 8 bits high in the bitstream, while the numbers I was using as a proxy for position in the bitsteam only jump by 2 at them

10:52 <whitequark> pepijndevos: sort of like that http://www.sutherland-hdl.com/papers/2002-HDLCon-paper_SystemVerilog.pdf

10:55 Jybz has quit [Quit: Konversation terminated!]

10:58 Asu has quit [Ping timeout: 240 seconds]

10:58 Asu has joined ##openfpga

10:58 Jybz has joined ##openfpga

11:00 Flea86 has joined ##openfpga

11:15 <OmniMancer> now to patch the tile grid

11:28 Jybz has quit [Quit: Konversation terminated!]

11:40 freemint has quit [Quit: Leaving]

11:43 Jybz has joined ##openfpga

12:00 <OmniMancer> now to gather routing info for the routing blocks

12:35 Bike has joined ##openfpga

12:52 Jybz has quit [Quit: Konversation terminated!]

12:56 Jybz has joined ##openfpga

13:21 Jybz has quit [Excess Flood]

13:21 Jybz has joined ##openfpga

13:26 <OmniMancer> what inputs can yosys accept for equivalence checking?

14:19 Bike has quit [Ping timeout: 240 seconds]

14:20 Bike has joined ##openfpga

14:39 Flea86 has left ##openfpga ["Leaving"]

14:47 sgstair_ has joined ##openfpga

14:50 sgstair has quit [Ping timeout: 268 seconds]

15:24 keesj has quit [Ping timeout: 240 seconds]

17:21 emeb has joined ##openfpga

17:54 sgstair_ is now known as sgstair

18:12 X-Scale` has joined ##openfpga

18:13 X-Scale has quit [Ping timeout: 240 seconds]

18:14 X-Scale` is now known as X-Scale

18:38 <GenTooMan> I wonder if anyone has gotten this error "Command '['sh', 'build_top.sh']' returned non-zero exit status 255." other than me from running nmigen.

18:46 <whitequark> needs more context

18:49 lopsided98 has quit [Ping timeout: 245 seconds]

18:49 lopsided98 has joined ##openfpga

18:53 <GenTooMan> That's what I thought I couldn't figure out what it meant.

18:56 <whitequark> well, it prints more than just that line, doesn't it?

18:59 <GenTooMan> Ok pastebin it is.

19:05 <GenTooMan> hmmm maybe I found the error by accident.

19:09 Jybz has quit [Excess Flood]

19:09 Jybz has joined ##openfpga

19:29 andre_ has joined ##openfpga

19:36 andre_ has quit [Quit: umount /dev/irc]

19:40 <GenTooMan> dug a bit through eclipse output(s) and I found the actual error which although just as weird at least I can figure out. I guess running under py dev is sometimes not so good.

19:40 <whitequark> so what's the actual error?

19:44 X-Scale` has joined ##openfpga

19:46 <azonenberg> pepijndevos: i would love to add more sanity checking/linting features to yosys's verilog/systemverilog front end

19:46 X-Scale has quit [Ping timeout: 246 seconds]

19:46 <azonenberg> in particular, a sizeable fraction of the footguns can be eliminated if you implement/restrict users to a well defined subset of the language's theoretical features

19:47 <azonenberg> So i want to add some optional arguments to read_verilog that enforce some additional rules

19:47 <azonenberg> For example, erroring out if you have latches in a combinatorial block or anything but a ff in an always_ff

19:47 <azonenberg> mandating default_nettype none

19:47 X-Scale` is now known as X-Scale

19:48 <whitequark> Clifford's answer for this before was "use an external linter"

19:48 <azonenberg> Yes, i know. But I want to use the actual yosys AST as input if at all possible

19:48 <azonenberg> i'd be OK-ish with exporting RTLIL and linting that

19:49 <azonenberg> as a pass that calls out to an external tool

19:49 <azonenberg> but i want it to be something i can integrate with a synthesis flow as-is

19:49 <GenTooMan> here is the listing https://pastebin.com/FwnBMWsu it's the generated pcf file. I'm not sure why it's setting frequency in the PCF ... let me check the version of nmigen

19:49 <azonenberg> also, some of the options are things that i really think need to be implemented early on in the parser

19:49 <azonenberg> For example, i want to be able to do read_verilog -sv -default_nettype none foo.sv

19:50 <azonenberg> and have it force default_nettype none, regardless of any declaration in the file or previous state set by earlier files

19:50 <azonenberg> Speaking of which, making default_nettype none default is IMO a bug in the spec

19:50 <azonenberg> and making preprocessor state persistent across files is another

19:51 <azonenberg> I would like to add nonstandard arguments to read_verilog that ensures source files are unaffected by compile order, i.e. no state persists across files other than the AST itself

19:51 <azonenberg> so that you dont have to worry about a third party HDL block added to your project defining macros that break your code, or vice versa

19:51 <azonenberg> yes, this has bit me before

19:51 <azonenberg> it's extra fun when the offending file is autogenerated by vivado and can't be patched because it'll just change again

19:52 <daveshah> GenTooMan: that was a new nextpnr addition

19:52 <azonenberg> sorry i meant to say above, making default_nettype wire default is the bug in the spec

19:52 <azonenberg> Use of an undeclared identifier should always be an error

19:53 <GenTooMan> daveshah oh ... I am using "nmigen-0.2.dev4+gf207f3f" of nmigen so I have to update nextpnr?

19:53 <daveshah> I guess so

19:53 <GenTooMan> :D

19:54 <daveshah> azonenberg: BTW, I added always_comb, ff, etc checking recently

19:54 <azonenberg> daveshah: oh awesome

19:54 <azonenberg> too bad i cant use it because all of my sv code uses structs and enums :p

19:54 <daveshah> And it is an error not a warning

19:55 <azonenberg> Also, what's the status of 7 series support for nextpnr + whatever tool from prjxray spits out a bitstream?

19:55 <daveshah> Yeah, structs and enums need doing soon

19:55 <azonenberg> Can you actually do at least basic stuff with mainline yosys+nextpnr yet?

19:55 <whitequark> azonenberg: can't you use read_verilog twice for that?

19:55 <daveshah> I have a proof of concept for nextpnr on Artix 7

19:55 <whitequark> AFAIK no state persists after read_verilog finishes

19:55 <azonenberg> whitequark: can you be more specific? use twice for what

19:55 <daveshah> It is not mainline and may never will be

19:55 <whitequark> so that third party HDL does not affect your code

19:55 <daveshah> Routing is too slow to be useful right now but I'm working on it

19:55 <azonenberg> whitequark: In that case, that isn't SV standards compliant

19:55 <whitequark> hm

19:56 <azonenberg> whitequark: actually, its even better

19:56 <azonenberg> if memory serves me right, the standard says that within a group of files being compiled state persists

19:56 <azonenberg> but it seems to allow for you to have a group be arbitrarily few/many files

19:56 <whitequark> hmm

19:56 <azonenberg> This lead to a fun bit of code on $sidegig a while ago that would compile fine in vivado and break in vcs/synopsys design compiler, and vice versa

19:57 <azonenberg> the workaround was to add c-style `ifndef `define include guards around a bunch of definitions

19:57 <whitequark> azonenberg: oh I see

19:57 <azonenberg> so that no matter whether things persisted or not, it would work

19:57 <whitequark> read_verilog *does* persist state

19:57 <whitequark> but yosys has the verilog_defines command

19:57 <whitequark> so you can do verilog_defines -reset

19:57 <azonenberg> does that wipe default_nettype, timescale (i guess that doesnt matter for synthesis) etc?

19:57 <azonenberg> or only actual `define

19:58 <azonenberg> basically, IMO we need to have both a standards compliance mode and a clean-slate mode

19:58 <whitequark> looking

19:59 <whitequark> azonenberg: `timescale is ignored

19:59 <whitequark> default_nettype is reset each time read_verilog is called

19:59 <whitequark> yosys also supports `resetall, not sure if it's in the standard

19:59 <whitequark> which you can presumably use in your HDL rather than in a script

20:00 <azonenberg> Hmmm, as much as i like resetting each time i think for compatibility with other stuff we do need to allow it to persist

20:00 <azonenberg> But that definitely needs to be an option

20:01 <daveshah> azonenberg: if you want to peek at it this is the nextpnr repo for artix7/ultrascale experiments

20:01 <daveshah> But expect issues https://github.com/daveshah1/nextpnr-xilinx

20:02 <azonenberg> daveshah: And what densities/packages do you support?

20:02 <azonenberg> right now i have 50t and 100t artix in ftg256 handy

20:02 <azonenberg> a 70t kintex in fbg484 on a board with limited gpio

20:02 <daveshah> I've only tried it with the "35t" that the Arty has

20:02 <azonenberg> what package is that?

20:02 <azonenberg> the 35 is a fused 50 so it should work on that

20:02 <daveshah> Some 324 one I think

20:03 <azonenberg> ah ok, so we'd have to find the mapping between csg324 and ftg256 pads

20:03 <daveshah> I don't think Xray has the package data for the 256 50t

20:03 <daveshah> It has the 484 50t iirc

20:03 <daveshah> It would be easy to add

20:03 <daveshah> Just some Vivado tcl scripts

20:03 <daveshah> But I don't know Xray well enough

20:03 <daveshah> Probably similar for the 100t tilegrid

20:03 <daveshah> It has slow routing and no timing data (so ignore any Fmax it gives)

20:03 <daveshah> ie don't actually use it

20:05 <azonenberg> well if somebody wanted to add package data for 256 50t i can start alpha testing with a few blinkies etc

20:05 <azonenberg> just to make sure i have things compiling etc

20:06 <azonenberg> what's the next steps? Is timing driven packing/placement in progress, or figuring out more bitstream parts, or what?

20:06 <daveshah> Yeah I've no idea what's actually involved

20:06 <daveshah> I don't have any involvement or insight into the bitstream side tbh

20:06 <daveshah> They have timing data inside VPR already, iirc

20:06 <daveshah> I just haven't the time or energy to do the same on the nextpnr side yet

20:07 <daveshah> VPR is the primary flow for xc7 and always will be

20:07 <azonenberg> oh?

20:07 Laksen has joined ##openfpga

20:07 <whitequark> i thought vpr was unsuitable for real-world architectures

20:07 <daveshah> Well, it seems it can be fudged

20:07 <azonenberg> I thought nextpnr was supposed to be a better, more scalable tool that was to entirely replace vpr moving forward for actual-hardware projects?

20:08 <whitequark> why should it be primary flow if it "can be fudged"?

20:08 <daveshah> I personally don't think I'll be able to maintain nextpnr for xc7 to a standard I would be happy about as well as the existing iCE40 and ECP5 flows

20:08 <whitequark> oh, I see

20:08 <azonenberg> Re VPR, how usable is that now for xc7?

20:08 <whitequark> does iCE40 flow need any ongoing maintenance?

20:08 <daveshah> I think basic stuff is working but I think routing is also quite slow, like nextpnr

20:09 <daveshah> To some extent iCE40 still does, eg not supporting HeAP yet

20:09 <whitequark> um, I've been using HeAP on iCE40 for like a year

20:10 <daveshah> Oh I meant defaulting not supporting

20:10 <whitequark> right

20:10 <daveshah> Because of the odd little edge cases that each arch has that each new cad algorithm needs to cope with

20:10 <daveshah> Same with things like the new router, any API changes, IO timing analysis, etc

20:11 <whitequark> makes sense

20:12 <daveshah> On a personal level, I'd rather work on supporting Lattice's new parts than Xilinx particularly as that will share a lot with ECP5

20:12 <daveshah> But while keeping some kind of experimental xc7 thing as a way of making sure the CAD algorithms can scale to that size of device

20:12 <whitequark> I'd be much happier personally with better Lattice support, but that's predicated on the kind of work I do

20:13 <azonenberg> Meanwhile, while i dont have the time to commit to much on the tooling side, 7 series/ultrascale is top priority for me because most of my projects lately wont fit in a lattice

20:13 <whitequark> azonenberg is in the exact opposite situation

20:13 <daveshah> Realistically nextpnr has a way to go before being able to route such big designs anyway

20:13 <azonenberg> i need 100k+ cells of capacity, 10G serdes, etc

20:13 <daveshah> That's the other issue with putting time into xc7, that Ultrascale is coming round the corner too

20:13 <azonenberg> one of the things holding me back from playing with the ecp5/ice40 tools more is that i simply dont have needs for anything that small

20:14 <azonenberg> daveshah: kintex7 is not going away any time soon

20:14 <azonenberg> ultrascale has no low end parts

20:14 <azonenberg> and artix/kintex is far more cost effective for most applications

20:14 <daveshah> I don't think Xilinx really care about low ends

20:14 <azonenberg> my understanding is that xilinx reached a point a few years ago where it no longer made sense to make low/mid/high end parts on the same node

20:14 <daveshah> They are investing in Efinix instead

20:15 <azonenberg> So now they have active families still getting development and even new devices (like spartan7) across 28, 20, and newer

20:15 <azonenberg> i don't see the 28nm stuff going away any time soon... heck, you can still get coolrunners

20:16 <daveshah> No, but xc7 will become less interesting over time

20:16 <daveshah> I guess check the open source tooling again this time next year and maybe things will be better

20:16 <azonenberg> what i mean is, i see xc7 being xilinx's key product in the sub-$500 price range for the next few years at least

20:17 <daveshah> Well they have some Zynq UltraScales and Versals at around that level

20:17 <daveshah> I know you don't like SoCs

20:17 <azonenberg> Pure FPGAs will always have a place

20:17 <daveshah> but iirc the smallest Zynq UltraScale is around 50k

20:17 <daveshah> LUTs

20:17 <azonenberg> i don't see xilinx discontinuing fpgas and only making socs

20:17 <azonenberg> having socs be the main marketing push sure

20:18 <daveshah> Well afaik Versal, their next generation, is only SoCs

20:18 <azonenberg> versal i dont think is a new generation per se

20:18 <azonenberg> i think it's a new *family*, like zynq is

20:18 <azonenberg> zynq is not intended to replace artix

20:18 <daveshah> It's a new node

20:19 <azonenberg> ok that i did not know

20:19 <azonenberg> but i expect we'll see virtexes on that node shortly

20:20 <daveshah> Given they have top end Versals with HBM intended for the virtex market I'm not so sure

20:20 <azonenberg> Hmmm

20:20 <azonenberg> well, if they drop pure-fpga support that may be the kicker that gets me to start moving to another fpga vendor :p

20:21 <azonenberg> i find it hard to believe they'd kill off the asic prototyping etc space

20:21 <azonenberg> that seems hugely profitable

20:21 <azonenberg> given how high the markup on the massive virtexes is

20:21 <daveshah> Well they'd just tell those people to use the ARM core for startup and then ignore it

20:25 mumptai has joined ##openfpga

20:26 <whitequark> are FPGAs ever power limited?

20:27 <azonenberg> Yes. A lot of the old spartan6 btc miner boards were thermally limited AIUI

20:27 <azonenberg> you couldn't run them at the fmax identified by timing analysis or they'd melt

20:28 <whitequark> hm

20:29 <whitequark> right, but I mean more, do FPGAs ever have "dark silicon"?

20:29 <whitequark> I'm guessing not

20:29 <azonenberg> i think some of them actually did dynamic frequency scaling with thermal sensor input closing the loop

20:29 <azonenberg> re dark silicon, not that i know of. Those things were at near 100% lut load all the time and just varied frequency

20:29 <whitequark> as in, is there ever a point where "adding more fabric" becomes an unviable strategy

20:29 <whitequark> so you add an ARM core

20:29 <azonenberg> Only if your problem can't parallelize more

20:29 <whitequark> as an FPGA vendor that is

20:29 <azonenberg> the reason they add CPUs is because software devs cost less than RTL devs

20:30 <whitequark> not as a user

20:30 <whitequark> no, I know

20:30 <whitequark> it's a tangent

20:30 <azonenberg> And because for low speed state machine stuff, you dont need everything unrolled in rtl all the time

20:30 <azonenberg> i dont think power is ever a reason to not add more fabric

20:30 <azonenberg> usually the cap on fpga size is yield

20:31 <whitequark> oh, is that why they go for multi-chip modules?

20:32 <azonenberg> They're not MCMs per se, 2.5D is kind of a special case in that it's still all silicon

20:33 <azonenberg> you basically just have the top few metal layers on a separate substrate

20:33 <whitequark> wait, what

20:34 <azonenberg> the big virtexes have a passive interposer made on silicon, i think it's a 65nm process for the 28nm virtexes that i heard somewhere ( no source for that number handy)))

20:34 <rombik_su> whitequark: https://www.xilinx.com/products/silicon-devices/3dic.html

20:35 <azonenberg> they have TSVs in the interposer connecting to flip chip bumps on the fpga dies

20:35 <azonenberg> then more flip chip bumps on the interposer to the package

20:35 <azonenberg> so there's actually two layers of silicon with nanoscale interconnects, not pcb traces, connecting the adjacent dies

20:35 <rombik_su> Yup, SSI (the substrate) is 65 nm

20:35 <azonenberg> On virtex7. Not sure if smaller for ultrascale

20:35 <azonenberg> there's something like 10k signal lines, plus clock/config trees, between each of the SLRs (FPGA dies)

20:36 <rombik_su> *(the silicon substrate)

20:36 <whitequark> right, that's what I was thinking

20:36 <whitequark> instead of one huge die you have a few smaller ones plus interconnect

20:36 <whitequark> so yield is higher

20:36 <azonenberg> Yes

20:36 <azonenberg> This also lets them have less mask sets

20:37 <azonenberg> So for example, an xc7vh870t is 2 SLRs, 90700 slices, 1680 DSPs

20:37 <azonenberg> so 45350 slices and 840 dsps each

20:38 <azonenberg> an xc7vh870t, as far as i can tell, is three *of the same fpga die*

20:38 <azonenberg> just a new interposer

20:38 <azonenberg> so only one 28nm mask set for two SKUs

20:38 <azonenberg> in fact, it might even be the same interposer with one of the three footprints unpopulated/capped off with a dummy die for thermal reasons

20:43 <azonenberg> They do have more than one fpga die on each process because the lower end SKUs are monolithic, and they also have a few variants of high end optimized for more DSP vs more SERDES etc

20:43 <azonenberg> but there's multiple obvious cases of reuse

20:44 <daveshah> Incidentally, I'm sure that I heard some of the Kintex US+ were using some Zynq US+ dies with the ARM cores locked out

20:44 <daveshah> Didn't look at that myself though

20:44 <daveshah> That would just be for mask commonality though

20:45 <azonenberg> that sounds less likely because the zynqs are a totally different boot process

20:45 nrossi has quit [Quit: Connection closed for inactivity]

20:45 <azonenberg> i dont think zynqs are capable of booting the fpga only in e.g. master spi mode

20:45 <nats`> <daveshah> Incidentally, I'm sure that I heard some of the Kintex US+ were using some Zynq US+ dies with the ARM cores locked out <= that's true

20:45 <azonenberg> oh?

20:46 <azonenberg> at least in 7 series, which i'm more familiar with, the arm comes up first and then loads the fpga

20:46 <azonenberg> so unless they have bootrom code to check an efuse bit then master-spi into the fpga

20:46 <azonenberg> which i guess isnt beyond the realm of possibility

20:46 <azonenberg> also it would be a totally different package but i guess that can be worked around

20:49 <rombik_su> Flip-chips can accomodate to it with different substrate PCB design, I guess

20:50 <azonenberg> yeah thats what i'm thinking, and then just have some extra fpga gpios to use the balls that you can't use with MIO

21:04 rombik_su has quit [Quit: Leaving]

21:06 rombik_su has joined ##openfpga

21:13 Laksen has quit [Quit: Leaving]

21:59 rohitksingh has joined ##openfpga

22:08 zino has joined ##openfpga

22:13 Jybz has quit [Quit: Konversation terminated!]

22:32 <mwk> azonenberg: that's not completely true

22:33 <mwk> the fpga part still has the usual config port bonded out

22:33 <mwk> it's just that the pins are labelled as NC

22:33 ZombieChicken has joined ##openfpga

22:33 <mwk> in theory, it could be possible (unless they disabled it via some mechanism unknown to me) to ignore the ARM core and boot it normally instead

22:34 <mwk> via the usual config modes

22:34 <mwk> perhaps even hold the ARM in reset

22:34 <rombik_su> They can disable ARM via efuse or with strap pins hardwired in substrate for example

22:34 <mwk> that's for xc7

22:35 <mwk> for ultrascale, they even use the exact same hw as kintex and zynq

22:35 <mwk> as in, only the smallest kintex u+ part is not actually the zynq, all the other ones have a disabled ARM core

22:36 <mwk> the obvious difference is packaging — the kintex parts don't have the PSS banks bonded out

22:36 <mwk> there may also be some fuses involved

22:38 <rombik_su> IIRC, in Zynq MPSoC's PL side can be put in user mode without bothering with PS side startup, when in 7s you'll need to bring PSU up in order to use PL side

22:40 <azonenberg> oh so they realized how dumb it was to make PS be in charge of everything and backed off on that?

22:42 <daveshah> I think they've effectively but a MicroBlaze or two in charge of everything instead

22:42 <mwk> microblaze? why?

22:43 <daveshah> idk

22:43 Asu has quit [Remote host closed the connection]

22:43 <daveshah> Their core of choice

22:43 <daveshah> https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18841724/PMU+Firmware

22:45 <rombik_su> Well, they had it lying around, it's mature and well-known among users so I think they just decided to hardening it

22:47 <daveshah> Yeah, looks like they did some triple redundancy too

22:47 <daveshah> https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18841708/Zynq+Ultrascale+plus+Security+Features

22:47 <daveshah> For thrice the fun

22:47 <rombik_su> :D

22:52 <rombik_su> Xilinx is very good at creating jobs - it's like a full-time job for a SWE to be able to jiggle all the bells and whistles on Zynq US+

22:52 <whitequark> lol

23:05 feuerrot has quit [Ping timeout: 240 seconds]

23:07 AndrevS has quit [Quit: umount /dev/irc]

23:10 feuerrot has joined ##openfpga

23:16 <azonenberg> meanwhile i'm still waiting for a big-fpga little-cpu chip

23:16 <azonenberg> like, a stm32 class processing engine bolted onto a big kintex7 sized fpga

23:16 <azonenberg> independent, and each capable of operating without the other

23:16 <rombik_su> Like, for housekeeping stuff?

23:17 <azonenberg> i'm thinking more like a few hundred MHz M7

23:17 <azonenberg> with a fair bit of on die sram plus an axi bus connected to the fpga for mapping external dram etc via the fpga if desired

23:17 <azonenberg> the other thing i'd like is a large fpga with a bunch of tiny cortex-m0 sized cpus scattered around the fabric

23:17 <mwk> tbh what's wrong with, say, 7z100 for that use case?

23:18 <azonenberg> mwk: couple of things

23:18 <azonenberg> first is, cortex-a is much harder to bring up bare metal than M

23:18 <azonenberg> if you dont want or need linux, zynq is difficuult to use

23:18 <mwk> that sounds like sw tooling problem though

23:19 <azonenberg> the other is, the zynq 7 arch (less familiar with u/u+) is designed to be cpu-first

23:19 <azonenberg> you cannot, for example, boot the fpga in master mode and leave the cpu in power-down mode until a gpio is toggled and you boot it

23:19 <azonenberg> the CPU can reconfigure the FPGA at any time without the FPGA's consent

23:19 <azonenberg> so you can't use the FPGA for an isolated security domain

23:20 <azonenberg> i want them to be basically two separate chips on the same die, just with a fast bus between them

23:20 <rombik_su> azonenberg: Do you use XSDK with Zynq?

23:20 <mwk> well that's kind of solvable with trustzone

23:20 <azonenberg> rombik_su: my experience with zynq to date has all been PL side while someone else did the software

23:21 <azonenberg> i attempted to get it to boot up bare metal and print hello world to the console and got nowhere

23:21 <rombik_su> That's strange, hello world in XSDK is just couple of clicks

23:21 <azonenberg> so i gave up and integralstick used a stm32f7 + artix7

23:21 <azonenberg> well thats the other problem, i didnt want to use huge amounts of generated code

23:22 <rombik_su> Oh, I see

23:22 <azonenberg> trying to even use the zynq PS in a pure systemverilog design without using the IP integrator is essentially impossible

23:22 <mwk> true

23:22 <azonenberg> you have to make a block design because that's the only way to get all of the myriad SFRs configured right at boot

23:22 <azonenberg> What i want is a pure FPGA that has a bunch of hidden GPIOs that connect to a MCU

23:22 * mwk would like to make some proper free toolchain for the PS side of zynq some day

23:23 <azonenberg> and brings out a memory bus plus a few other things

23:23 <rombik_su> mwk: please don't forget proper OpenOCD support :D

23:23 <azonenberg> With support for either fpga-first, mcu-first, or isolated architectures

23:24 <azonenberg> with a stm32, a few dozen lines of linker script and i can compile a C program with gcc, no assembly startup routines or anything, and be running C right out of reset

23:24 <azonenberg> the only thing you need asm for is disabling/enabling interrupts, so two functions with one line of inline asm each

23:24 <azonenberg> memory is working immediately

23:24 <azonenberg> good luck doing that with a cortex-A based platform

23:26 <azonenberg> i also have not yet managed to get debug working on a cortex-A using jtaghal

23:26 <azonenberg> meanwhile i have full in-circuit program support for stm32, as well as partial debug

23:26 <rombik_su> 7000s is very PS-centric by design

23:26 <azonenberg> rombik_su: yes i know, and i strongly disagree with that architectural decision

23:26 <azonenberg> i want either PL-centric or peer-to-peer

23:27 <azonenberg> Xilinx in general seems to be trying to push their products on software devs with things like HLS, the ACAP, etc

23:27 <azonenberg> i want a soc for rtl engineers who know software

23:27 <azonenberg> not for software folks who don't grok digital logic :p

23:28 <pie__> whats a PS

23:29 <mwk> the ARM core on zynq

23:29 <pie__> oh ok

23:29 <mwk> "processor system", I think

23:29 <azonenberg> To be precise, the PS is the arm core plus peripherals, bus, caches, dram controller, etc

23:29 <rombik_su> Acceleration market is seems to be growing fast, so they probably want to cater to those guys (mostly software/web folks).

23:29 <azonenberg> rombik_su: yes, i know

23:29 <azonenberg> i get why they're doing it

23:29 <azonenberg> doesn't mean i have to like it :)

23:29 <mwk> as opposed to PL, which is the FPGA part (programmable logic)

23:29 <mwk> also PS is called PSS in some places, but I have no idea what the extra S stands for

23:30 <zino> As a Software person I appreciate this trend to give me lots of IO fairly easily accessible from a semi-sensible OS. :)

23:30 <azonenberg> mwk: PSS = PS7 = Processing System 7

23:30 <mwk> what, the S stands for Seven?

23:31 <azonenberg> That is my understanding

23:31 <mwk> ugh.

23:31 <azonenberg> Soooo to give you an idea of how much i hate dealing with axi and zynq in general

23:31 <mwk> probably true though.

23:32 <azonenberg> for $sidegigclient's project, i had to move circa 1 Gbps of data between my logic in PL, and a C program in the PS

23:32 <azonenberg> you know what i did?

23:32 <mwk> hmm, or is it

23:32 * mwk looks through her notes

23:33 <azonenberg> i brought up the second ethernet MAC on the PS, connected the EMIO GMII bus to my PL, and wrote my logic to use Ethernet framing instead of AXI around PS-PL communications

23:33 <rombik_su> azonenberg: I'm sorry, my connection is lagging, it looks like I'm repeating your points, but in fact I sent those long before. )

23:33 <azonenberg> then on the PS side, you can just use SOCK_RAW/AF_PACKET and sendto() the PL :D

23:33 <mwk> azonenberg: the PSS name is also used on ultrascale

23:33 <mwk> which have PS8

23:34 <daveshah> Yeah, PSS_ALTO is familiar

23:35 <daveshah> It's also odd because PS7 isn't the 7th PS but refers to 7 series

23:35 <daveshah> Whereas PS8 is in 9 series effectively

23:36 <mwk> eh, ultrascale and ultrascale+ are not all that different

23:36 <mwk> could count as the same series

23:37 <mwk> or maybe it was intended to be used in first ultrascale, but hit some delays

23:37 <daveshah> That seems quite possible

23:37 <daveshah> Versal is PS9 btw

23:37 <mwk> or maybe it's xilinx just fucking up a numbering scheme again

23:37 <rombik_su> A lot of the docs (UltraScale Architecture Libraries Guide UG974, for example) is covering both US and US+

23:38 <azonenberg> daveshah: lol PS9 is versal?

23:38 <daveshah> Yep

23:39 <daveshah> All the Versal primitives are in latest Vivado

23:39 <azonenberg> but versal is 10 series

23:39 <azonenberg> u = 8, u+ = 9

23:39 <daveshah> That was the problem I just mentioned with PS8

23:39 <mwk> azonenberg: nobody ever said that

23:39 <azonenberg> yeah there is no zynq 8

23:39 <mwk> the only series there is is series 7

23:39 <azonenberg> mwk: i'm talking about internally, not in marketing names

23:39 <daveshah> Versal seems to be RAMB18E5 and RAMB36E5 too

23:40 <mwk> internally series 7 is called Fuji

23:40 <mwk> no number

23:40 <daveshah> I'm not sure what happened to E3 and E4

23:40 <mwk> daveshah: seems they cannot make up their mind whether Ex correspond to a given hw generation, or whether they're just number to be incremented whenever they change a primitive

23:41 <mwk> there's lots of precedent for that, unfortunately

23:41 <whitequark> lol

23:41 <daveshah> I still like that Lattice use increasing letters instead (appending A, B, C, D, etc)

23:41 <rombik_su> mwk: yup, OSERDESE2 is 7-series, OSERDESE3 is US/US+

23:41 <daveshah> This led to the super cute ECP5 JTAGG primitive

23:41 <azonenberg> mwk: the last details i had was Fuji is 7 series, Olympus is 8/ultrascale, Diablo is 9/ultrascale+, Everest/Everestea is Versal

23:41 <mwk> rombik_su: no

23:41 <daveshah> JTAG egg

23:42 <mwk> I mean yes

23:42 <mwk> but not just that

23:42 <mwk> when you actually look at internal primitives, *both* OSERDESE2 and E3 are series 7

23:42 <azonenberg> Someone also claimed to have found some internal device ID constants for virtex-9 and kintex-9 which were separate from vu+/ku+

23:42 <mwk> E2 are the 3.3V banks, E3 are 1.8V banks

23:42 <azonenberg> but still considered diablo family devices

23:42 <azonenberg> so who knows

23:43 <mwk> azonenberg: oh there are craploads of unused ID codes

23:43 <azonenberg> Yeah

23:43 <mwk> like there are twice as many unreleased 7 series devices as released ones

23:43 <azonenberg> who knows if any are future deviecs vs dead

23:43 <azonenberg> Oh?

23:43 <azonenberg> All i know about is the xc7a350t

23:43 <azonenberg> then going way back the xc2c1024

23:43 <azonenberg> those are the only thoroughly confirmed but never released xilinx parts i know of

23:43 <azonenberg> well, that and the whole Starfighter series

23:44 <mwk> well, at least there are IDCODEs for them

23:44 <mwk> I have no idea how far in production they actually were

23:44 <azonenberg> the xc7a350t had a whole resource definition entry in the overview table wtih exact lut/bram etc counts

23:44 <azonenberg> and a package/pinout defined

23:44 <azonenberg> i guess they figured it was competing with kintex and killed it

23:45 <azonenberg> Coolrunner-II was named BladeRunner originally afaik, hence the ISE data directory being called XBR (Xilinx BladeRunner)

23:45 <mwk> oh that explains that weird name

23:45 <azonenberg> StarFighter was the planned 90/130nm successor to launch alongside Spartan-3

23:45 <mwk> so the strangest part of series 7 is actually the unreleased thing with big DACs/ADCs

23:45 <azonenberg> based on patents, i strongly conjecture that it was to be a hybrid architecture with CPLD-style function blocks connected in a 2D routing tile array like an FPGA

23:45 <mwk> sorta like RFSoC, I think

23:46 <azonenberg> mwk: my guess is it took too long to finish and they rebuilt it around ultrascale fpga fabric

23:46 <azonenberg> That's the first i've heard of it though

23:46 <mwk> and it was supposed to be a heterogenous SSI thing, like the GTZ transceivers

23:46 emeb has quit [Quit: Leaving.]

23:46 <mwk> ie. the ADC/DAC were actually separate die in the same package

23:47 <mwk> well, it's amazing how many things you can find if you look at the device database closely in xdl

23:47 <azonenberg> Lol

23:48 <mwk> the other interesting unreleased thing is Spartan 3A variant with embedded MCU and USB 2.0 phy

23:48 <daveshah> Haha

23:48 <azonenberg> i also think spartan6 was originally going to have an in-package wirebonded flash variant like spartan3an

23:48 <azonenberg> cant recall where i got that info

23:48 <mwk> azonenberg: correct

23:48 <azonenberg> not sure why they killed that, it would have cost them very little to do

23:48 <mwk> it's even present in the final silicon

23:48 <azonenberg> no mask respin

23:48 <azonenberg> maybe buggy and they didnt want to do a new stepping?

23:48 <mwk> you can instantiate the primitive to talk to the nonexistent flash

23:49 <azonenberg> Lol

23:49 <azonenberg> has anyone tried bonding a flash to a live decapped spartan6?

23:49 <mwk> doubt so

23:49 <azonenberg> i kinda wanna do that now lol

23:49 <mwk> they would've fixed it in some device if they cared

23:49 <azonenberg> if you can give me some guesses of the bond pad locations

23:49 <mwk> I mean, they don't make all device sizes at the same time

23:49 <mwk> lower right corner of the thing IIRC

23:49 * mwk checks

23:50 <azonenberg> now that i have a decap setup working i might actually be willing to try this, would be an excuse to get the wirebonder at work a workout (havent used it in years)

23:50 <whitequark> let's say i use BUFGCE on Xilinx

23:50 <whitequark> do the input clock and output clock have a defined phase relationship?

23:51 <mwk> whitequark: yes

23:51 <whitequark> further, can I gate a chunk of a single clock domain without introducing a race condition?

23:51 <mwk> I mean, sort of

23:51 <mwk> the guarantee is that if you have several BUFG*s with the same input clock, they have same phase when they reach the flops

23:51 <mwk> so BUFGCE is in-phase with a BUFG with the same input

23:52 <whitequark> e.g. clka -> BUFGCE -> clkb; always @(posedge clka) x <= y; always @(posedge clkb) z <= x;

23:52 <azonenberg> whitequark: that's not how you'd do i think

23:52 <mwk> whitequark: this will work assuming that clka is distributed through a BUFG (which it should be)

23:52 <kc8apf> azonenberg: you'll probably be my closest decapping setup after my move. Do you let others use it?

23:52 <azonenberg> kc8apf: where are you moving?

23:52 <whitequark> context: I am thinking about nondeterminism in simulators

23:52 <kc8apf> North of Monroe

23:53 <azonenberg> kc8apf: what state is that? :p

23:53 <kc8apf> WA

23:53 <mwk> azonenberg: okay, so I have two hypotheses

23:53 <azonenberg> oh, cool - in that case then probably yes i'll be close

23:53 <whitequark> in Verilog for example if you write a clock divider it'll be racy

23:53 <mwk> one is that everything is in lower-right

23:53 <azonenberg> I'm willing to do lab work for guests, although at least for the moment i'm not going to be fully hands off

23:53 <mwk> because that's where SPI_ACCESS is

23:53 <azonenberg> it would probably be me doing the work whiel you watch

23:53 <azonenberg> mwk: do you actually have pads IDed or guessed? even if you don't know which is which signal?

23:54 mumptai has quit [Quit: Verlassend]

23:54 <whitequark> whereas a clock divider in VHDL isn't

23:54 <kc8apf> Fair enough. McMaster has let me do more hands on with oversight

23:54 <azonenberg> kc8apf: we'll work out details if you pay me a visit

23:54 <kc8apf> Fairly rare that I need decapping though

23:54 <azonenberg> Depending on your level of experience etc

23:54 <mwk> the other is... MISO in lower left, CLK and MOSI in lower-right, CS in upper-right

23:54 <whitequark> conversely, a clock *gating* circuit in Verilog is fine, but is racy in VHDL because it introduces "propagation delay" and now you have to manually balance the clock tree

23:54 <azonenberg> kc8apf: i'm also working on building out the facilities for visiting researchers

23:54 <kc8apf> Nice

23:55 <mwk> azonenberg: no, I haven't even seen a die shot

23:55 <mwk> and S6 is honestly too much of a mess to identify shit

23:55 <azonenberg> mwk: check siliconpr0n if you want to look at some top metal shots

23:55 <azonenberg> i dont think we have any delayered ones yet

23:55 <whitequark> so it looks like the behavior of the FPGA actually sorta matches what VHDL does

23:55 <mwk> I just know where roughly the things are in the tile structure

23:55 <azonenberg> kc8apf: as of now the conference room/presentation area is painted and set up but lacks a conference table, chairs, projector, or screen

23:55 <whitequark> but you have to manually insert BUFGs without relying on the toolchain for the sim to be correct

23:56 rombik_su has quit [Read error: Connection reset by peer]

23:56 <kc8apf> I'll be building out some electronics but no chem. Most of the current space will be for automotive, woodworking, and metalwork

23:56 <azonenberg> and only one of the three walls planned to have a whiteboard actually has one

23:56 <mwk> I suppose if you have a decapped FPGA, you could look for unused pads

23:56 <whitequark> actually, here's something i'm curious about

23:56 <mwk> in the corners

23:56 <azonenberg> mwk: if only it was that easy :p

23:56 <mwk> azonenberg: well I have no way to do any better

23:57 <azonenberg> There's a LOT of pads lol

23:57 <whitequark> imagine you have two DFFs connected in series, with a common clock

23:57 <azonenberg> https://siliconpr0n.org/archive/doku.php?id=azonenberg:xilinx:xc6slx4

23:57 <azonenberg> that is a rather dirty/scratched die, i can definitely do better

23:57 <azonenberg> but its the only s6 decap i'm aware of right now

23:57 <azonenberg> actually dig did a 100

23:58 <azonenberg> https://siliconpr0n.org/archive/doku.php?id=mcmaster:xilinx:xc6slx100

23:58 <mwk> *shrug* corners

23:58 <whitequark> for this to work correctly, the clock skew between them must be less than the D-Q propagation delay (minus setup time), right?

23:58 <mwk> I can't really narrow it down any closer

23:58 <azonenberg> whitequark: in real asics, you tend to have hold time violations with closely spaced shift regs

23:58 <mwk> all I can do is the above hypothesis about which wires are in which corner

23:58 <whitequark> azonenberg: aha. thought so

23:58 <daveshah> azonenberg/whitequark: this is also true on xilinx hardware, at least newer devices

23:59 <whitequark> oooh

23:59 <daveshah> Particularly across certain clocking boundaries

23:59 <daveshah> But the tools will always add routing delay to fix it

23:59 <azonenberg> yeah but in fpga its pretty easy to add routing delay

23:59 <daveshah> This is not true in iCE40 or ECP5

23:59 <daveshah> Where clock skew is low enough compared to minimum clock to out and routing for it not to be a problem