freemint has joined ##openfpga
freemint has quit [Remote host closed the connection]
<ZirconiumX> I'm pretty sure there's an SPIResource in nmigen-boards
rektide has joined ##openfpga
<kernlbob> So I guess Resources and Records can be bidirectional. Thanks.
<kernlbob> Can I pass a record through a module without unpacking it? Can't say `m.d.comb += submod.spi.eq(self.spi)` since signals go both says.
dh73 has quit [Quit: Leaving.]
davidw has joined ##openfpga
davidw is now known as Guest83739
Guest15525 has quit [Read error: Connection reset by peer]
<kernlbob> I tried using the `connect` method. But it seems to be for records that have fanin or fanout.
<OmniMancer> My map of bits I know do things in the logic blocks is much more full now
<kernlbob> I am dropping my line of questions for now. Next week I will try to pull together an example that shows where I'm confused.
freemint has joined ##openfpga
freeemint has joined ##openfpga
freeemint has quit [Remote host closed the connection]
Guest83739 has quit [Read error: Connection reset by peer]
Guest83739 has joined ##openfpga
juri__ has joined ##openfpga
juri_ has quit [Ping timeout: 240 seconds]
freemint has quit [Remote host closed the connection]
Guest83739 has quit [Quit: Leaving]
freemint has joined ##openfpga
<OmniMancer> daveshah: what is the best way to give a map of bits that are not a known config?
IanMalcolm has quit [Read error: Connection reset by peer]
IanMalcolm has joined ##openfpga
simeonm has joined ##openfpga
rohitksingh has quit [Ping timeout: 250 seconds]
rohitksingh has joined ##openfpga
_whitelogger has joined ##openfpga
simeonm has quit [Quit: Bye]
_whitelogger has joined ##openfpga
_whitelogger_ has quit [Remote host closed the connection]
<OmniMancer> now to figure out how IO works
cr1901_modern has quit [Quit: Leaving.]
cr1901_modern has joined ##openfpga
<OmniMancer> so many unknown bits
m4ssi has joined ##openfpga
_whitenotifier-f has quit [Ping timeout: 264 seconds]
<pepijndevos> IO is the most complicated part
<pepijndevos> And at least on Gowin, there are several variations of any given tile -.-
<OmniMancer> I think the Anlogic part has different tiles for the left right top and bottom
<pepijndevos> Yea, and on Gowin there are at least two different per side and they have loooots of options for logic levels, current drive, LVDS, etc, etc
<pepijndevos> rn I only support just... input and output at the default setting
* zignig female humans @ ~2k days are dangerous little monsters.
m4ssi has quit [Remote host closed the connection]
m4ssi has joined ##openfpga
Jybz has joined ##openfpga
rombik_su has joined ##openfpga
rohitksingh has quit [Ping timeout: 250 seconds]
m4ssi has quit [Remote host closed the connection]
Asu has joined ##openfpga
<pepijndevos> what is the type of a bare parameter LUT = 0; in verilog?
<pepijndevos> In particular, how wide is it?
juri__ has quit [Read error: Connection reset by peer]
juri_ has joined ##openfpga
<ZirconiumX> pepijndevos: this is Verilog, did you expect this to be reasonable? AIUI it coerces into the width of whatever expression you use it in.
<pepijndevos> eh ok
AndrevS has joined ##openfpga
<whitequark> no
<whitequark> pepijndevos: 0 is the same as 32'd0
<whitequark> why 32? because fuck you
<OmniMancer> Is verilog at all derived from C?
<pepijndevos> So how does this work with anything bigger than a LUT5? https://github.com/YosysHQ/yosys/blob/master/techlibs/common/simlib.v#L1227
<pepijndevos> I guess in the sense that everything with curly braces and semicolons is derived from C?
<pepijndevos> https://en.wikipedia.org/wiki/Verilog influenced by: C, Fortran
<whitequark> pepijndevos: parameters aren't typed
<whitequark> so you can pass a 64'd0 as .LUT there
<pepijndevos> Ah, yea, that's what I was asking: the width of the parameter, so, undefined/any
<OmniMancer> So I think I am missing some phantom bits
<whitequark> yeah
<whitequark> otherwise you couldn't have string parameters
<pepijndevos> Actually VHDL also has unconstrained types, which can map to this... just forgot about them hehe
<OmniMancer> my understanding of where tiles are works until you get to the first global clock spine
<whitequark> because strings are just null terminated sequences of octets
<pepijndevos> ah, so they took *that* part of C too -.-
<OmniMancer> lets copy all of the footguns of C
<whitequark> no, verilog has significantly more footguns
<whitequark> language designed by idiots
<OmniMancer> whitequark: I am not saying they didn't add more, but they certainly didn't seem to try to exclude any they had within reach
<whitequark> systemverilog: let's remove some of the footguns. haha joking we won't actually specify how always_ff works or make emitting a hard error on it standards compliant
<pepijndevos> I wonder what the story is... like IIRC JS being developed on very short notice as an afterthought or something like that
<OmniMancer> behold, the problem: https://imgur.com/tjSPD9Q
<sorear> verilog and vhdl were originally created to write sim models while the actual design files were made by a different team with vector graphics programs, synthesis was tacked on later. this doesn't explain why it's gotten *worse* though, synthesis has been an established use case for decades now
<whitequark> systemverilog was made by people who looked at c++ and thought, "what a honkin' good idea"
<sorear> might be a C++ thing "we'll add all the features, if you don't like them don't use them"
<whitequark> that's all the explanation i need
<pepijndevos> Yea, but, like... IIRC VHDL was designed by some commitee, so just wondering if Verilog was just some rando at some company who mad a thing they needed.
<OmniMancer> it seems the global clock spine is actually 8 bits high in the bitstream, while the numbers I was using as a proxy for position in the bitsteam only jump by 2 at them
Jybz has quit [Quit: Konversation terminated!]
Asu has quit [Ping timeout: 240 seconds]
Asu has joined ##openfpga
Jybz has joined ##openfpga
Flea86 has joined ##openfpga
<OmniMancer> now to patch the tile grid
Jybz has quit [Quit: Konversation terminated!]
freemint has quit [Quit: Leaving]
Jybz has joined ##openfpga
<OmniMancer> now to gather routing info for the routing blocks
Bike has joined ##openfpga
Jybz has quit [Quit: Konversation terminated!]
Jybz has joined ##openfpga
Jybz has quit [Excess Flood]
Jybz has joined ##openfpga
<OmniMancer> what inputs can yosys accept for equivalence checking?
Bike has quit [Ping timeout: 240 seconds]
Bike has joined ##openfpga
Flea86 has left ##openfpga ["Leaving"]
sgstair_ has joined ##openfpga
sgstair has quit [Ping timeout: 268 seconds]
keesj has quit [Ping timeout: 240 seconds]
emeb has joined ##openfpga
sgstair_ is now known as sgstair
X-Scale` has joined ##openfpga
X-Scale has quit [Ping timeout: 240 seconds]
X-Scale` is now known as X-Scale
<GenTooMan> I wonder if anyone has gotten this error "Command '['sh', 'build_top.sh']' returned non-zero exit status 255." other than me from running nmigen.
<whitequark> needs more context
lopsided98 has quit [Ping timeout: 245 seconds]
lopsided98 has joined ##openfpga
<GenTooMan> That's what I thought I couldn't figure out what it meant.
<whitequark> well, it prints more than just that line, doesn't it?
<GenTooMan> Ok pastebin it is.
<GenTooMan> hmmm maybe I found the error by accident.
Jybz has quit [Excess Flood]
Jybz has joined ##openfpga
andre_ has joined ##openfpga
andre_ has quit [Quit: umount /dev/irc]
<GenTooMan> dug a bit through eclipse output(s) and I found the actual error which although just as weird at least I can figure out. I guess running under py dev is sometimes not so good.
<whitequark> so what's the actual error?
X-Scale` has joined ##openfpga
<azonenberg> pepijndevos: i would love to add more sanity checking/linting features to yosys's verilog/systemverilog front end
X-Scale has quit [Ping timeout: 246 seconds]
<azonenberg> in particular, a sizeable fraction of the footguns can be eliminated if you implement/restrict users to a well defined subset of the language's theoretical features
<azonenberg> So i want to add some optional arguments to read_verilog that enforce some additional rules
<azonenberg> For example, erroring out if you have latches in a combinatorial block or anything but a ff in an always_ff
<azonenberg> mandating default_nettype none
X-Scale` is now known as X-Scale
<whitequark> Clifford's answer for this before was "use an external linter"
<azonenberg> Yes, i know. But I want to use the actual yosys AST as input if at all possible
<azonenberg> i'd be OK-ish with exporting RTLIL and linting that
<azonenberg> as a pass that calls out to an external tool
<azonenberg> but i want it to be something i can integrate with a synthesis flow as-is
<GenTooMan> here is the listing https://pastebin.com/FwnBMWsu it's the generated pcf file. I'm not sure why it's setting frequency in the PCF ... let me check the version of nmigen
<azonenberg> also, some of the options are things that i really think need to be implemented early on in the parser
<azonenberg> For example, i want to be able to do read_verilog -sv -default_nettype none foo.sv
<azonenberg> and have it force default_nettype none, regardless of any declaration in the file or previous state set by earlier files
<azonenberg> Speaking of which, making default_nettype none default is IMO a bug in the spec
<azonenberg> and making preprocessor state persistent across files is another
<azonenberg> I would like to add nonstandard arguments to read_verilog that ensures source files are unaffected by compile order, i.e. no state persists across files other than the AST itself
<azonenberg> so that you dont have to worry about a third party HDL block added to your project defining macros that break your code, or vice versa
<azonenberg> yes, this has bit me before
<azonenberg> it's extra fun when the offending file is autogenerated by vivado and can't be patched because it'll just change again
<daveshah> GenTooMan: that was a new nextpnr addition
<azonenberg> sorry i meant to say above, making default_nettype wire default is the bug in the spec
<azonenberg> Use of an undeclared identifier should always be an error
<GenTooMan> daveshah oh ... I am using "nmigen-0.2.dev4+gf207f3f" of nmigen so I have to update nextpnr?
<daveshah> I guess so
<GenTooMan> :D
<daveshah> azonenberg: BTW, I added always_comb, ff, etc checking recently
<azonenberg> daveshah: oh awesome
<azonenberg> too bad i cant use it because all of my sv code uses structs and enums :p
<daveshah> And it is an error not a warning
<azonenberg> Also, what's the status of 7 series support for nextpnr + whatever tool from prjxray spits out a bitstream?
<daveshah> Yeah, structs and enums need doing soon
<azonenberg> Can you actually do at least basic stuff with mainline yosys+nextpnr yet?
<whitequark> azonenberg: can't you use read_verilog twice for that?
<daveshah> I have a proof of concept for nextpnr on Artix 7
<whitequark> AFAIK no state persists after read_verilog finishes
<azonenberg> whitequark: can you be more specific? use twice for what
<daveshah> It is not mainline and may never will be
<whitequark> so that third party HDL does not affect your code
<daveshah> Routing is too slow to be useful right now but I'm working on it
<azonenberg> whitequark: In that case, that isn't SV standards compliant
<whitequark> hm
<azonenberg> whitequark: actually, its even better
<azonenberg> if memory serves me right, the standard says that within a group of files being compiled state persists
<azonenberg> but it seems to allow for you to have a group be arbitrarily few/many files
<whitequark> hmm
<azonenberg> This lead to a fun bit of code on $sidegig a while ago that would compile fine in vivado and break in vcs/synopsys design compiler, and vice versa
<azonenberg> the workaround was to add c-style `ifndef `define include guards around a bunch of definitions
<whitequark> azonenberg: oh I see
<azonenberg> so that no matter whether things persisted or not, it would work
<whitequark> read_verilog *does* persist state
<whitequark> but yosys has the verilog_defines command
<whitequark> so you can do verilog_defines -reset
<azonenberg> does that wipe default_nettype, timescale (i guess that doesnt matter for synthesis) etc?
<azonenberg> or only actual `define
<azonenberg> basically, IMO we need to have both a standards compliance mode and a clean-slate mode
<whitequark> looking
<whitequark> azonenberg: `timescale is ignored
<whitequark> default_nettype is reset each time read_verilog is called
<whitequark> yosys also supports `resetall, not sure if it's in the standard
<whitequark> which you can presumably use in your HDL rather than in a script
<azonenberg> Hmmm, as much as i like resetting each time i think for compatibility with other stuff we do need to allow it to persist
<azonenberg> But that definitely needs to be an option
<daveshah> azonenberg: if you want to peek at it this is the nextpnr repo for artix7/ultrascale experiments
<daveshah> But expect issues https://github.com/daveshah1/nextpnr-xilinx
<azonenberg> daveshah: And what densities/packages do you support?
<azonenberg> right now i have 50t and 100t artix in ftg256 handy
<azonenberg> a 70t kintex in fbg484 on a board with limited gpio
<daveshah> I've only tried it with the "35t" that the Arty has
<azonenberg> what package is that?
<azonenberg> the 35 is a fused 50 so it should work on that
<daveshah> Some 324 one I think
<azonenberg> ah ok, so we'd have to find the mapping between csg324 and ftg256 pads
<daveshah> I don't think Xray has the package data for the 256 50t
<daveshah> It has the 484 50t iirc
<daveshah> It would be easy to add
<daveshah> Just some Vivado tcl scripts
<daveshah> But I don't know Xray well enough
<daveshah> Probably similar for the 100t tilegrid
<daveshah> It has slow routing and no timing data (so ignore any Fmax it gives)
<daveshah> ie don't actually use it
<azonenberg> well if somebody wanted to add package data for 256 50t i can start alpha testing with a few blinkies etc
<azonenberg> just to make sure i have things compiling etc
<azonenberg> what's the next steps? Is timing driven packing/placement in progress, or figuring out more bitstream parts, or what?
<daveshah> Yeah I've no idea what's actually involved
<daveshah> I don't have any involvement or insight into the bitstream side tbh
<daveshah> They have timing data inside VPR already, iirc
<daveshah> I just haven't the time or energy to do the same on the nextpnr side yet
<daveshah> VPR is the primary flow for xc7 and always will be
<azonenberg> oh?
Laksen has joined ##openfpga
<whitequark> i thought vpr was unsuitable for real-world architectures
<daveshah> Well, it seems it can be fudged
<azonenberg> I thought nextpnr was supposed to be a better, more scalable tool that was to entirely replace vpr moving forward for actual-hardware projects?
<whitequark> why should it be primary flow if it "can be fudged"?
<daveshah> I personally don't think I'll be able to maintain nextpnr for xc7 to a standard I would be happy about as well as the existing iCE40 and ECP5 flows
<whitequark> oh, I see
<azonenberg> Re VPR, how usable is that now for xc7?
<whitequark> does iCE40 flow need any ongoing maintenance?
<daveshah> I think basic stuff is working but I think routing is also quite slow, like nextpnr
<daveshah> To some extent iCE40 still does, eg not supporting HeAP yet
<whitequark> um, I've been using HeAP on iCE40 for like a year
<daveshah> Oh I meant defaulting not supporting
<whitequark> right
<daveshah> Because of the odd little edge cases that each arch has that each new cad algorithm needs to cope with
<daveshah> Same with things like the new router, any API changes, IO timing analysis, etc
<whitequark> makes sense
<daveshah> On a personal level, I'd rather work on supporting Lattice's new parts than Xilinx particularly as that will share a lot with ECP5
<daveshah> But while keeping some kind of experimental xc7 thing as a way of making sure the CAD algorithms can scale to that size of device
<whitequark> I'd be much happier personally with better Lattice support, but that's predicated on the kind of work I do
<azonenberg> Meanwhile, while i dont have the time to commit to much on the tooling side, 7 series/ultrascale is top priority for me because most of my projects lately wont fit in a lattice
<whitequark> azonenberg is in the exact opposite situation
<daveshah> Realistically nextpnr has a way to go before being able to route such big designs anyway
<azonenberg> i need 100k+ cells of capacity, 10G serdes, etc
<daveshah> That's the other issue with putting time into xc7, that Ultrascale is coming round the corner too
<azonenberg> one of the things holding me back from playing with the ecp5/ice40 tools more is that i simply dont have needs for anything that small
<azonenberg> daveshah: kintex7 is not going away any time soon
<azonenberg> ultrascale has no low end parts
<azonenberg> and artix/kintex is far more cost effective for most applications
<daveshah> I don't think Xilinx really care about low ends
<azonenberg> my understanding is that xilinx reached a point a few years ago where it no longer made sense to make low/mid/high end parts on the same node
<daveshah> They are investing in Efinix instead
<azonenberg> So now they have active families still getting development and even new devices (like spartan7) across 28, 20, and newer
<azonenberg> i don't see the 28nm stuff going away any time soon... heck, you can still get coolrunners
<daveshah> No, but xc7 will become less interesting over time
<daveshah> I guess check the open source tooling again this time next year and maybe things will be better
<azonenberg> what i mean is, i see xc7 being xilinx's key product in the sub-$500 price range for the next few years at least
<daveshah> Well they have some Zynq UltraScales and Versals at around that level
<daveshah> I know you don't like SoCs
<azonenberg> Pure FPGAs will always have a place
<daveshah> but iirc the smallest Zynq UltraScale is around 50k
<daveshah> LUTs
<azonenberg> i don't see xilinx discontinuing fpgas and only making socs
<azonenberg> having socs be the main marketing push sure
<daveshah> Well afaik Versal, their next generation, is only SoCs
<azonenberg> versal i dont think is a new generation per se
<azonenberg> i think it's a new *family*, like zynq is
<azonenberg> zynq is not intended to replace artix
<daveshah> It's a new node
<azonenberg> ok that i did not know
<azonenberg> but i expect we'll see virtexes on that node shortly
<daveshah> Given they have top end Versals with HBM intended for the virtex market I'm not so sure
<azonenberg> Hmmm
<azonenberg> well, if they drop pure-fpga support that may be the kicker that gets me to start moving to another fpga vendor :p
<azonenberg> i find it hard to believe they'd kill off the asic prototyping etc space
<azonenberg> that seems hugely profitable
<azonenberg> given how high the markup on the massive virtexes is
<daveshah> Well they'd just tell those people to use the ARM core for startup and then ignore it
mumptai has joined ##openfpga
<whitequark> are FPGAs ever power limited?
<azonenberg> Yes. A lot of the old spartan6 btc miner boards were thermally limited AIUI
<azonenberg> you couldn't run them at the fmax identified by timing analysis or they'd melt
<whitequark> hm
<whitequark> right, but I mean more, do FPGAs ever have "dark silicon"?
<whitequark> I'm guessing not
<azonenberg> i think some of them actually did dynamic frequency scaling with thermal sensor input closing the loop
<azonenberg> re dark silicon, not that i know of. Those things were at near 100% lut load all the time and just varied frequency
<whitequark> as in, is there ever a point where "adding more fabric" becomes an unviable strategy
<whitequark> so you add an ARM core
<azonenberg> Only if your problem can't parallelize more
<whitequark> as an FPGA vendor that is
<azonenberg> the reason they add CPUs is because software devs cost less than RTL devs
<whitequark> not as a user
<whitequark> no, I know
<whitequark> it's a tangent
<azonenberg> And because for low speed state machine stuff, you dont need everything unrolled in rtl all the time
<azonenberg> i dont think power is ever a reason to not add more fabric
<azonenberg> usually the cap on fpga size is yield
<whitequark> oh, is that why they go for multi-chip modules?
<azonenberg> They're not MCMs per se, 2.5D is kind of a special case in that it's still all silicon
<azonenberg> you basically just have the top few metal layers on a separate substrate
<whitequark> wait, what
<azonenberg> the big virtexes have a passive interposer made on silicon, i think it's a 65nm process for the 28nm virtexes that i heard somewhere ( no source for that number handy)))
<azonenberg> they have TSVs in the interposer connecting to flip chip bumps on the fpga dies
<azonenberg> then more flip chip bumps on the interposer to the package
<azonenberg> so there's actually two layers of silicon with nanoscale interconnects, not pcb traces, connecting the adjacent dies
<rombik_su> Yup, SSI (the substrate) is 65 nm
<azonenberg> On virtex7. Not sure if smaller for ultrascale
<azonenberg> there's something like 10k signal lines, plus clock/config trees, between each of the SLRs (FPGA dies)
<rombik_su> *(the silicon substrate)
<whitequark> right, that's what I was thinking
<whitequark> instead of one huge die you have a few smaller ones plus interconnect
<whitequark> so yield is higher
<azonenberg> Yes
<azonenberg> This also lets them have less mask sets
<azonenberg> So for example, an xc7vh870t is 2 SLRs, 90700 slices, 1680 DSPs
<azonenberg> so 45350 slices and 840 dsps each
<azonenberg> an xc7vh870t, as far as i can tell, is three *of the same fpga die*
<azonenberg> just a new interposer
<azonenberg> so only one 28nm mask set for two SKUs
<azonenberg> in fact, it might even be the same interposer with one of the three footprints unpopulated/capped off with a dummy die for thermal reasons
<azonenberg> They do have more than one fpga die on each process because the lower end SKUs are monolithic, and they also have a few variants of high end optimized for more DSP vs more SERDES etc
<azonenberg> but there's multiple obvious cases of reuse
<daveshah> Incidentally, I'm sure that I heard some of the Kintex US+ were using some Zynq US+ dies with the ARM cores locked out
<daveshah> Didn't look at that myself though
<daveshah> That would just be for mask commonality though
<azonenberg> that sounds less likely because the zynqs are a totally different boot process
nrossi has quit [Quit: Connection closed for inactivity]
<azonenberg> i dont think zynqs are capable of booting the fpga only in e.g. master spi mode
<nats`> <daveshah> Incidentally, I'm sure that I heard some of the Kintex US+ were using some Zynq US+ dies with the ARM cores locked out <= that's true
<azonenberg> oh?
<azonenberg> at least in 7 series, which i'm more familiar with, the arm comes up first and then loads the fpga
<azonenberg> so unless they have bootrom code to check an efuse bit then master-spi into the fpga
<azonenberg> which i guess isnt beyond the realm of possibility
<azonenberg> also it would be a totally different package but i guess that can be worked around
<rombik_su> Flip-chips can accomodate to it with different substrate PCB design, I guess
<azonenberg> yeah thats what i'm thinking, and then just have some extra fpga gpios to use the balls that you can't use with MIO
rombik_su has quit [Quit: Leaving]
rombik_su has joined ##openfpga
Laksen has quit [Quit: Leaving]
rohitksingh has joined ##openfpga
zino has joined ##openfpga
Jybz has quit [Quit: Konversation terminated!]
<mwk> azonenberg: that's not completely true
<mwk> the fpga part still has the usual config port bonded out
<mwk> it's just that the pins are labelled as NC
ZombieChicken has joined ##openfpga
<mwk> in theory, it could be possible (unless they disabled it via some mechanism unknown to me) to ignore the ARM core and boot it normally instead
<mwk> via the usual config modes
<mwk> perhaps even hold the ARM in reset
<rombik_su> They can disable ARM via efuse or with strap pins hardwired in substrate for example
<mwk> that's for xc7
<mwk> for ultrascale, they even use the exact same hw as kintex and zynq
<mwk> as in, only the smallest kintex u+ part is not actually the zynq, all the other ones have a disabled ARM core
<mwk> the obvious difference is packaging — the kintex parts don't have the PSS banks bonded out
<mwk> there may also be some fuses involved
<rombik_su> IIRC, in Zynq MPSoC's PL side can be put in user mode without bothering with PS side startup, when in 7s you'll need to bring PSU up in order to use PL side
<azonenberg> oh so they realized how dumb it was to make PS be in charge of everything and backed off on that?
<daveshah> I think they've effectively but a MicroBlaze or two in charge of everything instead
<mwk> microblaze? why?
<daveshah> idk
Asu has quit [Remote host closed the connection]
<daveshah> Their core of choice
<rombik_su> Well, they had it lying around, it's mature and well-known among users so I think they just decided to hardening it
<daveshah> Yeah, looks like they did some triple redundancy too
<daveshah> For thrice the fun
<rombik_su> :D
<rombik_su> Xilinx is very good at creating jobs - it's like a full-time job for a SWE to be able to jiggle all the bells and whistles on Zynq US+
<whitequark> lol
feuerrot has quit [Ping timeout: 240 seconds]
AndrevS has quit [Quit: umount /dev/irc]
feuerrot has joined ##openfpga
<azonenberg> meanwhile i'm still waiting for a big-fpga little-cpu chip
<azonenberg> like, a stm32 class processing engine bolted onto a big kintex7 sized fpga
<azonenberg> independent, and each capable of operating without the other
<rombik_su> Like, for housekeeping stuff?
<azonenberg> i'm thinking more like a few hundred MHz M7
<azonenberg> with a fair bit of on die sram plus an axi bus connected to the fpga for mapping external dram etc via the fpga if desired
<azonenberg> the other thing i'd like is a large fpga with a bunch of tiny cortex-m0 sized cpus scattered around the fabric
<mwk> tbh what's wrong with, say, 7z100 for that use case?
<azonenberg> mwk: couple of things
<azonenberg> first is, cortex-a is much harder to bring up bare metal than M
<azonenberg> if you dont want or need linux, zynq is difficuult to use
<mwk> that sounds like sw tooling problem though
<azonenberg> the other is, the zynq 7 arch (less familiar with u/u+) is designed to be cpu-first
<azonenberg> you cannot, for example, boot the fpga in master mode and leave the cpu in power-down mode until a gpio is toggled and you boot it
<azonenberg> the CPU can reconfigure the FPGA at any time without the FPGA's consent
<azonenberg> so you can't use the FPGA for an isolated security domain
<azonenberg> i want them to be basically two separate chips on the same die, just with a fast bus between them
<rombik_su> azonenberg: Do you use XSDK with Zynq?
<mwk> well that's kind of solvable with trustzone
<azonenberg> rombik_su: my experience with zynq to date has all been PL side while someone else did the software
<azonenberg> i attempted to get it to boot up bare metal and print hello world to the console and got nowhere
<rombik_su> That's strange, hello world in XSDK is just couple of clicks
<azonenberg> so i gave up and integralstick used a stm32f7 + artix7
<azonenberg> well thats the other problem, i didnt want to use huge amounts of generated code
<rombik_su> Oh, I see
<azonenberg> trying to even use the zynq PS in a pure systemverilog design without using the IP integrator is essentially impossible
<mwk> true
<azonenberg> you have to make a block design because that's the only way to get all of the myriad SFRs configured right at boot
<azonenberg> What i want is a pure FPGA that has a bunch of hidden GPIOs that connect to a MCU
* mwk would like to make some proper free toolchain for the PS side of zynq some day
<azonenberg> and brings out a memory bus plus a few other things
<rombik_su> mwk: please don't forget proper OpenOCD support :D
<azonenberg> With support for either fpga-first, mcu-first, or isolated architectures
<azonenberg> with a stm32, a few dozen lines of linker script and i can compile a C program with gcc, no assembly startup routines or anything, and be running C right out of reset
<azonenberg> the only thing you need asm for is disabling/enabling interrupts, so two functions with one line of inline asm each
<azonenberg> memory is working immediately
<azonenberg> good luck doing that with a cortex-A based platform
<azonenberg> i also have not yet managed to get debug working on a cortex-A using jtaghal
<azonenberg> meanwhile i have full in-circuit program support for stm32, as well as partial debug
<rombik_su> 7000s is very PS-centric by design
<azonenberg> rombik_su: yes i know, and i strongly disagree with that architectural decision
<azonenberg> i want either PL-centric or peer-to-peer
<azonenberg> Xilinx in general seems to be trying to push their products on software devs with things like HLS, the ACAP, etc
<azonenberg> i want a soc for rtl engineers who know software
<azonenberg> not for software folks who don't grok digital logic :p
<pie__> whats a PS
<mwk> the ARM core on zynq
<pie__> oh ok
<mwk> "processor system", I think
<azonenberg> To be precise, the PS is the arm core plus peripherals, bus, caches, dram controller, etc
<rombik_su> Acceleration market is seems to be growing fast, so they probably want to cater to those guys (mostly software/web folks).
<azonenberg> rombik_su: yes, i know
<azonenberg> i get why they're doing it
<azonenberg> doesn't mean i have to like it :)
<mwk> as opposed to PL, which is the FPGA part (programmable logic)
<mwk> also PS is called PSS in some places, but I have no idea what the extra S stands for
<zino> As a Software person I appreciate this trend to give me lots of IO fairly easily accessible from a semi-sensible OS. :)
<azonenberg> mwk: PSS = PS7 = Processing System 7
<mwk> what, the S stands for Seven?
<azonenberg> That is my understanding
<mwk> ugh.
<azonenberg> Soooo to give you an idea of how much i hate dealing with axi and zynq in general
<mwk> probably true though.
<azonenberg> for $sidegigclient's project, i had to move circa 1 Gbps of data between my logic in PL, and a C program in the PS
<azonenberg> you know what i did?
<mwk> hmm, or is it
* mwk looks through her notes
<azonenberg> i brought up the second ethernet MAC on the PS, connected the EMIO GMII bus to my PL, and wrote my logic to use Ethernet framing instead of AXI around PS-PL communications
<rombik_su> azonenberg: I'm sorry, my connection is lagging, it looks like I'm repeating your points, but in fact I sent those long before. )
<azonenberg> then on the PS side, you can just use SOCK_RAW/AF_PACKET and sendto() the PL :D
<mwk> azonenberg: the PSS name is also used on ultrascale
<mwk> which have PS8
<daveshah> Yeah, PSS_ALTO is familiar
<daveshah> It's also odd because PS7 isn't the 7th PS but refers to 7 series
<daveshah> Whereas PS8 is in 9 series effectively
<mwk> eh, ultrascale and ultrascale+ are not all that different
<mwk> could count as the same series
<mwk> or maybe it was intended to be used in first ultrascale, but hit some delays
<daveshah> That seems quite possible
<daveshah> Versal is PS9 btw
<mwk> or maybe it's xilinx just fucking up a numbering scheme again
<rombik_su> A lot of the docs (UltraScale Architecture Libraries Guide UG974, for example) is covering both US and US+
<azonenberg> daveshah: lol PS9 is versal?
<daveshah> Yep
<daveshah> All the Versal primitives are in latest Vivado
<azonenberg> but versal is 10 series
<azonenberg> u = 8, u+ = 9
<daveshah> That was the problem I just mentioned with PS8
<mwk> azonenberg: nobody ever said that
<azonenberg> yeah there is no zynq 8
<mwk> the only series there is is series 7
<azonenberg> mwk: i'm talking about internally, not in marketing names
<daveshah> Versal seems to be RAMB18E5 and RAMB36E5 too
<mwk> internally series 7 is called Fuji
<mwk> no number
<daveshah> I'm not sure what happened to E3 and E4
<mwk> daveshah: seems they cannot make up their mind whether Ex correspond to a given hw generation, or whether they're just number to be incremented whenever they change a primitive
<mwk> there's lots of precedent for that, unfortunately
<whitequark> lol
<daveshah> I still like that Lattice use increasing letters instead (appending A, B, C, D, etc)
<rombik_su> mwk: yup, OSERDESE2 is 7-series, OSERDESE3 is US/US+
<daveshah> This led to the super cute ECP5 JTAGG primitive
<azonenberg> mwk: the last details i had was Fuji is 7 series, Olympus is 8/ultrascale, Diablo is 9/ultrascale+, Everest/Everestea is Versal
<mwk> rombik_su: no
<daveshah> JTAG egg
<mwk> I mean yes
<mwk> but not just that
<mwk> when you actually look at internal primitives, *both* OSERDESE2 and E3 are series 7
<azonenberg> Someone also claimed to have found some internal device ID constants for virtex-9 and kintex-9 which were separate from vu+/ku+
<mwk> E2 are the 3.3V banks, E3 are 1.8V banks
<azonenberg> but still considered diablo family devices
<azonenberg> so who knows
<mwk> azonenberg: oh there are craploads of unused ID codes
<azonenberg> Yeah
<mwk> like there are twice as many unreleased 7 series devices as released ones
<azonenberg> who knows if any are future deviecs vs dead
<azonenberg> Oh?
<azonenberg> All i know about is the xc7a350t
<azonenberg> then going way back the xc2c1024
<azonenberg> those are the only thoroughly confirmed but never released xilinx parts i know of
<azonenberg> well, that and the whole Starfighter series
<mwk> well, at least there are IDCODEs for them
<mwk> I have no idea how far in production they actually were
<azonenberg> the xc7a350t had a whole resource definition entry in the overview table wtih exact lut/bram etc counts
<azonenberg> and a package/pinout defined
<azonenberg> i guess they figured it was competing with kintex and killed it
<azonenberg> Coolrunner-II was named BladeRunner originally afaik, hence the ISE data directory being called XBR (Xilinx BladeRunner)
<mwk> oh that explains that weird name
<azonenberg> StarFighter was the planned 90/130nm successor to launch alongside Spartan-3
<mwk> so the strangest part of series 7 is actually the unreleased thing with big DACs/ADCs
<azonenberg> based on patents, i strongly conjecture that it was to be a hybrid architecture with CPLD-style function blocks connected in a 2D routing tile array like an FPGA
<mwk> sorta like RFSoC, I think
<azonenberg> mwk: my guess is it took too long to finish and they rebuilt it around ultrascale fpga fabric
<azonenberg> That's the first i've heard of it though
<mwk> and it was supposed to be a heterogenous SSI thing, like the GTZ transceivers
emeb has quit [Quit: Leaving.]
<mwk> ie. the ADC/DAC were actually separate die in the same package
<mwk> well, it's amazing how many things you can find if you look at the device database closely in xdl
<azonenberg> Lol
<mwk> the other interesting unreleased thing is Spartan 3A variant with embedded MCU and USB 2.0 phy
<daveshah> Haha
<azonenberg> i also think spartan6 was originally going to have an in-package wirebonded flash variant like spartan3an
<azonenberg> cant recall where i got that info
<mwk> azonenberg: correct
<azonenberg> not sure why they killed that, it would have cost them very little to do
<mwk> it's even present in the final silicon
<azonenberg> no mask respin
<azonenberg> maybe buggy and they didnt want to do a new stepping?
<mwk> you can instantiate the primitive to talk to the nonexistent flash
<azonenberg> Lol
<azonenberg> has anyone tried bonding a flash to a live decapped spartan6?
<mwk> doubt so
<azonenberg> i kinda wanna do that now lol
<mwk> they would've fixed it in some device if they cared
<azonenberg> if you can give me some guesses of the bond pad locations
<mwk> I mean, they don't make all device sizes at the same time
<mwk> lower right corner of the thing IIRC
* mwk checks
<azonenberg> now that i have a decap setup working i might actually be willing to try this, would be an excuse to get the wirebonder at work a workout (havent used it in years)
<whitequark> let's say i use BUFGCE on Xilinx
<whitequark> do the input clock and output clock have a defined phase relationship?
<mwk> whitequark: yes
<whitequark> further, can I gate a chunk of a single clock domain without introducing a race condition?
<mwk> I mean, sort of
<mwk> the guarantee is that if you have several BUFG*s with the same input clock, they have same phase when they reach the flops
<mwk> so BUFGCE is in-phase with a BUFG with the same input
<whitequark> e.g. clka -> BUFGCE -> clkb; always @(posedge clka) x <= y; always @(posedge clkb) z <= x;
<azonenberg> whitequark: that's not how you'd do i think
<mwk> whitequark: this will work assuming that clka is distributed through a BUFG (which it should be)
<kc8apf> azonenberg: you'll probably be my closest decapping setup after my move. Do you let others use it?
<azonenberg> kc8apf: where are you moving?
<whitequark> context: I am thinking about nondeterminism in simulators
<kc8apf> North of Monroe
<azonenberg> kc8apf: what state is that? :p
<kc8apf> WA
<mwk> azonenberg: okay, so I have two hypotheses
<azonenberg> oh, cool - in that case then probably yes i'll be close
<whitequark> in Verilog for example if you write a clock divider it'll be racy
<mwk> one is that everything is in lower-right
<azonenberg> I'm willing to do lab work for guests, although at least for the moment i'm not going to be fully hands off
<mwk> because that's where SPI_ACCESS is
<azonenberg> it would probably be me doing the work whiel you watch
<azonenberg> mwk: do you actually have pads IDed or guessed? even if you don't know which is which signal?
mumptai has quit [Quit: Verlassend]
<whitequark> whereas a clock divider in VHDL isn't
<kc8apf> Fair enough. McMaster has let me do more hands on with oversight
<azonenberg> kc8apf: we'll work out details if you pay me a visit
<kc8apf> Fairly rare that I need decapping though
<azonenberg> Depending on your level of experience etc
<mwk> the other is... MISO in lower left, CLK and MOSI in lower-right, CS in upper-right
<whitequark> conversely, a clock *gating* circuit in Verilog is fine, but is racy in VHDL because it introduces "propagation delay" and now you have to manually balance the clock tree
<azonenberg> kc8apf: i'm also working on building out the facilities for visiting researchers
<kc8apf> Nice
<mwk> azonenberg: no, I haven't even seen a die shot
<mwk> and S6 is honestly too much of a mess to identify shit
<azonenberg> mwk: check siliconpr0n if you want to look at some top metal shots
<azonenberg> i dont think we have any delayered ones yet
<whitequark> so it looks like the behavior of the FPGA actually sorta matches what VHDL does
<mwk> I just know where roughly the things are in the tile structure
<azonenberg> kc8apf: as of now the conference room/presentation area is painted and set up but lacks a conference table, chairs, projector, or screen
<whitequark> but you have to manually insert BUFGs without relying on the toolchain for the sim to be correct
rombik_su has quit [Read error: Connection reset by peer]
<kc8apf> I'll be building out some electronics but no chem. Most of the current space will be for automotive, woodworking, and metalwork
<azonenberg> and only one of the three walls planned to have a whiteboard actually has one
<mwk> I suppose if you have a decapped FPGA, you could look for unused pads
<whitequark> actually, here's something i'm curious about
<mwk> in the corners
<azonenberg> mwk: if only it was that easy :p
<mwk> azonenberg: well I have no way to do any better
<azonenberg> There's a LOT of pads lol
<whitequark> imagine you have two DFFs connected in series, with a common clock
<azonenberg> that is a rather dirty/scratched die, i can definitely do better
<azonenberg> but its the only s6 decap i'm aware of right now
<azonenberg> actually dig did a 100
<mwk> *shrug* corners
<whitequark> for this to work correctly, the clock skew between them must be less than the D-Q propagation delay (minus setup time), right?
<mwk> I can't really narrow it down any closer
<azonenberg> whitequark: in real asics, you tend to have hold time violations with closely spaced shift regs
<mwk> all I can do is the above hypothesis about which wires are in which corner
<whitequark> azonenberg: aha. thought so
<daveshah> azonenberg/whitequark: this is also true on xilinx hardware, at least newer devices
<whitequark> oooh
<daveshah> Particularly across certain clocking boundaries
<daveshah> But the tools will always add routing delay to fix it
<azonenberg> yeah but in fpga its pretty easy to add routing delay
<daveshah> This is not true in iCE40 or ECP5
<daveshah> Where clock skew is low enough compared to minimum clock to out and routing for it not to be a problem