_whitelogger has joined ##openfpga
futarisIRCcloud has joined ##openfpga
_whitelogger has joined ##openfpga
dj_pi has quit [Ping timeout: 246 seconds]
unixb0y has quit [Ping timeout: 245 seconds]
unixb0y has joined ##openfpga
balrog has quit [Quit: Bye]
balrog has joined ##openfpga
dj_pi has joined ##openfpga
ZombieChicken has quit [Ping timeout: 256 seconds]
<whitequark> eddyb: you can just use a PCI-to-PCIe bridge chip
ZombieChicken has joined ##openfpga
gsi__ has joined ##openfpga
gsi_ has quit [Ping timeout: 244 seconds]
dj_pi has quit [Ping timeout: 246 seconds]
<TD-Linux> eddyb, I would start with booting it tbh
<TD-Linux> tbh I would start with a 3.3v 386sx. they are slow enough that you can sigrok them with a cheap logic analyzer
<gruetzkopf> ?
<gruetzkopf> the highlight on pci maaay have been a stupid idea
<sorear> we *could* be discussing pci-dss
<whitequark> lol
genii has joined ##openfpga
genii has quit [Remote host closed the connection]
_whitelogger has joined ##openfpga
ZombieChicken has quit [Remote host closed the connection]
ZombieChicken has joined ##openfpga
flea86 has quit [Quit: Goodbye and thanks for all the dirty sand ;-)]
<eddyb> TD-Linux: yeah the logic levels on the Pentium FSBs seem obnoxious, but 66 or 100MHz aren't that hard to reach, right? anyway, I agree, modulo not having a 386 on hand already (but if they're cheap, eh)
<TD-Linux> yeah you should be able to reach that.
<eddyb> whitequark: heh, I suppose that is easier than spending who knows how long building an opensource version of such a bridge in an FPGA
<whitequark> 66 or 100 MHz isn't that easy to get reliably working on a parallel bus
<whitequark> remember how many months it took me to get stuff working on glasgow?
<eddyb> TD-Linux: oh wow there aren't that many steps between 386 and NetBurst. anyway, 486 seems like it's slow enough for Glasgow (although you wouldn't be able to connect it to everything)
<TD-Linux> er I thought you were using ecp5
<eddyb> whitequark: oops I was scrolled up. ugh reality of digital signals strikes again
<eddyb> TD-Linux: depends, I'd much rather trust a Glasgow to do initial poking (esp to watch an already running system) than my own thing
<whitequark> you don't have enough pins on a glasgow to watch a huge parallel bus
<whitequark> you have sixteen.
<whitequark> you need what, 96 at a minimum?
<whitequark> that's a fuckton of pins and at 66 MHz it is uhhh
<eddyb> ah, sure, I meant more the control lines
<whitequark> 6 Gbps?
<whitequark> not even Glasgow revE will be able to dump that straight into a PC
<TD-Linux> for a 386 I was mostly debugging it running off a boot ROM so watching just the address lines was enough as I could trace along with a ROM dump
<whitequark> yeah, you could do compression and all sorts of stuff
<TD-Linux> (I only saw 16 lines so even that only worked for early boot)
<TD-Linux> 386sx has only 16 data lines which helps
<whitequark> ahhh
<eddyb> TD-Linux: ahh right that is a neat trick. I was thinking some address lines and some of the control lines, but forgot instruction memory
<whitequark> so it might be possible to run several glasgows in parallel
<whitequark> using the sync port
<eddyb> TD-Linux: oh, that's a benefit to using 386 then, 486 has 32 data lines
<eddyb> TD-Linux: oops, I misunderstood what the SX meant
<eddyb> or rather, 486SX doesn't get the bus width downgrade, just no integrated FPU AFAICT
<TD-Linux> do you mean the Processor Formally Known as the i486?
<TD-Linux> eddyb, certainly faster processors would be cooler. but I think 386->486->pentium has only incremental bus changes so you could scale up over time
<sorear> [34]86, or, "back in my day we did multiprocessing *without* anything equivalent to CAS"
<TD-Linux> sorear, but how do you modern folks do multitasking without a TSS???
<sorear> tss is mostly ignorable? it never did much of anything you couldn't do otherwise on other architectures
<sorear> it's of no theoretical impact
<whitequark> sorear: the iopl map is mildly interesting
<whitequark> it's like a shitty version of paging though
<whitequark> with better granularity
<sorear> the iopl map is sorta weirdly taped on to the tss
<fseidel> only part of TSS that really matters is ESP0, and you can
<fseidel> usually get a register for that on most ISAs
<sorear> i don't really understand the gotcha td-linux is trying to pull on me
<TD-Linux> I was being ironic™
<eddyb> oh hey why do the datasheets talk about address lines also being inputs for cache invalidation?
<fseidel> ah okay
<whitequark> eddyb: dma by peripherals into main ram, i think
<eddyb> can you do anything close to ccNUMA with 486s?
<TD-Linux> up until recent windows on 32 bit you could get direct io port access from a process. I dunno if that was via TSS or some other mechanism though
<eddyb> whitequark: oh, that makes far more sense than anything I could think of
<TD-Linux> this is relevant for my windows xp vm with a pci express parallel port iommu'd into it
* eddyb blinks
<eddyb> TD-Linux: I'm trying to figure out if 386SX could do PCI, is it at all possible with the 16-bit bus?
rohitksingh has joined ##openfpga
<TD-Linux> the pci bus is generally not directly attached to the fsb, so yes
<TD-Linux> though I don't know of any 386 machines that had pci
Jybz has joined ##openfpga
emeb_mac has quit [Ping timeout: 246 seconds]
rohitksingh has quit [Ping timeout: 246 seconds]
<eddyb> TD-Linux: hmm, can you regroup the 16-bit access? or am I silly and they are 32-bit over several clock cycles? or does PCI not require 32-bit?
<TD-Linux> it's just 32 bit in two halves
<eddyb> TD-Linux: cursed idea: run qemu on a RISC-V CPU on the FPGA
<whitequark> what's cursed about it
<whitequark> it's just slow
<eddyb> writing your own emulation, sure, but running a software emulator inside a hardware emulator... well, it would be tricky to provide e.g. ethernet to that, as opposed to a PCI ethernet board (or one implemented in the FPGA)
<whitequark> an fpga is not an emulator
<whitequark> it implements that circuit, it doesn't implement a different circuit with a behavior that is equivalent in some way
Jybz has quit [Ping timeout: 255 seconds]
<daveshah> Synopsys would disagree...
Jybz has joined ##openfpga
<whitequark> hrm
<whitequark> "Xilinx UltraScale FPGA, the industry’s most advanced emulation chip"
* whitequark stares
<whitequark> eddyb: anyway it's not much of a problem, you use the OS network driver in qemu and then implement the driver for whatever emulated ethernet you have on the FPGA
<whitequark> it's very doable
<whitequark> you bring it up step by step
<eddyb> yeah but isn't it easier to just skip the qemu and RISC-V and hook up the ethernet to the 386 directly?
<whitequark> yes? you were the one who suggested qemu in the first place
<whitequark> i mean, it's not always easier
<sorear> it's an emulator, just look at the number of I/O standards it can support
<sorear> /notentirelyserious
<whitequark> if you already have linux running on risc-v in an fpga, it's easier to run qemu on it than to connect a 386 to it
<whitequark> in no small part because well
<whitequark> what kind of stuff would you run on that 386? DOS? have you seen how much of a nightmare DOS networking is?
<eddyb> either way, by "hardware emulator" I meant the chipset/motherboard emulation by gateware/uC... wait, is this one of those "emulator" vs "simulator" things?
<eddyb> I barely memorized concurrency vs parallelism :P
wren6991 has joined ##openfpga
<eddyb> whitequark: ah, yes, then I agree. either by using someone else's SoC or building it completely without the 386 connected to it
<eddyb> at that point you might be able to skip the FPGA if you can get RAM working without the uC being in the loop for that. but what's the fun in that?
<whitequark> what
Jybz has quit [Quit: Konversation terminated!]
<xobs> Hey all. Does anyone here have any familiarity with some of the features of the ICE40 SB_WARMBOOT?
<xobs> In particular, I'm looking for a way to pass state between reboots.
<xobs> I know the Lattice documentation says that you can leave BRAM uninitialized, but I haven't seen /how/ to do that. Except maybe to leave that part out of the bistream.
<sorear> I'm pretty sure that's exactly what that means
<sorear> the bitstream is a list of commands, either it contains an "intialize this BRAM with this data" command or it doesn't
<xobs> sorear: do you have more information? nextpnr says that I'm using 29/30 "ICESTORM_RAM" blocks, but when I look at the bitstream I only see it setting 4 BRAM banks.
<xobs> And is it possible to only partially initialize a BRAM bank? Or is it an all-or-nothing kind of deal?
<sorear> i would assume, that the former is all RAMs that are inferred while the latter is just those with an `initial` value
<sorear> I don't know, but if a HDL memory larger than 256x16 is split among multiple physical BRAMs, you could partially initialize the HDL-level memory
<xobs> I guess the question then becomes: how can I ensure a BRAM is uninitialized, and fixed to a given ICESTORM_RAM unit, such that multiple bitstreams share the same uninitialized memory?
<daveshah> You can manually lock a BRAM's position by inferring a SB_RAM40_4K and setting the "BEL" attribute to the name of a BRAM bel (see the list of bels in the GUI)
<daveshah> nextpnr always initialises all four BRAM quadrants currently
<daveshah> You'd have to modify it to not initialise BRAM quadrants containing no initialised BRAMs
<xobs> daveshah: ooh, cool! that's very promising.
<xobs> and it sure beats the approach I was considering taking, which was to write something into the external SPI flash indicating state.
<daveshah> If Yosys' inferred BRAM naming is stable enough (I think it is but you'd have to check), you could also set placement constraints using a Python pre-pack script
Asu has joined ##openfpga
<daveshah> eg ctx.cells["name].attrs["BEL"] = "BEL name"
<xobs> I'll have to see how well that works, but it's good to have an approach I can take.
<xobs> I'll probably need to optimize my design somewhat. I'm still super impressed it routes and meets timing.
<xobs> "ICESTORM_LC: 5252/ 5280 99%" (for those of you who aren't in #tomu)
<daveshah> Not bad!
<daveshah> I think the iCE40 has a high routing to logic ratio compared to many FPGAs
<xobs> Ah, that would explain it then. That makes me feel much better about it working.
<sorear> if you have a large random graph of LUTs, the amount of routing resources you need (total wire length) is as the third power of the chip size, because the wires get longer AND you need more of them
<sorear> real designs are not random graphs (cf Rent's rule) but the scaling laws still give small FPGAs an easier time having enough routing
<xobs> Along those lines, the critical path slowing my 12 MHz domain down looks suspicious. I see lots of references to "adder". Is there a good way to trace down the inefficiencies?
<xobs> Unfortuantely, after scala -> python -> yosys -> json -> nextpnr, things get a bit lost in translation. "Sink $abc$53477$auto$blifparse.cc:492:parse_blif$53784_LC.I1" for example.
<sorear> so, of all the pipeline steps ABC (logic optimizer) is the worst offender in terms of mangling otherwise useful node names
<sorear> there's been some work recently on alternatives (whitequark's "synth_ice40 -noabc -relut" IIRC) but if you're at 99% ICESTORM_LC *with* ABC you may not be able to test this
<daveshah> ABC does preserve about 50-70% of net names and src attributes with dress now
<daveshah> But it doesn't copy these onto the cells at the moment
<xobs> That ends up at 118%, but might be worth looking at if I pull out some of the logic.
<whitequark> wow, only 118%?
<whitequark> that's actually a really good result
<whitequark> I'd expect more like 150%
<whitequark> since -noabc doesn't even try to optimize for area
<daveshah> If you get rid of any UltraPlus IP you could look at the critical path on LP8K
<daveshah> It should be more or less proportionally slower
<daveshah> *up5k is proportionally slower than lp8k
<xobs> daveshah: that's a good idea. the UP5K IP I have is the RAM. Which can be mocked around.
<daveshah> For the adders, BTW, Yosys tends to discard useful names but preserves the src attribute
<daveshah> Running 'rename -src' followed by 'write_json' might help (although will only go as far as the Verilog)
<whitequark> I think the naming of adders in Yosys is a bug that needs to be fixed
<wren6991> daveshah: Oooh that's really helpful, alumacc is the next worst offender for me after abc, this seems to help
<wren6991> Oh I just noticed that setup on SB_IO CLOCK_ENABLE is marked as a clk -> async path. That doesn't seem right?
<wren6991> Although I guess it's tricky as there are two clocks in the IO tile which can sample it
<whitequark> wren6991: i've implemented rename -src after being frustrated with alumacc, specifically
<wren6991> whitequark: yay :) it's awesome, thank you
<wren6991> For SB_IO: it seems like your design could be reported as meeting timing, but actually fail to meet setup on CLOCK_ENABLE? Luckily I have a little bit of slack on it
<daveshah> It should also count as a setup path for any used clocks
<daveshah> The async path will probably occur if either input or output clock is unused
<wren6991> Thank you, and you're right, in this case it's just an output clock
<wren6991> Although maybe that logic doesn't make sense, because if you have two independent clocks then you can't really drive CLOCK_ENABLE synchronously, whereas if you're only using one clock, it will be sampled synchronous to that clock if it's driven
<daveshah> Yes, the timing analysis might be better off ignoring disconnected clock ports than treating them as async
<wren6991> Yes, if you aren't driving a clock, you are most likely using the nonregistered path for that direction (in/out) anyway, so CLOCK_ENABLE wouldn't be pertinent to that path
<whitequark> if you're not driving a clock and using the registered path it's just a constant 0
<whitequark> which is safe albeit meaningless
<wren6991> Ooh did we confirm that those registers are tied to FPGA reset?
<wren6991> and it's useful for saving a bit of power without spending a LUT to drive the latch enable :)
<wren6991> (on input paths anyway)
wren6991 has quit [Quit: Page closed]
rohitksingh has joined ##openfpga
ZombieChicken has quit [Remote host closed the connection]
_whitelogger has joined ##openfpga
<tnt> I'm thinking having the timing histogram having independent bin size for positive/negative size would be informative. (and also always have a bin boundary at 0). Does that appeal to anyone else ?
<whitequark> agree
Laksen has joined ##openfpga
<whitequark> daveshah: I need some help
<whitequark> are you familiar with the SAME_EDGE mode of ODDR/ODDR2?
<daveshah> Touched that stuff about 3 years ago, not that familiar though
<daveshah> xc7 I guess?
<whitequark> yes
<whitequark> well, if you use the same clock for C0 and C1 (inverted) on series 6 and set DDR_ALIGNMENT to C0 then it will be the same behavior
<whitequark> daveshah: so the question i have is
<whitequark> can i emulate that behavior on ice40?
<whitequark> i think i'll need a posedge flop for DDR output and negedge flop for DDR input
<whitequark> but i'm confused as to how exactly i would instantiate them
<daveshah> Yes, that is what I would assume
<daveshah> I think for output a posedge flop on the D_OUT_1 path should do it
<daveshah> Not so sure about input
<whitequark> yeah, output is easy
<whitequark> there is SAME_EDGE and SAME_EDGE_PIPELINED
<tnt> I'm pretty sure all you need to do to get SAME_EDGE is to put a fabric posedge FF in front of D_OUT_1
<whitequark> i think SAME_EDGE is one posedge flop on D_IN_1 and SAME_EDGE_PIPELINE is a posedge flop on D_IN_0 *and* D_IN_1
<whitequark> but I'm not sure
<daveshah> That makes sense
<tnt> yes +1
<daveshah> If Icarus accepts the Xilinx sim models, I'd do a side by side sim to be sure
gnufan_home1 has joined ##openfpga
gnufan_home has quit [Ping timeout: 244 seconds]
cr1901_modern has quit [Ping timeout: 246 seconds]
cr1901_modern has joined ##openfpga
<whitequark> daveshah: tnt: ack
<whitequark> I am thinking about providing basic DDR primitives in nmigen
<whitequark> universally supported on every architecture
<whitequark> I am thinking the output should be SAME_EDGE and input should be SAME_EDGE_PIPELINE to avoid timing horrors
<tnt> whitequark: sometime you can't tolerate the latency ...
<whitequark> hmmm
<whitequark> of pipeline?
<daveshah> I really like the idea of generic DDR primitives
<whitequark> i have half of that code in glasgow in a very ad-hoc way
<whitequark> and i see people reimplement them constantly
<daveshah> Yes, I've been there with CSI stufd
<daveshah> Although that had deserialisation needs too
cr1901_modern1 has joined ##openfpga
cr1901_modern has quit [Ping timeout: 246 seconds]
<whitequark> daveshah: do you think SAME_EDGE_PIPELINE is useful enough or would i need to provide an option to use SAME_EDGE too?
<whitequark> imo if you want your code to be generic and portable you really have to accomodate 1 cycle of pipelining
<whitequark> anything less and you're free to use the primitives yourself, because it will probably more than just that
<whitequark> but i may be wrong
<eddyb> whitequark: speaking of which, would you recommend nmigen, or bare RTLIL (ILANG?), as the target for some busywork-reducing DSL (I'd rather not touch Verilog)? nmigen has the obvious advantage of being able to implement the DSL in Python and construct objects instead of emitting text, and I'd probably want to avoid reimplementing some of its features too
<eddyb> but I'm less experienced/comfortable with Python atm
<whitequark> RTLIL is the name
<whitequark> i have no idea what your objective is
<daveshah> whitequark: yeah, I'd go for SAME_EDGE_PIPELINE
<eddyb> fair
<whitequark> daveshah: ack
<whitequark> the general nmigen design does not just do DDR, it allows arbitrary gearing
<whitequark> so the ecp5 4:1 primitives are fine, too, just provide ECLK
<whitequark> on xc7 this should probably actually use xSERDES, not xDDR
<daveshah> Very nice
<whitequark> ^_^
<daveshah> The only remaining thing I see are input/output delays
<whitequark> yes.
<whitequark> I was thinking about those.
<whitequark> I feel like I will inevitably need some form of delays
<whitequark> any ideas on how to expose them best?
<daveshah> I think a reasonably standard option would be a fixed delay in ps or variable delay with inc/reset inputs
<whitequark> hmm
<daveshah> ECP5 doesn't officially give the mapping from delay value to picosecond
<whitequark> i thought ecp5 only exposes edge-aligned/center-aligned?
<tnt> DDR is non-ambiguous but for higher ratio you need someway to sync.
<daveshah> You can work it out looking at SDF files
<daveshah> ECP5 also has a manual delay option
<whitequark> oh interesting
<daveshah> FYI, ECP5 IDDRX1F is SAME_EDGE_PIPELINED only
<daveshah> Looking at the vendor model
<whitequark> ooh I see
<daveshah> DEL_MODE set to USER_DEFINED and provide a DEL_VALUE between 0 and 127
<daveshah> It also has load/direction/increment inputs
<whitequark> wait, how does it even do edge-aligned/center-aligned?
<whitequark> does it just use the clock constraint to work out picoseconds or something?
<daveshah> It's not even based on clock constraint from my experience
<daveshah> Just a fixed value
<whitequark> what.
<whitequark> how does that even work??
<daveshah> It's to compensate for internal clock network delays
<whitequark> oh wow
<daveshah> Yeah, I expected it to be based on clock constraint or at least speed grade too
<whitequark> hmm, so there would be additional fields in the primitive for delays then, right?
<whitequark> right now there's i0,i1,...iN for gearing 1:N
<whitequark> and o0,o1...oN as well as oe
<whitequark> all depending on pin configuration
<daveshah> So I'd have a fixed delay value parameter, eg in ps, and also increment and reset/load inputs for variable delay
<whitequark> and then there'd be something like .delay.rst, .delay.stb, .delay.dir?
<daveshah> Xilinx doesn't have dir
<whitequark> oh?
<whitequark> s7 has...
<whitequark> and xc6 has too
<whitequark> incdec and inc, respectively
<whitequark> unless i'm missing something
<eddyb> whitequark: so, my objective is more or less "make some state machines easier to write". I guess it would make more sense if I would take some example, involving a simple memory bus with arbitrary wait times, maybe stick "add together two vectors, one element at a time, over that bus" to it (to avoid having a full CPU), and implemented it in a few different HDLs
<daveshah> whitequark: not seeing INCDEC?
<daveshah> This is extracted from the Xilinx libs so should be accurate
<daveshah> You can dynamically load an arbitrary value too, which could emulate direction, but not sure if this has glitch issues compared to incrementing
<whitequark> that has INC
<daveshah> Right but no direction control
<whitequark> i think INC is direction, CE is stobe, no?
<whitequark> according to the doc i see
<whitequark> As long as CE remains High, IDELAY will increment or decrement by TIDELAYRESOLUTION
<whitequark> every clock (C) cycle. The state of INC determines whether IDELAY will increment or
<daveshah> Oh, I see
<whitequark> decrement; INC = 1 increments, INC = 0 decrements, synchronously to the clock (C). If CE
<whitequark> is Low the delay through IDELAY will not change regardless of the state of INC.
<daveshah> That makes sense
<eddyb> (and then come up with my own DSL-based version, that I would prefer to write. obviously it would probably suck for many things, but if it can do one thing at all I'd be glad. it could likely be just some Python classes/functions on top of nmigen, tbh, I should probably try that out first before going full DSL)
<whitequark> eddyb: do you mean like you want to simplify writing parsers or inverse parsers in gateware?
<eddyb> whitequark: I did see your tweet about the parser thing :P
<eddyb> this *might* be inspired by that
<whitequark> so what do you actually want to do
<eddyb> make it easier for myself to experiment with gateware, tbh. I find some/most of the existing solutions quite tedious and error-prone, but that is probably 99% inexperience talking
<whitequark> have you actually used nmigen
<eddyb> I've only read, not written, sorry :(
<eddyb> ugh, nevermind the DSL thing, it made more sense in my head before I asked it, I just need to make something small enough so I can focus on it, but non-trivial enough to better articulate the abstraction gaps I want to fill in
<tnt> :/ ... I've been searching for 30 min wtf my design stopped working ... turns out I accidently switched minicom to Odd parity.
rohitksingh has quit [Ping timeout: 246 seconds]
<whitequark> daveshah: is there any difference between DELAYF and DELAYG?
<whitequark> they look basically the same other than the pins
X-Scale has quit [Ping timeout: 246 seconds]
<daveshah> yeah the only difference is one has dynamic control inputs (DELAYF I think) and the other doesn't
cr1901_modern1 has quit [Quit: Leaving.]
cr1901_modern has joined ##openfpga
<whitequark> i mean, you can use any combination of DELAYF and DELAYG anywhere in the design right?
<whitequark> or are there some restrictions?
X-Scale has joined ##openfpga
emeb has joined ##openfpga
rohitksingh has joined ##openfpga
<daveshah> whitequark: yeah, there is one DELAY block per IO with separate control set from all other DELAYs
<daveshah> The only limitation is you can't have both an input and an output delay on the same pin
<whitequark> but this applies to both DELAYF and DELAYG, right?
<daveshah> Both types can be mixed freely and there's no control block, unlike Xilinx
<daveshah> Yes
<whitequark> ack
<whitequark> xilinx has a control block?
<tnt> IDELAYCTRL
<whitequark> how does it work? i can't quite grasp it
<tnt> you just feed it a 200 MHz clock that some internal logic will use to calibrate the delay taps.
<G33KatWork> instantiate one per IO bank, supply refclk, use ODELAY and IDELAY on that bank. The delay taps are tied to the reference clock of course and there are some restrictions what reference clocks you can supply based on the speed grade. at least that was the case for me when I used it on a zynq
<tnt> you need a proper rst to it when the clock is stable and it will assert rdy when it's all ready to process the IO delay commands.
<whitequark> how do you associate it to IO bank?
<tnt> location constraint in the UCF for instance
<whitequark> so you have to propagate its name to UCF? gross
<tnt> well I think you can also use attributes on the instance.
<tnt> (* LOC="..." *) IIRC for Xilinx.
<whitequark> ah
<whitequark> hmm
<tnt> you might need a big table with io->pin to IODELAYCTRL location for every xilinx device though ...
<whitequark> that's kinda really annoying
<whitequark> but i guess i can always punt to the user
<whitequark> the clock needs to be specified manually anyway...
<whitequark> does IDELAYCTRL clock need to relate to IO clock?
<G33KatWork> seems not to be the case. At least I can't find anything in the documentation
<whitequark> wait, the clock freq is fixed?
<whitequark> so why can't i justisntantiate all of them?
<whitequark> power?
<tnt> you can. I'm pretty sure I've done it in the past.
rohitksingh has quit [Ping timeout: 244 seconds]
<emeb> tnt: hilarious how much up5k can be overclocked - mangling your riscv-usb project to learn and got PLL feedback divider wrong for my board. Was running it @ 1.5x faster than timing expected and it still worked fine.
<tnt> emeb: yeah, something I still have to measure is how much the timing estimate from nextpnr match the real max freq. Not sure how to do that meaningfully though.
<whitequark> does nextpnr do pvt corners?
<emeb> tnt: was it you who posted analysis of speed vs Vcore recently?
<emeb> showed nice linear relation IIRC.
<tnt> yes
<tnt> but that doesn't really tell anything wrt to nextpnr timing model vs reality.
<daveshah> whitequark: it doesn't support adjustment of pvt
<daveshah> It's data structures do support min and max delays
<daveshah> These will be used once we have hold time analysis done
<daveshah> It would be easy enough to add voltage/temp options
<daveshah> Speed grades for ECP5 are effectively a case of this and are supported already
<sorear> in principle, you can use the nextpnr timing model to calculate a frequency for a ring oscillator and then compare that to reality
<daveshah> (I guess the 5G ECP5 variant is actually a different Vcore too)
<sorear> but neither nextpnr nor icetime supports that, so you'd need to find someone who understands the timing model well enough to calculate by hand (not me)
<whitequark> daveshah: so i'm thinking, the timing fuzzer (?) y'all have built
<tnt> sorear: yeah, I "tried" sort of. By breaking the ring oscillator and putting a register in the middle and use the path len as a guide for the delay.
<whitequark> can it be used for qualiying the real device too?
<daveshah> The iCE40 one not, that's basically just an sdf parser
<daveshah> The ecp5 one could be
<sorear> I'd feel comfortable doing that if you had a few hundred stage ring oscillator
<sorear> if it's just 3 or 5 stages, the register will add a lot of delay
<daveshah> It assigns different pip types classes, then builds a system of linear equations between pip class delays and vendor timing analysis value for that net
<daveshah> Knowing the routed path of the net
<whitequark> yeah that one
<daveshah> Yeah, that would work
<daveshah> You'd need a way of measuring the delays
<daveshah> I know someone who spent many years on this problem
<daveshah> (ignore the cursed name of the technique)
<tnt> sorear: the timing report details the estimated time for each segment, so I'm just ignoring the reg setup time and clock-to-out estimate.
<sorear> this is probably obvious but one-way delays don't matter (except for I/O), only delays summed over a loop that returns to the same point on the chip…
gsi__ is now known as gsi_
Jybz has joined ##openfpga
<tnt> Ok, so according to nextpnr, the path delay should be 45 ns. That should be the half period of the ring oscillator, giving a 90 ns period and a 11.1 MHz frequency. The real frequency I get is 17.44 MHz.
<tnt> Can't see a difference using I0/I1/I2/I3.
<tnt> (it's probably swamped in all the other delays)
rohitksingh has joined ##openfpga
rohitksingh has quit [Ping timeout: 246 seconds]
<emeb> hmm... for some reason my SPI flash erase/write stuff has stopped working.
<emeb> read of existing data still works fine, but it acts like protection is preventing erase & write.
<emeb> yet status register reads 0x00.
<tnt> emeb: strange it would enable by itself ...
<emeb> tnt: what do you mean "enable by itself" ?
<emeb> I preface all my erase & write operations with a write enable cmd.
<emeb> and check the status after completing those commands and see that status reads 0x03 for a bit until those operations finish.
<emeb> but prior to issuing WEN + erase or WEN + page write I see no protection
<tnt> oh, nm, I misread, I thouth that you had checked a protection register and it was reading as protected.
<emeb> tnt: I also issue a cmd 0x98 global unlock but that doesn't seem to do anything.
<whitequark> emeb: what does the status register look like after a failed write or erase?
<emeb> whitequark: it reads 0x00
<emeb> well, it reads 0x03 for a bit, then 0x00
<emeb> so it acts like it's doing something, but I don't see a change in the memory contents
<whitequark> that doesn't look like protection then
<whitequark> yeah
<whitequark> what flash?
<emeb> W25Q32JV
<emeb> (same winbond that lots of folk use)
<whitequark> that's super interesting
<whitequark> any LA trace?
<emeb> whitequark: not ATM - I'm just banging on it w/ firmware in the FPGA right now. Need to hook up the scope and see what the SPI bus is doing next.
<whitequark> can you read S8 and S9 registers?
<emeb> sure - need to write some code for that. just a sec...
<emeb> whitequark: a bit confused about what S8/S9 are - don't see those mentioned in the datasheet.
<whitequark> emeb: 35h/15h
<emeb> kk
<emeb> ah - got it. those are bits in status reg 2
<emeb> s8 (SRL) is 0
<emeb> s9 (QE) is 1 - that's odd.
<emeb> didn't turn on QSPI mode.
<emeb> and the warning about QE being enabled with the /wp and /hold pins are strapped to the rails....
GuzTech has joined ##openfpga
Jybz has quit [Read error: Connection reset by peer]
Jybz has joined ##openfpga
ZombieChicken has joined ##openfpga
GuzTech_ has joined ##openfpga
<emeb> weird - can't seem to clear that QE bit either. always comes up set even after a WEN and write status reg 2 + 00
<emeb> like the chip is permanently write protected.
GuzTech has quit [Ping timeout: 246 seconds]
<daveshah> Have you checked Vcc is good?
<emeb> Yeah - solid 3.3V
GuzTech_ has quit [Remote host closed the connection]
GuzTech has joined ##openfpga
emeb has quit [Quit: Leaving.]
emeb_mac has joined ##openfpga
emeb_mac has quit [Ping timeout: 246 seconds]
Asu has quit [Read error: Connection reset by peer]
Asu has joined ##openfpga
Jybz has quit [Quit: Konversation terminated!]
GuzTech_ has joined ##openfpga
GuzTech_ has quit [Remote host closed the connection]
GuzTech has quit [Ping timeout: 240 seconds]
Asu has quit [Remote host closed the connection]