<ZirconiumX>
I think one of the most important skills for FPGA work is having good internal estimates for how big something should be
<daveshah>
10% bigger than the FPGA you are using :p
<ZirconiumX>
Like, I can synthesise a pixel pipeline for ECP5 and get a design with 736 LUT4s and 1489 FFs, but I have no idea if that's good or inefficient for this
<rombik_su>
Do you mean, like, how efficient your RTL code is? Or how well it can be inferred by a particular tool?
<qu1j0t3>
daveshah: :-)
<ZirconiumX>
rombik_su: The former
_whitelogger has joined ##openfpga
Bike has joined ##openfpga
freemint has quit [Ping timeout: 245 seconds]
lutsabound has joined ##openfpga
OmniMancer has quit [Quit: Leaving.]
emeb has joined ##openfpga
balrog has quit [Quit: Bye]
balrog has joined ##openfpga
<sorear>
“pixel pipeline” is not a single well defined thing
<ZirconiumX>
Sure, but I can send you the reference I'm working from plus the source code
rohitksingh has joined ##openfpga
<hackerfoo>
There's probably not a good answer for how big something should be because there are many tradeoffs.
<hackerfoo>
And it doesn't matter as long as it does what you want running on the hardware you have.
<ZirconiumX>
Given that there is barely any size difference between one pipeline and sixteen, I'm pretty sure I have a bug somewhere...
show has quit [Quit: WeeChat 2.5]
forksand has quit [Ping timeout: 265 seconds]
show has joined ##openfpga
forksand has joined ##openfpga
Jybz has quit [Quit: Konversation terminated!]
mumptai has joined ##openfpga
emeb_mac has joined ##openfpga
Asu has quit [Ping timeout: 268 seconds]
rombik_su has quit [Quit: Leaving]
Bob_Dole has joined ##openfpga
zkms has quit [Quit: zkms]
zkms has joined ##openfpga
Asu has joined ##openfpga
<ZirconiumX>
So, I'm presuming when your design synthesises to 24k DFFs, you need to look at optimising it
<ZirconiumX>
I mean, that's "only" 1.5k DFFs per pipeline, but still
<daveshah>
Well, depends what your design is
<daveshah>
If a Core i9 CPU synthesised to 24k DFFs then I'd say there'd be little need to optimise further
<ZirconiumX>
16-pixel GPU pipeline
<ZirconiumX>
Really I should build a front-end for it (so that Quartus won't yell at me for it not fitting into the I/O pins of my chip) and then try to meet timing
Asu has quit [Remote host closed the connection]
<daveshah>
Does Intel have hard shift registers?
<ZirconiumX>
Not that I know of
<daveshah>
Xilinx has them and they make pipeline delays much more efficient subject to some constraints (eg no reset)
<ZirconiumX>
I suppose we shall see
zkms has quit [Ping timeout: 276 seconds]
<sorear>
so lattice gives you a single nearly-free DFF after each LUT
<sorear>
I think, retiming would allow pipeline stages to be hidden in the preceding or following logic in many cases?
zkms has joined ##openfpga
<whitequark>
yes
<daveshah>
I was thinking about ZirconiumX's previous comment about 736 LUT4s and 1489 FFs so that wouldn't work
<ZirconiumX>
That's for a single pipeline (so far)
<daveshah>
And indeed on ECP5 in theory the LUTs and FFs are usable separately
<daveshah>
So no need for retiming
<daveshah>
Although routing congestion requires some care as to packing density
<ZirconiumX>
On Cyclone V, you can use LUTs and spare DFFs separately
<sorear>
can you say what exactly you're counting when you say "16 pixels"?
<sorear>
was this the PS2 emulator or am I thinking of someone else's project?
<ZirconiumX>
This is part of the PS2 GPU emulator, yeah
<ZirconiumX>
sorear: each pipeline is RGBA32, plus X/Y/Z, texture coordinates and channel culling
<ZirconiumX>
(I'm calling it culling; I don't know the correct term for it, but you can selectively not overwrite certain channels)
wpwrak has quit [Ping timeout: 240 seconds]
wpwrak has joined ##openfpga
<mwk>
masking?
lopsided98 has quit [Quit: Disconnected]
lopsided98 has joined ##openfpga
mumptai has quit [Remote host closed the connection]
freemint has joined ##openfpga
emeb has quit [Quit: Leaving.]
<adamgreig>
what document is best for understanding ecp5 I/O pads? specifically the available primitives and their attributes etc. I've got tn 2032 but it seems a bit lacking...
<adamgreig>
i'd really like the ecp5 equivalent of the ice40 "technology library" doc
<adamgreig>
or if anyone knows if lvds pairs (non-serdes) can be runtime swapped from in to out, that would be good too :p