X-Scale` has joined ##openfpga
X-Scale has quit [Ping timeout: 268 seconds]
X-Scale` is now known as X-Scale
inoor has joined ##openfpga
emeb_mac has joined ##openfpga
X-Scale has quit [Ping timeout: 265 seconds]
X-Scale has joined ##openfpga
finsternis has quit [Excess Flood]
finsternis has joined ##openfpga
OmniMancer has joined ##openfpga
rohitksingh has quit [Ping timeout: 264 seconds]
Bike has quit [Quit: Lost terminal]
_whitelogger has joined ##openfpga
freemint has quit [Ping timeout: 268 seconds]
_whitelogger has joined ##openfpga
rohitksingh has joined ##openfpga
rohitksingh has quit [Ping timeout: 250 seconds]
rohitksingh has joined ##openfpga
jemk has quit [Ping timeout: 246 seconds]
jemk has joined ##openfpga
Jybz has joined ##openfpga
_whitelogger has joined ##openfpga
_whitelogger has joined ##openfpga
inoor has quit [Quit: inoor]
emeb_mac has quit [Quit: Leaving.]
Asu has joined ##openfpga
bwidawks has joined ##openfpga
bwidawsk has quit [Ping timeout: 250 seconds]
bwidawks is now known as bwidawsk
SpaceCoaster has quit [Ping timeout: 250 seconds]
SpaceCoaster has joined ##openfpga
freemint has joined ##openfpga
rohitksingh has quit [Ping timeout: 250 seconds]
rombik_su has joined ##openfpga
<ZirconiumX> I think one of the most important skills for FPGA work is having good internal estimates for how big something should be
<daveshah> 10% bigger than the FPGA you are using :p
<ZirconiumX> Like, I can synthesise a pixel pipeline for ECP5 and get a design with 736 LUT4s and 1489 FFs, but I have no idea if that's good or inefficient for this
<rombik_su> Do you mean, like, how efficient your RTL code is? Or how well it can be inferred by a particular tool?
<qu1j0t3> daveshah: :-)
<ZirconiumX> rombik_su: The former
_whitelogger has joined ##openfpga
Bike has joined ##openfpga
freemint has quit [Ping timeout: 245 seconds]
lutsabound has joined ##openfpga
OmniMancer has quit [Quit: Leaving.]
emeb has joined ##openfpga
balrog has quit [Quit: Bye]
balrog has joined ##openfpga
<sorear> “pixel pipeline” is not a single well defined thing
<ZirconiumX> Sure, but I can send you the reference I'm working from plus the source code
rohitksingh has joined ##openfpga
<hackerfoo> There's probably not a good answer for how big something should be because there are many tradeoffs.
<hackerfoo> And it doesn't matter as long as it does what you want running on the hardware you have.
<ZirconiumX> Given that there is barely any size difference between one pipeline and sixteen, I'm pretty sure I have a bug somewhere...
show has quit [Quit: WeeChat 2.5]
forksand has quit [Ping timeout: 265 seconds]
show has joined ##openfpga
forksand has joined ##openfpga
Jybz has quit [Quit: Konversation terminated!]
mumptai has joined ##openfpga
emeb_mac has joined ##openfpga
Asu has quit [Ping timeout: 268 seconds]
rombik_su has quit [Quit: Leaving]
Bob_Dole has joined ##openfpga
zkms has quit [Quit: zkms]
zkms has joined ##openfpga
Asu has joined ##openfpga
<ZirconiumX> So, I'm presuming when your design synthesises to 24k DFFs, you need to look at optimising it
<ZirconiumX> I mean, that's "only" 1.5k DFFs per pipeline, but still
<daveshah> Well, depends what your design is
<daveshah> If a Core i9 CPU synthesised to 24k DFFs then I'd say there'd be little need to optimise further
<ZirconiumX> 16-pixel GPU pipeline
<ZirconiumX> Really I should build a front-end for it (so that Quartus won't yell at me for it not fitting into the I/O pins of my chip) and then try to meet timing
Asu has quit [Remote host closed the connection]
<daveshah> Does Intel have hard shift registers?
<ZirconiumX> Not that I know of
<daveshah> Xilinx has them and they make pipeline delays much more efficient subject to some constraints (eg no reset)
<ZirconiumX> I suppose we shall see
zkms has quit [Ping timeout: 276 seconds]
<sorear> so lattice gives you a single nearly-free DFF after each LUT
<sorear> I think, retiming would allow pipeline stages to be hidden in the preceding or following logic in many cases?
zkms has joined ##openfpga
<whitequark> yes
<daveshah> I was thinking about ZirconiumX's previous comment about 736 LUT4s and 1489 FFs so that wouldn't work
<ZirconiumX> That's for a single pipeline (so far)
<daveshah> And indeed on ECP5 in theory the LUTs and FFs are usable separately
<daveshah> So no need for retiming
<daveshah> Although routing congestion requires some care as to packing density
<ZirconiumX> On Cyclone V, you can use LUTs and spare DFFs separately
<sorear> can you say what exactly you're counting when you say "16 pixels"?
<sorear> was this the PS2 emulator or am I thinking of someone else's project?
<ZirconiumX> This is part of the PS2 GPU emulator, yeah
<ZirconiumX> sorear: each pipeline is RGBA32, plus X/Y/Z, texture coordinates and channel culling
<ZirconiumX> (I'm calling it culling; I don't know the correct term for it, but you can selectively not overwrite certain channels)
wpwrak has quit [Ping timeout: 240 seconds]
wpwrak has joined ##openfpga
<mwk> masking?
lopsided98 has quit [Quit: Disconnected]
lopsided98 has joined ##openfpga
mumptai has quit [Remote host closed the connection]
freemint has joined ##openfpga
emeb has quit [Quit: Leaving.]
<adamgreig> what document is best for understanding ecp5 I/O pads? specifically the available primitives and their attributes etc. I've got tn 2032 but it seems a bit lacking...
<adamgreig> i'd really like the ecp5 equivalent of the ice40 "technology library" doc
<adamgreig> or if anyone knows if lvds pairs (non-serdes) can be runtime swapped from in to out, that would be good too :p
<daveshah> adamgreig: the direct equivalent is http://www.latticesemi.com/view_document?document_id=52656
<daveshah> Just using a regular bidirectional IO in LVDS mode is fine
<daveshah> Unlike Xilinx there's no need to use a different primitive for differential IOs
<adamgreig> ah cool, thanks!
<adamgreig> that doc looks perfect
<daveshah> The primitive you want to use is probably "BB"
<adamgreig> I see there's ILVDS and OLVDS
zkms has quit [Quit: zkms]
zkms has joined ##openfpga
<daveshah> Those primitives are redundant though
<daveshah> There's no reason to use them over a single ended primitive and IOSTANDARD of LVDS
<adamgreig> doesn't look like they're mentioned in any of the ecp5 specific docs either
<adamgreig> got you, cool