freemint has quit [Ping timeout: 250 seconds]
wpwrak has quit [Ping timeout: 240 seconds]
wpwrak has joined ##openfpga
X-Scale has quit [Ping timeout: 240 seconds]
rohitksingh has quit [Remote host closed the connection]
rohitksingh has joined ##openfpga
Bike has quit [Quit: Lost terminal]
lutsabound has quit [Quit: Connection closed for inactivity]
X-Scale has joined ##openfpga
emeb_mac has quit [Quit: Leaving.]
OmniMancer has joined ##openfpga
Asu has joined ##openfpga
ym has joined ##openfpga
pie_ has joined ##openfpga
pie__ has quit [Ping timeout: 246 seconds]
ym has quit [Read error: Connection reset by peer]
Asu has quit [Remote host closed the connection]
freemint has joined ##openfpga
rohitksingh has quit [Ping timeout: 250 seconds]
Bob_Dole has quit [Read error: Connection reset by peer]
<ZirconiumX> daveshah: So I just noticed you committed a change for ABC9 that disables &mfs for ECP5. What does &mfs do, exactly?
<daveshah> ZirconiumX: it is some kind of SAT-based area recovery, aiui
<daveshah> But because of the way it brute forces things it struggles with entities with more than 6 inputs
<daveshah> There was a recent change on the abc side to try and fix that (it wasn't doing anything for ECP5 before)
<daveshah> But it seems to have broken totally now
<ZirconiumX> Right, okay, thank you
<ZirconiumX> 2.18. Executing XILINX_DSP pass (pack resources into DSPs).
<ZirconiumX> ERROR: Assert `nusers(P.extract_end(i)) <= 1' failed in ./passes/pmgen/xilinx_dsp_pm.h:588.
<ZirconiumX> I'm presuming this is a bug.
freemint has quit [Ping timeout: 250 seconds]
<daveshah> Yes, it looks like it
<ZirconiumX> Unfortunately I'm currently in class and don't have the time/battery power to bugpoint it
freemint has joined ##openfpga
<ZirconiumX> So it seems the pixel pipelines really like having multiplier blocks available.
<ZirconiumX> (alpha blending, I'd imagine)
freemint has quit [Ping timeout: 268 seconds]
freemint has joined ##openfpga
<GenTooMan> ZirconiumX Alpha blending is one of several things you would use a multiplier on a lot of transformations happen on color information
mwk has quit [Ping timeout: 240 seconds]
Stary has quit [Ping timeout: 265 seconds]
freemint has quit [Remote host closed the connection]
mwk has joined ##openfpga
freemint has joined ##openfpga
<ZirconiumX> Yeah, I know; I'm the one writing a GPU here :P
freemint has quit [Ping timeout: 240 seconds]
<GenTooMan> Academic research? Curiosity? Challenge? Someone told you to?
<ZirconiumX> GenTooMan: 2 and 3 :P
freemint has joined ##openfpga
Stary has joined ##openfpga
azonenberg_work has quit [Ping timeout: 264 seconds]
carl0s has joined ##openfpga
freemint has quit [Ping timeout: 250 seconds]
freemint has joined ##openfpga
emeb has joined ##openfpga
ym has joined ##openfpga
<sorear> How fast are your pixel pipelines, and how fast do they need to be? Could you time-multiplex 16 architectural pipelines on fewer than 16 physical ones?
OmniMancer has quit [Quit: Leaving.]
<ZirconiumX> sorear: To emulate the hardware, each pipeline has to run at about 150MHz
<ZirconiumX> That would be "real time", as such
<tnt> ZirconiumX: PSX ?
<ZirconiumX> PS2
<ZirconiumX> Don't call the PS1 the PSX, because the PSX is PS2 based
<tnt> Arf :)
ironsteel has joined ##openfpga
<ZirconiumX> Yes, SCEI are not known for their consistency
<ZirconiumX> Or common sense :P
<tnt> Well ... Xbox, Xbox 360, Xbox One, Xbox One X, Xbox One S ...
<ZirconiumX> 150MHz is possible on my Cyclone V chip, I *think*.
<ZirconiumX> Strictly, the PS2's GPU has 16 pixel pipelines, but only 8 are capable of texturing. Which seems a bit stupid to me.
<ZirconiumX> Regardless: you definitely can't fit a full GS on an iCE40, but it might fit on a decently-sized ECP5.
<tnt> Yeah, ice40 seemed like a stretch :)
<ZirconiumX> On the other hand, I'm trying to keep it at least semi-parametric so that if somebody needs some semi-crappy GPU, they have one
<ZirconiumX> It's by no means state of the art though :P
<ZirconiumX> https://pastebin.com/RFeu0r2c <--- synthesis statistics, if anybody's curious.
<sorear> ecp5 has more multipliers doesn’t it?
<ZirconiumX> It has multipliers
<ZirconiumX> It doesn't matter as much how many :P
<GenTooMan> Well at least you can fit it into the FPGA.
ironsteel has quit [Ping timeout: 240 seconds]
show has quit [Quit: WeeChat 2.6]
azonenberg_work has joined ##openfpga
<tpw_rules> ZirconiumX: do the gamecube one next
<tpw_rules> it's better imo
<ZirconiumX> The Flipper is a pretty complex thing
<ZirconiumX> It's a proto-programmable pipeline
<tpw_rules> exactly
<ZirconiumX> Compared to the Graphics Synthesizer, which is fixed function
<TD-Linux> ZirconiumX, I think it might have been just playing the spec game. at the time people were comparing consoles by # of polygons per second, and the PS2 was heavily billed as a "supercomputer"
<ZirconiumX> It was, compared to its successor :P
<tpw_rules> ?
<TD-Linux> true, but they were competing against dreamcast here :)
<ZirconiumX> There was a lot of political scaremongering about how because the PS3 was so powerful for the money terrorists could use it as a budget supercomputer, tpw_rules
<tpw_rules> i mea yes
<tpw_rules> but like the ps3 was quite powerful
<ZirconiumX> It turns out it was much easier to program the PS2 (which even had an official Linux distro) than the PS3.
<tpw_rules> mmm cell
<tpw_rules> also the ps1 was awful
<ZirconiumX> It got the job done, I suppose.
<TD-Linux> tpw_rules, you say that but you've never used a PC-FX
<ZirconiumX> The documentation seems to be nonexistent, but they *did* get first-mover advantage
<ZirconiumX> TD-Linux: the Dreamcast seems like a pretty well designed system
<ZirconiumX> Although SuperH is/was pretty weird
<TD-Linux> yeah they learned from the saturn
<ZirconiumX> I mean, they also worked with Microsoft
<TD-Linux> my friend has one of the very early "devkits", it was just a PowerVR pci card
<ZirconiumX> A friend tried to buy a Sega Katana. It was not quite but effectively dead on arrival
<ZirconiumX> Fortunately the seller took it back
<TD-Linux> ZirconiumX, also that's suprisingly small re your synthesis stats. Does that include texture samplers?
<ZirconiumX> At present no; the GPU is far from complete, but it's at least very modular
mumptai has joined ##openfpga
<Xark> ZirconiumX: PS2 GS might "fit" on an FPGA, but I suspect getting 38.4 GB/s from 2560-bit memory sounds tricky. :)
<ZirconiumX> Where are you pulling 2560-bit from?
<Xark> From Wikipedia (but matches my recollection): eDRAM bus width: 2560-bit (composed of three independent buses: 1024-bit write, 1024-bit read, 512-bit read/write)
ironsteel has joined ##openfpga
<Xark> Regardless of the specifics, pretty impressive bandwidth (especially for the time - but tiny memory size).
<ZirconiumX> That's marketing wank, to paraphrase my boyfriend. Yes, it's technically 2560-bit, but the core work happens in a 2 kilobyte page of cache.
<Xark> ZirconiumX: Fine, lowly 1024-bit memory. :)
<ZirconiumX> Which is not quite as absurd :P
<Xark> ZirconiumX: PS2 did have a lot of hype (Emotion Engine - *eyeroll*).
<ZirconiumX> The bandwidth is only used to fill the page cache quickly.
<ZirconiumX> Oh yeah, I'm well aware.
<Xark> ZirconiumX: PS2 frame buffer bandwidth was pretty amazing (it needed to be since frame buffer operations were so limited, needed a lot of "layers" to look good).
<ZirconiumX> Yeah, it's an incredibly wide renderer
<GenTooMan> The PS2 was definitely more than the PS1 by far, it at least had an FPU. Although non IEEE standards compliant.
<tpw_rules> and perspective correct texturing!
<ZirconiumX> tpw_rules: kind of.
<azonenberg_work> Xark: 38 GB/s of ram bandwidth is not that much
<azonenberg_work> that's 38 bits wide @ 1 GHz not counting refresh overhead etc
<azonenberg_work> 64 bit DDR3 1066 should handle it easily
<tpw_rules> ZirconiumX: does it still not
<ZirconiumX> It's 150MHz :P
<azonenberg_work> just slap a sodimm down
<Xark> GenTooMan: The non-complience was mostly a benefit for games (divide by zero is "fine"). :)
<ZirconiumX> tpw_rules: it's optionally perspective correct :P
<tpw_rules> sigh
<ZirconiumX> It's got two different sets of texture coordinates to boot
<TD-Linux> obviously you need to add rdram support to litedram
<cr1901_modern> ^this but unironically
<TD-Linux> actually tho please don't enable rambus
<TD-Linux> ZirconiumX, is the page cache manually managed?
Asu has joined ##openfpga
show1 has joined ##openfpga
rohitksingh has joined ##openfpga
<sorear> azonenberg_work: are you missing a factor of 8 there
ironsteel has quit [Ping timeout: 265 seconds]
<ZirconiumX> TD-Linux: since I didn't get a notification for that, I apologise
<ZirconiumX> Not directly, although you can force a flush of it to internal RAM
<ZirconiumX> tpw_rules: regarding the "optional perspective correction": https://puu.sh/EvfXl/46a5abe981.png
rohitksingh has quit [Ping timeout: 276 seconds]
<azonenberg_work> sorear: oh oops gigabytes
<azonenberg_work> let me recalculate
<azonenberg_work> i'm used to doing bandwidth in bits :p
<azonenberg_work> So 304 Gbps of bandwidth
<azonenberg_work> Yeah thats a little more
<ZirconiumX> Just a touch :P
<ZirconiumX> ...I'm actually trying to work out where they got the 38.4 GB/s figure from
emeb_mac has joined ##openfpga
<ZirconiumX> GPU frequency is 4 * 768 * 48000 Hz = 147.456 MHz; bus is 2560-bit; that's 377.48 gigabits / second, or 47.185 gigabytes / second.
<ZirconiumX> However, that's ignoring DRAM refresh.
<TD-Linux> ZirconiumX, wtf you're supposed to be my instant personal source for obscure PS2 GS tidbits! :^)
<ZirconiumX> I am sorry master, I have failed you /s
<sorear> ZirconiumX: dram has a lot of necessary wait states for row activation, etc, not just refresh
<TD-Linux> ZirconiumX, maybe they are not counting the 512 bit r/w bus? (is that the cpu's mmio bus?)
<TD-Linux> ZirconiumX, if you round the GPU frequency to 150MHz then you get exactly 38.4GB/s
<sorear> 150MHz is … very high for DRAM
<TD-Linux> welcome to rdram
<sorear> without documentation I am uncomfortable making any assumptions about the bus protocol or array clocking
rohitksingh has joined ##openfpga
<ZirconiumX> TD-Linux: It's not RDRAM :P
<ZirconiumX> It's listed as "embedded DRAM"
<TD-Linux> wait is this a separate memory to system ram
<ZirconiumX> Yes
<TD-Linux> ah it is, lol that makes a lot more sense :)
<ZirconiumX> That's how you have the stupidly wide memory bus :P
* TD-Linux took apart a ps2 before and didn't see enough lanes for 1024 bits
<balrog> embedded dram? that's on die?
<ZirconiumX> Yeah
<balrog> (or on package at least)
<TD-Linux> hbm 0.5
<ZirconiumX> ^
<GenTooMan> Perhaps writing some documentation would help? If you use DDR. The 512/1024/1024 bit buses are for the embedded dram which is 32mbits in size.
<TD-Linux> obviously just get one of those $30k xilinx fpgas with edram
<ZirconiumX> They apparently made a "GS I-32" with 32 megabytes of on-die DRAM.
<TD-Linux> for arcades or something?
<ZirconiumX> Go google the GS Cube
<ZirconiumX> They planned to do real-time rendering with it
<ZirconiumX> So it's also like SLI 0.5 :P
<TD-Linux> sounds like a project exclusively designed for hype
<ZirconiumX> They made Final Fantasy: The Spirits Within with it
<ZirconiumX> And depending on your point of view, that says a lot about the GS Cube.
<TD-Linux> yeah I was about to say, is that supposed to be a plus
<TD-Linux> was that just for previews?
rohitksingh has quit [Ping timeout: 250 seconds]
<ZirconiumX> No, the entire thing, apparently.
<TD-Linux> ouch. how did that work, given the fixed function pipeline? no raytracing at all?
<ZirconiumX> No raytracing, just lots and *lots* of rasterising
Asu has quit [Remote host closed the connection]
mwk has quit [Ping timeout: 265 seconds]
balrog has quit [Ping timeout: 276 seconds]
balrog has joined ##openfpga
carl0s has quit [Remote host closed the connection]
Bike has joined ##openfpga
mumptai has quit [Remote host closed the connection]
freemint has quit [Ping timeout: 250 seconds]
freemint has joined ##openfpga
mwk has joined ##openfpga
emeb has quit [Quit: Leaving.]
azonenberg_work has quit [Ping timeout: 240 seconds]