<keesj>
can't we skip the whole risc-v thing and create an open fpga?
<whitequark>
fpgas are comparatively expensive and power-hungry
<whitequark>
so if you are trying to upend an entire field, it makes sense to start with something more broadly useful
X-Scale` has joined ##openfpga
<emily>
wonder what fpgas sifive stuff is tested on
<keesj>
like.. a flipflop?
X-Scale has quit [Ping timeout: 276 seconds]
X-Scale` is now known as X-Scale
<keesj>
The design of an FPGA is probably more complex compared to a CPU
<azonenberg>
keesj: For reference, i made a full bitstream compatible emulator of the xc2c32a in verilog
<azonenberg>
it filled something like a third of an xc7a200t if memory serves me right
<keesj>
damm. ... I was just thinking about fpga in fpga
<azonenberg>
and this is a $2 CPLD we're talking about
<keesj>
amazing
<keesj>
well.. I guess a lot can be optimized during compilation (e.g. not emulating luts but using them?)
<azonenberg>
Somewhat related idea: I think it would be possible to create a "virtualized" FPGA implementation that takes advantage of the underlying hardware to be more efficient
<whitequark>
like using LUTRAM for LUTs?
<azonenberg>
in particular a xilinx SRL can be used as a 5-input reprogrammable lut
<whitequark>
oh yeah or that
<azonenberg>
So you can do out-of-band serial loading of truth tables
<azonenberg>
then have a full speed lut5 architecture
<azonenberg>
with much less overhead than using 32 dff's to store the truth table, which is probably what would happen if you compiled asic-targeted fpga rtl
<azonenberg>
block ram would of course just be block ram
<azonenberg>
I'm not sure how *useful* such a thing would be, but i think it can be done
<azonenberg>
along similar lines, i considered the possibility of making a JIT in my coolrunner emulator
<azonenberg>
basically rather than having 80 dff's for each mux in the pla and array, and a very bulky wide input mux architecture made out of luts
<azonenberg>
take the PLA truth tables coming off the JTAG and compile them down to lut equations
<azonenberg>
then feed that into SRLs
<keesj>
somebody is going to create a virtual FPGA platform like has happend for CPU's perhaps
<whitequark>
amazon has that
<azonenberg>
thus giving you a five input programmable AND gate in one lut
<azonenberg>
rather than a 3-input like you would get naively using 3 dffs and 3 input signals
<azonenberg>
whitequark: not exactly, amazon has a virtualized software platform running bare metal FPGA with a bit of i/o IP bolted in
<whitequark>
yep
<whitequark>
it's a first step
<azonenberg>
true virtualization would involve relocatable blocks of logic that connect to a hard noc or something, can be swapped in and out cleanly, and ideally are somewhat independent of underlying microarchitecture
<azonenberg>
Related: are 7 series clock regions all the same size?
<azonenberg>
is it plausible to make a bitstream IP block that could be used on multiple fpga dice given appropriate support logic around it?
<azonenberg>
so something you can drop into an xc7a100t or a 200t
<azonenberg>
at different bitstream addresses
<azonenberg>
and not have to re-PAR or ideally even re-bitgen
<azonenberg>
the columnar architecture is much more uniform than previous stuff but i dont think they've reached that level yet
<keesj>
you guys are on an other level
<azonenberg>
Ultrascale i think is even more uniform
<mwk>
azonenberg: series 7 clock regions aren't always the same size
<mwk>
first, left regions != right regions, but that part's rather obvious
<mwk>
but even within left half / right half: there are differences in cut-out areas for hard blocks
<azonenberg>
yeah true... the config block is really annoying
<azonenberg>
idk why they put it in the middle and not on the side liek they did with io and serdes
<azonenberg>
it doesnt seem like there's much of a benefit to that
<mwk>
PCIE, transceivers, config center, ARM core
<mwk>
and if you mean between devices in the same family... haha forget it
<mwk>
ah fuck that shit, ignore me
<mwk>
why am I even bothering to IRC with 150s ping
<mwk>
except facebook notifications, those are unstoppable (good luck reafing whatever it's notifying you about though)
rohitksingh has quit [Ping timeout: 245 seconds]
Jybz has quit [Excess Flood]
Jybz has joined ##openfpga
Jybz has quit [Remote host closed the connection]
Jybz has joined ##openfpga
freemint has joined ##openfpga
OmniMancer has joined ##openfpga
emily has quit [Remote host closed the connection]
emily has joined ##openfpga
emily has quit [Client Quit]
emily has joined ##openfpga
freemint has quit [Ping timeout: 264 seconds]
freemint has joined ##openfpga
genii has joined ##openfpga
emeb has joined ##openfpga
<azonenberg>
mwk: Do you think that is the direction we're moving towards down the road?
<azonenberg>
more regular, uniform fabric s.t. it will be possible to buy an IP core and drop it at any offset you want in the fabric modulo some alignment
<azonenberg>
the impression i get of xilinx's strategy is that they are targeting acceleration-type workloads and this level of fine grained design (not having to re-par the netlist constantly) seems like it would help that
azonenberg_work has joined ##openfpga
azonenberg_work has quit [Ping timeout: 276 seconds]
azonenberg_work has joined ##openfpga
pie_ has quit [Ping timeout: 265 seconds]
<sorear>
emily: they released bitfiles for xilinx vc707 at one point and I know that board was heavily used at ucb-bar
freemint has quit [Ping timeout: 240 seconds]
azonenberg_work has quit [Ping timeout: 268 seconds]
freemint has joined ##openfpga
freemint has quit [Ping timeout: 245 seconds]
dh73 has joined ##openfpga
<mwk>
azonenberg: no
freemint has joined ##openfpga
<mwk>
at least it's not something I could see from looking at xc[4-6]v, xc7, or xcu architectures
<mwk>
heck, if anything, it's getting *less* regular, xc2v actually had regular column pattern
OmniMancer has quit [Quit: Leaving.]
azonenberg_work has joined ##openfpga
freemint has quit [Ping timeout: 245 seconds]
Asu has joined ##openfpga
azonenberg_work has quit [Ping timeout: 246 seconds]
freemint has joined ##openfpga
dh73 has quit [Ping timeout: 260 seconds]
freemint has quit [Ping timeout: 245 seconds]
<whitequark>
mwk: can you explain how do Xilinx READ_FIRST/WRITE_FIRST/NO_CHANGE port semantics are actually implemented?
<whitequark>
for any single port, it's just a mux
<whitequark>
but how does it work between two ports?
<whitequark>
why does UG473 recommend WRITE_FIRST for TDP where asynchronous clocks are used? this doesn't make any sense to me
<mwk>
whitequark: I don't know, it's just a simple 2-bit field in the bitstream
<mwk>
but my hypothesis is
<mwk>
it could change relative timing of read vs write strobes for a port
<mwk>
it's been some time since I looked into it, but iirc it made sense if write strobe was shifted
pepijndevos has quit [Ping timeout: 245 seconds]
rohitksingh has joined ##openfpga
bwidawsk has quit [Quit: Always remember, and never forget; I'll be back.]
bwidawsk has joined ##openfpga
pepijndevos has joined ##openfpga
<whitequark>
mwk: ohhhh, yeah, changing strobe timings is cheaper than a mux for a single port
mumptai has joined ##openfpga
freemint has joined ##openfpga
pie_ has joined ##openfpga
rohitksingh has quit [Ping timeout: 264 seconds]
freemint has quit [Ping timeout: 250 seconds]
unixb0y has quit [Ping timeout: 245 seconds]
unixb0y has joined ##openfpga
emeb_mac has joined ##openfpga
rohitksingh has joined ##openfpga
cr1901_modern1 has joined ##openfpga
cr1901_modern has quit [Ping timeout: 276 seconds]