lutsabound has quit [Quit: Connection closed for inactivity]
emeb has quit [Quit: Leaving.]
seldridge has quit [Ping timeout: 272 seconds]
seldridge has joined #yosys
leviathan has joined #yosys
kuldeep has quit [Read error: Connection reset by peer]
kuldeep has joined #yosys
rohitksingh_work has joined #yosys
seldridge has quit [Ping timeout: 252 seconds]
emeb_mac has quit [Quit: Leaving.]
dys has joined #yosys
dys has quit [Ping timeout: 244 seconds]
mjoldfield has quit []
leviathan has quit [Remote host closed the connection]
mjoldfield has joined #yosys
m4ssi has joined #yosys
mjoldfield has quit [Read error: Connection reset by peer]
mjoldfield has joined #yosys
mjoldfield has quit [Read error: Connection reset by peer]
mjoldfield has joined #yosys
jfng has quit [Remote host closed the connection]
jayaura has quit [Remote host closed the connection]
Rixon[m] has quit [Remote host closed the connection]
nrossi has quit [Remote host closed the connection]
danieljabailey has quit [Ping timeout: 246 seconds]
danieljabailey has joined #yosys
nrossi has joined #yosys
jfng has joined #yosys
leviathan has joined #yosys
maikmerten has joined #yosys
rohitksingh_work has quit [Read error: Connection reset by peer]
lutsabound has joined #yosys
rohitksingh has quit [Ping timeout: 252 seconds]
rohitksingh has joined #yosys
<maikmerten>
can BRAM-inference for iCE40 also work when reading with a blocking assignment? I'm trying to define a cache that determines a cache hit/miss within one clock cycle, so I need the tag information available in that clock tick. I guess that's not something BRAM can provide?
<daveshah>
maikmerten: no, that won't work on ice40
<daveshah>
you'd have to find an fpga with distributed ram
<maikmerten>
thanks :-)
<maikmerten>
I guess having a cache-lookup cycle it'll be then ;-)
<sorear>
also, distributed ram tends to be much lower capacity than block ram on chips with both
AlexDaniel has quit [Read error: Connection reset by peer]
seldridge has joined #yosys
<daveshah>
sorear: not always a downside in some cases like register files, where a whole bram would be a waste anyway
<sorear>
indeed
<sorear>
but a cache memory is more likely to be sized to use the ~entire chip
<sorear>
especially on non-UP ice40 where you only have 16KB total
leviathan has quit [Read error: Connection reset by peer]
leviathan has joined #yosys
<maikmerten>
for getting my feet wet with caches, I'm going for a cache with 256 entries, each 32 bit wide (which fits nicely into 2 iCE40 BRAMs) and 256 16 bit tags (1 bit valid, 15 bit address)
<maikmerten>
and then work my way up ;-)
<maikmerten>
will only accept aligned 32-bit words... essentially geared towards being an instruction cache
<sorear>
direct-mapped cache covering 32 MB of address space?
<maikmerten>
yeah, it's going to be a horrible cache-trash fest
<maikmerten>
direct-mapped, write-through, massive 256 words in capacity... what's not to like? ;-)
<sorear>
what sort of external memory
<maikmerten>
8-bit wide external SRAM
rohitksingh has quit [Ping timeout: 268 seconds]
<maikmerten>
so currently my CPU needs in total about 6 cycles to fetch the next instruction
<maikmerten>
this is basically the RISC-V equivalent to an Intel 8088 ;-)
<maikmerten>
(but that one at least had a prefetch queue)
lutsabound has quit [Quit: Connection closed for inactivity]
<sorear>
what can the external sram do latency/throughput?
<maikmerten>
I'm currently using the SRAM with one cycle latency (present address at one clock edge, get the data one cycle later)
<maikmerten>
the chip itself can do 10ns cycle times
<maikmerten>
but due to the board layout and connectors and because I'm sampling data mid-cycle, I can only drive it at ~30 MHz
<sorear>
async SRAM?
<maikmerten>
for now I've settled for 25.125 MHz, which happens to be very close to 640x480@60Hz VGA timings
<maikmerten>
yes
<maikmerten>
512Kx8
<sorear>
i guess you could do 4-bit color and dedicate 50% of the memory cycles to scan-out
<tpb>
Title: GitHub - maikmerten/hx8k-breakout-extension: A PCB with SRAM, buttons, LEDs and some pmod-compatible connectors for the Lattice HX8K Breakout Board (at github.com)
<maikmerten>
yes, with some cleverness I guess one could drive VGA from that SRAM as well.
<sorear>
oh I though you were saying you were already going to do VGA
<sorear>
although real-time chargen/sprites is also an option
<maikmerten>
well, in the future I might want to do VGA. Back when I did something similiar in VHDL (pre-yosys), I already had a simple RISC-V SoC with VGA
<maikmerten>
that one generated 40x25 characters
<maikmerten>
with a chargen for 256 8x8 pixel chars
<maikmerten>
which is rather compact and can be done in BRAM
seldridge has quit [Ping timeout: 268 seconds]
AlexDaniel has joined #yosys
rohitksingh has joined #yosys
leviathan has quit [Remote host closed the connection]
seldridge has joined #yosys
m4ssi has quit [Remote host closed the connection]
seldridge has quit [Ping timeout: 272 seconds]
seldridge has joined #yosys
rohitksingh has quit [Ping timeout: 240 seconds]
ZipCPU has quit [Ping timeout: 250 seconds]
dys has joined #yosys
<maikmerten>
okay, a first implementation of my "aligned word only", "only word reads get into cache", direct-mapped, 256 entry cache works now
<maikmerten>
dhrystones go from 4105 per second to 4655 per second, a 13.3% performance increase
<maikmerten>
ressource util goes from 1857 LCs (no cache) to 1938 LCs (with cache), BRAMs from 5 to 8 (of 32)
<maikmerten>
(so a 4.3% increase of LC usage for 13.3% better performance... that's ok I guess)
<sorear>
does it cache instructions, data, or both
<maikmerten>
every aligned word read gets offered to the cache
<maikmerten>
so no proper separation
<maikmerten>
also, every write invalidates the respective cache line
<maikmerten>
so the cache is a) small and b) gets invalidated a lot
maikmerten has quit [Quit: Verlassend]
seldridge has quit [Ping timeout: 240 seconds]
seldridge has joined #yosys
develonepi3 has quit [Remote host closed the connection]
m4ssi has joined #yosys
m4ssi has quit [Remote host closed the connection]