clifford changed the topic of #yosys to: Yosys Open SYnthesis Suite: http://www.clifford.at/yosys/ -- Channel Logs: https://irclog.whitequark.org/yosys
emeb has quit [Quit: Leaving.]
emeb_mac has joined #yosys
proteusguy has quit [Ping timeout: 272 seconds]
gsi__ has joined #yosys
gsi_ has quit [Ping timeout: 248 seconds]
rektide has joined #yosys
proteusguy has joined #yosys
AlexDaniel has quit [Ping timeout: 246 seconds]
zignig has joined #yosys
citypw has joined #yosys
adjtm has joined #yosys
adjtm_ has quit [Ping timeout: 258 seconds]
proteusguy has quit [Ping timeout: 244 seconds]
PyroPeter has quit [Ping timeout: 258 seconds]
proteusguy has joined #yosys
PyroPeter has joined #yosys
adjtm has quit [Remote host closed the connection]
adjtm has joined #yosys
X-Scale has quit [Read error: Connection reset by peer]
vonnieda has joined #yosys
<bwidawsk> is there like a goto riscv I should be working with? I'd bee looking at the Wishbone VexRiscv
<sorear> are you looking for anything in particular?
<bwidawsk> sorear› just looking for something that will synthesize with diamond and yosys, and is "complex" to do a comparison
rohitksingh has joined #yosys
rohitksingh has quit [Ping timeout: 272 seconds]
rohitksingh_work has joined #yosys
_whitelogger has joined #yosys
<corecode> ac
<corecode> woops
emeb_mac has quit [Ping timeout: 245 seconds]
voxadam has quit [Ping timeout: 248 seconds]
voxadam has joined #yosys
dys has quit [Ping timeout: 244 seconds]
gsi__ is now known as gsi_
fsasm_ has joined #yosys
dys has joined #yosys
m4ssi has joined #yosys
citypw has quit [Ping timeout: 272 seconds]
citypw has joined #yosys
fsasm_ has quit [Ping timeout: 272 seconds]
vidbina has joined #yosys
rohitksingh has joined #yosys
dys has quit [Ping timeout: 245 seconds]
vidbina has quit [Ping timeout: 272 seconds]
AlexDaniel has joined #yosys
dys has joined #yosys
AlexDaniel has quit [Ping timeout: 244 seconds]
citypw has quit [Ping timeout: 245 seconds]
citypw has joined #yosys
jakobwenzel has quit [Remote host closed the connection]
jakobwenzel has joined #yosys
<ZirconiumX> So, adding DFFSRs (in the form of the only AC-family chip I could find, the 74AC11074) actually reduces the overall chip count, despite the '74 only having 2 DFFSRs
<ZirconiumX> (thanks daveshah)
fsasm has joined #yosys
shorne has joined #yosys
AlexDaniel has joined #yosys
rohitksingh has quit [Ping timeout: 245 seconds]
proteusguy has quit [Remote host closed the connection]
rohitksingh has joined #yosys
AlexDaniel has quit [Read error: Connection reset by peer]
rohitksingh has quit [Ping timeout: 248 seconds]
citypw has quit [Ping timeout: 272 seconds]
rohitksingh has joined #yosys
rohitksingh has quit [Ping timeout: 248 seconds]
X-Scale has joined #yosys
citypw has joined #yosys
vonnieda has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
rohitksingh_work has quit [Read error: Connection reset by peer]
fsasm has quit [Ping timeout: 246 seconds]
fsasm has joined #yosys
m4ssi has quit [Quit: Leaving]
emeb has joined #yosys
AlexDaniel has joined #yosys
rohitksingh has joined #yosys
unkraut has quit [Remote host closed the connection]
unkraut has joined #yosys
<ZirconiumX> daveshah: My adder techmap pass seems to produce more Yosys warnings. Mind telling me how I fucked up this time?
<tpb> Title: 74xx-liberty/74_adder.v at master · ZirconiumX/74xx-liberty · GitHub (at github.com)
<ZirconiumX> ../74_adder.v:32: Warning: Range [3:0] select out of bounds on signal `\AA': Setting 1 MSB bits to undef.
<ZirconiumX> ../74_adder.v:33: Warning: Range [3:0] select out of bounds on signal `\BB': Setting 1 MSB bits to undef.
<daveshah> ZirconiumX: AA needs to be WIDTH-1:0 not Y_WIDTH-1:0
<daveshah> same for BB
<ZirconiumX> Woops, thank you
emeb_mac has joined #yosys
rohitksingh has quit [Ping timeout: 246 seconds]
vonnieda has joined #yosys
emeb_mac has quit [Ping timeout: 245 seconds]
<ZirconiumX> daveshah: While reading through the synth_ice40 pass, I noticed that the iCE40 uses DFFE cells. Is the E here an enable or something?
<daveshah> Yes
<daveshah> Clock enable
rohitksingh has joined #yosys
<ZirconiumX> daveshah: So from some research a DFFE is essentially a transparent latch?
<daveshah> ZirconiumX: No, that's a D latch
<daveshah> A DFFE is effectively a flipflop with an AND gate on the clock (or a mux in front of the data input)
<daveshah> *a D flipflop
<tnt> "an AND gate on the clock" ... well ... don't implement it like that :p
<ZirconiumX> I'm confused, then
<tnt> If clock-enable is low, a dffe will ignore rising edges.
<tnt> yes
<tnt> 74AC377
<tnt> Octal D-Type Flip-Flop with Clock Enable
rohitksingh has quit [Ping timeout: 245 seconds]
<ZirconiumX> Sadly that particular part was not made in the AC family
<ZirconiumX> Oh, seems TI are lying then :P
<tnt> Well ... not all manufacturer make all parts in each family ...
<tnt> each has its unique set of gates they make in each family.
<ZirconiumX> True, I suppose
rohitksingh has joined #yosys
<ZirconiumX> Hmmm
<ZirconiumX> What advantage does a DFFE have over a plain DFF?
<ZirconiumX> Probably more flexibility, at least
<daveshah> It saves a mux, in situations when you only update the DFF sometimes
<daveshah> any HDL of the form if (a) q <= d maps to a DFFE nicely
flammit_ has joined #yosys
<ZirconiumX> Ah, I see
citypw has quit [Ping timeout: 244 seconds]
nengel has joined #yosys
Wolf481pl has joined #yosys
rohitksingh has quit [Ping timeout: 258 seconds]
flammit has quit [*.net *.split]
ZipCPU has quit [*.net *.split]
Wolf480pl has quit [*.net *.split]
attie has quit [*.net *.split]
flammit_ is now known as flammit
ZipCPU has joined #yosys
rrika has quit [Ping timeout: 246 seconds]
rrika has joined #yosys
<ZirconiumX> daveshah: It looks like dfflibmap can't match/create DFFE cells. Is this correct, or am I just blind?
<daveshah> Yeah, looked like no-one has ever implemented this
<daveshah> It might be that DFFEs are more common in an FPGA context, which doesn't use dfflibmap
<ZirconiumX> Presumably then I should use techmap for this instead?
<daveshah> Yes
<ZirconiumX> DFF vs DFFE in 74-series logic is going to be an interesting tradeoff that might not pay off; you're saving muxes, sure, but the 16373 lets you fit twice as many DFFs in a chip.
<ZirconiumX> Well, assuming my math on this (broken) pass is correct, it *should* be a fairly major gain
<tpb> Title: (broken) DFF to 74AC377 DFFE pass · ZirconiumX/74xx-liberty@4fa6b83 · GitHub (at github.com)
<ZirconiumX> This leaks $_DFFE_PP_ cells, though
citypw has joined #yosys
citypw has quit [Ping timeout: 245 seconds]
proteusguy has joined #yosys
<daveshah> ZirconiumX: you need a general techmap call (`-map +/techmap.v`) before you try to map the $_DFFE_PP_
<ZirconiumX> daveshah: Ah, thank you
<ZirconiumX> Before: 7729
<ZirconiumX> After: 6734
<ZirconiumX> That's pretty huge
<daveshah> I would expect to see a significant drop in the number of MUX2s?
<ZirconiumX> Indeed, we go from 1,316 to 876
<tnt> wiring 6700 chips is still going to be fun :p
<ZirconiumX> This is for the whole benchmark
<ZirconiumX> Biggest winner is axilxbar, with about 25% less gates
<ZirconiumX> PicoRV32 is currently at 1,532 gates
rohitksingh has joined #yosys
<ZirconiumX> daveshah: Actually, I just had a thought. Yosys would expect each individual DFFE to have its own enable bit, but the 74AC377 has a single enable bit for 8 flops
<ZirconiumX> So this would be technically incorrect, right?
<ZirconiumX> Or at least, modelled incorrectly
<daveshah> ZirconiumX: the iCE40 flipflops are similar (as are most FPGAs)
<daveshah> have a look at how tnt implemented dffe_min_ce_use in synth_ice40
<ZirconiumX> Ah, thank you, daveshah
<ZirconiumX> It's still an improvement, but very much less so
<ZirconiumX> At 7563 chips, currently
<ZirconiumX> Adding an opt_merge before unmapping like synth_ice40 does helped bring that down to 7378
maikmerten has joined #yosys
rohitksingh has quit [Ping timeout: 268 seconds]
<ZirconiumX> I'm reading the "memory_bram" documentation (as Clifford suggested); what is a transparent read?
<ZirconiumX> (of SRAM)
<daveshah> A transparent read is where the read port will reflect writes in the current clock cycle (aka read after write)
<ZirconiumX> So if you write X to address Y on one port and simultaneously command a read from address Y, SRAM is transparent is you get X out?
<ZirconiumX> *if
<daveshah> Yes
<daveshah> Yosys can fake it with a mux if the SRAM isn't capable natively
<ZirconiumX> The SRAM I'm looking at at the moment appears to stall the read if you do that
<ZirconiumX> Is that transparent?
<daveshah> That sounds like not transparent, ie read before write
<ZirconiumX> My plan with this is to designate one port as write and one as read
<daveshah> I'm not sure if this really fits one way or another
<daveshah> Yosys doesn't have a concept of BRAM stalling - this wouldn't map from Verilog well either
<ZirconiumX> So this chip wouldn't work?
rohitksingh has joined #yosys
<daveshah> You'd probably need to be a bit clever with how you drove it
<daveshah> Read on one clock cycle and write on the other or something, so you didn't have the collision
<ZirconiumX> Yeah, it'd need some anti-collision circuitry
<ZirconiumX> Or even properties to verify collisions could not happen
dys has quit [Ping timeout: 244 seconds]
dys has joined #yosys
<ZirconiumX> So, I managed to coerce memory_bram into working by fooling it into thinking the write port is clocked
<ZirconiumX> 1,009 chips, even though the RAM chips are very underused
jevinskie has joined #yosys
<ZirconiumX> 6,140 ICs
<ZirconiumX> Oh, hey jevinskie
<jevinskie> Howdy!
m4ssi has joined #yosys
m4ssi has quit [Remote host closed the connection]
<bwidawsk> daveshah› what is a SLICE in this context after PNR?
<daveshah> bwidawsk: a unit of two LUT4s, two flipflops, two MUX2s and two bits of carry logic
<bwidawsk> ah, this is like an ALM in altera parlance
<bwidawsk> thanks
<daveshah> Yup
maikmerten has quit [Remote host closed the connection]
<bwidawsk> daveshah› interestingly, synthesis time alone is a decent amount faster on blinky with gcc over clang
<bwidawsk> roughly 25% faster (granted we're talking ~2s here)
<daveshah> Interesting
<daveshah> Would be good to know if that applies to bigger benchmarks too
<bwidawsk> daveshah› I'll try to provide that info after I figure out how to get the same data from diamond on blinky
Thorn has quit [Ping timeout: 268 seconds]
Thorn has joined #yosys
<bwidawsk> daveshah› actually, I had it backwards - clang is faster, and it's more like 10%
<bwidawsk> it does fluctuate a bit...
SpaceCoaster has quit [Ping timeout: 245 seconds]
<bwidawsk> not sure if I did something wrong, but diamond and yosys use the same number of luts, but diamond uss half the number of slices
<bwidawsk> daveshah› https://0x0.st/zedC.txt
<daveshah> This is probably because nextpnr's packing density is pretty poor
<daveshah> This isn't counting carries (CCU2Cs) which take up a slice too
<bwidawsk> daveshah› does it make sense to add that?
<daveshah> Yes
<bwidawsk> daveshah› in nextpnr side, it's already counted by TRELLIS_SLICE, correct?
<daveshah> Yes
<bwidawsk> that brings it up to 184 vs. 232 then
* bwidawsk needs to add a cost column :P
<daveshah> I was referring to the synthesis side, in terms of LUT usage
<daveshah> Diamond SLICEs should already include CCU2s too
<bwidawsk> I have this
<bwidawsk> Number of SLICEs: 117 out of 41820 (0%)
<bwidawsk> SLICEs as Logic/ROM: 117 out of 41820 (0%)
<bwidawsk> SLICEs as RAM: 0 out of 31365 (0%)
<bwidawsk> SLICEs as Carry: 67 out of 41820 (0%)
<bwidawsk> Number of LUT4s: 233 out of 83640 (0%)
<bwidawsk> Number used as logic LUTs: 99
<bwidawsk> Number used as distributed RAM: 0
<bwidawsk> Number used as ripple logic: 134
<bwidawsk> Number used as shift registers: 0
<daveshah> So that is equivalent to 117 TRELLIS_SLICE
<daveshah> I'm curious what the Yosys output is
<bwidawsk> daveshah› https://0x0.st/zedC.txt
<bwidawsk> oops
<bwidawsk> daveshah› https://0x0.st/zenc.txt
<daveshah> So the total number of LUTs Yosys has inferred is 233 + 74*2
<daveshah> Yosys doesn't include the two LUT4s in the CCU2C in its statistic, whereas I believe Diamond does
<bwidawsk> so quite a bit worse then, huh?
<bwidawsk> let me post the entirety of the map output
<bwidawsk> daveshah› https://0x0.st/zenm.txt
<daveshah> Yeah, Yosys has some serious area issues for ECP5 at the moment
<daveshah> Mostly because the lack of proper LUT timings in ABC make it much too eager use muxes to build large LUTs
<bwidawsk> if I'm trying to paint open tools in a good light, should I do ice40 then?
<daveshah> I expect you'll find it much the same
<bwidawsk> :/
<daveshah> Probably a bit better, but we still definitely lag behind
<daveshah> Things will pick up once this PR is merged https://github.com/YosysHQ/yosys/pull/1098
<tpb> Title: WIP "abc9" pass for timing-aware techmapping (experimental, FPGA only, no FFs) by eddiehung · Pull Request #1098 · YosysHQ/yosys · GitHub (at github.com)
<daveshah> But it might be a month or two
<bwidawsk> for my sake, would it make sense to just merge it and try that out?
<bwidawsk> I don't care if it's in master so long as I can say in good faith, it will be
<daveshah> It's probably not giving a massive improvement yet, there is still some work that isn't even pushed
<daveshah> If you do try it, you'll need to add -abc9 to synth_ecp5 and synth_ice40
<bwidawsk> daveshah› actually, if you read the notes from the diamond log, it seems like they misreport the total number of luts
<bwidawsk> Notes:-
<bwidawsk> 1. Total number of LUT4s = (Number of logic LUT4s) + 2*(Number of
<bwidawsk> distributed RAMs) + 2*(Number of ripple logic)
<bwidawsk> it's a bit confusing
<daveshah> That is equivalent to in Yosys doing LUT4 count + 2*CCU2 count
<daveshah> If you want to see Yosys doing better in area, you can try adding -nomux to synth_ecp4
<daveshah> *synth_ecp5
<bwidawsk> for diamond then, i should be doing 99 + 2 * 134, correct?
<daveshah> No the Diamond number of LUT4s is correct
<daveshah> Yosys has done badly and there's no escaping
<bwidawsk> and presumably, Fmax goes up if you fix those somewhat
AlexDaniel has quit [Remote host closed the connection]
<daveshah> Yes
fsasm has quit [Ping timeout: 246 seconds]
<daveshah> Do you have the design somewhere?
<bwidawsk> it's your blinky from prjtrellis
<daveshah> bwidawsk: I'll push a PR tomorrow, turns out the subtractor mapping in Yosys was very suboptimal and hurting that design particularly badly
<daveshah> Should be down to more like 284 LUT4s in Diamond terms
<bwidawsk> daveshah› thanks!
<bwidawsk> if you add me to the cc, I would be happy to test it
vonnieda has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
jevinskie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
jevinskie has joined #yosys
<bwidawsk> sadly the soc_ecp5_evn project doesn't just work in diamond
rohitksingh has quit [Ping timeout: 272 seconds]
<bwidawsk> looks like it doesn't like how EHXPLLL is instantiated
<bwidawsk> it all looks right afaict...
tpb has quit [Remote host closed the connection]
tpb has joined #yosys