clifford changed the topic of #yosys to: Yosys Open SYnthesis Suite: http://www.clifford.at/yosys/ -- Channel Logs: https://irclog.whitequark.org/yosys
tpb has quit [Remote host closed the connection]
tpb has joined #yosys
maartenBE has quit [Ping timeout: 240 seconds]
Degi has quit [Ping timeout: 264 seconds]
maartenBE has joined #yosys
Degi has joined #yosys
emeb_mac has quit [Ping timeout: 265 seconds]
emeb has quit [Ping timeout: 240 seconds]
emeb has joined #yosys
emeb has quit [Ping timeout: 265 seconds]
emeb has joined #yosys
emeb_mac has joined #yosys
cr1901_modern has quit [Ping timeout: 260 seconds]
emeb has quit [Quit: Leaving.]
citypw has joined #yosys
cr1901_modern has joined #yosys
SpaceCoaster has joined #yosys
FFY00 has quit [Remote host closed the connection]
FFY00 has joined #yosys
az0re has joined #yosys
craigo has joined #yosys
xtro has quit [Quit: Lost terminal]
FL4SHK has quit [Ping timeout: 246 seconds]
FL4SHK has joined #yosys
emeb_mac has quit [Quit: Leaving.]
craigo has quit [Quit: Leaving]
FL4SHK has quit [Ping timeout: 256 seconds]
craigo has joined #yosys
Asu has joined #yosys
N2TOH_ has joined #yosys
N2TOH has quit [Ping timeout: 260 seconds]
kraiskil has joined #yosys
kraiskil has quit [Ping timeout: 256 seconds]
kraiskil has joined #yosys
<pepijndevos> daveshah, in Gowin it seems pip delay depends on fanout. How is this represented in nexpnr(-generic)?
kraiskil has quit [Ping timeout: 240 seconds]
<Lofty> pepijndevos: in nextpnr's API, that's up to you; you can probably model it as a base + constant * fanout model
<daveshah> This isn't something nextpnr-generic supports at the moment
<Lofty> But pip delay in general depends on fanout due to capacitance, I think
<daveshah> Yes although not all arches model this
<pepijndevos> right, that was more or less what I expected heh
<pepijndevos> ty
<daveshah> ECP5 does so that would be a starting point if you want inspiration
<pepijndevos> ah good to know
<pepijndevos> I think I'll keep postponing a gowin nextpnr target a while longer until at least clock routing works
<pepijndevos> I'll probably ping you at that time for some recommendations for a good starting point for a new arch.
citypw has quit [Ping timeout: 240 seconds]
<pepijndevos> ugh... seems Ghidra somehow forgot the analysis data, so I guess I'll be working on something else...
citypw has joined #yosys
<tux3> Aw, I can't get Yosys to accept clocks in interfaces anymore, not even with hacky workarounds :/
<tux3> >Handling const CLK on $memory$flatten\<snipped path>.mem[437]$55298 ($dff) from module top (removing D path).
<mwk> hmm
<mwk> could you show an example?
<tux3> I haven't minimized it, but essentially I use my AXI4-lite interface to talk to a small SRAM, so I have a `module my_sram(axi4lite.slave bus)`, and some master talks to it
<tux3> Well, I did have this bug open https://github.com/YosysHQ/yosys/issues/1592 originally, but at the time I could workaround it by putting my clock in a modport
<tpb> Title: Clock in interface port mis-synthesized away (but accepted in modport) · Issue #1592 · YosysHQ/yosys · GitHub (at github.com)
<tux3> It's probably not an easy fix or a recent regression, so I might just keep this project on the proprietary toolchains for a while more /shrug
<mwk> just to be sure, does it work with 4a05cad7f8a6ee57292e5360eb06305e13fc308b?
<mwk> because it may indeed be interfaces, or it may be my screwup in refactoring the pass that detects const clocks in the first place
<mwk> hmmm
<mwk> as for the original bug, it seems to be an issue with opt_clean
<mwk> you're in luck, I have repairing this godforsaken pass on my immediate todo list
<tux3> yay. still compiling
<tux3> waiting on abc, it seems to take a lot longer on 4a05cad7f8a6ee57292e5360eb06305e13fc308b?
<tux3> yosys is now using 9.5GB RAM and rising, starting to wonder if a BRAM turned into a reg =]
<tux3> oh it's done. very different output/performance, but same result on 4a05cad7f8a6ee57292e5360eb06305e13fc308b
<mwk> ... weird
<tux3> hhhhm
<tux3> uh
<mwk> ... hmmm
<mwk> so I fixed the clean issue, and it no longer nukes the submodule, but... clock is still not connected
<mwk> not good
<tux3> I'm saying it "failed" because i see ICESTORM_RAM: 12/ 32, which is what I had when my clock got removed
<tux3> But uh, actually on 4a05 I have ICESTORM_LC: 138640/ 7680 1805%
<whitequark> try (*ram_block*)
<tux3> So maybe it actually "worked!"
<whitequark> this forces the memory into a BRAM or fails the build
<mwk> okay so forget about the opt_clean thing
<mwk> that's a legitimate bug triggered by your testcase, but there clearly is another problem, in the SV frontend
<mwk> and it was about (* keep *) wires getting removed, so nothing that would matter for synthesis
<tux3> assuming (*ram_block*) \n logic [data_width-1:0] mem [(1<<addr_width)-1:0]; is the correct syntax, synthetizing
<tux3> ERROR: cell type '$mem' is unsupported (instantiated as 'foo.bram_rdata[1]_$mem_RD_DATA_4')
<mwk> ... that's not the best error message ever, but it does mean that blockram inference is failing
<tux3> More info above in the log: https://paste.debian.net/1159736/
<tpb> Title: debian Pastezone (at paste.debian.net)
<mwk> ... we'd really have to look into the design
<tux3> Happy to send it over if it helps, but fair warning the code is uh not very good
<tux3> It's just a toy riscv core at about 7k lines
<whitequark> compiler maintainers don't really care about code quality for the most part
<mwk> whatever it is, I've seen worse
<whitequark> also that
<daveshah> The worse the code is the more bugs it usually finds :)
<mwk> (and if not, I reserve the right to tell random people the war story)
<whitequark> at worst we might point out non-synthesizable constructs
<whitequark> but as long as it's all valid synthesizable verilog i honestly can't be bothered even thinking whether it's elegant or not
<daveshah> In terms of Verilog, dodgy async stuff is often interesting from a finding weird edge cases point of view
<whitequark> right, that's a different point of view :)
<tux3> well my testbenches pass, and avhdl is happy with it, but it's entirely possible I accidentally have nonsense verilog
<tux3> et me tar something up with a Makefile that doesn't require my custom tools to build
<mwk> for completeness, I've opened https://github.com/YosysHQ/yosys/pull/2337 for the opt_clean issue affecting your example in #1592, but... this actually doesn't fix the main problem
<tpb> Title: opt_clean: Fix module keep rules. by mwkmwkmwk · Pull Request #2337 · YosysHQ/yosys · GitHub (at github.com)
<mwk> (the device module & instantiation is now kept alive as expected, but the clock/reset lines are unconnected)
<tux3> I've kept the (*ram_block*) so this just fails to build, but the "correct" result is if nextpnr reports something like 28/32 BRAMS used (or, I guess, 1800% utilization)
<tux3> the offending memory is at ./src/dev/axi4lite_sram.sv:42
<mwk> tux3: well at least the memory inference failure is quite obvious
<mwk> you have an asynchronous read port, so using blockram isn't possible
<mwk> ideally you should read synchronously from memory in the same module where it is defined, but yosys with flatten gives you a bit more freedom
<mwk> it would be fine if you had an async read port directly feeding a register, but you have a problem at this line: wire [data_width-1:0] rdata = bram_rdata[bram_read_index];
<tux3> oh right, I wanted to move the reg outside the bram module, I was hoping flatten would see it
<mwk> this inserts muxes between the $mem and the $dff, preventing merging it
<tux3> Makes sense
<tux3> I can just make my bus output comb and put it after the mux, then. Shouldn't be a problem
<tux3> Thanks, I should have known this =]
craigo has left #yosys ["Leaving"]
<mwk> I don't quite understand what you want to do
<tux3> I guess changing writing `.unregistered(0)` at line 44 is a good enough "fix" to clear that issue, even if it's technically wrong
<mwk> yeah you'd have to do some pipeline fixing
<mwk> also I don't know why you're so carefully splitting the bram into bram_blocks, that's something yosys does for you
<tux3> for my caches the block size is tied to tag size, addr space, etc, I had in mind to make it "portable" by having each arch set reasonnable parameters
<tux3> not sure if that makes sense, I just got used to building things up from blocks and arch-specific params
<tux3> I'm still not sure how the async ram "works" with 4a05cad but on master the whole module gets optimized out. But I guess keeping the clean synchronous ram works all the time, so I'll just do that.
<tux3> I really appreciate the help, thank you. (And sorry that I don't have a more interesting bug to show for it!)
_whitelogger has joined #yosys
emeb has joined #yosys
SpaceCoaster has quit [Read error: Connection reset by peer]
maartenBE has quit [Ping timeout: 265 seconds]
maartenBE has joined #yosys
<pepijndevos> daveshah, I plotted wire delay for gowin, and it seems offset of 0.5 is pretty good, but the proportional part is more like 0.05, correct? https://ibb.co/bXTnSPL Y is ns, X is wire length
<tpb> Title: delay — ImgBB (at ibb.co)
<pepijndevos> Assuming wire lenght is measured in grid units
<daveshah> Yes, that sounds believable
<daveshah> It is manhattan distance yeah
<pepijndevos> ok thanks
<pepijndevos> hmmm, so gowin wires can be tapped at the ends and halfway, so I'm not really sure how that works timing wise. They don't list delays for e.g. 4 distance
<pepijndevos> Although... I'm not sure but probably the actual wire length doesn't mater so much as the parasitics of the wire
<pepijndevos> huh... I'm confused... so there is a tile full of muxes that select from many inputs to a fixed output. Does the pip delay correspond to that output, or to the input it selects from?
<pepijndevos> I'm assuming the former
citypw has quit [Ping timeout: 240 seconds]
kraiskil has joined #yosys
thardin has quit [Ping timeout: 265 seconds]
_whitelogger has joined #yosys
kraiskil has quit [Ping timeout: 240 seconds]
m4ssi has joined #yosys
Asu has quit [Quit: Konversation terminated!]
<awygle> has anybody tried replicating the numbers from table 3.20 in the ECP5 datasheet with nextpnr and/or diamond?
<awygle> it's a list of "basic functions" and their "register-to-register performance"
<daveshah> Nope
<daveshah> Not that I know of
m4ssi has quit [Remote host closed the connection]
<awygle> i got 207.3 MHz for a 64-bit adder as compared to the 441 MHz suggested in the datasheet, using nextpnr
<awygle> (and yosys, which is probably more relevant now that i think about it
<awygle> abc9 improves it slightly to 217.34 MHz
<daveshah> It might be related to register packing or something
<daveshah> There may also be issues with parts being pulled apart by the connections to the IO
<awygle> could be. i have inputs->reg x2->adder->reg->output
<daveshah> That is probably OK
<daveshah> Is the speed grade the same as the datasheet?
<awygle> yep, i'm running --speed 8
<awygle> and the table says "-8 timings"
<daveshah> I see
<daveshah> It's probably suboptimal register placement for some reason
<daveshah> But quite frankly I have bigger things to worry about than microbenchmark performance
<awygle> yeah, definitely
<awygle> i just was curious about achievable speeds
<awygle> i don't have a 64-bit wide datapath in my critical section anyway so it's more or less irrelevant
<daveshah> The register placement here is quite different to a real design anyway
<daveshah> As the influence from the IO is going to be much higher
<awygle> yeah
FL4SHK has joined #yosys
<awygle> i'm gonna increase the "registers between this and the I/O" to like 16 each, and see if that changes anything
<awygle> and then i'll be done playing with it
<daveshah> You could also try --out-of-context --placer sa to disable IO insertion (and work around the resulting probably singular matrix)
<awygle> ok, will do
<awygle> that got me to ~250 MHz
<awygle> that's close to what they have for "64 bit counter"
<awygle> so i wonder if their 64-bit adder is actually a half-adder
cr1901_modern has quit [Ping timeout: 265 seconds]
<daveshah> Yeah, a counter and adder shouldn't be so different
<awygle> on a 64-bit counter i got 263 MHz, so yeah i think it's fair to say they're cheating
<awygle> er, 269 MHz rather, so beating their claimed 263 MHz
<awygle> oh daveshah, do you have any idea about the edgeclk vs sclk question i asked the other day? the TN for high-speed I/O says the SCLK topology must be used for frequencies <250 MHz and the eclk must be used for frequencies >400 MHz, does that mean i can do whatever in between those two?
<daveshah> No idea
<daveshah> litedram also uses edge clock at 100MHz+ fine
<awygle> mk, i'll probably just design for eclk then
<daveshah> I didn't know that rule was even a thing
<awygle> makes the timing a lot looser
<awygle> oh but litedram is using the DQS stuff so it probably doesn't count come to think of it
<daveshah> maybe
<daveshah> depends what they mean by SCLK/ECLK topology really
<awygle> it's section 5 of TN-02035-1.2 that i'm looking at, if you feel compelled to investigate
<awygle> but don't worry about it on my account
emeb_mac has joined #yosys
strongsaxophone has joined #yosys
kristianpaul has quit [Read error: Connection reset by peer]
kristianpaul has joined #yosys
cr1901_modern has joined #yosys
kraiskil has joined #yosys
kraiskil has quit [Ping timeout: 246 seconds]
m4ssi has joined #yosys
m4ssi has quit [Remote host closed the connection]
cr1901_modern has quit [Ping timeout: 240 seconds]
cr1901_modern has joined #yosys
strongsaxophone has quit [Quit: Lost terminal]
m4ssi has joined #yosys
m4ssi has quit [Remote host closed the connection]
lf_ has quit [Ping timeout: 244 seconds]
lf has joined #yosys
emeb has quit [Quit: Leaving.]