Degi has quit [Quit: ZNC 1.6.6+deb1ubuntu0.2 - http://znc.in]
Degi has joined #yosys
enigma has quit [Quit: leaving]
anticw_ has quit [Remote host closed the connection]
anticw has joined #yosys
Degi_ has joined #yosys
Degi has quit [Ping timeout: 246 seconds]
Degi_ is now known as Degi
strongsaxophone has quit [Remote host closed the connection]
citypw has joined #yosys
Cerpin has quit [Read error: Connection reset by peer]
Cerpin has joined #yosys
emeb_mac has quit [Quit: Leaving.]
knielsen_ has joined #yosys
dys has joined #yosys
Jybz has joined #yosys
knielsen_ is now known as knielsen
N2TOH has quit [Ping timeout: 240 seconds]
N2TOH has joined #yosys
strongsaxophone has joined #yosys
emeb has joined #yosys
citypw has quit [Ping timeout: 256 seconds]
dys has quit [Ping timeout: 256 seconds]
strongsaxophone has quit [Remote host closed the connection]
strongsaxophone has joined #yosys
<grazfather>
Hm, from the readme: "type identifiers must currently be enclosed in (parentheses) when declaring signals of that type (this is syntactically incorrect SystemVerilog)". So what would that look like? I get errors with `(input bla::x) name` and other variations of what i put in parents
<grazfather>
parens
<ZirconiumX>
Maybe it's the "bla::x" bit?
<daveshah>
You would put just the bla::x in parens
strongsaxophone has quit [Ping timeout: 246 seconds]
strongsaxophone has joined #yosys
strongsaxophone has quit [Ping timeout: 265 seconds]
strongsaxophone has joined #yosys
voxadam has quit [Read error: Connection reset by peer]
voxadam has joined #yosys
<thardin>
yosys is deceptively good at optimizing thigns
<thardin>
I thought I was able to fit a 128x128-bit multiplier in a hx1k, but most of it got optimized out
<ZirconiumX>
thardin: it can always be made better :P
<thardin>
in reality 24x24-bit is closer to what the hx1k can actually do
<ZirconiumX>
If you can afford the timing delay, the UP5K has hardware multipliers
<thardin>
could probably do some pipelining
<thardin>
mostly just kind of getting a feel for what the different devices can do at the moment
<ZirconiumX>
The downside is that it's a UP5K :P
<thardin>
made a 24-bit Toom-3 squarer implementation, takes ~7 seconds just in yosys
<thardin>
99% LCs used :p
<ZirconiumX>
I have a fun testbench for combinational logic
N2TOH has quit [Ping timeout: 250 seconds]
N2TOH has joined #yosys
<ZirconiumX>
It calculates all possible chess moves on a chessboard in a single cycle through pure combinational logic
<thardin>
nice
<ZirconiumX>
3.8k LUT4s :P
<thardin>
I think I'll throw perf at yosys and see where it spends most of its time
<thardin>
what's the delay through all of those?
<ZirconiumX>
There are a couple of optimisation PRs out there
<thardin>
I see it's single-threaded. a bit of openmp #pragmas in strategic places would be an easy way to speed it up
<ZirconiumX>
Unfortunately it's not thread-safe
<ZirconiumX>
And making it thread-safe is not trivial
<ZirconiumX>
Internally, Yosys represents a netlist as a giant graph, and so any operation on this graph would need to be made thread-safe, so that no two threads could operate on the same node at the same time
<ZirconiumX>
Or else the graph needs to be partitioned
<ZirconiumX>
The easiest place to partition it would be at the module level
<ZirconiumX>
However, the modules get almost immediately flattened unless you pass `-noflatten`
<ZirconiumX>
(and not flattening is generally suboptimal)
<thardin>
building with ENABLE_DEBUG to hopefully get some symbols in my perf log
<ZirconiumX>
Actually, you get symbols
<ZirconiumX>
`make` outputs a file with full symbols
<ZirconiumX>
`make install` strips the installed executable but leaves the original intact
<ZirconiumX>
So the `yosys` in your source directory will have symbols
<thardin>
ah
<thardin>
already did make clean so, thumb twiddling time
<thardin>
I have ccache though, so future builds should be faster
N2TOH has quit [Ping timeout: 256 seconds]
<thardin>
I have found it can take a long time for yosys+nextpnr to figure out that "nope your design doesn't fit". would be nice if it could come to that conclusion faster
<ZirconiumX>
You can generally judge pretty quickly from the nextpnr "Device utilisation" info
<thardin>
mm
<qu1j0t3>
ZirconiumX: that is a nice benchmark! you should publish it
<thardin>
or just yosys taking more than 10 seconds or so
<mwk>
yosys doesn't even have the device geometry info
<mwk>
it only knows *what* resources exist on the target device, not how many
<ZirconiumX>
mwk: I feel like some `select -assert-max` might be a good rule of thumb
<mwk>
(it could perhaps be fixed some day, but that'd require some serious changes)
<ZirconiumX>
qu1j0t3: It's written in nMigen and maybe not the most readable thing ever
<qu1j0t3>
still!
<qu1j0t3>
also a good nmigen case study
<ZirconiumX>
I also have a very partial implementation of the PS2's GPU pipelines
peepsalot has quit [Quit: Connection reset by peep]
<thardin>
15% of yosys-abc's time is spent just in clock_gettime()
<ZirconiumX>
"yosys-abc" != "yosys"
<ZirconiumX>
ABC is the bit performing LUT mapping
peepsalot has joined #yosys
<thardin>
I see
<thardin>
for yosys malloc() and free() accounts for 16%
<ZirconiumX>
The wonderful thing about ABC is that ABC is a wonderful thing /s
<thardin>
let's see what callgrind thinks
<thardin>
dumdidum.. taking few minutes
<thardin>
there we go
<thardin>
RTLIL::put_reference dominates self%
<thardin>
SigSpec::~SigSpec() also accounts for quite a lot
<ZirconiumX>
daveshah: I think much of the spaced out solution that HeAP provides gets ""optimised"" by the SA refinement pass
<ZirconiumX>
Since the total solution wire lengths after SA aren't that far apart
<daveshah>
Yes, that could be a problem
<daveshah>
You could go even lower with beta to see what that does
<daveshah>
The SA refinement pass does ultimately have quite a small radius
<ZirconiumX>
Well, I'll see is 0.6 routes, but I'm not confident because the difference post-SA is 2000 wirelen units
<thardin>
changing to in(const IdString& rhs) reduces the number of calls to put_reference from 61.6M to 45.4M
<daveshah>
thardin: that seems like a very worthwhile change then
<thardin>
yeah
N2TOH has joined #yosys
<thardin>
commented on PR 1767
<ZirconiumX>
0.6 still fails to route
<az0re>
Does anyone know if there is any support at all for solving exists-forall problems directly in yosys (i.e. not yosys-smtbmc)? Nothing I have tried with the `sat` command has worked. Instead I get errors like: "ERROR: Failed to import cell $allconst$3 (type $allconst) to SAT database."
<thardin>
std::map<RTLIL::Cell*, RTLIL::Cell*, CompareCells> sharemap(CompareCells(this)); in opt_merge.cc now sticks out. ~13% of runtime
<thardin>
lots of methods in SigPool could also use some const&
<ZirconiumX>
...Dang, even after removing the SA refinement pass it still fails to route with beta = 0
<ZirconiumX>
I think the design is just too difficult for ECP5 to route
<daveshah>
Yup
<ZirconiumX>
would -nowidelut help here?
<daveshah>
Probably
<ZirconiumX>
So, 24%, let's see how this goes.
<ZirconiumX>
Wire length is already way down
<ZirconiumX>
Nope :(
<ZirconiumX>
Wonder how Quartus handles it.
<thardin>
got tripped up by the way kernel/ gets copied on every build
<thardin>
"why is my header change gettign reverted?"
<ZirconiumX>
"Info (170202): The Fitter performed an Auto Fit compilation. No optimizations were skipped because the design's timing and routability requirements required full optimization.
<ZirconiumX>
"
<ZirconiumX>
Proud of that
<thardin>
inlining some getters and making use of const& in some places reduces the instruction count of rmunused_module_signals() from 2800M to 2000M
<thardin>
probably more overall
<thardin>
>I have a patch for that incoming
<thardin>
scooped again!
<ZirconiumX>
You can try some openmp pragmas if you really want.