lutsabound has quit [Quit: Connection closed for inactivity]
dh73 has quit [Quit: Leaving.]
dh73 has joined #yosys
dh73 has quit [Client Quit]
nrossi has joined #yosys
Jybz has joined #yosys
Jybz has quit [Quit: Konversation terminated!]
tannewt has joined #yosys
rohitksingh has joined #yosys
FabM_cave has joined #yosys
acdimalev has joined #yosys
kraiskil has joined #yosys
fevv8[m] has quit [Write error: Connection reset by peer]
pepijndevos[m] has quit [Write error: Connection reset by peer]
promach3 has quit [Remote host closed the connection]
m4ssi has joined #yosys
fevv8[m] has joined #yosys
pepijndevos[m] has joined #yosys
promach3 has joined #yosys
jakobwenzel has quit [Quit: jakobwenzel]
jakobwenzel has joined #yosys
acdimalev has quit [Quit: .]
d0nker5 has quit [Ping timeout: 245 seconds]
d0nker5 has joined #yosys
d0nker5 has quit [Ping timeout: 240 seconds]
d0nker5 has joined #yosys
rohitksingh has quit [Ping timeout: 245 seconds]
rohitksingh has joined #yosys
anon93 has joined #yosys
pie_ has quit [Ping timeout: 276 seconds]
rohitksingh has quit [Ping timeout: 246 seconds]
citypw has quit [Ping timeout: 240 seconds]
anon93 has quit [Remote host closed the connection]
develonepi3 has joined #yosys
<develonepi3>
Hello all Has anyone tried to use docker, to build yosys tools (archne-pnr, icestorm, yosys and nextpnr), for raspberry pi or for Ubuntu?
<ZirconiumX>
develonepi3: I think everything builds natively.
<ZirconiumX>
Though, don't use arachne-pnr anymore; nextpnr is superior.
<develonepi3>
Zirconiumx: Yes, I have built on both Rpi3B+ & Rpi4B but it still takes a long time. Was just wondering since I have older versions of nextpnr.
<ZirconiumX>
develonepi3: Are you using HeAP or SA?
<ZirconiumX>
develonepi3: It's a very memory-intensive build because it's creating binary blob databases containing all the routing information it needs
<develonepi3>
Zirconiumx: Don't think so.
<ZirconiumX>
`nextpnr-ecp5 --help` should say what the default choice is for `--placer`.
<develonepi3>
Zirconiumx: I am using nextpnr-ice40, no --placer option fairly old. Last update in Aug 4.
<ZirconiumX>
Plus, consider that place and route is a slow operation anyway.
m4ssi has quit [Remote host closed the connection]
emeb has joined #yosys
pie_ has joined #yosys
develonepi3 has quit [Remote host closed the connection]
pie_ has quit [Remote host closed the connection]
<tnt>
Is it possible after doing a 'read' of all the input files to get yosys to write a single file that contains all it needs to pursue the synth without neededing any other files ? Goal is to real all sources (including any includes and init files for memories) on one machine, then ship that file to a build server that will do most of the work and just spit out the resulting .json.
pie_ has joined #yosys
FL4SHK has quit [Ping timeout: 250 seconds]
captain_morgan20 has quit [Ping timeout: 250 seconds]
captain_morgan20 has joined #yosys
<daveshah>
tnt: that is the aim of -E
<daveshah>
I don't know if it handles init files correctly
FL4SHK has joined #yosys
gorbak25 has quit [Ping timeout: 250 seconds]
<whitequark>
-E seems orthogonal
gorbak25 has joined #yosys
<whitequark>
I think doing read;hierarchy;write_ilang would work?
<tnt>
Yeah reading the description of -E it seems like it would just list all the files rather than make a single file containing everything.
<tnt>
whitequark: the ilang only contains a bunch of modules with "attribute \src "misc/xclk_strobe.v:36"" pointing to the actual sources :/
pie_ has quit [Ping timeout: 246 seconds]
<tnt>
Oh wait, seems hierarchy needs the -top option to really do anything and then it might do what I want.
<tnt>
I need to add read_verilog -D_ABC -lib +/ecp5/cells_sim.v +/ecp5/cells_bb.v as well for all the blackbox, but that might work.
<whitequark>
tnt: yep, need -top or -auto-top
<whitequark>
the \src attribute is purely for debug info
<whitequark>
yosys never reads those paths
<tnt>
Not sure mem init are handled properly though :/
<whitequark>
they are
<whitequark>
there is no way to read an external init file from rtlil
<whitequark>
which is actually kind of unfortunate
<tnt>
Oh yeah, I see indeed they are my bad.
pie_ has joined #yosys
FabM_cave has quit [Quit: Leaving]
dys has joined #yosys
pie_ has quit [Ping timeout: 250 seconds]
pie_ has joined #yosys
rohitksingh has joined #yosys
nrossi has quit [Quit: Connection closed for inactivity]
<janrinze>
does nextpnr / yosys slice large memory into dp16kd per bit or does it do it 'wide'? i noticed that larger memory has a big amount of muxes..
<tnt>
it tries different things
<tnt>
in the yosys output log you'll see the different config/layout it considered and the score it attributed to each and why it picked the one it did.
<janrinze>
I can use a block of 64Kx16 made of dp16kd at 140 MHz but the same amount with reg [15:0] memory[0:65535] will max out at 70 Mhz..
<ZirconiumX>
Unfortunately memory_bram sucks.
<ZirconiumX>
This is one of the reasons why: at present it has no concept of RAM speeds.
<janrinze>
I'm trying to get my head wrapped around how the reg [15:0] memory[0:65535] gets converted to dp16kd. The code in nextpnr looks pretty obscure to me (forgive me my ignorance.)
dys has quit [Ping timeout: 250 seconds]
<ZirconiumX>
janrinze: the conversion is done in Yosys
<ZirconiumX>
In memory_bram
<daveshah>
Yosys does tend to prefer bit slicing large RAMs
<daveshah>
This should improve timing as it reduces the need for read decode muxes
<janrinze>
daveshah: that's precisely what i hoped. Unfortunately in the timing analysis I see that is has around 9 muxes. When using bit slicing it would only need 3, i.i.r.c.
<janrinze>
64K can be done with 4xDP16KD per bit. Since we have 4bit LUT it will require 3 LUT to select the correct output bit.
<tnt>
look at the yosys output log ...
<tnt>
that might give a clue
<tnt>
It definitely picks 16k x 1 geometry, but there is still a lot of LUTs. Probably the semantic of DP16K doesn't perfectly match what it wants and it needs a bunhc of external logic ...
<tnt>
One of the reason I don't like inference. How do I express _i_don_t_care_ how r/w conflict are handled for instance ?
<janrinze>
yes, it does pick 16kx1. but a lot of glue around it apparently.
<ZirconiumX>
janrinze: do the two sides of your RAM use different clocks?
<janrinze>
tnt: when both reading and writing is clocked then it's easy to use DP16KD
<janrinze>
ZirconiumX: In the VGA I use DP16KD directly since it doesn't require any init values. Also the VGA side is read-only.
<janrinze>
for the CPU i like to setup memory so it will run the bootloader or bare metal code out-of-the-box.
<tnt>
interestingly 32kx16 looks fine. but 64kx16 the lut goes from 21 to 139 ...
<janrinze>
hmm.. perhaps i should try to split it in two 32 KB blocks :D
<janrinze>
eehhr.. 32kx16 blocks x2 of course.
<tnt>
First thing it generates a CE signal independently for each RAM which is useless, it doesn't matter if the CE of the 'unused' RAM is enabled.
<tnt>
And then the 4:1 MUX structure is ... insane.
<tnt>
That's what it uses to mux the output of 4 RAM into the output bit ...
<daveshah>
Yes, the problem is that ABC can't use the slice mux structures properly
<tnt>
-nowidelut seems to "fix" it
<daveshah>
Particularly with ABC9 in a real design this will only be a big problem if the structure is on the critical path
<daveshah>
Otherwise it will likely be relaxed
<daveshah>
It also seems like the CE issue could be improved in memory_bram by registering the address MSBs rather than the decoded signals
kraiskil has quit [Ping timeout: 240 seconds]
<janrinze>
daveshah: that's exactly how i use it with the VGA memory. The latched address MSBs are used to select the appropriate bit for reading.
<tnt>
Just tried -nowidelut on the supercon badge design. Hurts timing a bit it seems (on the super representative sample of 3 seeds ...) but reduces the pnr time by 15% and the slice count by 10%.
<daveshah>
Yes, nowidelut is effectively an area/timing tradeoff
<ZirconiumX>
I've been wondering about nowidelut on the Cyclone V. Since it's natively a LUT6 without being able to mux it higher, what would it even do?
<ZirconiumX>
There's the option of using LUT4s only, of which two can be packed into an ALM. Guess I'd have to conduct experiments