clifford changed the topic of #yosys to: Yosys Open SYnthesis Suite: http://www.clifford.at/yosys/ -- Channel Logs: https://irclog.whitequark.org/yosys
<corecode> daveshah: sorry to keep asking you
<corecode> how do i find the extra_bits_db (padin_glb_netwk)
<corecode> hm, is it as easy as reading this file?
<corecode> yea i see numbers that are very similar, but one off, like for the 1k
seldridge has quit [Ping timeout: 258 seconds]
emeb has left #yosys [#yosys]
<corecode> heh, icecube says Internal Error: Assumption 'arch->IsPlaceable(y, x)' failed in plGraphConverter.cpp line 1359
<corecode> when placing a DFF at tile (7, 20)
emeb_mac has joined #yosys
citypw has joined #yosys
gsi__ has joined #yosys
gsi_ has quit [Ping timeout: 244 seconds]
leviathanch has joined #yosys
seldridge has joined #yosys
rohitksingh has joined #yosys
seldridge has quit [Ping timeout: 268 seconds]
pie___ has joined #yosys
rohitksingh has quit [Ping timeout: 268 seconds]
pie__ has quit [Ping timeout: 245 seconds]
jevinskie has joined #yosys
rohitksingh_work has joined #yosys
promach has quit [Quit: WeeChat 2.3-dev]
promach has joined #yosys
m_w has quit [Quit: Leaving]
develonepi3 has quit [Remote host closed the connection]
emeb_mac has quit [Quit: Leaving.]
rohitksingh_work has quit [Read error: Connection reset by peer]
rohitksingh_work has joined #yosys
dys has quit [Ping timeout: 245 seconds]
awordnot has quit [Quit: Ping timeout (120 seconds)]
awordnot has joined #yosys
leviathanch has quit [Remote host closed the connection]
<promach> ZipCPU: see the updated version https://i.imgur.com/oPAVWdh.png
<promach> wait, I just found another corner case :|
<tpb> Title: A signed multiply verilog code using row adder tree multiplier and modified baugh-wooley algorithm · GitHub (at gist.github.com)
<daveshah> corecode: yeah, just try a SB_GB_IO (or internal oscillator, probably for two of the cases) at each global input location
<daveshah> and see which extra_bit appears in the asc file
rohitksingh_work has quit [Read error: Connection reset by peer]
rohitksingh_work has joined #yosys
citypw has quit [Ping timeout: 258 seconds]
<sxpert> promach: image is gone
<promach> sxpert: see the gist, wait let me update the image
<sxpert> ah ok
rohitksingh_work has quit [Read error: Connection reset by peer]
<promach> there is still bug when A_WIDTH != B_WIDTH
<sxpert> I see
<promach> sxpert: ok, it works now
<sxpert> the url probably changed
<promach> wait, let me update the code. give me 15 minutes
<tnt> I'm wondering if there are algorightm that specifically target LUT4 arch so that each layer uses the 4 inputs of the lut and not just 2.
<tnt> like doing the partial multiply 2 bits at a time instead of 1 bit at a time.
leviathanch has joined #yosys
<tpb> Title: A signed multiply verilog code using row adder tree multiplier and modified baugh-wooley algorithm · GitHub (at gist.github.com)
<promach> try it out and see what whether this meets your needs
<promach> and probably find out some corner cases that my assert() is not capable of finding
<promach> sxpert : just try it out first
<promach> probably I will need to solve the induction bugs before knowing which assert() I had missed
<sxpert> as ZipCPU would say, have you tried formal methods ?
rohitksingh_work has joined #yosys
<promach> sxpert : induction is part of yosys-smtbmc
<promach> and yosys-smtbmc is formal tool
<promach> and induction can help find bugs within the actual verilog source code itself as well as the formal code
<promach> sxpert : https://i.imgur.com/BLYZTi6.png should stay valid until induction shows me otherwise later
s_frit has quit [Remote host closed the connection]
s_frit has joined #yosys
<promach> use pencil and paper method to check this countermeasure
<promach> I cannot be 100 percent sure about the correctness of this countermeasure since I had not done a rigorous maths proof about this
<promach> and the code had not passed induction yet
rohitksingh_work has quit [Ping timeout: 245 seconds]
<corecode> yea i don't know what the colbufs are, so i don't know how to look for them in the files
s_frit has quit [Remote host closed the connection]
s_frit has joined #yosys
jevinskie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<tnt> corecode: AFAIR they're buffer to distribute global networks in various parts of the chips and if some global isn't needed in some area, you can disable those buffer to save power.
<tnt> and I think nextpnr currently just globally enables them all indiscriminately if they're actually needed or not.
s_frit has quit [Remote host closed the connection]
s_frit has joined #yosys
rohitksingh has joined #yosys
jevinskie has joined #yosys
citypw has joined #yosys
seldridge has joined #yosys
jevinskie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
maikmerten has joined #yosys
jevinskie has joined #yosys
develonepi3 has joined #yosys
gsi__ is now known as gsi_
rohitksingh has quit [Ping timeout: 246 seconds]
gruetzkopf has quit [Quit: No Ping reply in 180 seconds.]
kerel has quit [Remote host closed the connection]
gruetzkopf has joined #yosys
<develonepi3> daveshah: I have seen your presentation several times and have enjoyed it very much. You comments that "Most FPGA Development use closed-source tools, FPGA vendors don;t document bitstreams." Are right on point. Yourself & others ZipCPU, & Clifford Wolfe have advanced FPGA discipline of study more in the past few years than others in decades. I think that we are now at cusp where more people will start using FPGAs. I have been working in
<develonepi3> Compressing Numerical Meteorological Modeled Data for many years. This work Karhunen-Loeve transform (KLT) in the vertical direction and JPEG 2000 on XY slices has been abandoned. I recently started working on Bare Metal for the Raspberry Pi3B+ using Ultibo. I think this is now achievable with your ECP5 efforts and the Raspberry Pi3B+ running Bare-Metal.
kerel has joined #yosys
m4ssi has quit [Remote host closed the connection]
jevinskie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rohitksingh has joined #yosys
<corecode> i doubt you'd see a performance improvement for compute between running linux or ultibo
<ZipCPU> corecode: I'm curious why you'd say that
develonepi3 has quit [Quit: Leaving]
<corecode> ZipCPU: because typically compute means that there is no kernel executing for most of the time
emeb has joined #yosys
dys has joined #yosys
develonepi3 has joined #yosys
<ZipCPU> ... and, go on
<corecode> given that the kernel doesn't run much at all, it is unlikely that you see performance differences
<ZipCPU> Sorry, I guess I misread your response. You meant between Linux and Ultibo, and I thought you meant (Linux and Ultibo) vs FPGA
<corecode> oh no
<daveshah> FWIW, if this is floating point heavy then there's little chance of the ECP5 beating the Pi, you'd probably need something much fancier, unless you are very clever about how you describe it
<corecode> hi daveshah
<daveshah> otoh I can easily see the ECP5 winning if you get it fixedpoint/integer
<daveshah> hi corecode!
<corecode> you're just the guy i was looking for
<corecode> i'm trying to button up this icestorm stuff
<ZipCPU> daveshah: When I last examined the algorithm, it was I/O (i.e. SDRAM) bound
<corecode> what am i looking for in the colbuf_logic output?
<corecode> because i got it running (except for one tile)
<corecode> but now i don't know what i am looking for
<daveshah> You should see 4-tuples (colbuf_x, colbuf_y, user_x, user_y)?
<daveshah> Hopefully colbuf_x and user_x are the same
<corecode> no, that must be a different script
<corecode> you mean colbuf_io?
<corecode> there are 3 different colbuf scripts
<daveshah> ah, I think you might need to use colbuf.py to parse the colbuf_logic output
<daveshah> ie, pass all the .exp files created by colbuf_logic to colbuf.py
<corecode> aha
<corecode> last time i used it, i got assertion errors
<corecode> because icebox is missing some data
<corecode> this is a big maze
<corecode> what are those colbufs?
<daveshah> basically, the global network is split up into segments to save power
<daveshah> the colbufs are the buffers for a given line segment
<daveshah> there's an illustration at http://www.clifford.at/icestorm/io_tile.html
<tpb> Title: Project IceStorm IO Tile Documentation (at www.clifford.at)
<corecode> aaah! more documentation i didn't know about
<daveshah> the grey circles are the column buffers and the red lines indicate the tiles in which globals are driven by that buffer
<daveshah> there's a script to generate an svg like that somewhere too
<corecode> yea
tannewt has quit [Ping timeout: 252 seconds]
tannewt has joined #yosys
rohitksingh has quit [Remote host closed the connection]
<corecode> daveshah: aha! colbuf_io*.sh does not produce output with a ColBufCtrl line
<corecode> daveshah: so what's going on there
<daveshah> corecode: quite possible that the lm4k doesn't have IO colbufs (i.e. they are enabled all the time)
<corecode> does this chip not have ColBufCtrl in IO cells?
<corecode> and what the hell is up with (7,20), why did the icecube placer throw an internal assert
<daveshah> not something I've ever seen
<daveshah> perhaps that tile is broken
<corecode> that they noticed later?
<daveshah> within the realms of possibility
<corecode> so from reading the code in icebox, it seems that the other dies have e.g. the bottom IO tile connected to the colbuf
<corecode> because y==0 also maps to col buf source y=4
<corecode> but somehow when running the colbuf_io script, i don't see any colbufctrl - what does that mean?
<corecode> the signal has to be routed somehow?
<corecode> or they always route it directly, and not with a colbuf?
<corecode> why would this aspect be so different from the 5k
<daveshah> my guess is that it is always routed
<corecode> wait, maybe the problem is a different one
<corecode> so if i understand the colbuf_io code right, it sets up an IO cell that uses a clk, therefore a global network
<daveshah> yeah
<corecode> the input clock (from a different pin) will have to be routed via the global network
<corecode> or?
<corecode> because what i'm seeing is that the clk signal is routed via standard routing
<daveshah> ah, that explains that one
<daveshah> modify the example code to put an SB_GB in between clock pin and SB_IO
<daveshah> eg SB_GB gbuf (
<daveshah> .USER_SIGNAL_TO_GLOBAL_BUFFER(clk_in),
<daveshah> );
<daveshah> .GLOBAL_BUFFER_OUTPUT(clk)
<corecode> the ram code does that
<corecode> interesting that it worked for others?
<corecode> i'm also using the newer icecube, so maybe that's a difference
<daveshah> yes, quite possibly an icecube change
<corecode> thanks
<corecode> knowing what to look for really helps :)
<corecode> so how do i get the numbers for the extra_bits?
<corecode> there are some comments i don't quite get
<corecode> e.g.
<corecode> (1, 331, 143): ("padin_glb_netwk", "3"), # (1 3) (331 144) (331 144) routing T_0_0.padin_3 <X> T_0_0.glb_netwk_3
<corecode>
<corecode> where does the first tuple come from?
<daveshah> That comment is the text description of the bit from the GLB file
<daveshah> the first tuple comes from the .extra_bit in the asc or exp
<corecode> if it said (1, 331, 144) i'd understand it
<daveshah> There's an offset for some strange reason
<corecode> for some only?
<daveshah> I can't remember the specifics
<corecode> ok
<daveshah> So long as the first tuple comes from the asc/exp it should be fine
<corecode> so those i just need to place a global input and observe what extra bits are being set
<daveshah> Yes
<daveshah> in some cases, it might be an oscillator rather than a global input
<corecode> yes
<corecode> i guess i know what global network it is
<corecode> and gbufin locations i need as well, but i guess i should get that with the same test
<daveshah> I think the datasheet should have those
<daveshah> beware the gbufin locations in the datasheet are for global input pins
<daveshah> SB_GBs which drive from fabric are at the same locations, but drive a different network to the pin at that location
<corecode> that's also related to the padin_pio_db?
<daveshah> Yes, those are the input pin that drive each global
<corecode> it seems some dbs require a specific sequence, and i don't know what the sequence needs to match
<daveshah> padin_pio_db is in global network number order
<corecode> i'm not sure which terminology is what
<daveshah> "padin" refers to the dedicated route from a specific IO pin to a specific global network
<daveshah> "gbufin" refers to the route from fabric (the fabout into an IO tile) into a global network
m4ssi has joined #yosys
m4ssi has quit [Remote host closed the connection]
m4ssi has joined #yosys
<corecode> yea that was it, with the extra SB_GB the io tiles could be tested too
m_w has joined #yosys
maikmerten has quit [Remote host closed the connection]
leviathanch has quit [Remote host closed the connection]
m4ssi has quit [Remote host closed the connection]
FL4SHK has joined #yosys
indy has quit [Quit: ZNC - http://znc.sourceforge.net]
indy has joined #yosys
indy has quit [Quit: ZNC - http://znc.sourceforge.net]
show1 has joined #yosys
indy has joined #yosys
<corecode> i'm surprised there is no automation for the global network stuff
<corecode> or i am missing it
<daveshah> I don't think there is anything
<daveshah> As it is only 8 globals per device I don't think anyone bothered
<corecode> ok
<corecode> i guess i need to create a different footprint option to capture all information
<daveshah> Yes, quite possibly
<corecode> oh the pllauto script is fantastic
<daveshah> I did that one when doing the UltraPlus
<daveshah> It doesn't do routing, but that is pretty quick to figure out with icebox_vlog (and only needs one design)
<corecode> the ultralite is very similar
<corecode> i
<corecode> i'm just trying to verify that the values are the same
<daveshah> Yeah, very sensible
<corecode> i think the bit assignments are the same, pins/cells are different
<corecode> possibly the special different bits are different as well
<daveshah> The UltraPlus had a strange layout bitstream wise where one "half" was twice the height of the other
<corecode> and what do these extra bits do?
<daveshah> The 8 padin ones?
<corecode> no, the extra height ones
<daveshah> Oh, they are used to go from 3520 LUTs in the Ultra to 5280 in the UltraPlus
<corecode> ah
<corecode> is the lm4k the ultra?
<daveshah> No, that's the iCE40LM which is older
<daveshah> iCE5LP is Ultra
<corecode> what device string is that in icebox?
<corecode> 5k is ultraplus
<daveshah> lm4k is LM
<daveshah> Ultra isn't in icebox
<daveshah> iceunpack uses u4k for Ultra
<corecode> ah!
<corecode> but it is supported by nextpnr?
<daveshah> Neither are supported by nextpnr
<corecode> aaah
<corecode> so what does it take to go from icestorm to nextpnr?
<daveshah> Just looking at all the device specific cases and adding a new case
<daveshah> e.g. Adding the database import from icestorm to CMake and adding the device name
<corecode> that sounds quite moderate
<daveshah> Many of the cases will be the same as the up5k
<daveshah> It won't be much work at all
<corecode> oh i guess the IRDRV_BLOCK etc aren't even mapped?
<tpb> Title: nats / ice40up5k_base_project · GitLab (at code.electrolab.fr)
<nats`> the up5k is supported by nextpnr
<nats`> at least I made a small template to use it with yosys and nextpnr
<emeb> works great
<corecode> man this is still going to be quite aways
<daveshah> corecode: no, you'll need to add support for those too
<daveshah> You can probably base that off the support for the UltraPlus RGB driver though
<corecode> what happens if i don't add support?
<daveshah> You can't use that primitive
<corecode> :D
<corecode> fine
<daveshah> Everything else will work fine
<corecode> i want to just get my design going
<corecode> if icecube wouldn't fall on its face, i wouldn't be shaving this yak
<corecode> hm LEDIP_BLOCK, RGBDRV_BLOCK, LEDDRVCUR_BLOCK, IRDRV_BLOCK
<corecode> what's what now?
<daveshah> LEDIP_BLOCK will be a PWM generator, basically just hard digital logic
<daveshah> RGBDRV_BLOCK will be a 3 current constant driver for an RGB LED driver
<daveshah> Don't know what LEDDRVCUR_BLOCK is, nothing like that on the UltraPlus
<daveshah> IRDRV_BLOCK will be the IR driver
<daveshah> These should all have corresponding SB_ verilog primitives
<emeb> The Ultra parts had some sort of current source that had to be specifically hooked to the LED driver.
<emeb> Ultra Plus doesn't need that.
<corecode> i gotta say, this work is one of the least pleasurable things, and i've wasted a lot of time on obscure stuff
tpb has quit [Remote host closed the connection]
tpb has joined #yosys
<emeb> corecode: what are you doing?