az0re has quit [Remote host closed the connection]
tpb has joined #symbiflow
Degi_ has joined #symbiflow
Degi has quit [Ping timeout: 240 seconds]
Degi_ is now known as Degi
<daniellimws>
mithro: Where?
<mithro>
@daniellimws: Bunch of people submitted pull requests to help fix https://github.com/SymbiFlow/ideas/issues/47 -- always good to have a second pair of eyes look at that everything in the checklist is covered
<tpb>
Title: Updated the repositories to compile with the current copyright best practices · Issue #47 · SymbiFlow/ideas · GitHub (at github.com)
<daniellimws>
mithro: Oh sure
<daniellimws>
mithro: For the v2x repo you asked me to also add the "coding: utf-8" thing to the top of python files. Should we require this for the python files in other repos too?
<tpb>
Title: Updated the repositories to compile with the current copyright best practices · Issue #47 · SymbiFlow/ideas · GitHub (at github.com)
<mithro>
@daniellimws Done!
az0re has joined #symbiflow
<daniellimws>
mithro: By third_party code does this mean all the git submodules? For example inside https://github.com/SymbiFlow/symbiflow-xc7z-automatic-tester there's devmemX, linux-xlnx, u-boot-xlnx, zynq_bootloader, should they be put into a third_party directory?
<tpb>
Title: GitHub - SymbiFlow/symbiflow-xc7z-automatic-tester: Tool for automatically testing FPGA designs using a Zynq Series 7 board. (at github.com)
<mithro>
daniellimws: Yeap!
<sf-slack>
<timo.callahan> Hi @borisnotes, I'm trying to find an IRC client that works well, until then please use *symbiflow* to ask me questions (I can see them since the posts are echoed to Slack).
futarisIRCcloud has joined #symbiflow
citypw has joined #symbiflow
circ-user-VOF5C has joined #symbiflow
circ-user-VOF5C is now known as tcal
Bertl is now known as Bertl_zZ
<_whitenotifier-9>
[symbiflow-arch-defs] rakeshm75 opened issue #1436: Branch : Quicklogic : Primitives for the post-layout simulations - https://git.io/Jfk0Q
<tpb>
Title: tools: add symbiflow support and re-enable CI on all supported tools by acomodi · Pull Request #73 · SymbiFlow/fpga-tool-perf · GitHub (at github.com)
<tpb>
Title: GitHub - HackerFoo/vtr-nix-test: Using Nix to run VTR tests (at github.com)
<hackerfoo>
So now I can test algorithms as fast as I can make them.
<shapr>
exciting!
<shapr>
Seems like everyone I know is switching to nix to reduce compile times.
<hackerfoo>
I think the biggest improvement is moving data around automatically and keeping it organized. When running hundreds of configurations and flag settings, it's hard to not mess something up and draw the wrong conclusions.
<hackerfoo>
Or to do something dumb like making a code change or switching branches but forgetting to recompile.
<shapr>
My company is switching to nix to reduce compile times.
<hackerfoo>
No, I don't.
<shapr>
is there some other way outsiders could support your efforts?
<hackerfoo>
ZirconiumX: I'd estimate ~2% from previous runs, but now I can manage load much better because I broke every run into it's own derivation, where before a derivation might run a few tens in parallel.
<ZirconiumX>
It's funny, because I seem to have a fairly different methodology for "improvements"
<hackerfoo>
I'm keeping limiting task to 80 on a 96 core machine to avoid saturating it, but I might bring that up to 90.
<ZirconiumX>
I'd rather take the "latency" hit of multiple samples than limit "throughput"
<ZirconiumX>
So I use SPRT.
<hackerfoo>
shapr (IRC): Google is funding the work on Symbiflow (I'm an employee), but there are others such as Dave Shah (https://www.patreon.com/fpga_dave) that are funded by the community.
<tpb>
Title: David Shah is creating open source FPGA tools | Patreon (at www.patreon.com)
<hackerfoo>
shapr (IRC): I can help you get started if you are interested in contributing work to the project.
<tpb>
Title: GitHub - ZirconiumX/yosys-testrunner: Program for robustly comparing two Yosys versions (at github.com)
<hackerfoo>
Maybe we could combine both approaches for even better data.
<hackerfoo>
ZirconiumX: Thanks.
<ZirconiumX>
The point of SPRT is to have bounded error while maintaining optimality
<ZirconiumX>
I can go to incredibly small error for an old versus new test because it's generally pretty decisive
<ZirconiumX>
And I change the nextpnr seed each run to explore the design space as much as possible
<ZirconiumX>
As much as anything, I'm not asking "how do these compare", I'm asking "is this stronger"
<ZirconiumX>
My current measurement for "stronger" is "higher Fmax", but I specifically picked a multinomial model to support multiple outcome conditions
<hackerfoo>
I have the advantage of unlimited cores, so there isn't a need to saturate them. Many tasks are limited by available RAM, so it might be best to say at 80 cores, for 8GB per task.
<ZirconiumX>
Okay, sure
<hackerfoo>
Right now my aim is to minimize runtime without sacrificing (too much) quality.
<hackerfoo>
The goal is to improve VPR's runtime by 10x.
<sf-slack>
<pgielda> @hackerfoo - this is what we already did on (original) Distant approach (not using RBE) -- doing everything on one machine to a certain point (configure, common parts) then just cloning the hard drive a lot of times and forking a lot of small machines in GCP and running everything in parallel. Internally we were running it on Ryzens which was even faster for single core things (GCP did not have Epycs at that point and
<sf-slack>
even now I cannot spawn more than ~24 cores of n2s in total on GCP)
<sf-slack>
<pgielda> (I mean the "What would I do if I had 5 n1-highmem-96" thing)
<hackerfoo>
I had to fix a bug in Nix to handle waiting on more than 1k fds, but now it should be able to scale to thousands of cores now.
zkms has quit [Quit: zkms]
zkms has joined #symbiflow
<sf-slack>
<pgielda> Cool
<sf-slack>
<pgielda> I have to play with Nix once I have time
<sf-slack>
<pgielda> Oh, the main reason why it was better to spawn smaller machines was that a lot of things was single core anyway and single core performance drops once you are using more cores
<sf-slack>
<pgielda> In fact its a pity nobody has something like GCP-style on demand machines that have less cores, but the cores are super fast, specially cooled, can be unreasonably expensive even -- there are still workloads where you cannot really benefit from a lot of cores but rather from a smaller amount of really fast cores.
<sf-slack>
<pgielda> Ryzens/Threadrippers seem to be closest to that ;)
<daveshah>
For place and route, memory bandwidth/latency is another big constraint
<sf-slack>
<pgielda> (while still having a lot of cores ; ))
<sf-slack>
<pgielda> yes, true.
<daveshah>
which means more small machines almost certainly win there too
<sf-slack>
<pgielda> those machines had a lot of RAM
<sf-slack>
<pgielda> I have a lot of RAM everywhere just because of software like that, otherwise mostly eaten by the browser or not used ;)
<hackerfoo>
pgielda: I think they're all large machines anyway, so the smaller machines are just pieces of a bigger machine. So my theory is that 96 single-core machines will have more overhead than a single 96 core machine, and be far less flexible, and require 96 disks instead of one large shared disk.
<sf-slack>
<pgielda> you can actually attach same read only disk to multiple machines in GCP
<hackerfoo>
And it would be worse if they were actually individual machines.
<sf-slack>
<pgielda> it just gets slower but to a certain point it scales
<sf-slack>
<pgielda> but I agree
<hackerfoo>
I thought of that. I think a ZFS disk can be shared read/write. I haven't tried it yet.
<sf-slack>
<pgielda> also bigger drives are faster in GCP I think so to a certain point it makes sense to make this drive bigger than needed
<sf-slack>
<pgielda> but its all slightly too complicated when you can run one big machine, I agree
<hackerfoo>
daveshah: I have a Hades Canyon NUC (i7, 32GB RAM, nvme) that outruns my Xeon workstation by 1.5x-2x single threaded, and probably beats the cloud machines by 2-4x. I think this is partly clock speed, but that doesn't entirely explain the difference. It might be better at handling the heat (ironically), because it's a 100C part with massive blowers on it.
<daveshah>
For short term single threaded workloads the i9 in my laptop beats my Threadripper desktop
<daveshah>
Definitely not the case for more than a minute or two though
<hackerfoo>
It would be interesting to collect data to determine the best machine. We could make an FPGA benchmark suite.
OmniMancer has joined #symbiflow
<_whitenotifier-9>
[vtr-xml-utils] acomodi opened issue #11: Add license automated check in Travis - https://git.io/JfITh
<_whitenotifier-9>
[symbiflow-xc7z-automatic-tester] acomodi opened issue #3: Add license automated check in Travis - https://git.io/JfITj
<_whitenotifier-9>
[python-sdf-timing] acomodi opened issue #40: Add license automated check in Travis - https://git.io/JfIke
<_whitenotifier-9>
[symbiflow-bitstream-viewer] acomodi opened issue #6: Add license automated check in Travis - https://git.io/JfIkv
<_whitenotifier-9>
[fasm] acomodi opened issue #22: Add license automated check in Travis - https://git.io/JfIkf
<_whitenotifier-9>
[symbiflow-arch-defs] acomodi opened issue #1438: Add license automated check in Travis - https://git.io/JfIkJ
<_whitenotifier-9>
[prjxray] acomodi opened issue #1306: Add license automated check in Travis - https://git.io/JfIkU
<_whitenotifier-9>
[sv-tests] acomodi opened issue #773: Add license automated check in Travis - https://git.io/JfIkI