##openfpga on 2019-01-04 — irc logs at freenode.irclog.whitequark.org

00:04 ayjay_t has quit [Read error: Connection reset by peer]

00:05 ayjay_t has joined ##openfpga

00:06 Zorix has quit [Quit: Leaving]

00:07 Zorix has joined ##openfpga

00:12 mumptai has quit [Quit: Verlassend]

00:13 catdemon has joined ##openfpga

00:14 catplant has quit [Quit: WeeChat 2.2]

00:17 jcreus has quit [Ping timeout: 246 seconds]

00:20 catdemon is now known as catplant

00:21 futarisIRCcloud has quit [Quit: Connection closed for inactivity]

00:38 Bike has joined ##openfpga

01:03 Vincenttl has joined ##openfpga

01:19 pie__ has quit [Remote host closed the connection]

01:20 pie__ has joined ##openfpga

01:34 <whitequark> tnt: what would you prefer instead of pins being forced to 0?

01:34 <whitequark> tristate?

02:07 balrog has quit [Quit: Bye]

02:10 jevinskie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

02:11 <sorear> probably excessively clever idea: since the ice40 bitstream is documented, we can use gateware to do processing/validation of it, which could be done on tinyfpga/tomu type boards to make them much more difficult to brick - a bootloader could refuse to overwrite itself and refuse to run a bitstream which enables the SPI I/O

02:12 balrog has joined ##openfpga

02:13 jevinskie has joined ##openfpga

02:16 <sorear> obvious problem 1: langsec really does not like it when you run the same bitstream through two parsers, one of which is closed source

02:17 <sorear> obvious problem 2: it's unclear how much "you can write a bitstream that kills the flash" is in practice, and "you can overwrite the bootloader using the bootloader" has a much simpler fix in isolation, which I believe is at least partially implemented

02:17 <whitequark> sorear: you can't actually do that

02:17 <whitequark> the bitstream is not a bitstream

02:17 <whitequark> it's a sequence of packets that address CRAM in blocks

02:17 <whitequark> you can't parse it in gateware

02:19 <sorear> I am aware that the format has an annoying amount of order flexibility. We only need to handle the packet orderings which are actually generated by nextpnr and icecube, although the general case doesn't seem impossible to me

02:20 <sorear> to handle the general case you'd have to reimplement the state machine and check each CRAM block against both address and content

02:20 <sorear> to NOT handle the general case you'd look for the expected commands in the expected order

02:23 <sorear> (trivially, the chip itself is a digital circuit which turns a bitstream into an initialized config memory, so "you can't parse it in a digital circuit" is absurd; the trivial solution is not feasible because it requires more RAM than the total BRAM, but a nontrivial solution is)

02:24 <sorear> (I'm not sure whether you include "synthesized hard logic" in the definition of "gateware")

02:26 <whitequark> hrm

02:26 <whitequark> fine

02:26 <ZipCPU> sorear: So ... you want to validate a stream of data with a configuration that has less information than the data stream, and you want to prevent writing the flash on a bad bitstream ... I don't think the numbers add up for this. In order to know if its a good or bad bitstream, you'd have to process the entire bitstream before any flash erase command

02:26 <ZipCPU> You'd also need a place to store the bitstream you were in the middle of processing

02:27 <sorear> ZipCPU: for tinyfpga, you'd write normally but disable SB_WARMBOOT until the bitstream looks OK

02:27 <ZipCPU> Unless you have some external memory, I can't imagine that the chip would have enough room to contain within it the copy of its configuration

02:28 <ZipCPU> So ... the plan would be to write the bitstream but just hold off the SB_WARMBOOT until you know it is valid? But then removing power when done undoes what you are protecting against

02:29 <sorear> the ice40 uses a different SPI flash address for power-on versus SB_WARMBOOT, power-on always goes to the bootloader, not the user-provided design

02:29 <ZipCPU> So you aren't discussing writing the primary/bootloader partition at all?

02:30 <sorear> correct

02:30 <whitequark> you could potentially do this

02:30 <whitequark> do two writes

02:30 <whitequark> on the first write, check for validity and remember CRC

02:30 <whitequark> ah no this doesn't work

02:31 <sorear> you do one write, checking validity, to the non-default boot address

02:31 <sorear> if it's valid, switch to it

02:31 <whitequark> oh hm

02:31 <sorear> since it's the non-default address, it won't be used unless explicitly switched to

02:32 <ZipCPU> The power shutoff will still get you, since the bootloader defaults to loading the second address after a short period of time

02:32 <sorear> that's a gateware function, not a chip function, it can be modified to check validity (using SPI reads) before chainloading

02:35 <ZipCPU> Yes, I suppose it could

02:35 <swetland> you could conceivably have a tiny initial image which decides based on gpio state and serialno of flash images, which to boot

02:36 <swetland> A. miniloader B. bootloader-1 C. bootloader-2 D. image

02:36 <swetland> miniloader loads image unless bootloader is requested, prefers "newest" of -1/-2 bootloader unless previous is requested

02:37 <swetland> bootloader only ever writes to the opposing bootloader image when updating the bootloader, to avoid clobbering itself (clearly working) with an unknown image

02:37 <sorear> you're proposing something even more complicated than what I said and I'm not sure why

02:38 <sorear> and there isn't really such a thing as a "tiny image", we don't have compression support in the open tools for any fpga aiui

02:42 <swetland> If the goal is to prevent bricking of the bootloader, the most reliable approach is never overwrite a working bootloader with an unknown one (and you can't "know" it works until you actually boot it), and having the first stage be extremely simple and wp-locked (ideally never needing an update post-production). This also covers the loss-of-power-while-writing case.

02:42 <swetland> but yeah, if every image must be full-size in flash that gets punitive for larger parts

02:44 <sorear> swetland: I may have failed to make clear that in my proposal the bootloader (there is only one) is absolutely never overwritten?

02:44 <sorear> I'm talking about ways to prevent the user design from touching the bootloader

02:45 <sorear> so I don't think what you're saying is relevant to what I'm saying?

02:45 <swetland> ah, sorry, yes I missed that

02:56 <sorear> I feel like I'm coming across as aggressive here :/ I'd like to not do that

02:58 <swetland> no worries -- I misread the ongoing conversation and thought a different problem was being discussed. it happens

02:59 <swetland> probably due to past projects I've been involved in *always* wanting to be able to update bootloaders in the field, no matter how terrifying ^^

03:01 unixb0y has quit [Ping timeout: 246 seconds]

03:02 unixb0y has joined ##openfpga

03:23 ayjay_t has quit [Read error: Connection reset by peer]

03:23 ayjay_t has joined ##openfpga

03:25 futarisIRCcloud has joined ##openfpga

03:30 Miyu has quit [Ping timeout: 272 seconds]

03:40 pie__ has quit [Remote host closed the connection]

03:41 pie__ has joined ##openfpga

04:02 rohitksingh_work has joined ##openfpga

04:03 hl has quit [Ping timeout: 246 seconds]

04:23 dj_pi has joined ##openfpga

04:27 <whitequark> daveshah: also thinking about how to integrate carry logic into flowmap

04:27 <whitequark> i think i can have a step that like

04:27 <whitequark> looks at $alu cells

04:28 <whitequark> then, if the entire alu chain is packed into a lut, leaves it that way

04:29 <whitequark> but if a $alu is outside of a fanout free cone of another $alu cell, it takes both and un-packs them

04:29 <whitequark> so that the later techmap pass can pack them using dedicated carry logic

04:29 <whitequark> a really nice thing about flowmap is that it works entirely in terms of source cells

04:29 <whitequark> you don't have to painfully reconstruct cell boundaries from AIGs

04:30 m4ssi has joined ##openfpga

04:32 <whitequark> this way my cmp2lut techmap becomes just... completely redundant

04:32 emeb has left ##openfpga [##openfpga]

04:39 m4ssi has quit [Remote host closed the connection]

05:04 <catplant> nice

05:17 <azonenberg> whitequark: sooo

05:17 <azonenberg> i dont know how this will play with you or not

05:17 <azonenberg> but on architectures with higher-order luts, say lut6

05:17 <azonenberg> it's possible to do an adder plus some boolean operations in one lut

05:18 <azonenberg> say, (a ^ 0xdead) + (b ^ 0xbeef) should be one lut per bit plus carry chains

05:18 <azonenberg> xst isnt good at folding this, idk about vivado

05:18 <azonenberg> But it's an optimization to consider

05:19 <whitequark> azonenberg: this will probably end up as custom-ish logic in flowmap

05:19 <whitequark> however, flowmap is really easily adaptable to this kind of stuff

05:19 <whitequark> in part because it doesnt throw away information about adders in the original design

05:19 <whitequark> moreover

05:19 <azonenberg> Also, what about 3-input adders?

05:19 <azonenberg> iirc xilinx-land should be able to do those

05:20 <whitequark> probably needs a preliminary packing step in yosys that would produce eg $alu3 cells

05:21 <azonenberg> would that require any changes to the rtlil core? or would this be considered a temporary techmap-only cell type?

05:21 <whitequark> that would work the same as a $alu cell

05:21 <whitequark> just with 3 inputs

05:21 <whitequark> it would be yet another internal cell type, like $lut, $aoi, whatever

05:21 <whitequark> for any internal cell, yosys may choose to generate or not generate any of them depending on context

05:22 <whitequark> probably, there would be an option to $alumacc pass

05:22 <whitequark> whether to generate $alu3

05:45 pie__ has quit [Remote host closed the connection]

05:46 pie__ has joined ##openfpga

05:56 Bike has quit [Quit: Lost terminal]

06:27 <tnt> whitequark: yeah, tristate would be better I think. Basically I have plenty of pins connected to a test target and when running different applets, I use different pins. But then I need the unused one not to interefere ... so tristate, possibly weak pullup if oscillation is a concern.

06:28 <whitequark> tnt: good point

06:28 <whitequark> can you open an issue

06:30 <tnt> sure.

06:37 _whitelogger_ has joined ##openfpga

06:40 _whitelogger has quit [Ping timeout: 250 seconds]

06:40 <tnt> sorear: huh ... reading the backlog I don't get how you'd determine if a bitstream is "bad" ? I mean, accessing the flash to load/store user data is a perfectly valid thing for a bitstream to do, so how exactly do you plan to guarantee a bitstream at no point will access the bootloader area ?

06:41 <tnt> kind of looks like you'd need to do formal verification of the bitstream on the fpga itself, that seems ... unrealistic.

06:53 _whitelogger has joined ##openfpga

07:01 <sorear> > and refuse to run a bitstream which enables the SPI I/O

07:01 <sorear> in the example given, you can't use the flash to load/store user data

07:01 <whitequark> well that's kind of shit actually

07:02 <sorear> a substantially more complex instantiation could require a filter module to syntactically exist at a specific place (vaguely similar to the partial reconfig approach used by f1)

07:04 <whitequark> sorear: i have an easier solution

07:04 <whitequark> just write-protect the flash

07:04 <tnt> yeah, I was going to say the same ... much easier.

07:04 <whitequark> it literally has a pin for this exact purpose

07:10 <catplant> sorear: another idea, lock the boot sectors?

07:11 <sorear> I don't think this chip has sector-level locking

07:11 <catplant> most do?

07:12 <tnt> which flash chip is it ?

07:12 <sorear> AT25SF161, appears to have the ability to lock ranges (which is good enough) but not random sectors

07:13 <catplant> yeah

07:13 <catplant> thats what we ment

07:17 <tnt> Is the goal to protect against accidental or adversarial erasure of the bootloader ?

07:21 <catplant> alternatively

07:22 <catplant> can you add more spi flash?

07:23 <catplant> also ice40 question

07:23 <catplant> if you nvcm flash it, can you SB_WARMBOOT to spi flash?

07:24 <whitequark> yes

07:24 <catplant> nvcm flash a recovery bootloader???

07:25 <whitequark> yes

07:55 futarisIRCcloud has quit [Quit: Connection closed for inactivity]

08:22 ayjay_t has quit [Read error: Connection reset by peer]

08:22 ayjay_t has joined ##openfpga

08:22 <daveshah> I don't know if anyone has actually tested warm booting to flash?

08:23 <daveshah> I understand that it might be possible, but no one really knows

08:24 <daveshah> ad logic mapping. The plan for carries seems sensible

08:25 <daveshah> I'd also like to see support for optimisation around and/or mapping to the dedicated mux2s in Xilinx and Lattice

08:25 <daveshah> These can be used both to implement larger LUTs and large multiplexers (latter probably with some dedicated techmap rule)

08:26 <whitequark> hmmm

08:26 <whitequark> or what about rebalancing multiplexer chains?

08:26 <daveshah> Yes, that would make sense too

08:27 <daveshah> I'd like to see generic balancing at some point too

08:31 <whitequark> daveshah: can you take a look at https://sci-hub.tw/10.1109/92.285741 ?

08:31 <whitequark> they introduce a variable Rim but I don't understand how it's computed

08:31 <whitequark> Rex and Rslk are simple enough

08:32 jcreus has joined ##openfpga

08:39 <daveshah> I think it's basically a case of attempting simple packing and seeing how it reduces the LUT count and then looking for any trivially redundant LUTs after that

08:42 <whitequark> yeah, i don't understand how exactly they do that

08:42 <whitequark> which is my problem

08:42 <whitequark> what is "simple packing"

08:42 <daveshah> My feeling is a transformation along the lines of opt_lut

08:43 <whitequark> hmmmm.

08:52 <whitequark> daveshah: any specific proposals for calculating Rim?

08:52 <daveshah> No, not really

08:53 <whitequark> I guess I could try, hm,

09:02 <whitequark> daveshah: yeah. you are right. that is what they are considering.

09:09 ayjay_t has quit [Read error: Connection reset by peer]

09:09 <whitequark> so, hm, i can compute the number of possible LUT combine operations

09:09 ayjay_t has joined ##openfpga

09:10 <whitequark> and use that as the value of Rim?

09:13 asante has joined ##openfpga

09:18 m4ssi has joined ##openfpga

09:22 <mithro> tnt: the USB decoder in gtkwave seems to be working

09:22 <mithro> whitequark / catplant: It's still an active investigation topic

09:23 <mithro> catplant: nvcm flash a recovery bootloader is exactly what I'm investigating it for

09:26 Miyu has joined ##openfpga

09:26 <daveshah> whitequark: yes, that seems sensible to me

09:29 <daveshah> btw, a nice extension to that paper would be to use a slightly more advanced model of slack

09:29 <mithro> https://usercontent.irccloud-cdn.com/file/QiHO5rlt/image.png

09:29 <daveshah> instead of just LUT depth, also considering the delay of blackboxes

09:29 <mithro> Sooo coool....

09:30 <daveshah> this is something we were going to implement in abc xaig

09:30 <whitequark> daveshah: i mean

09:30 <whitequark> if yosys gains some way to provide timing models

09:30 <whitequark> i can probably use that

09:30 <daveshah> it'll probably just be a simple text file

09:31 <whitequark> yeah i dont wanna write that infrastructure

09:31 Miyu has quit [Ping timeout: 250 seconds]

09:32 <daveshah> sure, we'll be writing something like that for the XAIG stuff

09:33 <tnt> mithro: good :)

09:35 <tnt> whitequark: atm flowmap only does packing right ? It doesn't attempt to modify the logic "tree" to find an equivalent tree that packs better ?

09:36 <whitequark> it doens't, that's FlowSYN

09:36 <mithro> tnt: I made the color choices deterministic and deleted some of the worst choices

09:37 <tnt> mithro: hehe, yeah, I picked the whole list from the gtkwave source without paying attention to what works/doesn't :) Like ... white on white isn't ideal.

09:38 <mithro> Any idea how to make gtkwave load them automatically?

09:38 <mithro> tnt: the decoder also seems to be nondeterministic in some way...

09:39 <tnt> mithro: I think you can save your 'workspace' in gtkwave ?

09:40 <tnt> I noticed the non-determinism as well ... but mostly on 'invalid' signals. Must be something in the decoder itself because I don't touch the signals in anyway. I might try to save the vcd than gtkwave exports and make sure it's always consistent.

09:41 <mithro> It seems to miss the first edge sometimes

09:43 <daveshah> whitequark: aiui even flowsyn still isn't designed to do general combinational optimisations.

09:43 <daveshah> I think a nice way to approach a functional reduction type optimisation might be a random simulation plus SAT (I think ABC has something along these lines)

09:44 <daveshah> use random simulation to find signals that might be identical (e.g. with a hash table of sim traces), then use SAT to check whether they really are

09:45 <whitequark> hmmm

09:48 <daveshah> this way you could optimise around greybox cells like carry chains, etc,

09:51 <daveshah> maybe I'll have a play with this at some point

10:42 octycs has joined ##openfpga

10:44 emka has joined ##openfpga

10:51 <whitequark> daveshah: question

10:51 <whitequark> i have an internal representation where all luts are simultaneously mapped

10:52 <whitequark> this is done by having a graph where each node is mapped to some (best, depth wise) lut

10:52 <whitequark> has edges to each other possible lut

10:52 <whitequark> and then there's a set of nodes that is actually implemented

10:53 <whitequark> daveshah: can you take a look at the flowmap_area branch in my repo and help me out with updating this IR for LUT splitting?

10:53 <whitequark> specifically, there is "lut_path_lengths" and "lut_trans_outputs"

10:53 <whitequark> which i need to update

10:53 <whitequark> but i'm having trouble convincing myself that what i want to do is correct

10:54 <whitequark> so let's say i have a lutv and split lutw out of it

10:54 <whitequark> really this means two things

10:54 <whitequark> first, lutw is added into the list of implemented luts

10:55 <whitequark> it already has the set of gates as well as all the necessary edges

10:55 <whitequark> so that's easy

10:55 <whitequark> second, lutv is being reduced. this is the hard part.

10:57 <whitequark> it's hard because if w is not a gate connected directly to fanin of v, w can internally depend on other gates in v

10:58 <whitequark> i currently traverse the cone of all gates implementing v but excluding w

10:58 <whitequark> and noting the list of inputs that can be eliminated by breaking out w

10:59 <whitequark> i am thinking this can be used later in two ways

10:59 <whitequark> first, all the edges corresponding to these inputs may be broken

11:00 <whitequark> second, all the gates with fanin comprising only these inputs may be removed from v

11:00 <whitequark> (and recursively, all gates whose fanin is those inputs and the gates we just also removed)

11:02 <whitequark> now, i think lut_path_length is easier

11:02 <whitequark> because only the path length in input cone of w may change (is this correct?)

11:02 <whitequark> so, i restart the process as written for w

11:05 <whitequark> regarding lut_trans_outputs, i am not so sure

11:05 <whitequark> i think it will actually never change

11:05 <whitequark> but i can't seem to prove it

11:07 <whitequark> ah, no, it will change

11:26 _whitelogger has joined ##openfpga

12:04 <whitequark> Solution has -25.0% area overhead.

12:31 <daveshah> sorry, was afk

12:34 <daveshah> what exactly is lut_trans_outputs?

12:36 <daveshah> the assumption for path lengths seems fine

12:40 <edmund> tnt: we spoke about the SX1257 PMOD.

12:41 <whitequark> daveshah: lut_trans_outputs, for each lut, is the set of POs that will be affected by this lut becoming one level deeper

12:42 <tnt> edmund: true

12:43 <tnt> edmund: kbeckmann is working on it already it seems.

12:44 <daveshah> whitequark: ahhh i see

12:44 <daveshah> the code makes sense now

12:46 <edmund> tnt: you considered at ccc to send me a proposal for making the PMOD and a supporting design.

12:46 <kbeckmann> edmund: I'm soon done with the schematic, will start with the pcb layout after that. I'll probably be done in a few days.

12:48 <edmund> esden also looks into suggesting an ice40UP5 badge for Maker conference in Nov 2019 Los Angeles , including the SX1257.

12:48 <edmund> kbeckmann: awesome

12:48 <daveshah> there were two badge discussions, one radio and one face detection I think?

12:48 <edmund> kbeckmann: did you go with the suggestions by tnt?

12:49 <tnt> daveshah: no reason it can't be both :D

12:49 <daveshah> :D

12:50 <tnt> identity everyone and report to big brother ... killer app for a maker conf.

12:50 <kbeckmann> yeah currently we're thinking of using the sx1257 + i2c<->spi bridge (SC18IS602B). so it will only use up one dual pmod slot with 8 data pins.

12:50 <kbeckmann> io pins sorry.

12:50 <edmund> daveshah: It will be all at one badge. The sensor might be a Himax HM01B0

12:50 <daveshah> I see

12:50 <miek> another rf ic worth looking at is the AT86RF215

12:50 <daveshah> sounds scary, as tnt says

12:51 <tnt> need to find a non evil demo app for that combination of peripherals.

12:57 <tnt> miek: that chip looks interesting, but quite a bit more complex.

12:58 <tnt> I kind of like the sx1257 for its simplicity and sort of matching the ice40 'theme'.

12:58 <cr1901_modern> What would an SX1257 PMOD do?

12:58 <daveshah> yeah, there would be little point using the AT chip with an FPGA

12:58 <daveshah> cr1901_modern: the SX1257 is a low bandwidth RF frontend & up/downconverter

12:58 <daveshah> with a 32Msps delta-sigma IQ baseband interface

12:58 <daveshah> so you can implement your own radio protocol in an FPGA

12:59 <cr1901_modern> So it's basically the fronent to an SDR?

12:59 <cr1901_modern> frontend*

12:59 <tnt> yes

12:59 <daveshah> frontend, LO, mixer, 1-bit ADC/DAC

12:59 <cr1901_modern> How do you use a 1 bit ADC?

13:00 <daveshah> it's oversampled delta-sigma

13:00 <cr1901_modern> I know there's some weird proof that if you sample a 1-bit ADC biased w/ small noise, you can get the actual signal value to arbitrary precision

13:00 <daveshah> 32 or 36 MSPS for 500kHz bandwidth

13:01 <daveshah> for 1MHz for radio bandwidth, because it's an IQ interface

13:01 <miek> the AT chip has an IQ interface too, for some reason they leave it out of the overview

13:02 <daveshah> interesting

13:02 <daveshah> looks like that uses 14-bit ADCs and serialises over LVDS

13:02 <tnt> miek: yes it does. But it also has full MAC on board, so not using it is a bit meh. Also if we want people to play 'easily' with it, a 133 Mbps lvds interface on a up5k is not the easiest.

13:03 <daveshah> the 32Msps delta-sig is certainly more qt

13:04 <tnt> Mmm, the ATRF isn't full duplex ?

13:04 <miek> tnt: yeah, fair enough - might be something to play with on the ecp5 instead. they do have a slightly cheaper variant with just the IQ interface too

13:05 <tnt> yeah, it's half duplex only AFAICT. The SX1257 is full duplex.

13:06 <daveshah> I wonder if full duplex is actually practical to use though?

13:06 <daveshah> ie if you can get enough spacing in any of the ISM bands it supports?

13:08 <tnt> 915 band is 26 MHz wide, that's plenty.

13:08 <daveshah> ah, I was thinking about 868/9

13:08 <daveshah> yeah, that sounds fine

13:09 <tnt> The EU 868 band is definitely a whole lot narrower. I have no idea how good the OOB rejection is on that device, but it's something to test ...

13:10 <daveshah> there is 91x in the UK too (and probably most of the EU), but with strict duty cycle limits

13:10 <daveshah> https://usercontent.irccloud-cdn.com/file/td7LiqqK/Screenshot%20from%202019-01-04%2013-10-17.png

13:16 <tnt> wtf ... there is an upduino shield for the HM01B0 : https://www.digikey.be/product-detail/en/HM01B0-UPD-EVN/220-2226-ND/9759580

13:17 <miek> looks like they've got a pin-compatible variant for 400-510MHz too, the SX1255

13:17 <daveshah> tnt: and a total dearth of example code

13:17 <tnt> miek: yup and a new variant for the 700 Mhz band as well.

13:17 <daveshah> just a sample bitstream, which is almost useless

13:20 <edmund> tnt: detecting a face is hard enought with 1 Mbit SPRAM, Identifying people is impossible.

13:20 octycs has quit [Ping timeout: 250 seconds]

13:20 <daveshah> well, you could send a compressed image over radio once you've detected a face

13:21 <daveshah> and do the actual identification on the backend (where you need to compare against a large database anyway)

13:21 <daveshah> that's how I would architect such a system, anyway

13:25 <edmund> daveshah: 0,3mW TX power is not that useful to transfer images to a base station :-)

13:26 <daveshah> I was assuming there would be a PA on there

13:29 <edmund> daveshah: I would rather go without a PA to limit long range interference in large crowds and motivate the development of meshed solutions.

13:29 <daveshah> The advantage of a PA is that it tends to lead to shorter, higher power transmit bursts than longer, lower power tranmissions. This means your transceiver and logic are running for shorter

13:31 <edmund> tnt: yes https://www.digikey.be/product-detail/en/HM01B0-UPD-EVN/220-2226-ND/9759580 is the easiest and fastest way to get a HM01B0

13:32 <edmund> tnt: @GregDavill already did a PMOD board for the HM01B0

13:33 <edmund> tnt: https://ton.twitter.com/i/ton/data/dm/1075338056595435524/1075338031224090624/YgRnUUPJ.jpg:large

13:33 <tnt> edmund: I think that image is not public :)

13:35 <edmund> tnt: I also already got the Datasheet of the sensor, but Greg did not yet find time to write the Firmware for it.

13:36 <edmund> https://drive.google.com/file/d/1Io3zDOPpZZHCFCT5rVNZwNADGkHA9bvV/view?usp=sharing

13:36 octycs has joined ##openfpga

13:47 Miyu has joined ##openfpga

14:02 <whitequark> daveshah: ok, i think i have the breaking heuristic working

14:02 <daveshah> nice

14:03 <whitequark> Potential for breaking node $techmap$techmap$add$logic.v:9$6.$auto$alumacc.cc:474:replace_alu$95.$and$<techmap.v>:260$220_Y [4]: 300 (Rex=0, Rim=1, Rslk=0).

14:03 <whitequark> these names are way out of hand

14:03 rohitksingh_work has quit [Read error: Connection reset by peer]

14:04 <daveshah> lol

14:12 Flea86 has quit [Quit: Goodbye and thanks for all the dirty sand ;-)]

14:32 jcreus has quit [Ping timeout: 250 seconds]

14:56 rohitksingh has joined ##openfpga

15:22 genii has joined ##openfpga

15:58 <kbeckmann> esden: is the pinout finalized on the icebreaker? Asking because there are only 2 global pins exposed on the 3 pmods and they are on different pins. Would be nice to keep them on the same pin in case you want to use a clock input from a PMOD.

15:59 <esden> Sorry it is the way it is. We will not shuffle any pins any more. That ship has sailed many months ago. :(

16:00 <kbeckmann> i fully understand!

16:00 <esden> The part itself has a very annoying bondout

16:01 <esden> The PMOD basically reflect that

16:03 <esden> Let’s hope we will do better in the future iCEBreakers using other FPGA ;)

16:04 <kbeckmann> alright :). planning on building an ECP5 board?

16:05 <daveshah> Using a pin adjacent to a global pin shouldn't be too bad global wise

16:06 * zkms nods

16:07 <daveshah> You might want to manually constrain the global buffer though, as it won't be locked and could end up elsewhere to satisfy reset/CE constraints of other GBs

16:26 <esden> kbeckmann: maaaybe? ;) :P

16:30 pie__ has quit [Remote host closed the connection]

16:31 pie__ has joined ##openfpga

16:42 <whitequark> daveshah: so

16:42 <whitequark> regarding lut_trans_outputs

16:42 <whitequark> (which i've renamed to lut_critical_outputs which is slightly less opaque)

16:42 <whitequark> any ideas on updating it correctly?

17:04 <daveshah> Let me have a look, my mental model of this stuff is still not very good

17:04 <daveshah> thanks for renaming btw

17:04 <whitequark> i spent like 10 minutes trying to come up with a descriptive name for that..

17:18 X-Scale has quit [Ping timeout: 240 seconds]

17:21 <daveshah> whitequark: so first thoughts, if a LUT split increases the critical path for a PO, that could remove that PO from other lut_critical_outputs

17:21 <daveshah> is my understanding correct?

17:32 <whitequark> daveshah: hmmm

17:33 <whitequark> i'm not entirely sure actually

17:35 <sorear> i have a … strong association for "PO"

17:36 <daveshah> we should probably be using the term CO (combinational output) rather than PO in fact

17:36 <daveshah> because it also includes the set of register and blackbox inputs

17:39 <whitequark> daveshah: so. hm. labels never change.

17:39 <whitequark> so that part of condition stays the same.

17:39 <whitequark> but, a new node may be introduced

17:43 m4ssi has quit [Remote host closed the connection]

17:49 <daveshah> yeah, it's the new node that might be the problem

17:50 <whitequark> daveshah: so, hm

17:50 <daveshah> if the cut never increases the depth (which I presume is implied by labels not changing?) then my fear of other unrelated lut_critical_outputs changing because of the crit path for a PO increasing shouldn't be a problem

17:50 <whitequark> the cut never increases the depth

17:50 <daveshah> ok, that makes things much easier

17:51 <whitequark> wait, hm

17:52 <whitequark> increases the depth of what?

17:52 <daveshah> the critical path

17:52 <daveshah> for any CO

17:52 <whitequark> http://cadlab.cs.ucla.edu/~cong/papers/dac93.pdf figure 2

17:52 <whitequark> it can increase depth after all

17:53 <whitequark> actually my use of labels there might be wrong, even

17:53 <daveshah> the problem is that that then takes other nodes off the critical path too

17:54 <whitequark> mmm, you are right

17:54 <whitequark> i see it now

17:54 <whitequark> i think i need to invalidate entire cones

17:54 <whitequark> that's ok

17:54 <daveshah> yeah

17:54 <whitequark> in real circuits, the graph is very wide and very disjoint

17:55 <whitequark> so invalidating a cone is cheap

17:55 <whitequark> however

17:55 <whitequark> do you understand the precise cone that needs to be invalidated

17:56 <daveshah> not without further thought

17:57 <daveshah> I there are optimisations depending on the nature of the split - many probably won't change anything at all?

17:57 <whitequark> I think you accidentally a word there

17:58 <daveshah> did I?

17:58 <whitequark> "I there are"

17:58 <whitequark> i'm not sure which verb you meant

17:59 <daveshah> ah I think there are

17:59 <daveshah> sorry

17:59 <whitequark> oh yeah

17:59 <whitequark> this is why there is the potential heuristic

17:59 <whitequark> it tries to choose the most promising splits in an ad-hoc way

18:00 <whitequark> cutmap works much more reliably, but i don't understand it yet *and* the output of cutmap still becomes better after flowmap-r

18:00 <whitequark> so i think i'll have all three

18:00 <daveshah> seems reasonable

18:00 <whitequark> i also think flowmap-r is the only one of these that lets you choose an area-depth tradeoff

18:00 <daveshah> yeah, that's really nice

18:00 <whitequark> i'm not sure, cutmap might be able to do it too, the paper isn't super clear

18:01 <whitequark> but it did say that flowmap-r starts to really improve on cutmap results with -optarea of 2..

18:01 <whitequark> so i assume cutmap cannot do it

18:01 <whitequark> probably it can only compute the optimal solution?

18:01 <whitequark> for some value of optimal

18:02 <daveshah> yeah

18:03 <daveshah> > Afterwards, FlowMap-r [6] is able to trade the depths of nodes on non-critical paths or even the depth of the entire network for a smaller area

18:03 <whitequark> aha

18:03 <whitequark> so yeah i need all of them

18:03 <whitequark> flowmap-r enables flow-pack

18:04 <whitequark> and is further enabled by cutmap

18:04 <whitequark> i didn't start with implementing flow-pack because they use some weird boolean decomposition there

18:04 <whitequark> before flowmap-r

18:04 <whitequark> that's mostly redundant with flowmap-r

18:05 <whitequark> daveshah: ok, thinking out loud. if we split lutw off lutv, lutw is a predecessor of lutv

18:05 <whitequark> therefore, the output cone of lutv should be safe

18:06 <daveshah> in terms of lut_critical_outputs or labels?

18:06 <whitequark> lut_critical_outputs

18:06 <daveshah> yes

18:06 <daveshah> that seems correct to me

18:07 <whitequark> so... i invalidate lutw and its input cone, right?

18:07 <whitequark> now, labels

18:08 <daveshah> I don't think this works if overall depth increases?

18:08 <whitequark> labels were fine for the initial solution but not fine after the first breaking

18:08 <whitequark> because depths now dont correspond to labels

18:08 <daveshah> there might be other paths that were critical, unrelated to lutw

18:08 <whitequark> so i have to *first* recompute depths

18:08 <whitequark> hmm

18:09 <whitequark> so, for depths, the entire output cone of lutw is affected

18:09 <daveshah> yeap

18:10 <whitequark> and everything that has a successor whose depth changed has critical outputs affected

18:10 <whitequark> and everything that has anything in fanout whose critical outputs changed is affected

18:10 <whitequark> does this seem enough?

18:10 <daveshah> yes

18:10 <daveshah> I think so

18:10 <whitequark> i have debug asserts that verify that this is valid

18:11 <whitequark> so let's see if this works or crashes

18:23 X-Scale has joined ##openfpga

18:31 <whitequark> oh

18:31 <whitequark> depths are the same as path lengths, just from the other side of the graph

18:41 <daveshah> are path lengths not constant from PI to PO on a given path, whereas depths increasing from PI to PO

18:44 <whitequark> so a path length is the longest path from node to PO

18:44 <whitequark> and depth is the longest path from PI to node

18:44 <whitequark> right?

18:44 emeb has joined ##openfpga

18:46 <daveshah> whitequark: At least in typical PnR papers, path length is the length of the longest path that a node is involved in

18:47 <whitequark> ohh

18:47 <whitequark> Given a depth bound D, the slack on node v is defined as follows: If v is not a PI or PO, the slack of U is D - (L, + P,), where L, is the level of v in the network, and P, is the length of the longest path from v to any PO node. If v is a PI or PO, the slack of v is zero.

18:47 <whitequark> I'm following this.

18:51 <daveshah> In this case, it does seem like P is max length from v to output

18:51 <daveshah> L + P is what I would normally think of as path length

18:51 <whitequark> ah I see

18:51 <whitequark> what would you call P?

18:52 <daveshah> I don't know

18:52 <daveshah> Never heard of a term for it

18:53 jcreus has joined ##openfpga

18:53 <whitequark> altitude? :D

18:53 <whitequark> gonna go with that, to avoid aliasing the term "path length"

18:55 <whitequark> oh *facedesk*

18:55 <whitequark> the paper said LEVELS not LABELS

18:55 <whitequark> but i constantly mix up these terms

18:55 <whitequark> so i used labels and it sort of worked by accident

18:55 <whitequark> mystery solved

19:41 <whitequark> hm

19:41 <whitequark> my critical output update function doesn't work properly :S

19:41 <whitequark> or invalidation, maybe

19:46 <whitequark> oh god

19:46 <whitequark> i did it again

19:46 <whitequark> i forgot a &

19:50 <whitequark> i forgot *another* & what the fuck

19:50 <whitequark> i hate c++

19:51 <shapr> I think it's too large a language

19:51 <shapr> we have several C++ codebases at work and they're entirely unlike each other.

19:53 <whitequark> none of the c++ codebases i work with are anything like each other

19:53 <whitequark> including the ones i wrote myself

19:54 <qu1j0t3> :)

19:54 <qu1j0t3> yes, one's style evolves

19:54 <qu1j0t3> i watched my scala evolve from java-in-scala to pure FP scala

19:54 <whitequark> it's backwards with c++

19:54 <whitequark> you learn to use fewer and fewer c++ features

19:54 <whitequark> but do more

19:56 <qu1j0t3> yes

19:56 <qu1j0t3> that's been my trajectory in scala too, though

19:56 <adamgreig> is there a small good language trapped inside c++? maybe it's just c with namespaces

19:56 <qu1j0t3> sane subsets are a thing in many langs

19:57 <qu1j0t3> adamgreig: yeah it's called C

19:57 * qu1j0t3 runz

19:57 <whitequark> nothing about c is good

19:57 * qu1j0t3 isn't a particular fan of C but the joke was irresistible

19:57 <whitequark> or sane.

19:57 <qu1j0t3> whitequark: I won't argue!

19:58 <adamgreig> a strong foundation for c++

19:58 <whitequark> yes

20:05 <swetland> I really think the problem is that C++ is more of a language toolkit than a language ^^

20:06 <whitequark> the problem is that everyone making C++ is smart but misguide

20:06 <whitequark> d

20:06 <swetland> so you have to start by deciding how you're going to use it (though this ends up being a totally adhoc process in most small projects)

20:16 <shapr> We have one codebase that religiously follows herb sutter

20:16 * shapr shrugs

20:16 <shapr> I hope rust takes over or C++ loses weight.

20:21 <qu1j0t3> Herb is not unlike Pike in his disdain for taking input

20:35 rohitksingh has quit [Ping timeout: 258 seconds]

20:37 <IanMalcolm> I feel like there are a couple of subsets of C++ which are both safe and nice to work with, and they are all rust

20:43 mumptai has joined ##openfpga

21:07 octycs has quit [Quit: No Ping reply in 180 seconds.]

21:08 octycs has joined ##openfpga

21:10 octycs has quit [Client Quit]

22:41 jcreus has quit [Read error: Connection reset by peer]

22:57 ayjay_t has quit [Read error: Connection reset by peer]

22:57 ayjay_t has joined ##openfpga

23:31 pie__ has quit [Remote host closed the connection]

23:44 Bike has joined ##openfpga

23:52 pie_ has joined ##openfpga

23:59 Miyu has quit [Ping timeout: 250 seconds]

23:59 <azonenberg> whitequark: yeah, i find that I write most of my projects in "C+" :p