##openfpga on 2016-11-30 — irc logs at freenode.irclog.whitequark.org

00:00 <azonenberg_work> Then extrapolating from that measured value to another geometry of the same material

00:01 <jhol> so VTR has some demo XML that has some guestimate values for a 28nm process - which is what the ice40, so I figured I could use the C value, and then scale the R value to get the propagation time that matches the values that icebox has

00:02 <azonenberg_work> well R and C will both scale with a longer wire

00:02 <azonenberg_work> Looking at the vtr code directly though

00:02 <azonenberg_work> does it ever use r/c directly?

00:02 <azonenberg_work> or only r*c

00:03 <jhol> https://docs.verilogtorouting.org/en/latest/arch/reference/#wire-segments

00:04 <jhol> I can't imagine why it would need either R or C independently

00:04 <azonenberg_work> i mean

00:04 <azonenberg_work> do you give it drive current?

00:04 <azonenberg_work> b/c otherwise RC is useless :p

00:04 <jhol> it also references it in switches: https://docs.verilogtorouting.org/en/latest/arch/reference/#switches

00:05 <jhol> "switch" means mux or tri-state buffer

00:05 <azonenberg_work> Honestly, results might be easier if you set C=1 for everything

00:05 <azonenberg_work> and had R be delay units :p

00:05 <jhol> yeah - makes sense

00:06 <clifford> jhol, never used VPR so far (still on my todo list), but for some primitives (switches, etc.) you can specify absolute delay values. I

00:06 <jhol> yeah

00:07 <clifford> I'd try setting R=0 (which should give you a delay of 0) and model delay using things like switch Tdel

00:07 <azonenberg_work> That's an option too i guess

00:08 <azonenberg_work> Btw i have been away from openfpga stuff for a few weeks due to other things going on, but planning to spend a while hacking on things next month

00:08 <jhol> yeah I find VTR XML slightly cumbersome to model with

00:08 <jhol> some things are trivial to express in XML

00:08 <jhol> some things cause my brain to hurt

00:08 <clifford> re. R and C values: The R values along the path add up, and the 1/C for all driven components add up.

00:09 <jhol> yup

00:09 <clifford> So it's (R1+R2+R3+...)*1/(1/C1 + 1/C2 + 1/C3 + ...)

00:09 <azonenberg_work> I'm going to be meeting up with whitequark for a couple of weekend hackathons to try and bang out most of the remaining greenpak4 stuff

00:11 <clifford> From what I've seen so far (reading docs, not really used it) it's not really designed to be used with real architectures. It's a nice playground for P&R algorithms but for real hardware one would likely need to manually add some additional features to the C code.

00:11 <jhol> yes indeed

00:12 <azonenberg_work> clifford: yeah that was my understnading too

00:12 <azonenberg_work> but that's what we're all here for :p

00:12 <jhol> an example being that you can only set the size of a block in terms of it's height - but not it's width

00:12 <clifford> especially on the top-level grid side it seems to be insufficient to support real-world stuff like iCE40.

00:12 <jhol> so good luck if you have a block where width!=1

00:13 <azonenberg_work> Well, we may have to write our own PARs for real arches

00:13 <azonenberg_work> Not like we haven't done this before

00:13 <jhol> I think ice40 can be supported - just about

00:13 <jhol> except for the "extra" blocks

00:13 <azonenberg_work> clifford: How hard do you think it'd be to extend something like arachne to a more complex arch like 7 series?

00:13 <jhol> azonenberg_work: forget it

00:14 <jhol> I tried doing some prep work for it

00:14 <azonenberg_work> jhol: And how would vpr handle it?

00:14 <jhol> it would need rewriting

00:14 <azonenberg_work> Are we better off doing a full rewrite?

00:14 <clifford> My guess would be (not checked the code) that VPR reads the XML and then flattens the grid anyways. So I guess it should not be too hard to fix this kind of stuff..

00:14 <azonenberg_work> i mean we will have to tweak anything

00:14 <azonenberg_work> only question is, how heavily

00:15 <jhol> azonenberg_work: so I spend a month refactoring apnr in the summer to see what could be done

00:15 <clifford> If my guess is right, a good solution might be to simply add a feature for a "manual grid" where you just place each and every cell individually.

00:15 <jhol> at the moment the ice40 is completely hard-coded intoit

00:15 <jhol> I concluced that it needs to gain a architecture-XML feature like VTR has

00:16 <jhol> - a way to soft-define architectures

00:16 <azonenberg_work> jhol: fwiw i am going to be doing a bunch of work on the coolrunners at some point over the coming weeks-to-months as well

00:16 <jhol> \o/

00:16 <jhol> so cool

00:16 <azonenberg_work> i want to get back to the xc2c32a project

00:16 <clifford> (I guess that would be equivalent to having a 1x1 grid and using a lot of hierarchy in that one block. :)

00:16 <azonenberg_work> i have a bunch of various stuff floating around

00:16 <azonenberg_work> but need to combine it cleanly and refactor a LOT

00:16 <jhol> so looking at the VTR code - it's not a very clean code-base

00:17 <jhol> and I'm not sure how responsive they'd be to some ##openfpga blitzkreig on the code-base

00:17 <clifford> jhol, but afaik is should yield pretty good results even for larger devices.

00:17 <jhol> but it could be improved

00:18 <azonenberg_work> jhol: speaking of clean code (or lack thereof)

00:18 <azonenberg_work> i'm in the process of refactoring https://github.com/azonenberg/jtaghal

00:18 <jhol> oh nice!

00:18 <azonenberg_work> If you have any thoughts on the arch etc i'd love to hear

00:18 <jhol> maybe you could connect it to OpenOCD

00:19 <azonenberg_work> Being able to bridge the two is definitely on the longer term TODO

00:19 <azonenberg_work> My focus is mostly on low-level ops, i'm not trying to make gdb bridges etc necessarily

00:19 <azonenberg_work> But i want to be able to do things like talk to USER* instructions on a xilinx part after loading a bitfile

00:19 <azonenberg_work> And be able to work directly with a bit and not a svf

00:19 <azonenberg_work> I have that much working

00:20 <azonenberg_work> What i want to do at some point is refactor a lot of the coolrunner code to reflect my new understanding of the device microstructure

00:20 <azonenberg_work> it will be a lot cleaner with less magic numbers

00:20 <azonenberg_work> So basically...

00:20 <azonenberg_work> coolrunner devices use the jedec JED file format for the bitstreams

00:20 <azonenberg_work> This is supposed to be something you can directly burn to a device

00:21 <azonenberg_work> But xilinx being xilinx, they didn't do that

00:21 <azonenberg_work> the JED file is using virtual addressing

00:21 <azonenberg_work> And the chip has to be burned with physical addressing

00:21 <azonenberg_work> The translation right now is hard-coded (for the xc2c32a)

00:21 <azonenberg_work> and i think i support the 64, never bothered to do the rest

00:22 <azonenberg_work> the newly cleaned up code is going to use a procedural permutation based on the actual device structure

00:22 <azonenberg_work> should be way more readable

00:22 <jhol> yeah it would be cool to get a more general version of xc3sprog

00:22 <azonenberg_work> and extensible to larger devices

00:22 <azonenberg_work> jtaghal is not just an xc3sprog replacment

00:22 <azonenberg_work> it's meant for interactive debug on fpga designs

00:22 <azonenberg_work> thats what i used it for heavily on antikernel

00:23 <jhol> clifford: I think if VTR is going to be used, the XML language will need some improvement. but if VTR is not going to be used e.g. someone does a arachne-pnr II, it will still need some kind of architecture description language

00:23 <jhol> it would be cool to see a growing git repository of FPGA structure data in this format, from all these FPGA reverse engineering projects going on

00:23 <azonenberg_work> jtaghal plus the antikernel debug bridge code (which i am currently in the process of de-convolving from the jtaghal repo, it belongs in the antikernel code) lets me get a packet-based interface directly to arbitrary IP cores on the device

00:26 <clifford> jhol, if my guess is right that they just flatten the description anyways, then the XML arch description for VPR might as well be a pre-flattened intermediate format that's been generated from something else. At least that could be useful for experimenting with different architecture description languages.

00:27 <azonenberg_work> clifford: I think there will, long term, have to be at least 2 separate arch formats

00:28 <azonenberg_work> one for 2D fabrics and one for crossbar-based stuff like coolrunner and greenpak

00:28 <azonenberg_work> as those are fundamentally different interconnects with totally different par algorithms etc

00:30 <jhol> yup

00:31 <jhol> clifford: did you get a chance to test the ice40 kernel driver yet?

00:31 <jhol> no stress if not

00:32 <azonenberg_work> diamondman: btw

00:32 <azonenberg_work> what is the status of your work on the xilinx platform cable?

00:32 <azonenberg_work> in particular the DLC10

00:33 <clifford> jhol, no. it is on my todo list, but that list is overflowing ever since I'm back from california because I didn't work on anything on that list for almost two weeks..

00:33 <jhol> yeah I figured that might be the case

00:34 tecepe has quit [Ping timeout: 244 seconds]

00:34 <azonenberg_work> clifford: yeah my list is going to get nuts

00:34 <azonenberg_work> i'm about to go to hong kong for 3 weeks

00:34 <azonenberg_work> and then christmas

00:34 <azonenberg_work> so my inboxes will be overflowing with... who knows :p

00:35 <jhol> clifford: can I suggest you just grab the kernel source, and kick off the build

00:35 <jhol> this will save a lot of time if/when you actually want to do the test

00:35 <jhol> it takes ages to clone linux, and ages to build the RPi kernel the first time

00:38 digshadow has quit [Ping timeout: 260 seconds]

00:41 digshadow has joined ##openfpga

00:47 <felix_> jhol: haven't had time to test the patches; need to get some paid work done. i only have prepared a fresh sdcard with a raspbian for it by now

00:49 <jhol> :) - thanks for doing that much!

00:50 <felix_> if vpr can't be adpted for the artix7, i'd really suggest to write the new codebase in rust and not in c/c++. has some nice features, not too much overhead, more abstraction and more memory safety

00:51 <felix_> clifford: you're also at the 33c3, right? if you want, i'd like to discuss if and how i could help with the artix7

00:52 <jhol> I think there is a big question about governance of VTR

00:52 <jhol> I'm not sure if their interestests are really aligned with ours

00:52 <jhol> - do they care about code quality as much as we need them to?

00:52 <felix_> i don't know; only had a brief look at the project

00:53 <azonenberg_work> yeah because half the reason that we are doing this project

00:53 <jhol> well they currently don't have good enough code quality

00:53 <azonenberg_work> is because the vendor eda tools are terrible :p

00:53 <jhol> - but then can you make a leopard change his spots?

00:54 <jhol> so I'm not sure how they'd feel about a blitzkreig from this little community, coming in and telling them all the ways their project sucks

00:55 <jhol> I want to know how long it to cotton seed to write APNR - it seems to have been a matter of months? if so, perhaps writing a more flexible APNR would not be so bad, and would be a better way forward

00:55 <jhol> *took cotton seed...

00:55 digshadow has quit [Ping timeout: 256 seconds]

00:56 <felix_> yeah, writing a good placer isn't trivial. the router is probably a bit easier

00:57 <jhol> I think it would need to be a new project, because I have found cseed is nowhere near responsive enough to incoming patches

00:57 <jhol> I'm don't know much about his situation, but most of what I contributed bit-rotted without any proper feedback

00:58 <felix_> meh

01:00 carl0s has joined ##openfpga

01:08 tecepe has joined ##openfpga

01:14 <clifford> felix_, yes. I'll be at 33c3. Let's talk.

01:17 <jhol> felix_: I think the point is you don't need to write a "good" placer or anything on the first pass, you just need to set up an clean enough architecture that encompasses the problem with some quick-and-dirty algorithms, leaving the way open for refinement with more advanced code

01:25 <felix_> clifford: sounds good; i'm looking forward to that

01:25 <felix_> jhol: ack

01:27 <felix_> a clean architecture should be the highest priority; otherwise it's probably not really worth the effort

01:28 <felix_> anyway; i have to get some sleep now. good night.

01:57 maaku has joined ##openfpga

02:13 carl0s has quit [Quit: Leaving]

02:21 amclain has quit [Quit: Leaving]

03:15 kuldeep has quit [Ping timeout: 250 seconds]

03:22 kuldeep has joined ##openfpga

03:29 digshadow has joined ##openfpga

04:41 scrts has quit [Ping timeout: 252 seconds]

04:52 pie_ has quit [Ping timeout: 268 seconds]

04:53 scrts has joined ##openfpga

05:22 maaku has quit [Quit: No Ping reply in 180 seconds.]

05:24 maaku has joined ##openfpga

05:32 <rqou> housemates need to put our networking equipment on a UPS

05:33 <rqou> tripped the breaker again just now

05:33 <rqou> unfortunately due to the wiring mess there's no good way to get network equipment on the UPS without a huge extension cord

05:36 <rqou> i'm pretty sure the reason we're suddenly having problems is because it's winter and there are space heaters running

05:41 <rqou> azonenberg_work: reading backlog you mentioned that you can't extract parasitics even when given the process parameters?

05:42 <rqou> for my microfab class we used a proprietary tool tsuprem4 to simulate doping and various processing steps on a wafer

05:42 <rqou> is that not sufficient to calculate parasitic information? or does it not work for modern processes?

05:43 <rqou> (of course you don't actually have the parameters needed to feed into a tool such as tsuprem4; this is just out of curiosity)

05:48 maaku_ has joined ##openfpga

05:49 maaku has quit [Ping timeout: 260 seconds]

05:54 DocScrutinizer05 has quit [Disconnected by services]

05:54 DocScrutinizer05 has joined ##openfpga

06:07 qu1j0t3 has quit [Quit: WeeChat 0.4.3]

06:26 maaku_ has quit [Quit: No Ping reply in 180 seconds.]

06:28 maaku has joined ##openfpga

06:57 qu1j0t3 has joined ##openfpga

07:40 <azonenberg> rqou: what i meant is

07:41 <azonenberg> you need to know the parameters :p

07:41 <azonenberg> and while you may be able to predict how e.g. actual doping intensity will vary with beam current for a given implanter

07:41 <azonenberg> you still have to calibrate that tool

07:43 <azonenberg> Because if my experience with SEMs is any hint, the beam current at the source has nothing to do with ions/mm^2 at the wafer surface

07:43 <azonenberg> so many apertures and lenses etc in the way

07:44 <azonenberg> Sure, you can model incremental changes to an existing process

08:00 <digshadow> rqou: just looked at the bonding machine

08:00 <digshadow> doesn't look worht it

08:00 <digshadow> worth

08:01 pie_ has joined ##openfpga

08:03 <rqou> hmm why not?

08:03 <rqou> what about the other stuff? (i didn't look too closely)

08:19 cr1901_modern has quit [Read error: Connection reset by peer]

08:28 Bike has quit [Quit: natural]

09:53 LeelooMinai has quit [Quit: No Ping reply in 180 seconds.]

09:54 LeelooMinai has joined ##openfpga

10:18 mIKEjONE1 has joined ##openfpga

10:19 dingbat has quit [Ping timeout: 258 seconds]

10:19 mIKEjONES has quit [Ping timeout: 258 seconds]

10:19 azonenberg has quit [Ping timeout: 258 seconds]

10:22 defparam_ has joined ##openfpga

10:25 SuperChickeNES has joined ##openfpga

10:25 ChickeNES has quit [Ping timeout: 258 seconds]

10:25 defparam has quit [Ping timeout: 258 seconds]

10:25 qu1j0t3 has quit [Ping timeout: 258 seconds]

10:25 hobbes- has quit [Ping timeout: 258 seconds]

10:29 hobbes- has joined ##openfpga

10:31 azonenberg has joined ##openfpga

10:31 azonenberg has quit [*.net *.split]

10:36 dingbat has joined ##openfpga

10:36 qu1j0t3 has joined ##openfpga

10:36 azonenberg has joined ##openfpga

10:36 dingbat has quit [Changing host]

10:36 dingbat has joined ##openfpga

11:30 openfpga-bb has quit [Ping timeout: 244 seconds]

11:30 openfpga-bb has joined ##openfpga

11:38 pie_ has quit [Ping timeout: 248 seconds]

12:02 maaku has quit [Quit: No Ping reply in 180 seconds.]

12:03 <rqou> i just learned that the sony ps2 architecture is even more insane than i originally thought

12:03 maaku has joined ##openfpga

12:03 <rqou> late model ps2s have a ppc cpu emulating a mips r3000 that is supposed to be the io coprocessor and ps1 back-compat processor

12:04 <rqou> (the magic google term is "DECKARD")

12:04 <rqou> so (late) PS2s have mips, ppc, and the custom vu processors all with different ISAs

12:13 cr1901_modern has joined ##openfpga

12:14 pie_ has joined ##openfpga

12:14 pie_ has quit [Changing host]

12:14 pie_ has joined ##openfpga

13:00 pie_ has quit [Ping timeout: 258 seconds]

14:05 scrts has quit [Ping timeout: 250 seconds]

14:07 scrts has joined ##openfpga

14:59 pie_ has joined ##openfpga

15:57 pie_ has quit [Ping timeout: 260 seconds]

16:34 scrts has quit [Ping timeout: 244 seconds]

16:39 <kristianpaul> f1 instances, fpga instances in aws :p

16:40 <qu1j0t3> :)

16:51 tecepe has quit [Ping timeout: 248 seconds]

16:56 scrts has joined ##openfpga

17:07 amclain has joined ##openfpga

17:09 SuperChickeNES has quit [Quit: ZNC 1.6.1 - http://znc.in]

17:09 ChickeNES has joined ##openfpga

17:12 Bike has joined ##openfpga

17:12 pie_ has joined ##openfpga

17:31 digshadow has quit [Quit: Leaving.]

18:03 tecepe has joined ##openfpga

18:15 <defparam_> AWS just unveiled FPGA instances on EC2: https://aws.amazon.com/blogs/aws/developer-preview-ec2-instances-f1-with-programmable-hardware/ - Xilinx UltraScale+ VU9P

18:17 <defparam_> ah just saw your comment ;)

18:19 * felix_ wonders if the fpga bitstreams are getting sanitized before getting loaded in the fpga

18:23 <jhol> I'd love to reverse engineer the ultrascale+ if only dev boards were not so damn expensive

18:23 <jhol> (and I didn't have a 5 month old baby to look after)

18:25 <defparam_> "Dedicated PCIe x16 interface to the CPU"... "The FPGAs are dedicated to the instance and are isolated for use in multi-tenant environments"... I wonder what mitagations they have for DMA attacks.. IOMMU?

18:26 <felix_> artix7 is imho a much more interesting target at the beginning, since the chips are also rather fast but ways cheaper. and having understood the artix7 will really help to understand the bigger and newer series

18:26 <felix_> probably iommu

18:27 <jhol> yes I agree - and I suspect previous and future generations of Xilinx devices are going to have a lot in common also

18:30 <defparam_> "In addition to building applications and services for your own use, you will be able to package them up for sale and reuse in AWS Marketplace." - looks like they are creating a store for IP resell

18:32 pie_ has quit [Ping timeout: 258 seconds]

18:34 <kristianpaul> they are just plugging fancy fpga boards in their current hw

18:34 <kristianpaul> dedicate instances most likely

18:35 <kristianpaul> but having xilinx ide in a AMI is not that bad..

18:39 kuldeep has quit [Ping timeout: 240 seconds]

18:41 kuldeep has joined ##openfpga

18:48 m_w has joined ##openfpga

19:01 <azonenberg> jhol: i can almost guarantee that the 7 series are basically all the same

19:01 <azonenberg> just maybe more routing resources in the bigger devices

19:01 <azonenberg> the higher end*

19:07 digshadow has joined ##openfpga

19:07 scrts has quit [Ping timeout: 260 seconds]

19:08 pie_ has joined ##openfpga

19:15 pie_ has quit [Ping timeout: 260 seconds]

19:41 X-Scale has quit [Ping timeout: 244 seconds]

19:58 <felix_> yeah, i'd also suspect that the whole 7 series is made from the same parametrizable building blocks, but the kintex and virtex devices have more routing ressources

19:59 <azonenberg> Yes, almost certainly

20:00 <azonenberg> I would expect the bitstream layout will have some blocks mirrored left-right etc for electrical reasons

20:00 <azonenberg> the two slices in a CLB likely have different layouts as well b/c a CLB seems like the actual base block of layout according to what i've seen of other devices

20:00 <azonenberg> well, CLB + switch box

20:00 <felix_> it seems that there are mainly two kinds of switch boxes and the're probably mirrored

20:01 <felix_> yes

20:13 jhol has quit [Quit: Coyote finally caught me]

20:14 jhol has joined ##openfpga

20:58 <openfpga-github> [yosys] azonenberg pushed 12 new commits to master: https://git.io/v137B

20:58 <openfpga-github> yosys/master 277f478 oldtopman: Added optional flag for linking curses with readline.

20:58 <openfpga-github> yosys/master f257ccf Clifford Wolf: Added "yosys-smtbmc --append"

20:58 <openfpga-github> yosys/master 73653de Clifford Wolf: Merge pull request #274 from oldtopman/lcurses...

21:32 scrts has joined ##openfpga

21:33 m_w has quit [Remote host closed the connection]

21:38 pie_ has joined ##openfpga

21:38 <pie_> re: aws fpga instances, damn thats interesting

21:38 pie_ has quit [Changing host]

21:38 pie_ has joined ##openfpga

21:39 maaku has quit [Quit: No Ping reply in 180 seconds.]

21:41 maaku has joined ##openfpga

22:00 digshadow has quit [Quit: Leaving.]

22:00 digshadow has joined ##openfpga

23:45 <rqou> waiting for someone to use it to discover a chipset bug and pwn the system :P

23:45 <rqou> azonenberg?

23:46 <pie_> hrhr

23:47 <pie_> well i wonder how much recon you could do with only software access?

23:48 <pie_> idk, i saw some really oly thing a while back i have no idea what im talking about, and i doubt theyd make something like this possible, but maybe you could have it jtag itself? :P

23:48 <pie_> though i suppose for devs to be able to use the platform well theyd have to give out hardware documentation

23:48 <rqou> the fpga almost certainly isn't connected to the pcie jtag pins, that's too easy :P

23:49 <rqou> no, i was saying that someone should find a silicon bug somewhere in the chipset iommu logic or similar

23:51 <pie_> *old

23:51 <pie_> ah i see

23:51 <azonenberg> rqou: lol that would be fun

23:52 <rqou> i've never looked into the details, but i assume that pcie has a nice number of footguns

23:52 <azonenberg> I know it allows arbitrary DMA but if you do passthrough that may be virtualized by the iommu

23:52 <azonenberg> Definitely curious about how to pwn a host with ti

23:52 <azonenberg> it*

23:53 <pie_> at the least it would be a look into the whole fpga integrated computer thing

23:54 <rqou> this blogger (http://danluu.com/cpu-bugs/) implies that intel has been skimping out on validation

23:54 <pie_> a la that one open source thing

23:54 <pie_> rqou, i think ive read something about that but it may have been the same guy

23:54 <pie_> novena or what was it?

23:55 <rqou> novena is something completely different

23:55 <rqou> it's just a mostly-completely-open laptop with an arm i.MX and a S6 fpga

23:55 <pie_> then again fpga + cpu is nothing new so im just going to be quiet now

23:55 <azonenberg> rqou: well i dont entirely trust intel's implemetnations

23:55 <pie_> rqou, yeah i didnt mean its like AWS

23:56 <azonenberg> I would LOVE to find a full-on userspace-to-ring0 privesc in x86 one day

23:56 <azonenberg> Just havent had the time to even begin thinking about how to do it :p

23:56 <pie_> what a day that would be

23:56 <rqou> not inconceivable :P

23:57 <pie_> well we have those cache attacks

23:57 <azonenberg> it happened already but it was in the sysret insn

23:57 <rqou> there was a sysret and swapgs thingy a couple years back due to intel and amd not doing quite the same thing

23:57 <azonenberg> and there was a s/w workaround

23:57 <azonenberg> but imagine a pipeline bug in, say, mov

23:57 <pie_> oh mommy

23:57 <azonenberg> it isnt practical to virtualize every mov

23:57 <azonenberg> or patch them

23:57 <rqou> and then there was the alignment trap DoS that danluu was talking about to DoS the hypervisor

23:58 <rqou> i imagine some kind of bug could definitely exist in e.g. "some crazy non-temporal load involving avx" :P

23:59 <azonenberg> Yep

23:59 <rqou> there was the thing at the last defcon about using the prefetch opcode to probe memory you weren't supposed to be able to

23:59 <azonenberg> lol

23:59 <rqou> you could e.g. find the kaslr slide with it

23:59 <azonenberg> did not see that talk

23:59 <azonenberg> and thats certainly interesting