##openfpga on 2019-07-18 — irc logs at freenode.irclog.whitequark.org

00:15 <zignig> " mostly because zignig wanted a text assembler " , thanks !

00:39 genii has quit [Remote host closed the connection]

00:43 emeb_mac has joined ##openfpga

00:48 X-Scale` has joined ##openfpga

00:51 X-Scale has quit [Ping timeout: 248 seconds]

00:51 X-Scale` is now known as X-Scale

01:04 Bike has quit [Read error: No route to host]

01:04 Bike has joined ##openfpga

01:25 X-Scale` has joined ##openfpga

01:27 X-Scale has quit [Ping timeout: 268 seconds]

01:27 X-Scale` is now known as X-Scale

01:42 dh73 has quit [Remote host closed the connection]

02:18 emeb has quit [Quit: Leaving.]

02:41 dj_pi has joined ##openfpga

02:53 dj_pi has quit [Ping timeout: 272 seconds]

03:07 Bike has quit [Quit: Lost terminal]

03:11 X-Scale` has joined ##openfpga

03:13 X-Scale has quit [Ping timeout: 272 seconds]

03:13 X-Scale` is now known as X-Scale

03:33 <_whitenotifier-3> [Boneless-CPU] zignig synchronize pull request #4: directives bikeshed - https://git.io/fjXmy

04:00 <_whitenotifier-3> [Boneless-CPU] zignig synchronize pull request #4: directives bikeshed - https://git.io/fjXmy

04:36 rohitksingh_work has joined ##openfpga

06:54 emeb_mac has quit [Ping timeout: 245 seconds]

07:04 Miyu has joined ##openfpga

07:13 m4ssi has joined ##openfpga

07:58 pie_ has quit [Ping timeout: 245 seconds]

07:58 Asu has joined ##openfpga

08:15 <eddyb> okay this is a really helpful example of nmigen combinational logic (specifically, a lot of it in one place) :D https://github.com/whitequark/Boneless-CPU/blob/master/boneless/gateware/decoder.py#L214

08:16 <whitequark> yeah

08:18 <tnt> whitequark: can you tell "don't care" to nmigen ?

08:19 <eddyb> whitequark: would you say it'd be a waste of time if someone made a simple DSL that expanded to pretty much that style of code? just a bit of syntax sugar

08:20 <whitequark> tnt: in cases? yes, you can use with m.Case("110010----1010"):

08:20 <whitequark> eddyb: what style of code

08:21 <whitequark> expanded how

08:21 <whitequark> i have no idea what you mean here

08:21 <eddyb> whitequark: really dumb small things like `=` or `<-` instead of `.eq` and `foo(...)` instead of `with m.Foo(...)`

08:21 <tnt> whitequark: no, more like in "results" of the comb logic. Like when decoding instructions, most of the time, half the output signals don't matter at all and as far as I'm concerned the logic optimizer can make them 1 or 0, whatever yields the smallest/fastest logic.

08:21 <whitequark> tnt: as a deliberate choice the base nmigen language does not include any way to get 'x

08:21 <eddyb> <3

08:21 <whitequark> i've been thinking about adding something like Rust's unsafe to nmigen, where you can opt into having those

08:22 <whitequark> but to do that, i want to first see if it actually gives any real benefit

08:22 <tnt> :(

08:22 <whitequark> because i'm not convinced it does

08:22 <whitequark> tnt: like one of the goals behind boneless is to optimize it as far as i can manually and then add 'x to it

08:22 <whitequark> and see if it makes any difference

08:23 <tnt> I know when I did my decoding logic, this had quite a bit of gain. No point in configuring an ALU path if the 'enable' bit at the end forces the output to 0.

08:23 <whitequark> is it because 'x is useful, or because 'x works around a problem elsewhere in the toolchain?

08:23 <whitequark> useful on its own*

08:24 <tnt> Oh, no, I mean 'x in verilog didn't help (at least not with yosys). I had to use an external tool where I could specify input / output and the output could include "don't cares".

08:24 <whitequark> ohhhhh

08:25 <whitequark> yeah but nmigen is a yosys frontend, if it didn't help in verilog, what would be the point of adding it to nmigen?

08:25 <eddyb> hmm, funnily enough, these classes look a bit like React but with "elaborate" instead of "render". and I've tried to build a DSL around a React-like before (which failed because proper integration would've required an entire compiler for a Java-like language, and that made me switch focus to making it easier to write toy compilers)

08:26 <tnt> whitequark: well, there was always the hope that your logic optimization improvements would make yosys more "don't care" aware :p

08:27 <tnt> I had opened https://github.com/YosysHQ/yosys/issues/765 as a simple exampl

08:27 <whitequark> tnt: i would actually expect that it would optimize all the other cases better

08:27 <whitequark> i.e. you would not have to insert don't cares to get faster logic

08:28 <whitequark> all you would have to do is to ensure that the "don't care" choice is not on the critical path

08:28 <whitequark> but it can still be perfectly deterministic

08:28 <tnt> The way I wrote it, I don't really "insert" don't cares. I just make the default output of decoding "doesn't matter". And then for each instruction / path, I make sure to set the actual bits that will be used.

08:29 <whitequark> yes, I know. I make the default output of decoding something simple it already does.

08:30 <whitequark> anyway

08:30 <eddyb> oh and this is a nice example of a FSM :D https://github.com/whitequark/Boneless-CPU/blob/master/boneless/gateware/core.py#L292

08:30 <whitequark> like I said, I'm not absolutely opposed to 'x in nMigen

08:30 <whitequark> I just think it's premature

08:31 <whitequark> for example, consider that perfectly safe Rust code can and does beat C++ at the things C++ is good at

08:31 <tnt> Sure, but I just don't consider 'x' to be unsafe :p

08:32 <whitequark> it is though

08:32 <tnt> I don't see why.

08:32 <whitequark> it's the same semantically as LLVM's `poison`

08:32 <eddyb> whitequark: you should add something like `MaybeUninit` :P

08:32 <whitequark> because if you ever take a decision based on 'x that results in observable behavior change, everything your circuit does from that point is unpredictable

08:33 <eddyb> safe to create, but to read it you need to do `.unsafeAssumeInit()` or so mething

08:33 <whitequark> (in general, it's possible to restrict the damage)

08:33 <whitequark> i.e. as long as 'x gets into a storage element, all bets are off, in general

08:33 <whitequark> as soon as*

08:33 <tnt> Sure, but (1) the simulator should show me that. (2) that's a bug, no different that if you assign the wrong value to the bit manually.

08:34 <eddyb> whitequark: hmm, you could theoretically ensure that it doesn't reach storage?

08:34 <whitequark> tnt: the simulator will show you that if you have a testcase that hits it. are you sure you do?

08:34 <whitequark> and it is very, very different.

08:34 <whitequark> the reason it is different is, for example, consider an FSM

08:34 <tnt> It's as much tested as the rest of the core :p

08:35 <whitequark> no matter how wrong are the control inputs to your FSM, if it has 5 states, it will stay within those 5 states

08:35 <whitequark> if you have a 'x as a control input, it can get into a state that is completely illegal

08:35 <whitequark> like if it is 1-hot encoded, it could get into a state with multiple hot bits.

08:35 <whitequark> or zero

08:35 <whitequark> I did, in fact, hit that bug

08:36 <whitequark> that's why it's unsafe: it is nondeterministic, and that nondeterminism propagates

08:37 <eddyb> if it stays within comb logic then the situations in which you have 'x anywhere should be quantifiable and you "just" have to require that none of those situations propagate all the way down to the outputs

08:37 <whitequark> tnt: in fact you could have a formal proof that your FSM never gets into an illegal state, and then you feed 'x to that module, and it still will

08:37 <tnt> I consider it my job to prevent that propagation by design of my logic.

08:37 <whitequark> that doesn't matter one bit as to whether it's unsafe or not

08:38 <whitequark> just like C programmers consider it their job to prevent UB from being invoked, yet every one of them writes programs that are full of UB.

08:38 <tnt> meh ... I guess we'll just have to agree to disagree. To each his own views.

08:38 <eddyb> I think you downgrade it from the halting problem to a SAT problem, if you don't let it reach the state? (assuming you don't want your users to write proofs, in which case you don't even need SAT)

08:38 <whitequark> tnt: no.

08:38 <whitequark> whether 'x is unsafe is not a point we can agree to disagree on, because it has basis in fact.

08:38 <whitequark> whether 'x is *worth it* is a matter of opinion and therefore disagreement

08:39 <eddyb> s/in which case/cause if you were,/

08:41 <whitequark> eddyb: you can just add logic that tracks the 'x state

08:41 <whitequark> the problem is that this negates all efficiency advantages of using 'x in the first place

08:41 <whitequark> pretty much like -fsanitize=undefined

08:42 <whitequark> of course what it doesn't negate is tracking whether your design is stuck in an illegal state or not, which is a completely different advantage of 'x, and is actually what it is introduced for in Verilog

08:43 <whitequark> i.e. the purpose of 'x in simulation and in synthesis is different

08:43 <eddyb> tnt: I'm pretty sure it's unsafe in a similar way to languages without memory safety: it can violate local reasoning, at a distance, in ways which would otherwise be impossible

08:43 <whitequark> ^ exactly

08:44 <whitequark> using 'x is 100% fine in every case where your *complete* design is covered by formal proofs that ensure that 'x never propagates to storage elements

08:44 <eddyb> it's kind of insane, but local reasoning can let you prove some things by construction, things that would otherwise require painstaking manual proofs or hit the halting problem with automation

08:45 <whitequark> once that is no longer the case, whether *any* part of your design works correctly is down to *every other* part of your design working correctly, which is not something that designs made by humans generally do, ever

08:47 <whitequark> "not having 'x anywhere" was actually a foundational, uncompromising principle of oMigen that I'm considering relaxing in nMigen, heh

08:52 <eddyb> whitequark: hmm can I stick (something like) this on an iCEstick and have it blink an LED? (well, I'd have to slow it down a lot to see it :P) https://github.com/whitequark/Boneless-CPU/blob/master/examples/software/toggle.asm

08:52 <eddyb> oh I guess I might have to hook up the core to IO, lol

08:53 <whitequark> eddyb: note I haven't tested that core on a real FPGA at all yet

08:53 <whitequark> but in theory yes

08:54 <eddyb> tempted to just do this today so I can understand the whole process better. overall, it seems like Boneless is small enough that I can study (and maybe experiment off of) it

08:55 <eddyb> whitequark: oh heh

09:05 <whitequark> tnt: oh and one last thing. there are good ways to make 'x much safer, for example, a new $freeze cell that takes 0, 1, or x, and outputs 0 or 1, but it's unspecified which if the input is x

09:05 <whitequark> so then you can stick that cell onto every input of your module, and you get local reasoning back again

09:05 <eddyb> that's funny, this still happens: `*** buffer overflow detected ***: iceprog terminated`

09:06 <eddyb> (with the iCEstick LED example from the icestorm repo)

09:06 <whitequark> but then it wouldn't be Verilog, it would be Yosys' Safer Verilog or something, since there is no way to get the same behavior from Vivado

09:06 <whitequark> which is why it'll never be widespread.

09:06 <eddyb> I don't even know what that is from, maybe NixOS compiles some sanitizer into the binary or something lol

09:06 <whitequark> eddyb: glibc prints those i think

09:20 <eddyb> lol ERROR: Could not install packages due to an EnvironmentError: [Errno 30] Read-only file system: '/nix/store/4c4ajgdnhlqk994hilagk5cgv7vw9yzg-python3-3.7.3/lib/python3.7/site-packages/six.py'

09:20 <eddyb> I should look up how this is actually supposed to be done :P

09:32 <eddyb> ah, virtualenv https://nixos.org/nixpkgs/manual/#how-to-consume-python-modules-using-pip-in-a-virtualenv-like-i-am-used-to-on-other-operating-systems

09:39 <eddyb> whitequark: how do you actually run the tests? my naive attempts don't get very far

09:39 <whitequark> python3 setup.py test

09:40 <whitequark> or python3 -m unittest

09:40 <eddyb> OOOOOH

09:40 <eddyb> okay I see. `pip install .` also worked to get me `boneless-as`

09:41 lopsided98 has quit [Ping timeout: 276 seconds]

09:41 <whitequark> yep

09:41 <whitequark> or `python3 setup.py develop --user`

09:42 lopsided98 has joined ##openfpga

09:47 <eddyb> oh, cool, VSCode's Python extension autodetect my venv

09:50 <eddyb> whitequark: how do I run the "main functions" in alsru/control/core/decoder? what are they for?

09:50 <whitequark> eddyb: they're for generating verilog/rtlil for separate units of the CPU

09:51 <eddyb> ooh so they expose nmigen itself for those units?

09:52 <whitequark> what's "nmigen itself"

09:53 <eddyb> oh, I was looking at the top of the file and missed `from nmigen import cli` below. I wasn't sure if `cli` was from `nmigen` or from `boneless`

09:55 <eddyb> whitequark: like, nmigen the tool that does the things nmigen... ugh failing to use words, I should go to lunch. anyway how do I get python to run those? naive attempts fail

09:55 <whitequark> python3 -m boneless.gateware.alsru generate -t il foo.il

09:56 <eddyb> ooooh I was missing the boneless. at the start /facepalm

09:56 <eddyb> whitequark: thanks!

10:01 <eddyb> whitequark: `python -m boneless.gateware.core core-fsm+memory generate core.v` throws `nmigen.back.verilog.YosysError: ERROR: Parser error in line 30: syntax error`

10:02 <whitequark> eddyb: do you have yosys from master branch?

10:02 <whitequark> (or 0.9)

10:02 <eddyb> awww `Yosys 0.8+ (git sha1 d9daf09cf3, gcc 7.4.0 -fPIC -Os)`

10:03 <whitequark> yeah that's from 3 months ago

10:04 <eddyb> makes sense https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/compilers/yosys/default.nix#L11

10:16 <eddyb> ugh I have a build server, why am I torturing myself building this locally

10:36 <eddyb> heh this works https://gist.github.com/eddyb/f1f74f9b0d220fddb3d459b2235f6f26

10:36 <eddyb> whitequark: yeah, that was it, I have a big .v now :D

10:37 <whitequark> sweet

10:45 <zignig> eddyb: I can confirm that Boneless runs on the tinyfpgaBX , i've left it running blink for a few days.

10:48 <eddyb> zignig: aww I won't be the first. do you have the code for that up anywhere?

10:52 <zignig> I do , https://github.com/zignig/gizmotron/tree/master/v3

10:52 <zignig> what board are you using ? , it's set up for tinyfpgabx and uses nmigen-boards at the moment .

10:53 <eddyb> zignig: iCEstick atm

10:55 <eddyb> zignig: heh so this is your MMIO? https://github.com/zignig/gizmotron/blob/master/v3/core_v3.py#L79-L82

10:56 <zignig> ok , you will need zignig/gizmotron/tree/master/v3 , and edit the core_v3.py

10:56 <zignig> MMIO?

10:56 <eddyb> memory-mapped IO

10:56 <zignig> yep,

10:56 <whitequark> boneless doesn't have mmio

10:56 <whitequark> it has a separate address space for peripherals

10:56 flea86 has joined ##openfpga

10:57 <eddyb> oh, is that what `ext` is?

10:57 <whitequark> yes

11:00 <zignig> eddyb: I have been developing on it and it's borked at the moment

11:01 <zignig> eddyb: blink.asm is fixed now.

11:02 <eddyb> cool :D

11:04 <zignig> just change out the nmigen import to icestick and , change BB to ice stic and get rid of the resourses at the top.

11:17 <eddyb> zignig: I can just copy this and run it in a venv with boneless installed, right?

11:19 <cr1901> python3 path/to/nmigen/file.py -h, if you're using nmigen.cli

11:19 <cr1901> Ahhh shit, scrollback fail, nevermind

11:20 <eddyb> cr1901: that is the kind of thing that python didn't like, heh. I think because there are dependencies to other parts of boneless

11:20 <cr1901> I'd have to see the error, I haven't built boneless since Jan or so.

11:24 <zignig> eddyb: no idea , should be good! :)

11:25 <zignig> cr1901: have you looked at the new v3 code? the arch code has some dense python magic in it.

11:26 <eddyb> cr1901: that is, it works via e.g. python -m boneless.gateware.core but not python boneless/gateware/core.py

11:29 <zignig> cr1901: are you looking to rewrite your simulator for core_v3?

11:30 <whitequark> i think instruction semantics should be a part of arch., not a separate simulator

11:30 <whitequark> then the formal model can be generated from it too

11:31 <eddyb> zignig: if you like bitfield metaprogramming tricks, check this out https://github.com/eddyb/wiREd/blob/master/disasm/arm.js#L209

11:31 <cr1901> zignig: Basically, it looks like my code is obsolete and I have nothing to do :P

11:32 <zignig> a good plan , perhaps the instructions should have a python version of the operation as a field

11:32 <whitequark> yeah that's the idea i had

11:33 <zignig> cr1901: sort of, I found the simulator really useful for debugging the asm.

11:34 <zignig> whitequark: class ADD (C_ARITH, M_RRR, T_ADD, F_RRR ): sim = "rd = ra + rb"

11:34 <whitequark> not really like that no

11:35 <cr1901> The point is if in insn semantics are part of the arch, then there is little code for me to actually write to do a simulator.

11:35 <eddyb> zignig: btw, `#!/usr/bin/python` is not portable (and might not even play nice with a venv)

11:36 <eddyb> you should use `#!/usr/bin/env python` (or even better, `#!/usr/bin/env python3`)

11:36 <zignig> eddyb: yeah , it's still in the "hacky crap" phase of development ;)

11:37 <eddyb> (mostly pointing it out because I hadn't even seen that use of `env` before NixOS - where it's the entirety of `/usr`. `/bin` only has `/bin/sh` and `/lib` doesn't exist :P)

11:38 <zignig> eddyb: fixed.

11:38 <whitequark> /usr/bin/env is very common outside of nixos

11:39 <eddyb> yeah I just mean there's a chance someone might never notice it until it actually makes a difference (which it sure can outside of NixOS, but it's not guaranteed :P)

11:39 <eddyb> zignig: `nmigen.build.res.ResourceError: Resource clk16#0 does not exist`

11:40 <eddyb> let me actually check what this supports

11:40 <cr1901> icestick doesn't have a clk16

11:40 <zignig> eddyb: you will have to use the clock on the ice stick , change it to clk12

11:40 <eddyb> 16 refers to 16MHz?

11:40 <cr1901> it has a clk12

11:41 <zignig> eddyb: it does , it's the default clock of the board

11:41 <cr1901> How to abstract something like this away is an open discussion right now

11:42 <zignig> cr1901: what do you mean ?

11:42 <cr1901> i.e. it should be possible to write platform-agnostic designs w/o much effort. Multiple ways to do it (subclass, mixins). I'm not sure which way is best, and others have asked too.

11:43 <eddyb> oh fun `unrecognised option '--placer'`

11:43 <whitequark> no subclasses or mixins

11:43 <whitequark> i've described the design this will use on the issue tracker, you can read it

11:43 <zignig> nmigen-board is a very good step towards that. I think a 'default' clock might be a good plan , for beginner.

11:43 <eddyb> it's not even clear what's outputting that

11:43 <zignig> eddyb: need the latest nextpnr

11:43 <eddyb> I was about to ask, lol

11:44 <cr1901> which issue on the issue tracker?

11:44 * zignig has migen,yosys and nextpnr pull and build on a cron job.

11:44 <whitequark> cr1901: https://github.com/m-labs/nmigen/issues/57#issuecomment-506598973

11:49 <cr1901> whitequark: Oh look, I'm tagged in that. This works fine for clocks specifically. I'm thinking of a different issue, such as how one would rewrite hdmi2usb's build infrastructure to take advantage of nmigen's dep injection.

11:49 <whitequark> hmmm

11:49 <whitequark> what do youmean?

11:52 <cr1901> I was looking to reduce code duplication in hdmi2usb, which has a tendency to create SoC's per platform (Base, Video class, etc) in their own isolated Python files. I have been wondering how to leverage the "platform" input to elaborate to nmigen, such that >>

11:52 <cr1901> _all_ platforms ultimate share the same single SoC Elaboratable for a given SoC class (Base, Video, etc)

11:52 <cr1901> and then use either subclassing or mixins that the SoC Elaboratable uses to completely abstract away platform differences.

11:53 <whitequark> ohhhh I see

11:53 <cr1901> I hadn't quite worked out how it concretely works.

11:53 <whitequark> I agree that requires research. I do not have any especially good ideas for desining that.

11:53 <eddyb> zignig: so this is fun, I think I can just run `nix build --store ssh-ng://build.lyken.rs` and it will build all the dependencies I have in `default.nix` on the server (if the official build servers don't have them, or I'm overriding their sources to get up-to-date versions

11:54 <cr1901> Somewhere in the m-labs scrollback this came up. Getorix (sp) seems to be very interested in it too

11:54 <eddyb> (that server has -j48 and it doesn't make noise in the office :P)

11:55 <zignig> eddyb: looked at nix but not had a try yet, sounds like gentoo without the extreme waiting and rollback.

11:56 <eddyb> I mean, there are official build servers, like with most distros. but I guess you could compare it to gentoo in terms of being able to customize packages

11:56 * zignig observes that whitequark's 'not especially good ideas' are still probably awesome.

11:56 <eddyb> it's not perfect and sometimes you have to fight it to do something that assumes too much about Linux

11:57 <eddyb> but hey, if steam works on it, anything is possible, right :P?

11:57 <zignig> eddyb: indeed , do you have blinky yet , huh huh huh ?

11:57 <eddyb> zignig: it's building pypy3 for some reason, and taking a long time to do so

11:57 <zignig> cr1901: have you got a v3 on a icebreaker yet ?

11:58 <cr1901> I haven't tried, and I'm a bit preoccupied till Sunday most likely

11:58 <zignig> cr1901: away mission or just busy ?

11:58 <cr1901> but in my plans

11:58 <cr1901> both?

11:58 <zignig> :/

11:58 <cr1901> :P

11:59 <cr1901> but in my plans

11:59 <cr1901> it'll be useful on icebreaker plus 128kB SRAM

12:00 <zignig> cr1901: oooh, 64KW is 128Kb ... nice , need a SRAM driver. sounds like a plan.

12:00 <eddyb> while this is compiling every python package in existence or whatever it's doing, I'll go work on the world's most inefficient parser for arbitrary CFGs

12:01 * zignig hands eddyb a go faster button.

12:02 <eddyb> it's not updating often but it's outputting stuff like `[rtyper] specializing: 15800 / 157489 blocks (10%)`

12:02 <eddyb> zignig: is the button connected to a server with more than 48 logical cores :P?

12:03 <whitequark> zignig: doesn't icebreaker have SPRAM?

12:03 <whitequark> and of course boneless is specifically made to use SPRAM well

12:03 <zignig> whitequark: not that I am aware of , but cr1901 might have a pmod.

12:03 <whitequark> no I mean on the FPGA

12:03 <cr1901> No I meant SPRAM

12:03 <cr1901> I was being sloppy

12:04 <cr1901> zignig: Port this if you're bored. I wrote it for micropython support for mithro. It's known to work: https://github.com/timvideos/litex-buildenv/blob/master/gateware/ice40.py#L6

12:05 <whitequark> oh yeah I forgot we don't have SPRAM inference...

12:05 <cr1901> It's wishbone, so it's misoc/litex compat (no idea about heavyX)

12:05 <zignig> cr1901: don't have any SPRAM at the moment, i've only go a tiny BX at the moment. will get an EX or an orangecrab

12:06 <zignig> when they come out.

12:06 <whitequark> SPRAM is on-chip single port RAM on the iCE40UP5K

12:07 <zignig> whitequark: ah , ok. does it need a special driver ? or does nextprn infer it ?

12:07 <whitequark> it needs the code cr1901 linked you, for now

12:07 <whitequark> this will be improved in yosys some day

12:07 <zignig> noted.

12:08 Asu` has joined ##openfpga

12:08 Asu has quit [Ping timeout: 268 seconds]

12:08 <zignig> ah , Instances. that reminds me , need to look into Warmboot and PLL at some point.

12:10 <tnt> zignig: note that you can't initialize the spram, so you need to "boot" from EBR.

12:11 <eddyb> one could make an UART bootloader, right?

12:11 <whitequark> tnt: ohhhhh

12:11 <whitequark> this makes me think if boneless' careful adjustment to only ever use 1 port (in the smallest configuration) is actually bad

12:11 <eddyb> or maybe read it in via SPI

12:12 <tnt> whitequark: yeah, the slight detail I had overlooked at first :p

12:12 <zignig> eddyb: I've written most of one for the defunct core_v2, which I will port once I have a handle on the assembler.

12:12 <whitequark> because if I use a BRAM in front...

12:12 <whitequark> it might as well be a cache

12:12 <whitequark> hmm

12:12 <whitequark> tnt: wild idea

12:12 <whitequark> a special hack in the instruction decoder that hardcodes memory writes from the external bus

12:13 <whitequark> I think this can be done very cheaply, possibly even at 0 LUT cost

12:13 <whitequark> so it would just cycle "read external, write memory, increment"

12:13 <zignig> whitequark: some magic in the BusArbiter ?

12:14 <whitequark> maybe reusing the PC counter to do the increment

12:14 <tnt> eddyb: yeah, at one point I was just having a small hardcoded spi flash reader that preloaded the spram and then released the reset line of a picorv32 once it had been loaded.

12:14 <whitequark> yeah I think I'll do it that way

12:14 <whitequark> because it saves a few muxes on the critical path

12:14 <whitequark> zignig: no, purely in the decoder

12:14 <eddyb> heh I think there was an arch that even put special instruction forms for `FFxx` addresses and half of that was RAM, the other half IO

12:14 <eddyb> am I thinking of the gameboy?

12:14 <eddyb> s/put/had

12:14 <whitequark> lots of arches have something like that

12:15 pie_ has joined ##openfpga

12:16 <cr1901> mips is like that: depending on which memory address you access, you bypass cache, MMU, or both

12:17 <cr1901> or neither*

12:18 <eddyb> cr1901: that stuff had me very confused until someone (whitequark?) explained it to me

12:19 <eddyb> because I started from what other people have tried to document about the N64, I kept thinking it was a weird thing Nintendo did, instead of something specific to MIPS

12:21 <cr1901> MIPS also doesn't have page table walks, so you have to write the code to figure out whether a page is in memory manually. Honestly, I think this is less hassle than doing it in h/w and wish riscv didn't mandate a hardware walk.

12:21 <cr1901> Mips solution seems pretty good to actually getting MMU shit to work (oh wow I praised MIPS)

12:21 <whitequark> software walks are very hard to get right

12:22 <whitequark> first, you have to pin your tlb handler in the tlb, which wastes often a lot of space in it

12:22 <whitequark> and in general the logic to keep it there is not trivial

12:22 <whitequark> second, your tlb handler uses registers, so you need some way to save the registers

12:23 <whitequark> but you can't assume you can access pretty much any other memory

12:23 <cr1901> you need to pin two tlb entries- one for the handler, and... the other I forget, tbh :P

12:23 <whitequark> for the page tables?

12:23 <cr1901> might've been something more specific.

12:24 <cr1901> whitequark: It's been a while since I studied it. I remember thinking at the end "this seems like so much less hassle than testing and implementing hardware walk logic". I can appreciate that it only _seems_ easier, and that doing either s/w or h/w walk is shit to implement.

12:25 <whitequark> iirc there was something about VIVT/PIPT TLB and such

12:25 <whitequark> but MMUs and caches were never my strong point

12:25 <cr1901> Somebody uses those?

12:26 <cr1901> eddyb: Yea another thing... you ever heard the term VIPT?

12:26 <whitequark> no I mean, IIRC with sw walk you have fewer possible combinations

12:26 <whitequark> but I'm not sure

12:31 <eddyb> cr1901: I don't think so, no?

12:32 <tnt> whitequark: took me a bit of time to understand what you meant, but yeah, that kind of instruction would definitely be useful for a lot of things when moving data in from external devices into local memory. The opposite ( read memory / write external ) would also be useful I think if it can fit at zero cost in the decoding logic.

12:32 <eddyb> ugh I just remembered a stray thought from yesterday ("ccNUMA over air")

12:32 <whitequark> tnt: hopefully

12:32 <whitequark> but I need to finish this yosys pass first

12:34 <cr1901> eddyb: Short version, VIPT is an optimization where cache and MMU accesses are done slightly in parallel. It is extremely common. It also leads to a lovely situation where two addresses in cache can point into the same page (or is it "two different pages can point to the same address"?).

12:34 <cr1901> So you have to write a "page coloring algorithm" to ensure that pages will never be aligned in such a way that aliasing occurs, and it's all just a fricking mess I don't understand :'D

12:34 <eddyb> tnt: next thing you know you're implementing macroop fusion :P

12:35 <eddyb> (if Boneless had compressed instructions, would they be one byte each?)

12:36 <whitequark> boneless already has multicycle instructions and multiword instructions

12:36 <eddyb> oh right nvm

13:11 flea86 has quit [Quit: Goodbye and thanks for all the dirty sand ;-)]

13:36 dh73 has joined ##openfpga

13:54 Sprite_tm has quit [Remote host closed the connection]

13:57 rohitksingh_work has quit [Read error: Connection reset by peer]

14:04 emeb has joined ##openfpga

14:21 genii has joined ##openfpga

14:26 Asu` has quit [Ping timeout: 246 seconds]

14:27 Asu has joined ##openfpga

14:30 rohitksingh has joined ##openfpga

14:30 cr1901 has quit [Quit: Leaving.]

14:31 cr1901 has joined ##openfpga

14:37 carl0s has joined ##openfpga

15:07 moho1 has quit [Quit: WeeChat 2.2]

15:08 <eddyb> zignig: wow this took forever and didn't even succeed "builder for '/nix/store/r9s808a51vrag0hhsscrr58y2sgsh8a3-pypy3-7.0.0.drv' failed with exit code 1"

15:44 rohitksingh has quit [Ping timeout: 248 seconds]

15:48 cr1901 has quit [Quit: Leaving.]

15:48 cr1901 has joined ##openfpga

16:06 m4ssi has quit [Remote host closed the connection]

16:15 Sprite_tm has joined ##openfpga

16:16 <Sprite_tm> Hey guys/gals, I'm getting an issue when programming my ECP5 using openocd/jtag...

16:16 <Sprite_tm> Error: tdo check error at line 26713

16:16 <Sprite_tm> Error: READ = 0x6c00000

16:16 <Sprite_tm> Error: WANT = 0x0000100

16:16 <Sprite_tm> Error: MASK = 0x0002100

16:16 <Sprite_tm> I can decode this using the ECP docs into Config: SRAM, SPIm fail 1, BSE error:User aborted configuration, Execution error

16:17 <Sprite_tm> Unfortunately, that still doesn't tell me what's actually going wrong here. FWIW, jtag frequency doesn't seem to matter.

16:20 <daveshah> What board and environment?

16:21 <Sprite_tm> Custom board. ECP5 LFE5U-45F 8BG381C

16:21 <Sprite_tm> What do you mean by 'environment'?

16:21 <daveshah> Programmer

16:22 <daveshah> and anything else relevant (other JTAG devices, etc)

16:22 <Sprite_tm> Tiao Tumpa FT2232H board. It's known to work.

16:22 <Sprite_tm> No other JTAG devices.

16:23 <daveshah> Is this your first test of this board, or has it worked before/with Diamond?

16:23 <Sprite_tm> No, 1st test. It's a bringup test, at the moment 2/2 boards fail :/

16:24 <Sprite_tm> I have no clue why. I had pretty similar hardware on the previous incarnation, with the most stiking difference that that used an -uM-45F.

16:24 <daveshah> Have you changed the script to the correct device type?

16:25 <Sprite_tm> Yes. Both the openocd script as well as the ecppack command line.

16:25 <daveshah> You should change nextpnr, not ecppack

16:25 <daveshah> For all but the 12k, no device argument to ecppack is needed

16:25 <Sprite_tm> Ah, sorry, nextpnr indeed. Ecppack didn't have an argument.

16:26 <Sprite_tm> or no device argument at least.

16:27 <daveshah> Just to check, can you post the full command line (although a bad device ID would probably fail much earlier) for nextpnr and ecppack?

16:27 <Sprite_tm> <-->nextpnr-ecp5 --json $< --lpf $(CONSTR) --textcfg $@ --45k --package CABGA381 --speed 8 --freq $(FREQ)

16:27 <Sprite_tm> ecppack --spimode $(FLASH_MODE) --freq $(FLASH_FREQ) --svf-rowsize 100000 --svf $(PROJ).svf --input $< --bit $@

16:28 <daveshah> Can you try ecppack without freq and spimode? It might be confusing things

16:28 <Sprite_tm> O_O

16:28 <Sprite_tm> That works.

16:30 <Sprite_tm> Any idea what that's about? I'd love to keep those args tbh, boot is 4 secs or so otherwise.

16:30 <Sprite_tm> If not, I'll just experiment to see what works btw.

16:30 <daveshah> Those commands set various SPI-flash specific things in the bitstream, which it seems the ECP5 won't accept via JTAG

16:30 <Sprite_tm> Also, mode is qspi and flash_freq is 38.8 (MHz).

16:31 <Sprite_tm> Huh. I'm 99% sure the -UM-45F had no issue with it.

16:31 <Sprite_tm> Can't test as I had to rewire my JTAG-cable :/

16:31 <daveshah> Odd. Maybe different silicon revisions or something

16:31 <Sprite_tm> Perhaps indeed. Ah well, good to know.

16:32 <Sprite_tm> Thanks for the help, I'd've been off checking about five million things before I'd suspect those settings.

16:32 <Sprite_tm> If anything, it's on the Internet now so other people will get a search result :)

16:38 <daveshah> Anyway, just pushed a fix so this won't happen again (SPI options won't be included in the SVF file)

16:40 <whitequark> daveshah: https://imgur.com/a/RHiXInO

16:40 <whitequark> i'm using a borderline insane approach but it seems to actually work pretty well

16:42 <daveshah> Oh nice, think I understand it a bit better looking at that

16:43 rohitksingh has joined ##openfpga

16:43 <whitequark> it's a BDD! it's a CFG! it's an SSA-like IR! it's a combination etc

16:43 <whitequark> what i realized is that yosys' switch statements are exactly isomorphic to ML pattern matching

16:44 <whitequark> so i simply did the CPS transform on it and then used a standard approach from Le Fessant's paper

16:44 moho1 has joined ##openfpga

16:44 <whitequark> this produces code linear in size *and* is polynomial complexity itself

16:45 <whitequark> except now there's a proc_* pass that contains a complete optimizing compiler with an IR and several passes

16:45 <Sprite_tm> daveshah: Awesome, that's a very rapid turnaround time :P

16:48 <daveshah> whitequark: hehe, I think we will add a few more things like that as we add more and more higher-level optimisations

16:48 <whitequark> yes but, yo dawg

16:50 <whitequark> daveshah: https://github.com/whitequark/yosys/commits/proc_match

16:58 Miyu has quit [Ping timeout: 272 seconds]

17:05 azonenberg_work has joined ##openfpga

17:10 cr1901 has quit [Quit: Leaving.]

17:10 cr1901 has joined ##openfpga

17:44 bubble_buster has quit [Ping timeout: 257 seconds]

17:44 bubble_buster has joined ##openfpga

17:46 daveshah has quit [Ping timeout: 257 seconds]

17:47 Jybz has joined ##openfpga

17:48 daveshah has joined ##openfpga

17:52 rohitksingh has quit [Ping timeout: 244 seconds]

18:11 Asu is now known as Asu_

18:16 mkdir has joined ##openfpga

18:16 <mkdir> hello, how do you declare float var in verilog

18:16 <mkdir> 'real' type throws error

18:17 <whitequark> what error? in what program?

18:17 <mkdir> ERROR: syntax error, unexpected TOK_REAL

18:17 <mkdir> ice40

18:17 <mkdir> sorry yosys

18:18 <whitequark> floats are not synthesizable

18:18 <whitequark> they're simulation only

18:18 <mkdir> mm so what is real used for? and what's a better type for floats? reg[63:0] hz = 0.5

18:18 <mkdir> works okay

18:19 <whitequark> don't expect to be able to use floats like you'd do it in C

18:19 <whitequark> you will need dedicated logic that implements your desired float semantics

18:19 <mkdir> oh

18:19 <whitequark> synthesizable verilog arithmetics is *only* integers (well booleans too)

18:20 <mkdir> hmm so how do we deal with real numbers

18:20 <mkdir> and why does the real type exist?

18:20 <mkdir> i thought it was for floats

18:20 <mkdir> reference: https://www.csee.umbc.edu/portal/help/VHDL/verilog/types.html

18:20 <whitequark> i told you: real is simulation only

18:20 <mkdir> ooh

18:21 <whitequark> it's for e.g. comparing your logic's behavior to something more precise

18:21 <whitequark> or for representing voltage levels or something

18:21 <whitequark> 98% of verilog cannot be used at all in synthesis

18:21 <mkdir> so this works: reg[7:0] hz = 0.5;

18:21 <mkdir> but maybe not the way i think?

18:21 <whitequark> i think that just coerces to integer and ends up 0 or 1

18:21 <mkdir> ah

18:22 <daveshah> You probably don't want to be using floating point stuff on an iCE40 anyway

18:22 <whitequark> also that, especially not 64 bit floats

18:22 <mkdir> hmm ok

18:23 <daveshah> I remember someone who copied some HSV to RGB code into an HLS tool, asked for a single cycle implementation, and got about a 80k LUT design out

18:23 <daveshah> Because it was all 64 bit floats

18:23 <whitequark> lol

18:24 <daveshah> You can hit similar things even with integers using division or modulo (by anything other than a constant power of two)

18:24 <whitequark> how would yosys even implement /5?

18:24 <mkdir> so what data types are recommended?

18:24 <mkdir> reg and wire?

18:25 <mkdir> what's the diff between

18:25 <ZirconiumX> You can store stuff to a reg but not a wire

18:25 <whitequark> reg can be assigned from `always` statement, wire can be assigned with the `assign` statement

18:25 <whitequark> note that reg is not necessarily implemented as a register

18:26 <daveshah> whitequark: the algorithm is in https://github.com/YosysHQ/yosys/blob/master/techlibs/common/techmap.v#L281

18:26 <daveshah> Just lots of subtracts and compares...

18:26 <mkdir> but reg can be assigned outside of always too right

18:26 <mkdir> but wire cannot?

18:26 <ZirconiumX> So non-restoring division?

18:27 <whitequark> mkdir: nope, reg can only be assigned from always

18:27 <whitequark> and wire only from assign

18:27 <whitequark> the distinction between them (for synthesis) is purely syntactical

18:29 <mkdir> whitequark: https://pastebin.com/LnTG116H

18:29 <mkdir> look at code from Lattice

18:29 <whitequark> sure

18:29 <mkdir> reg types are defined outside always

18:29 <mkdir> oh

18:29 <mkdir> I see

18:29 <mkdir> wire cannot be inside

18:29 <whitequark> you define both wire and reg outside always, generally

18:30 <mkdir> hmm i see well then maybe i did not understand the previous statement

18:30 <mkdir> mkdir: nope, reg can only be assigned from always

18:30 <whitequark> you can't write `reg x; assign x = ...`

18:31 <ZirconiumX> If you see here, div_cntr* are all reg, and all have non-blocking <= assignments in an always block

18:31 <whitequark> you have to write `reg x; ... always @(...) x <= 1`

18:31 <whitequark> or `x = 1`

18:31 <whitequark> depending on whether the always block is clocked or not

18:31 <ZirconiumX> Outside the always block, the LED* outputs (which are wires) are assigned

18:32 <mkdir> ooh ok thanks i see

18:32 <azonenberg> Anybody here have a few minutes to sanity check my understanding of some bignum/crypto code?

18:34 <whitequark> nope too scared to look

18:36 <azonenberg> lol

18:40 mkdir has quit [Remote host closed the connection]

18:40 mkdir has joined ##openfpga

18:57 Lord_Nightmare has quit [Quit: ZNC - http://znc.in]

19:00 Lord_Nightmare has joined ##openfpga

19:08 mkdir has quit [Ping timeout: 260 seconds]

19:15 carl0s has quit [Remote host closed the connection]

19:20 SpaceCoaster has quit [Quit: ZNC 1.6.5+deb1+deb9u2 - http://znc.in]

19:24 SpaceCoaster has joined ##openfpga

19:40 emeb_mac has joined ##openfpga

19:41 Miyu has joined ##openfpga

19:42 <kc8apf> bignum, sure

19:42 <kc8apf> crypto, nope

19:44 <TD-Linux> might be interesting to convert integer divides to dsp blocks instead

19:53 Asu_ has quit [Remote host closed the connection]

19:54 Asu has joined ##openfpga

19:59 <whitequark> tnt: looked through boneless in search of places where 'x could help.

19:59 <azonenberg> kc8apf: i'm trying to port the C reference implementation of x25519 in NaCl to HDL

19:59 <whitequark> found two where 5 levels of muxes (2 luts) become a wire instead

19:59 <whitequark> and yes, my pass can gracefully handle that, in fact, although it doesn't right now

20:00 <kc8apf> why NaCL?

20:00 <azonenberg> Because that was the most readable 25519 implementation i found

20:00 <azonenberg> djb's ref implementation on the website was in assembly

20:01 <azonenberg> the versions in libsodium were all fancy optimized CPU code

20:01 <tnt> whitequark: how does it "know" it can be reduced to wires ?

20:01 <azonenberg> undoing all the bignum stuff and converting back to a straight logic[255:0] is a big pain

20:01 <azonenberg> the nacl "ref" implementation is the least optimized one i could find

20:01 <azonenberg> so i had the least work to undo :p

20:01 <kc8apf> ugh. premature optimization

20:01 <whitequark> tnt: well, the pass can see that 4 of 5 decision points (mux levels) lead either to one particular branch, or 'x

20:01 <azonenberg> kc8apf: https://www.antikernel.net/temp/smult.c

20:01 <kc8apf> I really shouldn't sign up for any additional projects. I'm already getting lost in Cyclone V docs

20:01 <azonenberg> and i use "readable" lightly

20:02 <whitequark> so it discards the branch that leads to 'x together with the decision point itself

20:02 <kc8apf> and I should be poking at BMCs

20:02 <azonenberg> note the 100% lack of comments outside the header

20:02 <whitequark> that reduces it to 1 decision point, which looks like `i_insn[5] ? 1 : 0`

20:02 <whitequark> which will be later reduced to a wire by a later pass

20:02 <azonenberg> ultra dense code with almost no whitspace

20:02 <tnt> whitequark: ah oki, so it's using 'x' to mark "don't cares". I had mis-understood, I thought somehow the pass "magically" knew which ... I didn't understand how obviously.

20:02 <azonenberg> no descriptions of what the code actually DOES

20:02 <azonenberg> no theory of operation, etc

20:02 <whitequark> yes, that was me considering whether boneless has any places where 'x in decoder would actually help

20:03 <azonenberg> i'm slowly porting things and still don't actually understand how some parts of it work

20:03 <whitequark> it would not improve clock speed, but it would shave a few luts... maybe like 5

20:03 <kc8apf> classic C programmers worried about all the wrong things

20:03 <kc8apf> see all the reciprocal handling, probably unnecessary

20:04 <azonenberg> like, i've written ultra optimized code too

20:04 <kc8apf> but everything thinks they need to avoid standard libraries

20:04 <azonenberg> But it was HEAVILY commented

20:04 <azonenberg> yeah, but at least i know this version works

20:04 <azonenberg> So as long asi feed test vectors to it and to my code, i can make mine readable :p

20:04 <tnt> whitequark: I mean, depending how you coded the decode logic it might/might-not help. Like if you kind of "manually" assigned those bit of the op code to directly control that mux and hardwired it already, obviously it won't help much. (that's just a random example)

20:04 <whitequark> kc8apf: https://github.com/jedisct1/libsodium/commit/3cefff9e52b51a310c28c8dbb6b3edf1171c3d8b

20:04 <whitequark> tnt: so what I did in the boneless decoder is that it is 100% table driven

20:05 <whitequark> i.e. it uses pattern matching to map input to output, for every input and output

20:05 <whitequark> even when there is actually a field in the instruction

20:05 <whitequark> the idea is that, best case, the synthesis tool unfolds all that and it's just wires

20:05 <whitequark> but if i make a mistake somewhere, i still get correct behavior

20:05 <tnt> Let me have a look at the code, that might be easier for me to understand.

20:06 <whitequark> https://github.com/whitequark/Boneless-CPU/blob/master/boneless/gateware/decoder.py

20:06 <whitequark> e.g. the entire self.o_cond switch is a no-op, or at least is supposed to be

20:07 <whitequark> I only helped the synthesizer in one place... manually hoisted o_cond and o_flag out of the jump opcodes, so it is not reset to 0 when the operation is not a jump

20:08 <TD-Linux> azonenberg, if you aren't tied to that p

20:09 <TD-Linux> particular curve the folks in #secp256k1 are pretty good

20:09 <azonenberg> I'm doing 25519 per requirements of a protocol i want to use it on

20:09 <azonenberg> doesn't support any other curves

20:09 <azonenberg> Also https://www.antikernel.net/temp/md5_kernel.cu

20:09 <azonenberg> This is hyper-optimized code i wrote ten years ago for password cracking

20:10 <azonenberg> back when gpgpu was just taking off

20:11 <azonenberg> doing things like using duff's device for unrolled loops with variable iteration counts using a single conditional

20:11 <azonenberg> (which i independently invented, I didn't know this was a standard thing with a name until years later)

20:12 <tnt> whitequark: I see. But you kinda helped a bit manually by having multiple "with m.Switch(self.i_insn):"

20:12 <whitequark> tnt: did I?

20:13 <tnt> (like L347 and L358 for instance)

20:13 <azonenberg> ah ok this isnt the exact code i was looking for

20:13 <whitequark> tnt: nope, that is done solely for readability

20:13 <azonenberg> but somewhere in that tool i think i have a strcat() that never touches ram

20:14 <whitequark> tnt: actually i had to add a pass to my proc_match pass so that these nested switches would be *as efficient* as a flat one

20:14 <whitequark> ie it is worse

20:14 <tnt> whitequark: Oh wait, my bad, ABS and LIT aren't instructiong.

20:14 <whitequark> oh yeah that too

20:14 <azonenberg> kc8apf: https://www.antikernel.net/temp/sha1_kernel.cu

20:15 <azonenberg> https://www.antikernel.net/temp/sha1_kernel_core.h

20:15 <azonenberg> this is what good optimization looks like... WITH COMMENTS

20:15 Jybz has quit [Quit: Konversation terminated!]

20:31 <tnt> whitequark: jfyi, that's what I currently use https://gist.github.com/smunaut/6e3576436ed4e72f0e6aafc080f9e630#file-f16_dec_gen-py-L541

20:33 <whitequark> tnt: i'm 95% certain i can match or beat that with my new pass

20:35 <tnt> heh yeah, I'd have to try it, but that requires rewriting it to output switch statements rather than a truth table.

20:35 <whitequark> i need to finish the pass first... i have the priority decoders taken care of

20:35 <whitequark> but no connection to pmux cells yet

20:35 <whitequark> i want to stuff in a few more optimizations too

20:37 <whitequark> one optimization to improve delay with some coding styles, and another optimization to improve techmapping of very long literal comparisons

20:37 <whitequark> currently if you compare with something like 000000001 it synthesizes into a long mux chain

20:37 <whitequark> but it should just be a $eq cell

20:46 dh73 has quit [Remote host closed the connection]

20:47 dh73 has joined ##openfpga

22:08 <ZirconiumX> Since CRC works by XORing when it reaches a most-significant 1 bit, then CRC on all-zeroes is either the polynomial or the initial register value, right?

22:08 <ZirconiumX> Trying to break a CRC on a bitstream and it's not going well

22:10 Miyu has quit [Ping timeout: 245 seconds]

22:11 <adamgreig> lots of scope for weird tricks with crc though, like reversing input/output order and inverting output

22:11 <ZirconiumX> Yeah...

22:12 <adamgreig> http://reveng.sourceforge.net/

22:13 <ZirconiumX> Been running backwards and forwards through stuff with that

22:14 <ZirconiumX> It's not very trivial to extract 916-byte frames from a hexdump to give to reveng

22:15 <mwk> ZirconiumX: not exactly

22:15 <mwk> it's common to use non-0 initial register value (all-1 usually)

22:17 <mwk> if the polynomial is any good, using all-0 input will effectively run the polynomial as LFSR and give you a random-looking value

22:17 <ZirconiumX> Mmm

22:18 <hackerfoo> If you know (can guess) the exact algorithm, you could just try all 64k polynomials, right?

22:18 <mwk> no

22:19 <mwk> there are more parameters

22:19 <mwk> and there are only 64k polynomials if it's a 16-bit CRC

22:19 <mwk> also, if it's a CRC, cracking it is much much simpler

22:20 <mwk> the whole thing is linear

22:20 <hackerfoo> Ah. Well, I found this: https://www.cosc.canterbury.ac.nz/greg.ewing/essays/CRC-Reverse-Engineering.html

22:21 <mwk> well, affine with some common modifications

22:21 <ZirconiumX> Yup

22:21 <mwk> oh yes

22:21 <mwk> that's the article I was looking for

22:21 <ZirconiumX> Read it

22:21 <adamgreig> the article is a classic but surprisingly unhelpful in practice i found

22:21 <hackerfoo> This looks neat: http://reveng.sourceforge.net/

22:22 <adamgreig> unless you're on top of your linear algebra it's still a bit of a guessing game

22:22 <mwk> just beware

22:22 <adamgreig> or anyway it was for me and i was meant to be very on top of linear algebra at that point :p

22:22 <mwk> we're talking FPGAs

22:22 <mwk> the modifications can be something noone in their right mind would implement in software

22:23 <mwk> on one FPGA family, Xilinx computes CRC-16, but feeds 36-bit words into it

22:23 <mwk> which are a concatenation of just-written 32-bit data word and current 4-bit destination register address

22:23 <mwk> on another FPGA family, Xilinx computes a 22-bit CRC from 22-bit words

22:23 <mwk> and figuring out *what* goes into a CRC is often harder than figuring out the algorithm itself

22:24 <adamgreig> or at least, once you know exactly what goes in and what comes out it should be relatively quick to get the algorithm

22:24 <mwk> in my experience, it's the other way around

22:24 <mwk> first figure out more-or-less the algorithm

22:25 <mwk> inducing neighboring bitflips and comparing the resulting XOR values for many places in the bitstream is useful here

22:26 azonenberg has quit [Remote host closed the connection]

22:28 azonenberg has joined ##openfpga

22:34 <ZirconiumX> So, I literally just put in 914 zero bytes followed by 6C 93 and RevEng cracked it

22:35 <ZirconiumX> Now to see if it holds for the other data I have

22:35 <ZirconiumX> (CRC-16/MODBUS)

22:47 <ZirconiumX> Yes, it is

22:47 <ZirconiumX> Well, that was easier than expected

22:59 Asu has quit [Quit: Konversation terminated!]

23:01 dh73 has quit [Ping timeout: 260 seconds]

23:20 Bike has joined ##openfpga

23:57 genii has quit [Read error: Connection reset by peer]