#nmigen on 2020-07-20 — irc logs at freenode.irclog.whitequark.org

2020-07-06 20:15 ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen · IRC meetings each 1st & 3rd Monday at 1800 UTC · next meeting July 20th

00:01 <_whitenotifier-b> [nmigen] whitequark commented on issue #441: PS7 block not initialized on series-7 Zynq targets - https://git.io/JJcgl

00:03 Yehowshua has joined #nmigen

00:04 <Yehowshua> ktemkin, I'm perusing Luna - how might set the device class in the descriptor packets?

00:04 <Yehowshua> Oh I see it

00:05 <Yehowshua> Weird - it is set correctly...

00:09 daknig2 has joined #nmigen

00:30 Yehowshua has quit [Remote host closed the connection]

00:36 daknig2 has quit [Ping timeout: 240 seconds]

00:39 Yehowshua has joined #nmigen

00:51 Yehowshua has quit [Remote host closed the connection]

01:10 <aaaa> what is "UnusedElaboratable: <nmigen.compat.genlib.fifo.CompatSyncFIFOBuffered object at 0x7f02393c7cc0> created but neve

01:10 <aaaa> *never used"

01:10 <aaaa> since i use the fifo immediately afterwards

01:11 <aaaa> both the read and write side are connected (one side to an internal module, one side is exposed as a port)

01:11 <aaaa> but i get that error

01:11 <aaaa> and readable/writable are stuck as zero

01:13 Degi has quit [Ping timeout: 260 seconds]

01:14 Degi has joined #nmigen

01:14 <miek> are you adding it to submodules?

01:18 <aaaa> uhhh

01:18 <aaaa> no

01:20 <aaaa> that fixed it

01:20 <aaaa> thanks

01:20 <aaaa> or at least it doesn't error now, testing it rn

02:19 jock-tanner has quit [Ping timeout: 256 seconds]

02:36 jaseg has quit [Ping timeout: 256 seconds]

02:38 jaseg has joined #nmigen

03:16 electronic_eel has quit [Ping timeout: 256 seconds]

03:16 electronic_eel has joined #nmigen

03:54 PyroPeter_ has joined #nmigen

03:56 PyroPeter has quit [Ping timeout: 260 seconds]

03:56 PyroPeter_ is now known as PyroPeter

04:08 daknig2 has joined #nmigen

04:12 <ianloic> random q: was there a `nmigen` command that's now no longer a thing?

04:19 <awygle> i completely forgot #381 was still an issue and not something that was implemented >_> oops

04:19 <awygle> is there anything blocking it from being implemented currently, and is it too big to make 0.3?

04:22 <awygle> (to be clear i'm offering to do it not badgering for it to be done)

05:09 <d1b2> <a> having a bit of an issue with the glasgow i2c implementation - i've got it isolated and hooked up into a simple design (https://github.com/GlasgowEmbedded/glasgow/blob/9a83d8574fd4031fe067eaf5c5ae748ecff7b4d6/software/glasgow/applet/interface/i2c_initiator/__init__.py#L19)

05:09 <d1b2> <a> but i'm getting weird responses when i try to use it

05:09 <d1b2> <a> i can tell that it's working at least at the basic level, since the command-sequence for "write" works consistently and it gives some sort of error when an invalid i2c slave address is given

05:10 <d1b2> <a> but when i try to do a read operation, i get back a 1 response from the address-write portion of it, and then 255, 255 from the read portion

05:10 <d1b2> <a> https://cdn.discordapp.com/attachments/721095784178778172/734633223379878028/Screen_Shot_2020-07-19_at_9.49.48_PM.png

05:13 peepsalot has quit [Quit: Connection reset by peep]

05:27 <d1b2> <TiltMeSenpai> uhh i2c has pullups

05:28 <d1b2> <TiltMeSenpai> if you see nack and 255 on data, it means you're knocking on the door but nobody's home

05:28 <d1b2> <TiltMeSenpai> if I'm interpreting your question right

05:28 <d1b2> <a> huhh

05:28 <d1b2> <a> but when i try to write it works fine

05:28 <d1b2> <a> w/ the same periph addr

05:29 <d1b2> <TiltMeSenpai> the device might not support read addresses? I don't really know

05:30 <d1b2> <TiltMeSenpai> or it could be looking for some non-7-bit address?

05:31 <d1b2> <a> works fine with a different i2c impl

05:31 <d1b2> <TiltMeSenpai> do you have an oscilloscope?

05:31 <d1b2> <TiltMeSenpai> or logic analyzer

05:32 <d1b2> <TiltMeSenpai> something weird is going on that's stopping the target from driving the bus, but it's hard to say what without looking at the waveform

05:33 <d1b2> <TiltMeSenpai> oh wait if you're running on a glasgow, you can use the trace option

05:33 <d1b2> <a> i can hook up the i2c output to a logic analyzer and grab waveforms

05:35 <d1b2> <TiltMeSenpai> yeah if you add --trace output.vcd the glasgow should end up writing a vcd with measured values to output.vcd

05:35 <d1b2> <TiltMeSenpai> might be easier than grabbing a logic analyzer and hooking things up

05:35 <d1b2> <a> oh this is running on a different fpga

05:35 <d1b2> <TiltMeSenpai> oh are you just using the gateware?

05:36 <d1b2> <a> yeah i just pulled the relevant gateware into a normal fpga project

05:36 <d1b2> <TiltMeSenpai> ah I see

05:37 daknig2 has quit [Ping timeout: 256 seconds]

06:09 <whitequark> a: that i2c implementation isn't the best in the world, but i think i tested it quite a bit

06:10 <whitequark> awygle: we should discuss it on today's meeting

06:10 <awygle> oh, sure

06:12 <d1b2> <a> huhh there might be some FIFO issues here too actually, is there any reason a SyncFIFOBuffered would behave weirdly?

06:12 <d1b2> <a> reading from the same FIFO at different times is yielding different results

06:12 <whitequark> hmm

06:12 <whitequark> which fpga? is it multiclock?

06:13 <d1b2> <a> ECP5, one clock

06:13 <whitequark> nothing that comes to my mind

06:13 <d1b2> <a> there's a 1 second delay after the i2c operations are written to the command fifo before i try to read from the output fifo

06:14 <d1b2> <a> but if i let other things happen before reading from the FIFO, then the FIFO output is different

06:14 <d1b2> <a> depth is v large so that doesn't appear to be an issue

06:15 * awygle confirms meeting time for the fourth time

06:15 daknig2 has joined #nmigen

06:43 jeanthom has joined #nmigen

06:45 hitomi2504 has joined #nmigen

06:51 <d1b2> <a> fixed it, turned out that the data wasn't being put into the FIFO fast enough to keep up with the I2C transaction

06:51 <d1b2> <a> that and a small bug in the FIFO read logic

06:52 <d1b2> <a> how does glasgow handle this? does it store the data in a separate FIFO until it's all been received over USB, and then quickly dump it into the main FIFO?

06:57 jock-tanner has joined #nmigen

07:21 proteus-guy has joined #nmigen

07:25 jeanthom has quit [Ping timeout: 256 seconds]

08:08 Asu has joined #nmigen

08:36 jock-tanner has quit [Ping timeout: 256 seconds]

08:38 jeanthom has joined #nmigen

08:40 daknig2 has quit [Ping timeout: 256 seconds]

08:59 daknig2 has joined #nmigen

09:18 daknig2 has quit [Ping timeout: 256 seconds]

09:27 nengel has joined #nmigen

10:08 jeanthom has quit [*.net *.split]

10:08 nengel has quit [*.net *.split]

10:08 Degi has quit [*.net *.split]

10:08 alexhw_ has quit [*.net *.split]

10:09 mwk has quit [*.net *.split]

10:10 nengel has joined #nmigen

10:10 jeanthom has joined #nmigen

10:10 alexhw_ has joined #nmigen

10:10 mwk has joined #nmigen

10:10 Degi has joined #nmigen

10:31 proteus-guy has quit [Ping timeout: 256 seconds]

10:55 jock-tanner has joined #nmigen

13:12 jock-tanner has quit [Ping timeout: 260 seconds]

14:04 emeb has joined #nmigen

14:28 cstrauss[m] has joined #nmigen

14:33 <DaKnig> where can I look for examples of using Array, FIFO, Memory...?

14:37 <lkcl_> DaKnig: find , -name "nmigen/examples/*.py" | xargs grep "Array"

14:37 <lkcl_> :)

14:38 <DaKnig> yeah good point

14:38 <DaKnig> :)

14:38 <lkcl_> if those don't exist, then... (1sec..)

14:39 * lkcl_ just looking...

14:40 <lkcl_> Array: https://git.libre-soc.org/?p=nmutil.git;a=blob;f=src/nmutil/multipipe.py;hb=HEAD#l46

14:41 <lkcl_> set up here: https://git.libre-soc.org/?p=nmutil.git;a=blob;f=src/nmutil/multipipe.py;hb=HEAD#l121

14:41 <lkcl_> and accessed here: https://git.libre-soc.org/?p=nmutil.git;a=blob;f=src/nmutil/multipipe.py;hb=HEAD#l222

14:42 <lkcl_> so for Array, you create a list (or a generator of some kind) and pass that to "Array"

14:42 <lkcl_> then you can "index" that "Array" using a Signal.

14:42 <lkcl_> it's real simple

14:42 <DaKnig> but this means different members can have different type

14:43 <DaKnig> does this compile down to actual arrays on the verilog level

14:43 <lkcl_> ah... i've never tried that however it might actually work.

14:43 <FL4SHK> I've not tried arrays that have heterogeneous elements

14:43 <FL4SHK> I don't normally need things like that

14:43 <lkcl_> you mean "on the ilang level", yes

14:43 <FL4SHK> https://github.com/fl4shk/volt32_console/blob/master/src/gfx/video_ditherer_mod.py#L62 here's my use of arrays

14:44 <FL4SHK> I guess also my FIFO

14:44 <DaKnig> ilang?

14:44 <DaKnig> IL?

14:44 <DaKnig> or uh, IR?

14:44 <FL4SHK> ilang is what nMigen spits out right?

14:44 <lkcl_> DaKnig: yes yosys ilang

14:44 <DaKnig> it spits out verilog

14:44 <lkcl_> FL4SHK: yes.

14:44 <DaKnig> I m using vivado, for me it spits verilog

14:44 <lkcl_> DaKnig: no, it doesn't. it generates yosys native ilang.

14:45 <lkcl_> you can then pass that *to* yosys, by using "read_ilang {filename}" then "write_verilog {filename}" if you want verilog

14:45 <DaKnig> oh cool

14:45 <DaKnig> so thats how it does it

14:45 <lkcl_> but that is yosys's job, not nmigen's

14:46 <DaKnig> "the nmigen toolchain" :)

14:46 <lkcl_> yup. there's also a ghdl plugin for yosys, it works, but people here recomment using verilator for conversion of vhdl, as being more complete. i think it's verilator

14:47 <FL4SHK> the GHDL plugin doesn't support records with unconstrained elements in them

14:47 <FL4SHK> i.e. one of the most desirable features from VHDL 2008

14:47 <lkcl_> FL4SHK: ahh this is what the microwatt team ran into.

14:47 <DaKnig> I would really like VHDL to get as much attention as Verilog in the open source circles

14:47 <DaKnig> not enough tools, that dont work as well...

14:47 <FL4SHK> old VHDL is pretty good btw

14:47 <lkcl_> DaKnig: from working with microwatt for several months, i am now deeply impressed with VHDL

14:48 <FL4SHK> it's just not as good as VHDL 2008

14:48 <FL4SHK> ...VHDL 2008's generic packages feature, one of the best things about the language, is very poorly supported

14:48 <FL4SHK> now, nMigen is actually a lot better in the high level things department

14:48 <lkcl_> i had a lot of trouble compiling microwatt.

14:49 <FL4SHK> What exactly is Microwatt?

14:49 <lkcl_> DaKnig: Memory example - https://github.com/nmigen/nmigen/blob/master/examples/basic/mem.py

14:49 <lkcl_> FL4SHK: 1 sec

14:49 <lkcl_> https://github.com/antonblanchard/microwatt/

14:50 <FL4SHK> ah, POWER

14:50 <FL4SHK> I like making new instruction sets

14:50 <FL4SHK> I've got one I'm in the process of making the assembler for

14:50 <lkcl_> it's a POWER9-compliant core written in VHDL by a research team in IBM that i worked with, 20 years ago

14:50 <FL4SHK> 20 years ago I was 6

14:50 <FL4SHK> first grade, didn't care about computers

14:50 <DaKnig> tbh microwatt is a confusing name, I get the pun but theres something else with that name in the field already

14:50 <lkcl_> FL4SHK: yes, i remember you saying: i'm actually really interested to hear how that goes over time.

14:51 <lkcl_> DaKnig: been there. when they wanted to make it a 5-stage pipeline design i told them they'd have to change the name to "megawatt" :)

14:52 <DaKnig> lol

14:52 <FL4SHK> lkcl_: I decided to make a smaller processor first before the big one

14:52 <FL4SHK> the instruction set is defined

14:52 <FL4SHK> there are 36 instructions

14:52 <FL4SHK> purely 32-bit machine

14:52 <FL4SHK> ...almost

14:52 <FL4SHK> it has full products for multiplies

14:54 <DaKnig> where can I see the ilang that nmigen spits

14:54 lkcl has joined #nmigen

14:54 <lkcl> bleh, apologies: mobile broadband internet connection

14:56 <lkcl> DaKnig, SyncFIFO example - https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/minerva/units/loadstore.py;hb=HEAD#l189

14:56 lkcl_ has quit [Ping timeout: 260 seconds]

14:57 <lkcl> that example's interesting in that it's combined with a Record (defined at line 185)

14:58 <lkcl> FL4SHK: yes, doing a small design before moving to a larger one, very sensible.

14:58 <lkcl> DaKnig: take any of the examples https://github.com/nmigen/nmigen/tree/master/examples/basic

14:59 <lkcl> then type "python3 {whatever.py} --help"

14:59 <lkcl> followed by "python3 {whatever.py} generate -t il"

14:59 <lkcl> if you want to see the equivalent verilog, change that to "-t v"

15:04 <lkcl> DaKnig: i thoroughly recommend opening the ilang in yosys ("read_ilang {filename.il}" and doing "show top" or "show {press tab key}" if there are submodules

15:04 <lkcl> you'll need xdot and graphviz installed (apt-get install graphviz) for that to work

15:05 nengel has quit [Ping timeout: 240 seconds]

15:05 <lkcl> it's quite fascinating to see the results of a linearly-written program as a gate-level tree, graphically.

15:06 <lkcl> i've fixed several early rookie mistakes by always examining the graphviz after every edit

15:15 FFY00 has quit [Quit: dd if=/dev/urandom of=/dev/sda]

15:15 FFY00 has joined #nmigen

15:29 <DaKnig> does it show the primitives too/

15:29 <DaKnig> ?

15:47 hitomi2504 has quit [Quit: Nettalk6 - www.ntalk.de]

15:48 <whitequark> a: glasgow has two FIFOs, the one that holds data while it's being received through USB is in the FX2

15:49 <whitequark> DaKnig: regarding arrays with heterogenous elements, an array is basically a more compact way to write a Switch()/Case() construct

15:50 <whitequark> if you use an array on left-hand side, what happens is it expands on a Switch(index) and every Case() contains a single eq where the right-hand side is the same

15:50 <whitequark> if you use an array on RHS, then in every Case the LHS is the same

15:50 <whitequark> (though you could use an Array on LHS and Array on RHS and that has a more complex expansion, but the basic principle is the same)

15:50 <FL4SHK> still working on my Python-based assembler here

15:51 <FL4SHK> going to support variable width instructions

15:51 <FL4SHK> ...in reality it's just that a source code instruction may end up as two instructions in the final binary

15:51 <FL4SHK> specifically, instructions that use 32-bit immediates

15:51 <whitequark> kinda like boneless :)

15:51 <FL4SHK> yes

15:52 <FL4SHK> does boneless do that?

15:52 <FL4SHK> instructions that use immediates basically have two possible sizes in my architecture

15:54 <FL4SHK> ...in reality there's just a `pre` instruction that expands the size of the immediate in the following instruction

15:54 <FL4SHK> as such it's not truly variable width

15:54 <FL4SHK> `pre` is a real, separate instruction

15:54 <tpw_rules> that's precisely what boneless does

15:54 <FL4SHK> ah

15:55 <lkcl> DaKnig: yes, everything is shown. do try it out, you'll soon see

15:55 <FL4SHK> seems great minds think alike

15:56 <FL4SHK> though in my case, I didn't get the idea on a whim or anything

15:56 <whitequark> i got it from Philip Freidlin I think

15:56 <lkcl> whitequark: cool! i didn't realise Arrays could be used as a type of Switch statement

15:56 <FL4SHK> I was the TA for a course my last semester of grad school

15:56 <whitequark> lkcl: that's basically all they are

15:56 <whitequark> *Philip Freidin

15:56 <FL4SHK> And the architecture that the students had to work with had a `pre` instruction

15:57 <tpw_rules> FL4SHK: one quirk in boneless is that since there's only 3 immediate bits in the arithmetic instructions, they actually index a lookup table if not preceded by `pre` (EXTI)

15:57 <tpw_rules> with like 0, 1, 0xFF, and some others

15:57 <FL4SHK> 3 immediate bits???

15:57 <lkcl> immediates are one of the biggest nuisances in ISAs. the Mill uses bit-level compression

15:57 <FL4SHK> just 3?

15:58 <FL4SHK> `pre`, or EXTI, pretty much solves the issue IMO

15:58 <tpw_rules> some have 5 and some have 8 but the arithmetic are 3

15:58 <FL4SHK> 3 is very tiny

15:58 <FL4SHK> the architecture the students had to work with was 16-bit with 4-bit immediates

15:58 <FL4SHK> everyone had to provide their own encoding

15:58 <lkcl> tpw_rules: the Mill does a really fascinating compact job: kinda like Huffman Encoding but targetted at FP as well as INT

15:58 <FL4SHK> I personally love the task of building an encoding from an instruction set

15:58 <lkcl> FL4SHK: ooo, 4-bit immediates :)

15:59 <FL4SHK> ...as well as building the instruction set

15:59 <FL4SHK> it was pretty ARM-like

15:59 <FL4SHK> the instructor called it PINKY

15:59 <FL4SHK> only had a z flag

15:59 <FL4SHK> still had the set less than instruction of MIPS

15:59 <tpw_rules> lkcl: that sounds kind of bothersome? the two bit aligned ISAs i know are slow and excessively complex

15:59 <lkcl> FL4SHK: me too. i designed an instruction set based around 2-bit "groupings", in 1990.

16:00 <lkcl> tpw_rules: they have the advantage of static data allocation and it really shines for vectorisation. i don't know the full details, they have a working LLVM port

16:01 <lkcl> tpw_rules: for vectors, being able to expand a single bit "0" out to a full 8/16/32/64-bit register is a hell of a saving.

16:01 <FL4SHK> Here's what the architecture I'll be building my main computer with is like

16:01 <FL4SHK> it's a vector machine

16:01 <FL4SHK> and it treats cache lines as vectors

16:01 <FL4SHK> cache lines are used as the primary thing over registers

16:01 <lkcl> FL4SHK: ooo :)

16:01 <FL4SHK> I built a machine like this once

16:01 <FL4SHK> didn't have a data cache

16:01 <FL4SHK> did have instruction cache, though

16:02 <FL4SHK> I think it was a 32 kiB icache?

16:02 <FL4SHK> I can't remember.

16:02 <lkcl> there's... this is a known type of architecture. i forget the name.

16:02 <FL4SHK> The machine didn't have a hardware-enforced mapping from cache line number to address btw

16:02 <lkcl> FL4SHK: it's designed for 3D, right?

16:02 <FL4SHK> well, it would certainly be *good* at that

16:03 <FL4SHK> load instructions and store instructions would set the mapping of cache line number to address

16:03 <FL4SHK> and as such multiple cache lines could share the same data by way of sharing address

16:03 <lkcl> ok. the reason i ask is because in 3D (as we're finding out for LibreSOC), the workloads are basically large-amounts-of-LD, large-amounts-of-processing, large-amounts-of-ST

16:03 <FL4SHK> this type of machine that I built is called "Line Associative Registers"

16:03 <FL4SHK> such a machine hasn't been manufactured before TMK

16:04 <lkcl> there's no overlap. the data that's processed is *not* shared significantly with other batches.

16:04 <lkcl> FL4SHK: it's cool coming up with something new, isn't it? :)

16:04 <FL4SHK> I didn't come up with Line Associative Rgisters

16:04 <lkcl> awww

16:04 <FL4SHK> but my master's project was to create a LARs processor

16:04 <FL4SHK> I made the first one with floating point of any variety

16:04 <FL4SHK> ...but it was this bad fp format, bfloat16

16:04 <lkcl> interesting

16:04 <FL4SHK> it has type tagged registers

16:04 <lkcl> urrrr i know it

16:05 <FL4SHK> bfloat16 was only chosen for the simplicity of implementation

16:05 <FL4SHK> and simplicity of, well, testing!

16:05 <lkcl> i love type-tagged registers: it's what we'll be using in Libre-SOC.

16:05 <FL4SHK> here the type tag is set by the load or store instruction

16:05 <lkcl> oh: have you looked at the Mill?

16:05 <FL4SHK> Not yet

16:05 <FL4SHK> I know of it

16:05 <lkcl> yeeees, i was going to say, that's exactly what the Mill does :)

16:05 <FL4SHK> the Mill sounded interesting

16:05 <DaKnig> lkcl: dont forget , mov is turing complete

16:06 <lkcl> the LD operation basically specifies the width... that's it.

16:06 <lkcl> there's no ADD8, no ADD16, no FPADD16 instruction: there's just... ADD.

16:06 <FL4SHK> LARs as a concept was designed with mitigating the memory bottleneck

16:06 <DaKnig> ld+st is turing complete so you dont need actual processing :)

16:06 <lkcl> DaKnig: cool! :)

16:07 <FL4SHK> transport triggered architectures come to mind

16:07 <lkcl> FL4SHK: however you'll likely find, just like the Mill, that you'll need a "widen" and a "narrow" instruction

16:07 <FL4SHK> lkcl: the thing I built didn't have those

16:07 <FL4SHK> loads and stores set both the address and type tag of a cache line

16:07 <lkcl> which (for int) does zero/sign-extending and (for float) does FP conversion

16:08 <FL4SHK> the architecture does automatic casting

16:08 <FL4SHK> if two registers have different types

16:08 <FL4SHK> let's say you add rB and rC, storing the result in rA

16:08 <lkcl> FL4SHK: yep, totally with you. i totally get it: this is the basis of Simple-V, the Vectorisation system i invented

16:08 <FL4SHK> oh

16:08 <lkcl> and the Mill does it as well

16:08 <FL4SHK> okay then

16:08 <FL4SHK> what are `widen` and `narrow`?

16:09 <lkcl> Mill "widen" instruction: sign-extend and zero-extend. anything that was tagged as say "INT8" will be "widened" to whatever-the-widen-instruction says

16:09 <lkcl> INT16, INT32 etc.

16:09 <FL4SHK> LARs instruction sets don't have specific sign extend and zero extend instructions

16:09 <FL4SHK> an add instruction will take care of it

16:10 <FL4SHK> due to automatic casting

16:10 <lkcl> basically it uses the "tag" that originally came from the LD, and the new "tag" of the...

16:10 <lkcl> ah yes!

16:10 <lkcl> ah... but does the "ADD" instruction *contain* the new tag?

16:10 <FL4SHK> no, the tag is only set by loads and stores

16:10 <FL4SHK> loads and stores also don't necessarily access memory due to associativity

16:11 <lkcl> ok... so to do a convert from INT8 to INT32 you would need to do a "fake load of zero" into an INT32

16:11 <lkcl> *then* ADD the INT8 number to that zero-loaded register

16:11 <FL4SHK> loads being associative means you can just walk a register without needing to actually touch memory

16:12 <lkcl> which in terms of the number of opcodes and cycles is sub-optimal

16:12 * lkcl tck, tck.... thinking....

16:12 <FL4SHK> you might have a delay of like, one clock cycle

16:12 <FL4SHK> zero extension and sign extension are less necessary on this machine because you can just do 8-bit, 16-bit, 32-bit, etc. arithmetic natively

16:13 <lkcl> i'm trying to think this through... what does it mean "loads are associative"?

16:14 <FL4SHK> well

16:14 <FL4SHK> if you load from an address already in the LAR file

16:14 <FL4SHK> you don't need to read from memory

16:14 <DaKnig> how long are those registers, again?

16:14 <DaKnig> lines, whatever you called em

16:14 <lkcl> ok, yes. got it.

16:14 <FL4SHK> you just set the destination LAR to the contents of your other LAR that already has this data

16:15 <FL4SHK> a load instruction, counterintuitively, might cause you to actually write back to memory

16:15 <lkcl> so one of those would best be "reserved" as a "always containing zeros" line, by convention at the assembly / ABI level

16:15 <FL4SHK> that's probably something you want

16:15 <FL4SHK> a zero register

16:15 <FL4SHK> DaKnig: like register widths, cache line widths, etc. in a normal architectures, that's set by whoever makes the instruction set

16:16 <lkcl> yehyeh i was thinking that.

16:16 <FL4SHK> I don't remember what my most recent LARs architecture did for that

16:16 <FL4SHK> I think I picked 256 bytes per LAR?

16:16 <FL4SHK> 64 LARs, 256 bytes each

16:16 <FL4SHK> oh, other thing, lkcl

16:16 <FL4SHK> Fully LARs-based machines have only LARs, no regular cache or registers

16:17 <FL4SHK> this includes the instruction side

16:17 <FL4SHK> you have instruction LARs and data LARs

16:17 <FL4SHK> instruction LARs get loaded via `fetch` instructions

16:17 * lkcl raises eyebrows at instruction LARs :)

16:17 <FL4SHK> ...and normally you'd want your source code to not have `fetch` instructions

16:18 <FL4SHK> software is supposed to provide a guarantee that the pipeline fetching always fetches from an ILAR

16:18 <FL4SHK> ...I'd personally say to throw an exception if there's a miss

16:19 <FL4SHK> I was planning on doing that.

16:19 <FL4SHK> here's how software provides the guarantee

16:19 <FL4SHK> get the Binutils level software inserting `fetch`es

16:20 <FL4SHK> for my next LARs processor, which is *not* the one I want to use for my main computer

16:21 <FL4SHK> I want to not have virtual memory in that machine

16:23 <lkcl> FL4SHK: yes, VM is a lot of work.

16:30 Yehowshua has joined #nmigen

16:31 <Yehowshua> DaKnig, I'm a bit late, but I have some memory/Array, FIFO, and wishbone tutorials here https://github.com/BracketMaster/nmigen-by-example

16:31 <Yehowshua> Let me know if you have any issues

16:32 * emeb makes bookmark

16:35 <FL4SHK> DaKnig: don't forget about my FIFOs that I made

16:35 <FL4SHK> those also show how to use arrays

16:35 <FL4SHK> I'm also using one of my FIFOs

16:35 <FL4SHK> I was originally not using first word fall through

16:35 <FL4SHK> but now I am

16:37 <lkcl> FL4SHK: fwft is reeaaally tricky. whitequark went to a lot of trouble to write formal correctness proofs for the FIFO classes in nmigen

16:37 <FL4SHK> lkcl: it is?

16:37 <FL4SHK> Maybe it's not first word fall through that I did, then?

16:37 <FL4SHK> I tested it

16:37 <FL4SHK> I might have implemented something other than fwft

16:38 <lkcl> FL4SHK: well... it is for most people. you appear to have a well-above-average capability in hardware design :)

16:38 <FL4SHK> like... I find CPUs much more difficult than I did that thing

16:38 <FL4SHK> or at least the types of CPUs I'm making

16:38 <FL4SHK> simple ones are easy

16:38 <lkcl> stuff that's known to be *really* hard computer science you're like, "pffh" :)

16:39 <lkcl> yehyeh

16:39 <FL4SHK> I can make a multi-cycle CPU in my sleep

16:39 <FL4SHK> ...er, by that I mean, a big freaking state machine CPU

16:39 <lkcl> FL4SHK: try a multi-issue Out-of-Order engine some time

16:40 <FL4SHK> *that's* something I haven't done before

16:40 <lkcl> yehh i went with a FSM for the early version of Libre-SOC, just to get "instructions into pipelines" without having to worry about register dependencies

16:40 <FL4SHK> I want to make a multi-core, out of order, multi-issue LARs machine

16:40 <FL4SHK> nobody has done this before

16:40 <FL4SHK> ...oh, and virtual memory

16:41 <lkcl> it took me *six months* of study with Mitch Alsup's help to understand the CDC 6600.

16:41 <FL4SHK> what kind of things are you referring to as really hard computer science?

16:41 <lkcl> OoO design for example. it's... yeah.

16:41 <FL4SHK> I don't consider that ridiculously easy

16:42 <lkcl> https://git.libre-soc.org/?p=soc.git;a=tree;f=src/soc/scoremulti;hb=HEAD

16:42 <Yehowshua> lkcl, I was pretty impressed on the AFIFO whitequark did. These are common utilities hardware designers just shouldn't have to worry about

16:42 <FL4SHK> fwft might not even be what I actually implemented

16:42 <lkcl> https://git.libre-soc.org/?p=soc.git;a=tree;f=src/soc/scoreboard;hb=HEAD

16:42 <FL4SHK> all I did was shift reads to be asynchronous

16:42 <FL4SHK> just like I've done with block RAM before

16:42 <awygle> FWIW I found the docs page of Array more confusing than helpful. Once I realized it's a list you can index with a signal it clicked.

16:42 <Yehowshua> Imagine if one day you could just do something like m.submodules.pcie = PCIe()

16:42 <lkcl> FL4SHK: ah yes that *might* be different.

16:43 <lkcl> Yehowshua: yeah, litex and fusesoc are intended to be that kind of level. and it's what we'll need

16:43 <FL4SHK> I referred to it as a first word fall through FIFO but

16:43 <FL4SHK> it still does some stuff synchronously

16:43 <FL4SHK> it does what I needed it to

16:43 <FL4SHK> Maybe first word fall through isn't what I needed at all

16:44 <lkcl> FL4SHK: does it mean that: if the FIFO is empty, and it is written to, that the data coming in is available for reading *on the same cycle*

16:44 <lkcl> that's "fwft" as best i can tell.

16:45 <lkcl> without fwft, you will always have a 1 clock cycle delay, guaranteed, between incoming and outgoing data.

16:45 <lkcl> even if the FIFO is currently empty

16:46 Yehowshua has quit [Remote host closed the connection]

16:46 <lkcl> FL4SHK: if you're looking to do Out-of-Order, i recommend looking up "Design of a Computer" by James Thornton. it's available online (free) thanks to Thornton giving permission around 2010

16:47 <lkcl> he was very old. his wife wrote a hand-written letter to the person who asked if he could put a copy of the book online

16:49 <lkcl> and if you're interested in precise exceptions, branch speculation etc. i have some augmentation chapters written by Mitch Alsup that help explain how to do that, on top of the original 6600.

16:49 <FL4SHK> don't need branch prediction for the type of thing I'm doing

16:49 <lkcl> he also showed me how to do O-o-O memory management, which would be relevant for the LARs concept

16:49 <d1b2> <TiltMeSenpai> is this "The Control Data 6600"?

16:50 <FL4SHK> the fact that software guarantees no instruction ILARs misses means you can get some other assumptions

16:50 <lkcl> that took me 3-4 weeks to understand, on its own.

16:50 <lkcl> dlb2, TiltMeSenpai: yes :)

16:50 <FL4SHK> I didn't have a delay of 1 clock cycle for reading

16:50 <FL4SHK> but I did for writing

16:51 <FL4SHK> so this is probably something else

16:51 <lkcl> what about simultaneous read-and-write, on the same clock cycle? what happens then?

16:51 <FL4SHK> the only thing htat's not synchronous is reading from the array inside the FIFO

16:51 <lkcl> yes it sounds like it isn't fwft. fwft is definitely the following conditions (all on same clock cycle):

16:52 <lkcl> * FIFO is empty

16:52 <lkcl> * write occurs

16:52 <lkcl> * read occurs

16:52 <lkcl> * write "falls through" to the read port

16:52 <FL4SHK> that really doesn't sound that bad

16:53 <FL4SHK> it sounds a lot like something I've done for register files before

16:53 <lkcl> it's the sort of thing that's enough of a nuisance that people don't want to have to reinvent it (and get it wrong)

16:53 <FL4SHK> Where you'd need to read what was currently being written

16:53 <lkcl> yes, funnily enough, it's exactly the same.

16:53 <Lofty> Isn't that transparent read?

16:53 <FL4SHK> that's really not that hard to me

16:54 <lkcl> FL4SHK: yes - but it takes time, and people get it wrong, and then things break.

16:54 <FL4SHK> doesn't sound hard at all to me

16:54 <lkcl> Lofty: on regfiles? i believe so. it's kinda like having an operand forwarding bus built-in to the regfile

16:54 <FL4SHK> it's just a matter of dealing with "next state" stuff

16:55 <Lofty> It's not specific to register files

16:55 <Lofty> It's a property of the memory

16:55 <FL4SHK> I haven't needed a first word fall through FIFO before, though

16:55 <lkcl> FL4SHK: it took our team 2-3 weeks to write the regfiles with transparent reads, and unit tests, and formal correctness proofs.

16:55 <FL4SHK> what

16:55 <FL4SHK> why?

16:56 <FL4SHK> it took me that long for, say, my LAR file in my original LARs machine

16:56 <FL4SHK> *that* was a hard project

16:57 <lkcl> because this stuff is hard - for us - and we're not confident it would "work", so had to make sure by spending the time writing unit tests that gave us the confidence in the code

16:57 <lkcl> i don't think you fully appreciate: you really do have a well-above-average level of competence in hardware design :)

16:58 <lkcl> that was a compliment btw

17:06 <FL4SHK> oh, well, thanks

17:06 <whitequark> Lofty: that's a transparent read if you look at a memory alone

17:07 <whitequark> but it's called first-world fallthrough on a FIFO

17:07 <whitequark> same concept though

17:07 <Lofty> Ah, I see, thank you

17:14 <awygle> isn't it still FWFT even if it has latency >0, as long as you don't have to tick the output port to get the new data to show up?

17:17 peepsalot has joined #nmigen

17:46 <FL4SHK> lkcl: to me, a register file is something you don't even really need to verify

17:46 <FL4SHK> other than by looking at the source code

17:47 <lkcl> FL4SHK: certain industries absolutely cannot take the author's word for it - or the source code.

17:47 <lkcl> hence this:

17:47 <lkcl> https://github.com/SymbioticEDA/riscv-formal/blob/master/docs/rvfi.md

17:47 <FL4SHK> lkcl: I don't think it takes much to formally verify a register file

17:47 <FL4SHK> like, 2 to 3 weeks is a *lot*

17:48 <FL4SHK> a LAR File, on the other hand

17:48 <FL4SHK> that's difficult to formally verify

17:48 <FL4SHK> or at least it was for me

17:48 <FL4SHK> ...I'd probably have an easier time with it today

17:48 <FL4SHK> LAR files are not simple like caches

17:48 <lkcl> indeed. as will be the OoO Dependency Matrices.

17:49 <lkcl> when you have an out-of-order design, formal correctness proofs - that data has been correctly been read/written in the right order to the register file - becomes far more challenging

17:49 <FL4SHK> when you say formal verification, do you mean with yosys?

17:49 <FL4SHK> because that's the definition I was using

17:50 <lkcl> symbiyosys - yes.

17:50 <lkcl> which uses SAT solvers like yices2, etc., yes

17:50 <lkcl> i have been thinking about how to verify the OoO Dependency Matrices for some time.

17:51 <lkcl> how to guarantee that the instruction issue order is the same as the completion order *where it actually matters*

17:51 <FL4SHK> I'd write everything in nMigen at this point

17:52 <lkcl> because - haha - in some cases it doesn't matter. yes, that's what we're doing. everything's in nmigen.

17:52 <FL4SHK> I'll need to study up on out of order machines

17:52 <lkcl> "add r1, r2, r3" and "add r4, r5, r6" do *not* matter what the completion order is because there's no dependency hazards

17:53 <FL4SHK> I understand that much

17:53 <lkcl> FL4SHK: the "normal" algorithm - the one that everyone quotes - is the Tomasulo Algorithm.

17:53 <FL4SHK> I've heard of it

17:53 <lkcl> there's a really good video on youtube by an indian guy, who explains it really well

17:53 <FL4SHK> but I don't know its details

17:53 <FL4SHK> since I'm unaware of its details, I might come up with my own thing

17:54 <FL4SHK> it'd be fun to say "hey, look, this is my own algorithm"

17:54 jeanthom has quit [Ping timeout: 240 seconds]

17:54 <lkcl> once you understand that, i can point you at a page which allows understanding of the precise capability of the (augmented) 6600.

17:54 <lkcl> :)

17:54 <FL4SHK> I don't think I want to see the Tomasulo Algorithm

17:54 <lkcl> there's some things you definitely need to think through.

17:54 <lkcl> do you want interrupts to be serviceable immediately?

17:55 <FL4SHK> if it's a machine with out of order execution, probably not

17:55 <lkcl> do you want multiple LOAD/STOREs to be done in parallel without data corruption?

17:55 <lkcl> FL4SHK: actually, the precise-augmented 6600 *can* handle interrupt-servicing immediately, because there's a way to cancel outstanding in-flight instructions

17:56 <FL4SHK> What does "outstanding" mean in this case?

17:56 <lkcl> "work that's in pipelines or waiting to be put *into* pipelines that hasn't hit the register file yet"

17:56 <lkcl> aka "in-flight"

17:57 <FL4SHK> I'll need to think about what I should do for micro ops

17:57 <lkcl> some OoO designs use a "rollback history" system. others "hold off" from writing anything that could cause "damage"

17:57 <FL4SHK> I don't want to study existing ideas for micro ops

17:57 <FL4SHK> nooo don't tell me

17:57 <lkcl> micro-ops according to Mitch Alsup is... haha :)

17:57 <lkcl> you really do want to discover this stuff for yourself, don't you? :)

17:57 <FL4SHK> yes

17:58 * lkcl zzip. with extra gaffa tape.

17:58 <lkcl> mmMmmh, mmhmhhh!

17:59 <lkcl> if you get stuck just ask.

17:59 <FL4SHK> things that I don't outright need to know like what the virtual memory system needs to do for OSes to work

17:59 <FL4SHK> I don't know if I want to know much in advance.

18:00 Yehowshua has joined #nmigen

18:00 <Yehowshua> I guess its meeting time?

18:00 <lkcl> FL4SHK: what will be fascinating is if you document all of this and put it online as libre software

18:00 <lkcl> oh?

18:00 <lkcl> oh!

18:01 <whitequark> yep, meeting time

18:01 <FL4SHK> software, eh

18:01 <FL4SHK> I thought this was hardware!!!

18:01 <FL4SHK> the reality is that hardware and software might as well be the same thing...

18:02 <jfng> o/

18:02 <whitequark> awygle? ktemkin?

18:02 <Yehowshua> yeah - have you heard of https://downloadmoreram.com?

18:02 <Yehowshua> its hardware software

18:02 <FL4SHK> you can actually download more RAM into an FPGA by making it out of logic

18:02 <FL4SHK> Array is my RAM

18:03 <FL4SHK> FL4SHK's Programmable Gatorade

18:03 <Yehowshua> Salut jfng

18:04 <whitequark> okay, my agenda items: status update, #381, #355, Record split

18:04 <whitequark> jfng, anything from you on the SoC side?

18:05 <lkcl> https://github.com/nmigen/nmigen/issues/381

18:05 <lkcl> https://github.com/nmigen/nmigen/issues/355

18:05 <Yehowshua> Thanks lkcl

18:06 <jfng> no, maybe discussing #20 and #10

18:06 <whitequark> alright, we can do that

18:06 <lkcl> https://github.com/nmigen/nmigen/issues/20

18:06 <lkcl> https://github.com/nmigen/nmigen/issues/10

18:07 <whitequark> no, nmigen-soc #10 and #20

18:07 <jfng> lkcl: i meant nmigen-soc issues

18:07 <lkcl> jfng: doh :)

18:07 <lkcl> https://github.com/nmigen/nmigen-soc/issues/10

18:07 <lkcl> https://github.com/nmigen/nmigen-soc/issues/20

18:08 <whitequark> i'll begin with the status update. not much to report; i've been looking into cleaning up cxxsim and getting it into master proper

18:08 <whitequark> that will probably take a little more time

18:08 <whitequark> the main issue i'm having is organizing the simulator guts; there are a bunch of things that are shared (the public interface essentially) and a bunch of things that are similar but not really shared exactly

18:09 <whitequark> right now there are two Simulator classes that inherit from the same "core" class

18:09 <whitequark> i think that's not a particularly great design; it's confusing to which one you refer, it's easy to have their interfaces accidentally diverge, it requires you to substitute or rename imports

18:09 <Yehowshua> So I remember sometime ago that you can express new logic **in** simulation

18:09 <Yehowshua> How does CXX handle that?

18:09 <whitequark> oh?

18:09 <whitequark> can you elaborate?

18:10 <Yehowshua> Yes - you could do a.eq(b) in a process

18:10 <Yehowshua> Lemme find an example

18:10 <whitequark> oh right

18:10 <whitequark> when you do that in a process, it doesn't add logic; it executes once and instantaneously

18:11 <whitequark> more like a regular assignment

18:11 <lkcl> whitequark: yes. we've already started doing this:

18:11 <lkcl> cxxsim = True

18:11 <lkcl> from nmigen.sim.cxxsim import Simulator, Settle

18:11 <lkcl> if cxxsim:

18:11 <lkcl> else:

18:11 <lkcl> from nmigen.back.pysim import Simulator, Settle

18:11 <whitequark> lkcl: yeah so that sucks imo

18:11 * cr1901_modern is here in read only mode mostly

18:11 <whitequark> i mostly did it because i wanted to get something out for you folks to test

18:11 <lkcl> whitequark: really appreciated

18:11 <lkcl> the usual "solution" is a Factory class system

18:11 <whitequark> what i think would be a better design is having a single Simulator, all the commands, etc, and the Simulator would take a SimulationEngine that would actually implement it

18:12 <whitequark> the Engine would be mostly or completely opaque to user code

18:12 <whitequark> so it'd be something like `from nmigen.sim import PythonEngine, CxxEngine`

18:12 <whitequark> `Simulator(engine=CxxEngine)

18:12 <whitequark> or maybe

18:12 <whitequark> `Simulator(engine=CxxEngine(builder=user_build_fn))`

18:12 <whitequark> (for people who want to customize exactly how the C++ code is built)

18:13 <Yehowshua> This feels natural `Simulator(engine=CxxEngine)`

18:13 <whitequark> right but it doesn't let you pass options to the engine

18:13 <whitequark> maybe `Simulator(engine=CxxEngine())`

18:13 <Yehowshua> yeah

18:13 <lkcl> that'd work. it's one step away from a full "class Factory" (where engine is a string and the Factory class looks that up in a dictionary)

18:13 <whitequark> there is a good reason I don't want to make the engine argument a string

18:14 <whitequark> the reason is that importing CxxEngine, right now, pulls in Python's ctypes

18:14 <whitequark> but... that's unlikely to work all that well on PyPy, and I know you folks use PyPy among other things

18:15 <whitequark> PyPy needs cffi, but that's an external dependency

18:15 <lkcl> yuk

18:15 <lkcl> oh

18:15 <whitequark> well, it doesn't strictly speaking *need* cffi

18:15 <whitequark> it just works much faster with cffi

18:15 <whitequark> and this is hella important, because the ctypes overhead is massive

18:15 <lkcl> i meant to say: i had a suggestion instead of using ctypes?

18:15 <awygle> i'm here. sigh.

18:15 * lkcl waves to awygle

18:15 <whitequark> it is in fact so large that cxxsim is only ~2x faster than pysim!

18:16 <whitequark> where in reality it should be something like ~100x faster

18:16 <lkcl> the idea is: at the same time as auto-generating the cxxsim.cc, actually auto-generate a c-based python module that matches it.

18:16 <Yehowshua> So its not very hard to write driver code around CXXRtl directly in C++.

18:17 <whitequark> lkcl: yeah and that would work even worse on pypy

18:17 <Yehowshua> I'm also not opposed to writing such drivers

18:17 <lkcl> rather than try to use swig (or other c-to-python wrapper), just auto-generate the c module.

18:17 <lkcl> whitequark: sigh :)

18:17 <Yehowshua> Like for a large design, the user could double down and write the driver code themselves

18:18 <whitequark> Yehowshua: i'm quite certain we can speed things up

18:18 <whitequark> i'm not presenting this to you as some sort of insurmountable problem

18:18 <whitequark> i'm merely describing how much overhead ctypes has

18:18 <Yehowshua> Ah

18:18 <whitequark> i haven't even tried solving this so far in the code that i wrote

18:18 <lkcl> Yehowshua: true. we just discussed that yesterday, how simulated verilator peripherals are written in c

18:18 <whitequark> simulated cxxrtl peripherals (aka cxxrtl black boxes) are, naturally, written in c++

18:19 <lkcl> i wasn't suggesting *replacing* the use of ctypes. but... a speed up of 50x, if achievable by not using ctypes is possible for /usr/bin/python3, that's quite compelling

18:19 <whitequark> a speed up of 50x would, i think, only be possible by ditching python completely

18:20 <whitequark> i.e. have the simulation call back into python only when it actually needs something from python

18:20 <Yehowshua> hmm... what about pypy... Does it have CTypes support?

18:20 <Yehowshua> Jitted pypy get quite fast

18:20 <whitequark> basically, nmigen would tell cxxrtl "here is your clocks, and here is the trigger you need to call me back on, and now do this all on your own"

18:21 <whitequark> Yehowshua: yes but ctypes has some design problems that prevent pypy from being efficient with it

18:21 <whitequark> which is one reason for cffi's existence

18:22 <cr1901_modern> I'm confused... Is the "only 2x faster" thing a new regression? I could've sworn cxxsim was an order of magnitude faster previously...

18:22 <whitequark> cr1901_modern: it's only 2x (on minerva, the speedup will be higher on larger designs) faster when used through nmigen

18:22 <whitequark> let me rephrase

18:22 <whitequark> cxxrtl is 100x faster than pysim, but cxxsim (nmigen's cxxrtl integration) is 2x faster

18:23 <cr1901_modern> ahhh okay

18:24 <lkcl> with LibreSOC, we're seeing about... 10-20 instructions per second executed in pysim

18:24 <lkcl> that's without the IEEE754 FPU added

18:25 <whitequark> what about cxxsim?

18:25 <lkcl> we're running into that ready/valid bug, on every aspect of the design

18:26 <whitequark> right but i don't think it should affect how fast the design runs

18:26 <lkcl> ready/valid signalling is a core aspect of the data "management" because it's an OoO design rather than a simple, straightforward pipeline

18:26 <whitequark> (the problem is virtually certainly on python side)

18:26 <lkcl> none of the unit tests pass, so i can't get it running to the point where i can tell how fast it is

18:26 <whitequark> mm, okay

18:27 <lkcl> because they _all_ use ready/valid style communication, unfortunately.

18:27 <whitequark> btw Yehowshua could you minimize the testcase further? that would really speed up the process of fixing the bug

18:27 <whitequark> it's somewhat subtle

18:27 <Yehowshua> Yes. I was having Michael work on that

18:27 <whitequark> great

18:28 <lkcl> Yehowshua: cutting out the actual "shift" should do the trick and just have a countdown.

18:28 <whitequark> anyway, let's go on to the next item

18:28 <Yehowshua> I'm curious about nmigen-soc

18:28 <whitequark> yeah?

18:29 <Yehowshua> What's the vision? How does it compar/complement LiteX?

18:29 <Yehowshua> **compare

18:29 <lkcl> and with OpenPITON?

18:29 <whitequark> nmigen-soc is a SoC *gateware toolkit*

18:29 <whitequark> so it gives you all the buses, and it gives you a BSP generator

18:30 <whitequark> but it doesn't, for example, know how to build firmware

18:30 <Yehowshua> OK

18:30 <whitequark> and it doesn't have a BIOS, it doesn't have any preference on what your language is

18:30 <whitequark> could be C, C++, Rust, whatever

18:30 <whitequark> LiteX could (in principle) be built on top of nmigen-soc

18:30 <Yehowshua> I see now

18:30 <whitequark> if/when LiteX migrates to nmigen

18:31 <lkcl> whitequark: it sounds very much like what we need to complete Libre-SOC.

18:31 <lkcl> if you're familiar with OpenPITON, they can specify the full spec of (an) SoC as a JSON file.

18:32 <lkcl> templating *shudder* then creates the ennntiirre SoC including AXI4 bus infrastructure

18:32 <Yehowshua> Looking at issue 10, I see that the peripherals would be asynchronous - as in able to cross multiple clock domains

18:33 <lkcl> it sounds to me like nmigen-soc would be the "bedrock" of a nmigen equivalent of that

18:34 <whitequark> yeah

18:34 <lkcl> how do CSRs work? they're just registers (in effect) but named so that, if needed, they can be "addressed" by wishbone/AXI4?

18:34 <whitequark> yeah more or less

18:34 <lkcl> ok

18:34 <awygle> correct me if i'm wrong, but it seems like nmigen-soc is a placeholder and a sketch right now, and there's a fair bit of design work to be done before it's "finalized". is that accurate?

18:35 <jfng> yes

18:35 <awygle> hi jfng, was wondering if you were here :)

18:35 <whitequark> awygle: i think the parts that are already there are reasonably functional, and shouldn't radically change

18:35 <whitequark> but there are a lot of things missing

18:36 <lkcl> oh.

18:36 <jfng> you can already use parts of it to build SoCs (e.g. the busses part)

18:36 <whitequark> so i wouldn't say it's a placeholder (nmigen-stdio is, though), but it's definitely unfinalized

18:36 <Yehowshua> So jnfg, I've talked to you a bit before. awygle, haven't really said hi to you before - so hello.

18:36 <jfng> what is currently lacking is the integration tools

18:36 <awygle> howdy :D

18:36 <FL4SHK> calculus seems a little out of the ordinary for hardware

18:36 <lkcl> right. one thing that we need for Libre-SOC is a way to check if a particular address is valid

18:36 <FL4SHK> except not really

18:36 <lkcl> *before* actually issuing the request.

18:37 <FL4SHK> sorry, just a joke

18:37 * FL4SHK leaves again

18:37 <lkcl> FL4SHK :)

18:37 <whitequark> the main reason i haven't spent a lot of time on stdio/soc yet is because we don't have streams or interfaces yet

18:37 <cr1901_modern> I remember a long time ago (~1 year ago) we discussed "how do we build firmware for nmigen SoCs"? Is the idea now that nmigen-soc will _not_ handle building at all?

18:37 <whitequark> which is why those are on the agenda today, among other things

18:37 <jfng> a subset of which (peripherals) is the current focus of development

18:37 <lkcl> this because we're doing an out-of-order design and there will be multiple (parallel) memory requests outstanding

18:37 <FL4SHK> what do you mean by interfaces whitequark?

18:38 <whitequark> FL4SHK: we'll get there in a bit :)

18:38 <FL4SHK> might be SV interfaces

18:38 <jfng> (hi awygle :) )

18:38 <FL4SHK> but I find that classes do that job mostly

18:38 <whitequark> what i have in mind is not intentionally related to SV interfaces

18:38 <FL4SHK> ah

18:38 <awygle> FL4SHK: https://github.com/nmigen/nmigen/issues/342#issuecomment-656390339 background

18:39 <whitequark> right, let's actually go through the items, ok?

18:39 <whitequark> #381, which awygle raised earlier today

18:39 <_whitenotifier-b> [nmigen] awygle edited issue #342: Separate `Record` into `PackedStruct` and `Interface` components - https://git.io/Jv7U7

18:39 <whitequark> please read the discussion in https://github.com/nmigen/nmigen/issues/381

18:39 <whitequark> it's short and i don't want to rehash it here

18:40 <whitequark> some more context: the main reason this mechanism is exists / is being discussed right now is an annoying quirk of oMigen

18:41 <lkcl> whitequark: did you ever consider (as is done in Chisel3 and BSV) adding "direction" at the lowest level (Value, Signal)?

18:41 <whitequark> it had Records, which were just plain classes, and which you couldn't assign to nor use in expressions directly

18:41 <whitequark> lkcl: patience

18:41 <whitequark> we'll get to that

18:41 <lkcl> whitequark: :)

18:41 Yehowshua42 has joined #nmigen

18:41 <whitequark> there's like three layers of infrastructure to build before we get there

18:42 <awygle> i had the impression from the discussion on #381 that there was a reasonable amount of consensus for https://github.com/nmigen/nmigen/issues/381#issuecomment-624212080

18:42 <whitequark> anyway, to use oMigen records, you had to access either their fields or .raw_bits()

18:42 <whitequark> argh

18:42 <whitequark> sorry, wrong link

18:42 <whitequark> i meant https://github.com/nmigen/nmigen/issues/355

18:43 <whitequark> (but we'll discuss #381 too)

18:43 <awygle> oh, yes, much more controversial lol

18:43 <Yehowshua42> Oh!

18:43 <Yehowshua42> I was like #381 is not short!

18:43 <whitequark> yeah sorry :/ we should get one of those IRC bots that convert links to titles

18:44 <Yehowshua42> lkcl will be replaced

18:44 <whitequark> anyway, I wanted to make nMigen records something that you could use like you use any ordinary value

18:44 <lkcl> Yehowshua42: lol

18:44 Yehowshua has quit [Ping timeout: 245 seconds]

18:45 <whitequark> initially, Record was a direct subclass of Value, and treated specially everywhere

18:45 <whitequark> this was somewhat controversial because people tried to inherit from it, and I didn't really want to support that

18:46 <lkcl> whitequark: yeah. if it was vhdl it would not be possible. however: python, go figure. it just feels... "natural" to inherit (and extend).

18:46 <whitequark> eventually what I arrived at is making an "UserValue", which is a special Value you *can* inherit from, under the condition that it always lowers to some other nMigen value

18:46 <whitequark> which is what Record does (it lowers to a concatenation of its fields)

18:46 <lkcl> i remember we had quite a discussion about the implications of modifying inherited instances after they'd been used (once)

18:46 <whitequark> this was insightful because there are clearly quite a few more use cases for this kind of thing

18:47 <Yehowshua42> So if I understand correctly, UserValue is its own thing

18:47 <awygle> yes, the ability to "plug in" custom data types to the nmigen infrastructure is very useful

18:47 <whitequark> unfortunately UserValue has a fatal flaw

18:48 <whitequark> which is to say, it inherits from Value, and Value has a ton of methods with various names, and is getting regularly expanded

18:48 <whitequark> what this means in practice is that, unless we fix the flaw, we can never add methods to Value

18:48 <whitequark> because it'll break user code that uses the same names e.g. in record fields

18:48 <whitequark> I can definitely foresee someone using a field called "shift_left"

18:48 <lkcl> ... which pollutes the namespace into which you'd consider adding "things"

18:48 <whitequark> yes

18:48 <Yehowshua42> Is it possible to have something that doesn't inherit from value but always lowers to value? Maybe the user can define how it lowers?

18:49 <whitequark> that is precisely what #355 is about

18:49 <lkcl> i encountered this problem when creating RecordObject. i "fixed" it by overriding __getattr_

18:49 <lkcl> 1 sec

18:49 <FL4SHK> I derive from `Record`s often

18:49 <lkcl> actually __setattr__

18:50 <lkcl> https://git.libre-soc.org/?p=nmutil.git;a=blob;f=src/nmutil/iocontrol.py;h=bdbd055e41cf2a0f3337ab2aa66d2ee4d127d0dc;hb=aeed18a63d687cdaa9c00b98d46c66583fef6e2d#l94

18:50 <Yehowshua42> one thing I'd add to the list of things to discuss if we have time is bringing industry attention to nMigen

18:51 <lkcl> by over-riding __setattr__ it becomes possible to check if the thing being added is already in use ("shift_left")

18:51 <awygle> FL4SHK: the problem with that is we might break all your code because of a totally unrelated change in a new version of nmigen

18:51 <whitequark> yep

18:51 <lkcl> and to raise an exception

18:51 <whitequark> lkcl: but it would be better if this problem didn't exist in first place

18:51 <lkcl> whitequark: true :)

18:51 <whitequark> anyway, Yehowshua42 has the right solution here: a separate class that doesn't inherit from Value, but which is *castable* to Value

18:52 <whitequark> we didn't already have that implemented because... UserValue predates Value.cast IIRC

18:52 <lkcl> oo intriguing

18:52 <lkcl> i like it

18:53 <whitequark> there are a few things that we should decide

18:53 Yehowshua42 has quit [Quit: Ping timeout (120 seconds)]

18:53 <whitequark> (1) what should it be called? UserValue is a bad name because it'll also be used internally in nMigen

18:53 <whitequark> already used in fact

18:54 <awygle> trait To<Value>

18:54 <awygle> i'm obviously joking about the specifics but i think that's the general tone we should shoot for. i don't _love_ CustomValue although i'd be OK with it

18:54 <whitequark> yeah, same

18:54 <whitequark> I'm open to better names

18:55 <whitequark> ValueCastable?

18:55 <awygle> ToValue, Castable, AsValue, Lowerable

18:55 <whitequark> to go with Elaboratable?

18:55 * lkcl chooses engineering-style names

18:55 Yehowshua has joined #nmigen

18:55 <jfng> UserValue.as_value() seems redundant

18:56 <Lofty> I think ValueCastable works quite well, actually

18:56 <Yehowshua> I agree with Lofty

18:56 <lkcl> oh wait... so the idea is similar to Elaboratable in concept?

18:56 <awygle> it's a bit wordy but it 's probably the clearest

18:56 <Lofty> I'm willing to take wordiness for clarity

18:56 <Yehowshua> Same

18:56 <whitequark> lkcl: yeah, kinda. Elaboratable is an interface that lowers to a Fragment (usually a Module when you write it); ValueCastable lowers to Value

18:56 <awygle> i'd say yes. it's a marker class in the same way that Elaboratable is

18:57 <lkcl> ValueCastable... yyeah.

18:57 <whitequark> excellent, let's go with that name then

18:57 <whitequark> the cast method should be .as_value I'm pretty sure, we already have the convention going

18:57 <Yehowshua> Defining how something lowers - is this a new idea? Does some other HDL have that?

18:57 <whitequark> .as_signed()

18:57 <awygle> yeah strong agree with as_value

18:57 <Lofty> Yep, as_value works nicely.

18:57 <cr1901_modern> +1 to as_value

18:58 <lkcl> is that to be implemented *by* each inheritor of ValueCastable?

18:58 <whitequark> Yehowshua: not sure, but it seems straightforward

18:58 <whitequark> lkcl: correct

18:58 <d1b2> <TiltMeSenpai> .as_value() feels very rust-y, I like it

18:58 <Lofty> I think there are a fair few people with Rust experience here

18:58 <whitequark> TiltMeSenpai: nmigen is generally fairly rusty; not overwhelmingly so, but I do borrow ideas

18:58 <d1b2> <TiltMeSenpai> yeah

18:58 <Yehowshua> Yeah - it is straight forward. So much so that I'm wondering why didn't someone else think of this?

18:58 <awygle> then it should be to_value :p

18:58 <Yehowshua> Rust is a bootiful language

18:58 <awygle> or into_value

18:58 <awygle> rather

18:58 <lkcl> mmmm then i can see that getting "old" quite quickly - enough so that people start putting it into classes that they then inherit from

18:58 <whitequark> into_* is about ownership

18:58 <awygle> i know, i'm joking

18:59 <whitequark> right ok

18:59 <awygle> (mostly joking, partially lamenting python's weak type system)

18:59 <awygle> either way nothing productive

18:59 <lkcl> one of the really nice things about Record (and RecordObject): there's one function (the constructor) and that's it

18:59 <whitequark> lkcl: i don't expect there will be many direct subclasses of ValueCastable

18:59 <Yehowshua> python is a snek. sneak's have no bones...

18:59 <whitequark> but i don't see anything wrong with subclassing it further

18:59 <Yehowshua> **snakes - commenting on weak types

18:59 <Lofty> I mean, nMigen tries to compensate for the type system as much as possible

18:59 <whitequark> it's completely your code, do whatever you want, nmigen will take it

18:59 <lkcl> whitequark: oh this is "internal to nmigen" classes we're talking, rather than user-created ones?

19:00 <whitequark> nope, ValueCastable is something downstream code would use

19:00 <whitequark> your RecordObject would probably inherit from it

19:00 <whitequark> eventually

19:00 <whitequark> nmigen Record would inherit from it

19:00 <awygle> somebody somewhere needs to implement the lowering, and we should expose that

19:01 <Lofty> This really does sound like a trait at this point :P

19:01 <lkcl> ok. right. and then RecordObject would be used (without having every instance that inherits from RecordObject have its own to_value)

19:01 <whitequark> Lofty: it *is* a trait

19:01 <whitequark> lkcl: yes

19:01 <awygle> if you want Record lowering semantics you should be able to inherit from Record

19:01 <lkcl> okaaay

19:01 <whitequark> (Record is going away though)

19:01 <whitequark> (but yeah)

19:01 <awygle> well yes, but it's the only example we all have experience with currently

19:01 <whitequark> yeah

19:01 <Yehowshua> as in will be deprecated?

19:01 <whitequark> yes

19:01 <awygle> nice segue :p

19:01 <Lofty> And eventually removed

19:01 <whitequark> everything public goes through a deprecation cycle of at least one release

19:01 <Yehowshua> Well - you have to do what you have to

19:02 <whitequark> more if it turns out to be a major issue for downstream code

19:02 <Yehowshua> Just own it like Apple killing CD drivers

19:02 <Yehowshua> **drives

19:02 <whitequark> we don't *just* break downstream code if we can do it at all

19:02 <whitequark> *avoid it at all

19:02 <awygle> i linked this before but the proposal re: Record is summarized here https://github.com/nmigen/nmigen/issues/342#issuecomment-656390339

19:02 <lkcl> Yehowshua: we may have to take a copy of Record and maintain it externally (in nmutil) at the crossover/deprecation point

19:03 <whitequark> we have one more aspect on ValueCastable

19:03 <awygle> oh sorry

19:03 <Yehowshua> lkcl - and or eventually re-write to use valuecastable

19:03 <whitequark> it is related to an edge case of (incorrect) user code returning different things when .as_value() is called multiple times

19:03 <Yehowshua> In fact - I think re-writing codebases every once in a while is a good exercise

19:03 <Yehowshua> Albeit painful

19:03 <lkcl> Yehowshua: it depends on timing of the Oct 2020 tape-out

19:04 <lkcl> whitequark: oh?

19:04 <Yehowshua> Yeah - not for a while

19:04 <whitequark> if you return different results from .as_value() when it is called (by nmigen, during casting) multiple times

19:04 <whitequark> you can end up with wildly internally inconsistent ASTs

19:04 <whitequark> and things will break in a confusing way well after the fact

19:05 <Lofty> I want to just say "this is undefined behaviour", but obviously that's not a solution; how could one catch a situation like that?

19:05 <awygle> is this something we can check for? we have the lazy lowering stuff in the current UserValue

19:05 <lkcl> this is similar to the original discussion we had for RecordObject: what happens if someone adds things to the RecordObject *after* constructor time?

19:05 <awygle> we could check for equality there

19:05 <whitequark> awygle: we can't check for equality

19:05 <lkcl> except the problem's now moved to to_value()

19:05 <whitequark> Values override ==

19:05 <awygle> it being python you can still override __eq__ but then it should be obvious you're making things worse

19:06 <Lofty> This seems like a "murphy versus machiavelli" problem, almost.

19:06 <whitequark> no no, .as_value() returns a Value, and Value does override __eq__

19:06 <cr1901_modern> whitequark: Not saying returning different results is a good idea, but... why can't that be caught internally in nmigen via an isinstance() dance?

19:06 <whitequark> uh, how would isinstance() help?

19:06 <Lofty> They're all instances of Value, right?

19:07 <awygle> mm i guess Value.cast is static so you can't really store previous results there, you'd have to do it internal to the ValueCastable, which means the user can still do Bad Things

19:07 * lkcl wonders two things. (A) is it Officially Nmigen's Problem at all (B) if it is, can hashing of the AST be done, keeping a dictionary of first-usage and comparing it against subsquent uses?

19:07 <awygle> or that

19:07 <whitequark> (a) yes, it's a footgun, (b) that's not trivial, hence the discussion

19:07 <whitequark> let me explain the options we have

19:07 <whitequark> and why they're all bad

19:07 <cr1901_modern> Oh, the "child" type is erased when is_value() is called

19:08 <whitequark> awygle: i'm not being defensive against the user doing Bad Things on purpose; that is not possible in Python

19:08 <whitequark> (or even in Rust really, though it is harder there)

19:08 <Lofty> <Lofty> This seems like a "murphy versus machiavelli" problem, almost.

19:08 <lkcl> Lofty: lol

19:08 <whitequark> (you can already shell out to gdb and change private fields in your program without UB)

19:08 <whitequark> can always*

19:09 <awygle> i copy, i meant "without realizing, while thinking they're doing the right thing"

19:09 <whitequark> what i'm defending against is *accidental misuse*

19:09 <whitequark> yes

19:09 <whitequark> so there are two general options we have for this case

19:09 <whitequark> detect or prevent

19:09 <Lofty> I'm wondering if there's some feasible way of evaluating as_value exactly once.

19:09 <whitequark> and there are two places we can do it in

19:10 <whitequark> as_value() itself, and Value.cast()

19:10 <whitequark> detection would mean memorizing the value when it's first returned, and comparing them the next time the function is called

19:10 <whitequark> this seems pointless: it's strictly more work than prevention

19:11 <whitequark> okay, it's not actually pointless, it eagerly shows that there is a bug in the user code

19:11 <Yehowshua> I'm scratching my head on how to implement prevention

19:11 <whitequark> prevention in this case would mean memorizing the value when it's first returned, and then never calling the user function again

19:11 <lkcl> agreed: if you're going to go to the trouble of detection, and you *know* it's going to cause problems in 100% of cases, logically that suggests prevention

19:11 <Lofty> So prevention would be "evaluate exactly once"

19:11 <lkcl> mmm except...

19:11 * lkcl thinks

19:11 <whitequark> yes. we currently do that. but it's not a very good implementation

19:12 <lkcl> someone calling it twice would go "why doesn't it do what i want the second time??"

19:12 <awygle> i actually think detection is better than prevention

19:12 <awygle> yes, for that reason

19:12 <awygle> detection will loudly tell you you've made a mistake. prevention will silently do _something_ which may or may not be right

19:12 <lkcl> unless.. haha, you actually monkey-patch the module and **REMOVE** the to_value function after it's first called :)

19:12 <Lofty> lkcl: machiavelli

19:13 <Yehowshua> Why not combine detection and prevention

19:13 <Yehowshua> That is prevent

19:13 <Yehowshua> But then inform the user you prevented

19:13 <lkcl> or, you monkey-patch it to replace it with "don't call this again, here's why"

19:13 <Lofty> If a detection trips, it should be fatal

19:13 <lkcl> *or*...

19:13 <whitequark> it should be a hard error if we detect at all

19:13 <Lofty> How do you know which one is the intended result?

19:13 <whitequark> no monkey patches

19:13 <lkcl> you use an over-ride on __getattr__ which checks if the thing being accessed is named "to_value"

19:14 <whitequark> no overrides

19:14 <whitequark> no metaclasses

19:14 <awygle> what we need is linear types basically, but we can't have those. currently i like the memoization option the best.

19:14 <whitequark> no weird junk people will get confused by

19:14 <cr1901_modern> I don't see what's wrong with memoization

19:14 <whitequark> yes, let me explain

19:15 <whitequark> so the way memoization would be implemented is by adding a private field on the user class (it will end up being named _Value__casted or something) in Value.cast

19:15 <whitequark> that's fine

19:15 <whitequark> we can then either detect or prevent or whatever

19:15 <whitequark> the problem is that .as_value() is a public function

19:15 * lkcl thinks...

19:15 jeanthom has joined #nmigen

19:16 <lkcl> oh. i wonder if, just like in Elaboratable detects "def elaborate", if it's possible to require that a base-class function be called

19:16 <whitequark> and, suppose one has a PackedStruct and one wants to rotate it or shift it for whatever reason

19:16 <lkcl> that base-class function will set the flag "to_value_has_been_called"

19:16 <whitequark> so you'd write packed_struct.rotate_left(10) but that doens't work cuz it's not a Value

19:16 <FL4SHK> I have a concern

19:17 <FL4SHK> will I have to change my existing code that uses `Record`?

19:17 <whitequark> FL4SHK: eventually, yes, because Record will be deprecated and removed

19:17 <whitequark> for now, no, there will be a compat shim

19:17 <FL4SHK> How often do you plan on doing breaking changes?

19:17 <Yehowshua> And FL4SHK, you can pull in Record directly

19:17 <whitequark> every 0.x release I remove features deprecated in 0.(x-1)

19:18 <FL4SHK> oh my

19:18 <FL4SHK> will it ever become very stable?

19:18 <whitequark> the release cadence is... I think jfng suggested 3 months ideally? right now it's more like 6 months

19:18 <Lofty> It *is* 0.x software

19:18 <FL4SHK> I see

19:18 <Yehowshua> Well, its only been around a little over a year

19:18 <FL4SHK> nMigen?

19:18 <whitequark> FL4SHK: year, year and a half from now, we might have 1.0

19:18 <Lofty> Yep

19:18 <FL4SHK> I thought it was longer than that

19:18 <whitequark> something like that

19:18 <whitequark> nMigen is very young

19:18 <cr1901_modern> December 2018

19:18 <FL4SHK> I see

19:19 <awygle> starting to think we should put the deprecation policy in the readme

19:19 <Lofty> Honestly, I don't think we need to rush to 1.0 anyway, but that's kinda irrelevant

19:19 <whitequark> awygle: it will be in the docs

19:19 <awygle> we get that question a fair bit

19:19 <whitequark> quite prominently

19:19 <Yehowshua> Which leads me to my next question

19:19 <FL4SHK> Breaking changes are scary because I have old code sometimes

19:19 <Yehowshua> About the docs

19:19 <FL4SHK> and I can't always update it

19:19 <cr1901_modern> >so you'd write packed_struct.rotate_left(10) but that doens't work cuz it's not a Value

19:19 <cr1901_modern> Was this a finished thought?

19:19 <whitequark> FL4SHK: such is the life with 0.x dependencies

19:19 <FL4SHK> All right.

19:19 <Lofty> FL4SHK: the old versions will still be on PyPI, so you can pin against them, I think

19:20 <whitequark> i go to quite a bit of effort to make upgrades painless

19:20 <whitequark> e.g. the deprecation errors tell you how to fix your code, typically

19:20 <whitequark> *warnings

19:20 <Yehowshua> Yup - noticed with nmigen.back.pysim -> nmigen.sim.pysim

19:20 <FL4SHK> Any idea if very basic stuff might change?

19:20 <whitequark> with Record specifically you could also extract it from the nmigen codebase and stuff it into your own codebase and use it indefinitely

19:20 <whitequark> FL4SHK: not really

19:21 <FL4SHK> That's about what I figured

19:21 <whitequark> Record is one of the few major warts

19:21 <FL4SHK> probably not going to have to deal with very many breaking changes on my end, then

19:21 <FL4SHK> I bet PackedStruct will stick around

19:21 <whitequark> the other is the build system DSL, but that's far off, and will probably have an automatic migration system

19:21 <whitequark> PackedStruct should be the final design for that component

19:21 <FL4SHK> I largely only need plain old Python classes and `PackedStruct`

19:21 <awygle> whitequark: to try to finish the thought out, i suspect we could do _some_ kind of python shenanigans to ensure that as_value did memoization _itself_, thereby avoiding the problem. the question is, is that too much magic.

19:21 <FL4SHK> What about Layout?

19:21 <FL4SHK> will it be sticking around?

19:22 <Yehowshua> @awg

19:22 <FL4SHK> american wire gauge

19:22 <Yehowshua> awgle is right

19:22 <whitequark> we'll get to that discussion once we finish #355, ok?

19:22 <Yehowshua> FL4SHK - ur funny

19:22 <whitequark> let's stay on topic

19:23 <whitequark> so

19:23 <whitequark> 19:19 < cr1901_modern> >so you'd write packed_struct.rotate_left(10) but that doens't work cuz it's not a Value

19:23 <whitequark> this was not a finished thought, we veered way off topic

19:23 * cr1901_modern nods

19:23 <whitequark> what you *should* write when you realize `packed_struct.rotate_left` doesn't work, is `Value.cast(packed_struct).rotate_left(10)`

19:24 <whitequark> but what you might *want* to write is `packed_struct.as_value().rotate_left(10)`

19:24 <whitequark> and that doesn't detect or protect incorrect implementations of .as_value()

19:24 <lkcl> urrr yuk

19:24 <cr1901_modern> hrm... :(

19:25 <whitequark> awygle is right: we'll need *some* python shenanigans there

19:25 <awygle> whitequark: radical proposal for discussion - what if Value.cast didn't work for this?

19:25 <cr1901_modern> import inspect

19:25 <cr1901_modern> and inspect the call frame?

19:25 <awygle> so that `as_value` is the canonical and only way to do this

19:25 <whitequark> awygle: that makes the problem worse

19:25 <whitequark> because we don't control as_value

19:25 <lkcl> to explain that: if those are the implementations of.. say... the upgraded-Record-replacement's rotate_left function, fine

19:25 <awygle> mm.. yes, fair.

19:25 <whitequark> the other reason it makes the problem worse is that all of the nMigen guts use Value.cast

19:26 <whitequark> anyway, let's see which our options to fix this are

19:26 <awygle> i shoulda stopped talking once everybody was saying how right i was :p

19:26 <Yehowshua> Well its either lots of shenanigans or lots of educating

19:27 <whitequark> - we could do memoization in Value.cast and somehow detect if .as_value() is called directly

19:27 <whitequark> - we could do memoization in .as_value() (using a decorator, probably) and then detect if .as_value() is not defined using this decorator

19:27 <lkcl> https://en.wikipedia.org/wiki/Memoization

19:27 <whitequark> so basically in the second case, code would look like this:

19:27 <whitequark> class MyRecord(ValueCastable):

19:28 <whitequark> @ValueCastable.memoize

19:28 <whitequark> def as_value(self):

19:28 <whitequark> and the @ValueCastable.memoize decorator would be mandatory

19:28 <awygle> that's not too bad. small boilerplate but minimal magic.

19:28 <lkcl> whitequark: that would at least have people questioning *why* that decorator is mandatory

19:28 <Yehowshua> That works. Somewhat opaque for new users

19:28 <whitequark> lkcl: excellent

19:29 <whitequark> that is part of what i want to achieve

19:29 <whitequark> because the docs for the decorator would explain the hazard assocaited with not memoizing

19:29 <lkcl> at which point they'd be referred either here or to the documentation

19:29 <lkcl> eexactly, yes

19:29 <whitequark> this has two good effects

19:29 <whitequark> first, .as_value() and Value.cast() do the same thing

19:29 <whitequark> second, it's done in a fairly straightforward python-like way

19:30 <whitequark> even though the mandatory nature of the decorator is unusual

19:30 <whitequark> the actual process through which this happens is not

19:30 <Lofty> So the decorator is there as a reminder that you won't be called multiple times?

19:30 <awygle> i am going to be also making lunch for the rest of this meeting so if i take a bit to respond to a ping that's why

19:30 <cr1901_modern> >and the @ValueCastable.memoize decorator would be mandatory

19:30 <cr1901_modern> Can this be enforced at compile time? If not, at least user oversights will be constrained to one location

19:31 <whitequark> cr1901_modern: that's the next point of the discussion

19:31 <whitequark> how do we enforce it?

19:31 <awygle> by "compile time" i assume you mean "during elaboration"?

19:31 <whitequark> we can enforce it in Value.cast, but maybe you never call Value.cast

19:31 <whitequark> I think it should be enforced by overriding ValueCastable.__new__

19:31 <whitequark> like we do for Elaboratable

19:32 <Lofty> I was thinking about constructor/destructor stuff or something, yeah

19:32 <lkcl> zowee. i never encountered an actual (real) use for __new__ before. cool

19:32 <awygle> oh hm, that's not a bad idea. i didn't realize that was how elaboratable worked.

19:32 <Yehowshua> So during object generation, you check if its happened before?

19:32 <whitequark> awygle: specifically, that's how the unused elaboratable warnings work

19:32 <awygle> i was about to say "the 'right' way to do this is metaclasses but bleh". __new__ is arguably less 'right' but certainly less fucky

19:32 <whitequark> the existence of the elaborate method is enforced in Fragment.get

19:32 <whitequark> yes, and metaclasses cause issues during subclassing sometimes

19:32 <whitequark> as you've discovered already

19:33 <awygle> mhm

19:33 <lkcl> https://github.com/nmigen/nmigen/blob/master/nmigen/_unused.py#L18

19:34 <Yehowshua> Gotta run - I'll read the irc logs later

19:34 Yehowshua has quit [Remote host closed the connection]

19:34 <lkcl> Yehowshua: k.

19:35 <cr1901_modern> >we can enforce it in Value.cast, but maybe you never call Value.cast

19:35 <cr1901_modern> This means that Value.cast _must_ now be called on something that run the memoize decorator?

19:36 <whitequark> in general, .as_value() must always be decorated

19:38 <whitequark> okay, I think that's about right for ValueCastable

19:38 <cr1901_modern> Actually I should back up for a sec, since I'm a bit behind on nmigen features:

19:38 <cr1901_modern> Value.cast is new/not implemented yet, correct?

19:38 <whitequark> it is very much implemented and even described in docs

19:38 <whitequark> https://nmigen.info/nmigen/latest/lang.html#value-casting

19:39 <cr1901_modern> So now, if you enforce the decoration rule in Value.cast, that means that _everything_ that Value.cast takes must now be decorated? Is the current behavior for Value.cast to accept a superset of types beyond "stuff that implements as_value()"? >>

19:40 <cr1901_modern> Basically "will the new behavior break compatibility with Value.cast as-is right now"?

19:40 <whitequark> er, not at all

19:40 <whitequark> it doesn't change the existing behavior in any way

19:40 <whitequark> it adds new behavior

19:40 <whitequark> well

19:40 <whitequark> the way we have it worked out is actually simpler than that

19:40 <whitequark> Value.cast() simply calls ValueCastable.as_value()

19:41 <whitequark> now, *instantiating* a ValueCastable is different

19:41 <whitequark> say you have this code:

19:41 <whitequark> class MyRecord(ValueCastable):

19:41 <whitequark> def as_value(): ...

19:41 <whitequark> if you do MyRecord() and as_value() is not decorated with @ValueCastable.memoize [preliminary name], an exception is thrown

19:42 * lkcl apologies: need to rest. will be back (and checking irc logs)

19:42 <cr1901_modern> ahhh

19:43 <d1b2> <emeb> golly! hacked up a custom platform definition for my up5k board and tried the acm_serial LUNA example and it actually enumerated!

19:43 <whitequark> okay, two hours is enough of a meeting

19:43 <whitequark> I guess we're done for today

19:43 <jfng> can we spend a 5mins on nmigen-soc issues ?

19:43 <cr1901_modern> I see... ValueCastable has the memoization logic, so Value.cast() and as_value() do the same thing

19:44 <whitequark> jfng: oh, yeah

19:44 * cr1901_modern will go back to read-only

19:44 <_whitenotifier-b> [nmigen] awygle commented on issue #355: [RFC] Redesign UserValue to avoid breaking code that inherits from it - https://git.io/JJC8h

19:44 * lkcl would like to hear about nmigen-soc

19:44 <awygle> wq lemme know if i mis-summarized or missed anything

19:45 <_whitenotifier-b> [nmigen] whitequark commented on issue #355: [RFC] Redesign UserValue to avoid breaking code that inherits from it - https://git.io/JJC4I

19:45 <whitequark> nope, all seems correct

19:45 <_whitenotifier-b> [nmigen] awygle commented on issue #355: [RFC] Redesign UserValue to avoid breaking code that inherits from it - https://git.io/JJC4L

19:45 <jfng> so, about https://github.com/nmigen/nmigen-soc/issues/10

19:46 <jfng> the question would be, can we consider it done ?

19:46 <_whitenotifier-b> [nmigen] whitequark commented on issue #355: [RFC] Redesign UserValue to avoid breaking code that inherits from it - https://git.io/JJC4m

19:47 <jfng> i believe most of the scaffolding needed for csr peripherals is done

19:47 <_whitenotifier-b> [nmigen] whitequark commented on issue #355: [RFC] Redesign UserValue to avoid breaking code that inherits from it - https://git.io/JJC4c

19:47 <jfng> one issue that has not been addressed is awygle's concern about compatibility between peripherals

19:48 <whitequark> jfng: we decided to go for Approach A, plus a wrapper that puts the two if you use a peripheral as a "black box"

19:48 <whitequark> right?

19:48 <awygle> my question on #10 would be, does this design preclude eventually having a "bus-agnostic" way to describe memories (as well as registers) which could be used to write bus-agnostic peripherals? i believe this is a desirable use case

19:49 <lkcl> awygle: and access the CSRs directly?

19:49 <awygle> lkcl: i want a way to say "i have these control registers and this memory-map" and be able to instantiate that with a Wishbone bus, or an AXI bus, or an Avalon bus, or a custom bus, without having to change anything about the peripheral

19:50 <awygle> which makes me nervous about the proposed `wishbone.Peripheral` mixin

19:50 <lkcl> awygle: cool. i can see how that would be useful / desirable

19:50 <awygle> but it's not necessarily a problem, i just want to raise it as a use case i am interested in

19:50 <whitequark> awygle: is that actually possible, mechanically?

19:50 <whitequark> unless your memory is something dumb, you're probably actually handling bursts yourself

19:51 <lkcl> there's also a real-world use-case i can think of: Raptor Engineering is doing an LPC implementation.

19:51 <whitequark> and i'm not sure if there is a way to abstract over WB and AXI bursts

19:51 <jfng> sorry, i wasn't clear enough

19:51 <lkcl> it's possible for LPC to "flip" - dynamically - into UART mode.

19:51 <jfng> my question is for a peripheral with only CSRs, no memories, no WB

19:51 <awygle> whitequark: that's a fair question, but 1) lots of memories are dumb 2) you can probably map to a lowest common denominator if performance isn't critical 3) if performance is critical you should still be able to write AXI-only peripherals

19:51 <whitequark> awygle: do we need many kinds of dumb memory peripherals?

19:52 <jfng> just two attributes: `csr_bus` and `periph_info` for metadata

19:52 <awygle> i am open to learning that it's not possible or not useful, i just don't want to preclude it at this early stage

19:52 <lkcl> as in: some CSRs *reprogram* the behaviour of the peripheral to be a completely different type of interface.

19:52 <lkcl> sorry completely different type of peripheral

19:52 <awygle> i dont' want an AXISramPeripheral and a WBSramPeripheral

19:52 <whitequark> why not?

19:52 <whitequark> does it cause problems?

19:52 <awygle> twice the verification effort, i guess?

19:53 <whitequark> hmm

19:53 Kekskruemel has joined #nmigen

19:53 <whitequark> but most of the verification of a memory mapped peripheral is verification of the bus, right?

19:53 <whitequark> like, even for a DDR controller, presumably most of it would live in stdio

19:53 <whitequark> and be verified there

19:54 <awygle> i think if we do have AXISramPeriph and WBSramPeriph, then most of the code in any given peripheral will be mapping from AXI and/or WB to a bus-neutral control interface (in nmigen-stdio) anyway. but again, i could be wrong about that.

19:54 <lkcl> it makes sense to me for AXIsramPeripheral and WBSramPeripheral to be created by way of mix-ins

19:54 <awygle> but we're drifting away from jfng's request pretty harshly

19:54 <lkcl> and likewise {AnyOtherBus}SramPeripheral

19:54 <whitequark> yeah

19:54 <jfng> memories are a whole other topic yes

19:54 <awygle> in the absence of memories i don't have any real issue with the current proposal but i don't really see the value of the wishbone.Peripheral mixin, i guess

19:55 <jfng> my question is: do we need a mixin csr.Peripheral class ?

19:55 <lkcl> sorry... {AnyOtherBus}{SomePeripheralNeedingCSRs}

19:55 <jfng> which would validate two attributes: the csr bus interface, and the peripheral metadata

19:55 <whitequark> hmm

19:55 <jfng> the alternative i see is pure naming conventions

19:56 <whitequark> how would a peripheral with both CSR and Wishbone look like?

19:56 <whitequark> inherit from wishbone.Peripheral alone? inherit from both wishbone.Peripheral and csr.Peripheral?

19:56 <awygle> what is the use case for having both?

19:57 <lkcl> and then because they inherit from both, they know to "talk" to each other?

19:57 <jfng> a wrapper peripheral class, maybe with a decorator, that would implement the bridge

19:57 <jfng> so it would inherit from wishbone.Peripheral alone

19:57 <whitequark> jfng: can you remind me what was the outcome of the discussion of split CSR/Wishbone and unified CSR/Wishbone?

19:58 <whitequark> ie do the peripherals with both CSR and Wishbone export a single Wishbone bus, or both CSR and Wishbone buses

19:58 <whitequark> I recall we reached a decision but I can't remember which one it is

19:58 <whitequark> and there was some really good reason for that decision too

19:58 <jfng> oh no, i forgot

19:58 <awygle> yknow what i'ma just shut up because i'm not very informed here. i've laid out my use case, i trust y'all to either support it or decide it's a bad idea.

19:59 <lkcl> i have a vague recollection that AXI4 has CSRs separate somehow. it could just be a convention though

19:59 <lkcl> we may actually have to use a modified version of Wishbone.

19:59 <lkcl> (adding support for speculative read/writes)

19:59 <whitequark> awygle: i'm going to defer that decision, i think nothing forces us to preclude it for now, so i'll keep the option open to have the kind of middleware you request

19:59 <whitequark> but no promise that it would absolutely be the way we go

20:00 <lkcl> anything that's "merged" would make that... difficult.

20:00 <awygle> copy

20:00 <lkcl> Wishbone is based on a "take-it-or-leave-it" type of contract.

20:00 <whitequark> yeah, nmigen-soc will strictly stick to upstream Wishbone

20:01 <lkcl> Out-of-Order designs need the "House Contract of Sale" contract. "offer, exchange, complete"

20:01 <whitequark> jfng: okay, we need to figure that out (again)

20:02 <whitequark> because i think it would be the key for making this decision

20:02 <whitequark> maybe ask key2? iirc he was involved

20:02 <lkcl> which would mean that, if it's not "separatable" (so that we can mix in alternative buses), we'd have to hard-fork nmigen-soc. or write a replacement.

20:02 <lkcl> which would be a lot of duplicated effort.

20:03 <whitequark> jfng: iirc, i argued for split CSR/Wishbone buses in peripherals that have their own Wishbone bus because you can always turn it into a merged one, but not the other way around

20:03 <lkcl> to explain: the Out-of-Order design that we're doing can have up to *eight* in-flight memory read/writes simultaneously outstanding

20:03 <whitequark> and we could have a wrapper that turns the split one into a merged one if desired

20:04 <whitequark> on the other hand, the split design can have somewhat lower resource consumption

20:04 <lkcl> where normal Wishbone it expects one and only one bus read/write at one time, and for stalling to propagate back to the main core.

20:05 <lkcl> whitequark: indeed (wrapper makes split -> merged but not possible the other way)

20:05 <whitequark> jfng: on the other hand, i think key2's counterargument was that it is necessary to ensure synchronization between CSR writes and memory writes

20:05 <whitequark> so the split design isn't actually entirely viabel

20:05 <whitequark> *viable

20:05 <jfng> i found the logs (22/03), and a split csr/wb interface + an easy to use wrapper was indeed the conclusion

20:06 <whitequark> hmm

20:06 <lkcl> is there a log somewhere of key2's counterargument?

20:06 <whitequark> it was in private communication

20:06 <lkcl> ahh ok

20:06 <lkcl> whoops

20:08 <lkcl> what was his concern?

20:08 <whitequark> if you have different latencies on CSR and WB/AXI interfaces you may have a bad time

20:08 <whitequark> eg if you flush a FIFO

20:08 <lkcl> that if you split things, you have to make a synchronisation protocol (in effect, something pretty similar to wishbone stb/ack)?

20:08 <whitequark> once you command this through CSR, you want to know that the FIFO is indeed flushed

20:09 <lkcl> yes. this i call the "take-it-or-leave-it" protocol :)

20:09 <lkcl> and a FIFO, interestingly, interferes with that... and requires the "Contract of Sale" style API.

20:09 <lkcl> funny.

20:10 <jfng> if we take a splitted bus approach, and use mixins for peripherals, then it would be very tempting to inherit from two (wb,csr) mixins

20:10 <jfng> but that would not work, i thinlk

20:10 <whitequark> jfng: why not?

20:11 <jfng> assuming each mixin must provide a `periph_info` attribute

20:11 <whitequark> right, that was exactly my concern

20:12 <lkcl> h

20:13 <jfng> there must be a single point of truth `periph_info` attribute for the whole peripheral, but this assumes that its memory layout is hierarchical

20:13 <lkcl> hhmmm i "get" key2's concern about synchronisation. it really does mean that some sort of ready/valid/busy/ack signalling is needed on CSRs.

20:13 <lkcl> the use of e.g. Wishbone (or AXI4) *masks* that need

20:14 <jfng> this signaling you need can be done by bridging your csrs behind a WB4 bus

20:14 <lkcl> because normally (i.e. in the merged design), the use *of* the Bus - which has that ready/valid/busy/ack protocol built-in - *provides* the very protocol needed so that delays can..

20:15 <lkcl> jfng: i am kinda advocating that the protocol used to communicate between split buses *is* wishbone :)

20:15 <lkcl> even when say AXI4 is used

20:16 <lkcl> because it contains the exact ready/valid/busy/ack communications protocol needed for managing (say) FIFO-based CSRs.

20:16 <whitequark> lkcl: i'm pretty sure you actually want AXI4

20:16 <whitequark> because that has out-of-order transactions

20:16 <lkcl> whitequark: well... *thinks*...

20:16 <whitequark> the reason nmigen-soc bothers with Wishbone at all is that a lot of existing cores and designs use it, and people are familiar with it

20:17 <whitequark> WB4 isn't all that good, and WB itself is essentially a legacy bus at this point

20:17 <lkcl> yeah... it's not sophisticated, that's for sure.

20:18 <lkcl> to clarify context: i'm referring to a protocol used to communicate between split peripheral option

20:18 <lkcl> as an *internal* protocol

20:19 <whitequark> jfng: so i think the periph_info issue is fixable, but we should decide something about synchronization first

20:19 <whitequark> what do you think about this?

20:19 <lkcl> in the "merged" design you don't see the problem because the Bus provides the very protocol needed to ensure that FIFO-based CSRs get correctly updated

20:19 <lkcl> acknowledgement comes back a few cycles later (when the FIFO is flushed)

20:20 <lkcl> if all CSRs were single-cycle update, there would not be a problem

20:20 <lkcl> am i making sense? :)

20:20 <jfng> could we provide some metadata about bus latency ?

20:21 <whitequark> jfng: how would a CPU core use it?

20:21 <whitequark> wait states?

20:21 * lkcl yup. tired. leave you to it to discuss, will check the logs. thank you to you both (and everyone)

20:21 <whitequark> lkcl: CSRs can't be all single cycle update because their width is unlimited

20:22 <whitequark> jfng: tbh, i am tired too, what do you think about discussing this later this week, or having another meeting next monday?

20:22 <whitequark> so on 27th rather than 3rd

20:23 <jfng> np, i need to go home too

20:23 <whitequark> we could spend 1st and 3rd monday questions in general and 2nd/4th monday just between us implementers :)

20:23 <whitequark> discussing questions*

20:24 <jfng> that's a good idea

20:24 <awygle> yeah i like that

20:24 <whitequark> okay, let's do that

20:25 <awygle> we might be able to talk about #381 :p

20:25 <whitequark> yeah...

20:25 <awygle> go get some rest y'all

20:25 ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen · IRC meetings each Monday at 1800 UTC · next meeting July 27th

20:26 <whitequark> see yall later

20:27 <awygle> oh real quick, happy to implement 355 but not sure what my schedule is like for the next <undetermined>, so don't let me hold up 0.3 if i don't make it to it is all

20:28 <whitequark> mm okay

20:28 <whitequark> it's not a huge change, so worst case I can just make it myself

20:28 <whitequark> we'll have 0.3.rc1 first

20:28 <awygle> mhm

20:40 Asu has quit [Remote host closed the connection]

20:47 Kekskruemel has quit [Quit: Leaving]

21:30 jeanthom has quit [Ping timeout: 264 seconds]

21:40 <Degi> Is Record([("abc", 1),("def", 1)]) the wrong syntax for making a record with 2 subsignals? Since .def gives an invalid syntax error

21:44 <whitequark> `def` is a Python keyword

21:44 <whitequark> you can use getattr() to access that field

21:44 <whitequark> it's just a bad placeholder name :)

21:44 <Degi> Oh indeed...

21:44 <Degi> Heh yeah I've noticed xD

22:00 <_whitenotifier-b> [nmigen/nmigen-soc] whitequark pushed 1 commit to master [+0/-0/±6] https://git.io/JJCuo

22:00 <_whitenotifier-b> [nmigen/nmigen-soc] rroohhh c754caf - test: make nmigen 0.3+ compatible

22:00 <_whitenotifier-b> [nmigen-soc] whitequark closed pull request #23: test: make nmigen 0.3+ compatible - https://git.io/JJZod

22:32 lkcl_ has joined #nmigen

22:36 lkcl has quit [Ping timeout: 264 seconds]