ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen · IRC meetings each 1st & 3rd Monday at 1800 UTC · next meeting July 20th
<_whitenotifier-b> [nmigen] whitequark commented on issue #441: PS7 block not initialized on series-7 Zynq targets - https://git.io/JJcgl
Yehowshua has joined #nmigen
<Yehowshua> ktemkin, I'm perusing Luna - how might set the device class in the descriptor packets?
<Yehowshua> Oh I see it
<Yehowshua> Weird - it is set correctly...
daknig2 has joined #nmigen
Yehowshua has quit [Remote host closed the connection]
daknig2 has quit [Ping timeout: 240 seconds]
Yehowshua has joined #nmigen
Yehowshua has quit [Remote host closed the connection]
<aaaa> what is "UnusedElaboratable: <nmigen.compat.genlib.fifo.CompatSyncFIFOBuffered object at 0x7f02393c7cc0> created but neve
<aaaa> *never used"
<aaaa> since i use the fifo immediately afterwards
<aaaa> both the read and write side are connected (one side to an internal module, one side is exposed as a port)
<aaaa> but i get that error
<aaaa> and readable/writable are stuck as zero
Degi has quit [Ping timeout: 260 seconds]
Degi has joined #nmigen
<miek> are you adding it to submodules?
<aaaa> uhhh
<aaaa> no
<aaaa> that fixed it
<aaaa> thanks
<aaaa> or at least it doesn't error now, testing it rn
jock-tanner has quit [Ping timeout: 256 seconds]
jaseg has quit [Ping timeout: 256 seconds]
jaseg has joined #nmigen
electronic_eel has quit [Ping timeout: 256 seconds]
electronic_eel has joined #nmigen
PyroPeter_ has joined #nmigen
PyroPeter has quit [Ping timeout: 260 seconds]
PyroPeter_ is now known as PyroPeter
daknig2 has joined #nmigen
<ianloic> random q: was there a `nmigen` command that's now no longer a thing?
<awygle> i completely forgot #381 was still an issue and not something that was implemented >_> oops
<awygle> is there anything blocking it from being implemented currently, and is it too big to make 0.3?
<awygle> (to be clear i'm offering to do it not badgering for it to be done)
<d1b2> <a> having a bit of an issue with the glasgow i2c implementation - i've got it isolated and hooked up into a simple design (https://github.com/GlasgowEmbedded/glasgow/blob/9a83d8574fd4031fe067eaf5c5ae748ecff7b4d6/software/glasgow/applet/interface/i2c_initiator/__init__.py#L19)
<d1b2> <a> but i'm getting weird responses when i try to use it
<d1b2> <a> i can tell that it's working at least at the basic level, since the command-sequence for "write" works consistently and it gives some sort of error when an invalid i2c slave address is given
<d1b2> <a> but when i try to do a read operation, i get back a 1 response from the address-write portion of it, and then 255, 255 from the read portion
peepsalot has quit [Quit: Connection reset by peep]
<d1b2> <TiltMeSenpai> uhh i2c has pullups
<d1b2> <TiltMeSenpai> if you see nack and 255 on data, it means you're knocking on the door but nobody's home
<d1b2> <TiltMeSenpai> if I'm interpreting your question right
<d1b2> <a> huhh
<d1b2> <a> but when i try to write it works fine
<d1b2> <a> w/ the same periph addr
<d1b2> <TiltMeSenpai> the device might not support read addresses? I don't really know
<d1b2> <TiltMeSenpai> or it could be looking for some non-7-bit address?
<d1b2> <a> works fine with a different i2c impl
<d1b2> <TiltMeSenpai> do you have an oscilloscope?
<d1b2> <TiltMeSenpai> or logic analyzer
<d1b2> <TiltMeSenpai> something weird is going on that's stopping the target from driving the bus, but it's hard to say what without looking at the waveform
<d1b2> <TiltMeSenpai> oh wait if you're running on a glasgow, you can use the trace option
<d1b2> <a> i can hook up the i2c output to a logic analyzer and grab waveforms
<d1b2> <TiltMeSenpai> yeah if you add --trace output.vcd the glasgow should end up writing a vcd with measured values to output.vcd
<d1b2> <TiltMeSenpai> might be easier than grabbing a logic analyzer and hooking things up
<d1b2> <a> oh this is running on a different fpga
<d1b2> <TiltMeSenpai> oh are you just using the gateware?
<d1b2> <a> yeah i just pulled the relevant gateware into a normal fpga project
<d1b2> <TiltMeSenpai> ah I see
daknig2 has quit [Ping timeout: 256 seconds]
<whitequark> a: that i2c implementation isn't the best in the world, but i think i tested it quite a bit
<whitequark> awygle: we should discuss it on today's meeting
<awygle> oh, sure
<d1b2> <a> huhh there might be some FIFO issues here too actually, is there any reason a SyncFIFOBuffered would behave weirdly?
<d1b2> <a> reading from the same FIFO at different times is yielding different results
<whitequark> hmm
<whitequark> which fpga? is it multiclock?
<d1b2> <a> ECP5, one clock
<whitequark> nothing that comes to my mind
<d1b2> <a> there's a 1 second delay after the i2c operations are written to the command fifo before i try to read from the output fifo
<d1b2> <a> but if i let other things happen before reading from the FIFO, then the FIFO output is different
<d1b2> <a> depth is v large so that doesn't appear to be an issue
* awygle confirms meeting time for the fourth time
daknig2 has joined #nmigen
jeanthom has joined #nmigen
hitomi2504 has joined #nmigen
<d1b2> <a> fixed it, turned out that the data wasn't being put into the FIFO fast enough to keep up with the I2C transaction
<d1b2> <a> that and a small bug in the FIFO read logic
<d1b2> <a> how does glasgow handle this? does it store the data in a separate FIFO until it's all been received over USB, and then quickly dump it into the main FIFO?
jock-tanner has joined #nmigen
proteus-guy has joined #nmigen
jeanthom has quit [Ping timeout: 256 seconds]
Asu has joined #nmigen
jock-tanner has quit [Ping timeout: 256 seconds]
jeanthom has joined #nmigen
daknig2 has quit [Ping timeout: 256 seconds]
daknig2 has joined #nmigen
daknig2 has quit [Ping timeout: 256 seconds]
nengel has joined #nmigen
jeanthom has quit [*.net *.split]
nengel has quit [*.net *.split]
Degi has quit [*.net *.split]
alexhw_ has quit [*.net *.split]
mwk has quit [*.net *.split]
nengel has joined #nmigen
jeanthom has joined #nmigen
alexhw_ has joined #nmigen
mwk has joined #nmigen
Degi has joined #nmigen
proteus-guy has quit [Ping timeout: 256 seconds]
jock-tanner has joined #nmigen
jock-tanner has quit [Ping timeout: 260 seconds]
emeb has joined #nmigen
cstrauss[m] has joined #nmigen
<DaKnig> where can I look for examples of using Array, FIFO, Memory...?
<lkcl_> DaKnig: find , -name "nmigen/examples/*.py" | xargs grep "Array"
<lkcl_> :)
<DaKnig> yeah good point
<DaKnig> :)
<lkcl_> if those don't exist, then... (1sec..)
* lkcl_ just looking...
<lkcl_> Array: https://git.libre-soc.org/?p=nmutil.git;a=blob;f=src/nmutil/multipipe.py;hb=HEAD#l46
<lkcl_> set up here: https://git.libre-soc.org/?p=nmutil.git;a=blob;f=src/nmutil/multipipe.py;hb=HEAD#l121
<lkcl_> and accessed here: https://git.libre-soc.org/?p=nmutil.git;a=blob;f=src/nmutil/multipipe.py;hb=HEAD#l222
<lkcl_> so for Array, you create a list (or a generator of some kind) and pass that to "Array"
<lkcl_> then you can "index" that "Array" using a Signal.
<lkcl_> it's real simple
<DaKnig> but this means different members can have different type
<DaKnig> does this compile down to actual arrays on the verilog level
<lkcl_> ah... i've never tried that however it might actually work.
<FL4SHK> I've not tried arrays that have heterogeneous elements
<FL4SHK> I don't normally need things like that
<lkcl_> you mean "on the ilang level", yes
<FL4SHK> I guess also my FIFO
<DaKnig> ilang?
<DaKnig> IL?
<DaKnig> or uh, IR?
<FL4SHK> ilang is what nMigen spits out right?
<lkcl_> DaKnig: yes yosys ilang
<DaKnig> it spits out verilog
<lkcl_> FL4SHK: yes.
<DaKnig> I m using vivado, for me it spits verilog
<lkcl_> DaKnig: no, it doesn't. it generates yosys native ilang.
<lkcl_> you can then pass that *to* yosys, by using "read_ilang {filename}" then "write_verilog {filename}" if you want verilog
<DaKnig> oh cool
<DaKnig> so thats how it does it
<lkcl_> but that is yosys's job, not nmigen's
<DaKnig> "the nmigen toolchain" :)
<lkcl_> yup. there's also a ghdl plugin for yosys, it works, but people here recomment using verilator for conversion of vhdl, as being more complete. i think it's verilator
<FL4SHK> the GHDL plugin doesn't support records with unconstrained elements in them
<FL4SHK> i.e. one of the most desirable features from VHDL 2008
<lkcl_> FL4SHK: ahh this is what the microwatt team ran into.
<DaKnig> I would really like VHDL to get as much attention as Verilog in the open source circles
<DaKnig> not enough tools, that dont work as well...
<FL4SHK> old VHDL is pretty good btw
<lkcl_> DaKnig: from working with microwatt for several months, i am now deeply impressed with VHDL
<FL4SHK> it's just not as good as VHDL 2008
<FL4SHK> ...VHDL 2008's generic packages feature, one of the best things about the language, is very poorly supported
<FL4SHK> now, nMigen is actually a lot better in the high level things department
<lkcl_> i had a lot of trouble compiling microwatt.
<FL4SHK> What exactly is Microwatt?
<lkcl_> FL4SHK: 1 sec
<FL4SHK> ah, POWER
<FL4SHK> I like making new instruction sets
<FL4SHK> I've got one I'm in the process of making the assembler for
<lkcl_> it's a POWER9-compliant core written in VHDL by a research team in IBM that i worked with, 20 years ago
<FL4SHK> 20 years ago I was 6
<FL4SHK> first grade, didn't care about computers
<DaKnig> tbh microwatt is a confusing name, I get the pun but theres something else with that name in the field already
<lkcl_> FL4SHK: yes, i remember you saying: i'm actually really interested to hear how that goes over time.
<lkcl_> DaKnig: been there. when they wanted to make it a 5-stage pipeline design i told them they'd have to change the name to "megawatt" :)
<DaKnig> lol
<FL4SHK> lkcl_: I decided to make a smaller processor first before the big one
<FL4SHK> the instruction set is defined
<FL4SHK> there are 36 instructions
<FL4SHK> purely 32-bit machine
<FL4SHK> ...almost
<FL4SHK> it has full products for multiplies
<DaKnig> where can I see the ilang that nmigen spits
lkcl has joined #nmigen
<lkcl> bleh, apologies: mobile broadband internet connection
<lkcl> DaKnig, SyncFIFO example - https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/minerva/units/loadstore.py;hb=HEAD#l189
lkcl_ has quit [Ping timeout: 260 seconds]
<lkcl> that example's interesting in that it's combined with a Record (defined at line 185)
<lkcl> FL4SHK: yes, doing a small design before moving to a larger one, very sensible.
<lkcl> DaKnig: take any of the examples https://github.com/nmigen/nmigen/tree/master/examples/basic
<lkcl> then type "python3 {whatever.py} --help"
<lkcl> followed by "python3 {whatever.py} generate -t il"
<lkcl> if you want to see the equivalent verilog, change that to "-t v"
<lkcl> DaKnig: i thoroughly recommend opening the ilang in yosys ("read_ilang {filename.il}" and doing "show top" or "show {press tab key}" if there are submodules
<lkcl> you'll need xdot and graphviz installed (apt-get install graphviz) for that to work
nengel has quit [Ping timeout: 240 seconds]
<lkcl> it's quite fascinating to see the results of a linearly-written program as a gate-level tree, graphically.
<lkcl> i've fixed several early rookie mistakes by always examining the graphviz after every edit
FFY00 has quit [Quit: dd if=/dev/urandom of=/dev/sda]
FFY00 has joined #nmigen
<DaKnig> does it show the primitives too/
<DaKnig> ?
hitomi2504 has quit [Quit: Nettalk6 - www.ntalk.de]
<whitequark> a: glasgow has two FIFOs, the one that holds data while it's being received through USB is in the FX2
<whitequark> DaKnig: regarding arrays with heterogenous elements, an array is basically a more compact way to write a Switch()/Case() construct
<whitequark> if you use an array on left-hand side, what happens is it expands on a Switch(index) and every Case() contains a single eq where the right-hand side is the same
<whitequark> if you use an array on RHS, then in every Case the LHS is the same
<whitequark> (though you could use an Array on LHS and Array on RHS and that has a more complex expansion, but the basic principle is the same)
<FL4SHK> still working on my Python-based assembler here
<FL4SHK> going to support variable width instructions
<FL4SHK> ...in reality it's just that a source code instruction may end up as two instructions in the final binary
<FL4SHK> specifically, instructions that use 32-bit immediates
<whitequark> kinda like boneless :)
<FL4SHK> yes
<FL4SHK> does boneless do that?
<FL4SHK> instructions that use immediates basically have two possible sizes in my architecture
<FL4SHK> ...in reality there's just a `pre` instruction that expands the size of the immediate in the following instruction
<FL4SHK> as such it's not truly variable width
<FL4SHK> `pre` is a real, separate instruction
<tpw_rules> that's precisely what boneless does
<FL4SHK> ah
<lkcl> DaKnig: yes, everything is shown. do try it out, you'll soon see
<FL4SHK> seems great minds think alike
<FL4SHK> though in my case, I didn't get the idea on a whim or anything
<whitequark> i got it from Philip Freidlin I think
<lkcl> whitequark: cool! i didn't realise Arrays could be used as a type of Switch statement
<FL4SHK> I was the TA for a course my last semester of grad school
<whitequark> lkcl: that's basically all they are
<whitequark> *Philip Freidin
<FL4SHK> And the architecture that the students had to work with had a `pre` instruction
<tpw_rules> FL4SHK: one quirk in boneless is that since there's only 3 immediate bits in the arithmetic instructions, they actually index a lookup table if not preceded by `pre` (EXTI)
<tpw_rules> with like 0, 1, 0xFF, and some others
<FL4SHK> 3 immediate bits???
<lkcl> immediates are one of the biggest nuisances in ISAs. the Mill uses bit-level compression
<FL4SHK> just 3?
<FL4SHK> `pre`, or EXTI, pretty much solves the issue IMO
<tpw_rules> some have 5 and some have 8 but the arithmetic are 3
<FL4SHK> 3 is very tiny
<FL4SHK> the architecture the students had to work with was 16-bit with 4-bit immediates
<FL4SHK> everyone had to provide their own encoding
<lkcl> tpw_rules: the Mill does a really fascinating compact job: kinda like Huffman Encoding but targetted at FP as well as INT
<FL4SHK> I personally love the task of building an encoding from an instruction set
<lkcl> FL4SHK: ooo, 4-bit immediates :)
<FL4SHK> ...as well as building the instruction set
<FL4SHK> it was pretty ARM-like
<FL4SHK> the instructor called it PINKY
<FL4SHK> only had a z flag
<FL4SHK> still had the set less than instruction of MIPS
<tpw_rules> lkcl: that sounds kind of bothersome? the two bit aligned ISAs i know are slow and excessively complex
<lkcl> FL4SHK: me too. i designed an instruction set based around 2-bit "groupings", in 1990.
<lkcl> tpw_rules: they have the advantage of static data allocation and it really shines for vectorisation. i don't know the full details, they have a working LLVM port
<lkcl> tpw_rules: for vectors, being able to expand a single bit "0" out to a full 8/16/32/64-bit register is a hell of a saving.
<FL4SHK> Here's what the architecture I'll be building my main computer with is like
<FL4SHK> it's a vector machine
<FL4SHK> and it treats cache lines as vectors
<FL4SHK> cache lines are used as the primary thing over registers
<lkcl> FL4SHK: ooo :)
<FL4SHK> I built a machine like this once
<FL4SHK> didn't have a data cache
<FL4SHK> did have instruction cache, though
<FL4SHK> I think it was a 32 kiB icache?
<FL4SHK> I can't remember.
<lkcl> there's... this is a known type of architecture. i forget the name.
<FL4SHK> The machine didn't have a hardware-enforced mapping from cache line number to address btw
<lkcl> FL4SHK: it's designed for 3D, right?
<FL4SHK> well, it would certainly be *good* at that
<FL4SHK> load instructions and store instructions would set the mapping of cache line number to address
<FL4SHK> and as such multiple cache lines could share the same data by way of sharing address
<lkcl> ok. the reason i ask is because in 3D (as we're finding out for LibreSOC), the workloads are basically large-amounts-of-LD, large-amounts-of-processing, large-amounts-of-ST
<FL4SHK> this type of machine that I built is called "Line Associative Registers"
<FL4SHK> such a machine hasn't been manufactured before TMK
<lkcl> there's no overlap. the data that's processed is *not* shared significantly with other batches.
<lkcl> FL4SHK: it's cool coming up with something new, isn't it? :)
<FL4SHK> I didn't come up with Line Associative Rgisters
<lkcl> awww
<FL4SHK> but my master's project was to create a LARs processor
<FL4SHK> I made the first one with floating point of any variety
<FL4SHK> ...but it was this bad fp format, bfloat16
<lkcl> interesting
<FL4SHK> it has type tagged registers
<lkcl> urrrr i know it
<FL4SHK> bfloat16 was only chosen for the simplicity of implementation
<FL4SHK> and simplicity of, well, testing!
<lkcl> i love type-tagged registers: it's what we'll be using in Libre-SOC.
<FL4SHK> here the type tag is set by the load or store instruction
<lkcl> oh: have you looked at the Mill?
<FL4SHK> Not yet
<FL4SHK> I know of it
<lkcl> yeeees, i was going to say, that's exactly what the Mill does :)
<FL4SHK> the Mill sounded interesting
<DaKnig> lkcl: dont forget , mov is turing complete
<lkcl> the LD operation basically specifies the width... that's it.
<lkcl> there's no ADD8, no ADD16, no FPADD16 instruction: there's just... ADD.
<FL4SHK> LARs as a concept was designed with mitigating the memory bottleneck
<DaKnig> ld+st is turing complete so you dont need actual processing :)
<lkcl> DaKnig: cool! :)
<FL4SHK> transport triggered architectures come to mind
<lkcl> FL4SHK: however you'll likely find, just like the Mill, that you'll need a "widen" and a "narrow" instruction
<FL4SHK> lkcl: the thing I built didn't have those
<FL4SHK> loads and stores set both the address and type tag of a cache line
<lkcl> which (for int) does zero/sign-extending and (for float) does FP conversion
<FL4SHK> the architecture does automatic casting
<FL4SHK> if two registers have different types
<FL4SHK> let's say you add rB and rC, storing the result in rA
<lkcl> FL4SHK: yep, totally with you. i totally get it: this is the basis of Simple-V, the Vectorisation system i invented
<FL4SHK> oh
<lkcl> and the Mill does it as well
<FL4SHK> okay then
<FL4SHK> what are `widen` and `narrow`?
<lkcl> Mill "widen" instruction: sign-extend and zero-extend. anything that was tagged as say "INT8" will be "widened" to whatever-the-widen-instruction says
<lkcl> INT16, INT32 etc.
<FL4SHK> LARs instruction sets don't have specific sign extend and zero extend instructions
<FL4SHK> an add instruction will take care of it
<FL4SHK> due to automatic casting
<lkcl> basically it uses the "tag" that originally came from the LD, and the new "tag" of the...
<lkcl> ah yes!
<lkcl> ah... but does the "ADD" instruction *contain* the new tag?
<FL4SHK> no, the tag is only set by loads and stores
<FL4SHK> loads and stores also don't necessarily access memory due to associativity
<lkcl> ok... so to do a convert from INT8 to INT32 you would need to do a "fake load of zero" into an INT32
<lkcl> *then* ADD the INT8 number to that zero-loaded register
<FL4SHK> loads being associative means you can just walk a register without needing to actually touch memory
<lkcl> which in terms of the number of opcodes and cycles is sub-optimal
* lkcl tck, tck.... thinking....
<FL4SHK> you might have a delay of like, one clock cycle
<FL4SHK> zero extension and sign extension are less necessary on this machine because you can just do 8-bit, 16-bit, 32-bit, etc. arithmetic natively
<lkcl> i'm trying to think this through... what does it mean "loads are associative"?
<FL4SHK> well
<FL4SHK> if you load from an address already in the LAR file
<FL4SHK> you don't need to read from memory
<DaKnig> how long are those registers, again?
<DaKnig> lines, whatever you called em
<lkcl> ok, yes. got it.
<FL4SHK> you just set the destination LAR to the contents of your other LAR that already has this data
<FL4SHK> a load instruction, counterintuitively, might cause you to actually write back to memory
<lkcl> so one of those would best be "reserved" as a "always containing zeros" line, by convention at the assembly / ABI level
<FL4SHK> that's probably something you want
<FL4SHK> a zero register
<FL4SHK> DaKnig: like register widths, cache line widths, etc. in a normal architectures, that's set by whoever makes the instruction set
<lkcl> yehyeh i was thinking that.
<FL4SHK> I don't remember what my most recent LARs architecture did for that
<FL4SHK> I think I picked 256 bytes per LAR?
<FL4SHK> 64 LARs, 256 bytes each
<FL4SHK> oh, other thing, lkcl
<FL4SHK> Fully LARs-based machines have only LARs, no regular cache or registers
<FL4SHK> this includes the instruction side
<FL4SHK> you have instruction LARs and data LARs
<FL4SHK> instruction LARs get loaded via `fetch` instructions
* lkcl raises eyebrows at instruction LARs :)
<FL4SHK> ...and normally you'd want your source code to not have `fetch` instructions
<FL4SHK> software is supposed to provide a guarantee that the pipeline fetching always fetches from an ILAR
<FL4SHK> ...I'd personally say to throw an exception if there's a miss
<FL4SHK> I was planning on doing that.
<FL4SHK> here's how software provides the guarantee
<FL4SHK> get the Binutils level software inserting `fetch`es
<FL4SHK> for my next LARs processor, which is *not* the one I want to use for my main computer
<FL4SHK> I want to not have virtual memory in that machine
<lkcl> FL4SHK: yes, VM is a lot of work.
Yehowshua has joined #nmigen
<Yehowshua> DaKnig, I'm a bit late, but I have some memory/Array, FIFO, and wishbone tutorials here https://github.com/BracketMaster/nmigen-by-example
<Yehowshua> Let me know if you have any issues
* emeb makes bookmark
<FL4SHK> DaKnig: don't forget about my FIFOs that I made
<FL4SHK> those also show how to use arrays
<FL4SHK> I'm also using one of my FIFOs
<FL4SHK> I was originally not using first word fall through
<FL4SHK> but now I am
<lkcl> FL4SHK: fwft is reeaaally tricky. whitequark went to a lot of trouble to write formal correctness proofs for the FIFO classes in nmigen
<FL4SHK> lkcl: it is?
<FL4SHK> Maybe it's not first word fall through that I did, then?
<FL4SHK> I tested it
<FL4SHK> I might have implemented something other than fwft
<lkcl> FL4SHK: well... it is for most people. you appear to have a well-above-average capability in hardware design :)
<FL4SHK> like... I find CPUs much more difficult than I did that thing
<FL4SHK> or at least the types of CPUs I'm making
<FL4SHK> simple ones are easy
<lkcl> stuff that's known to be *really* hard computer science you're like, "pffh" :)
<lkcl> yehyeh
<FL4SHK> I can make a multi-cycle CPU in my sleep
<FL4SHK> ...er, by that I mean, a big freaking state machine CPU
<lkcl> FL4SHK: try a multi-issue Out-of-Order engine some time
<FL4SHK> *that's* something I haven't done before
<lkcl> yehh i went with a FSM for the early version of Libre-SOC, just to get "instructions into pipelines" without having to worry about register dependencies
<FL4SHK> I want to make a multi-core, out of order, multi-issue LARs machine
<FL4SHK> nobody has done this before
<FL4SHK> ...oh, and virtual memory
<lkcl> it took me *six months* of study with Mitch Alsup's help to understand the CDC 6600.
<FL4SHK> what kind of things are you referring to as really hard computer science?
<lkcl> OoO design for example. it's... yeah.
<FL4SHK> I don't consider that ridiculously easy
<Yehowshua> lkcl, I was pretty impressed on the AFIFO whitequark did. These are common utilities hardware designers just shouldn't have to worry about
<FL4SHK> fwft might not even be what I actually implemented
<FL4SHK> all I did was shift reads to be asynchronous
<FL4SHK> just like I've done with block RAM before
<awygle> FWIW I found the docs page of Array more confusing than helpful. Once I realized it's a list you can index with a signal it clicked.
<Yehowshua> Imagine if one day you could just do something like m.submodules.pcie = PCIe()
<lkcl> FL4SHK: ah yes that *might* be different.
<lkcl> Yehowshua: yeah, litex and fusesoc are intended to be that kind of level. and it's what we'll need
<FL4SHK> I referred to it as a first word fall through FIFO but
<FL4SHK> it still does some stuff synchronously
<FL4SHK> it does what I needed it to
<FL4SHK> Maybe first word fall through isn't what I needed at all
<lkcl> FL4SHK: does it mean that: if the FIFO is empty, and it is written to, that the data coming in is available for reading *on the same cycle*
<lkcl> that's "fwft" as best i can tell.
<lkcl> without fwft, you will always have a 1 clock cycle delay, guaranteed, between incoming and outgoing data.
<lkcl> even if the FIFO is currently empty
Yehowshua has quit [Remote host closed the connection]
<lkcl> FL4SHK: if you're looking to do Out-of-Order, i recommend looking up "Design of a Computer" by James Thornton. it's available online (free) thanks to Thornton giving permission around 2010
<lkcl> he was very old. his wife wrote a hand-written letter to the person who asked if he could put a copy of the book online
<lkcl> and if you're interested in precise exceptions, branch speculation etc. i have some augmentation chapters written by Mitch Alsup that help explain how to do that, on top of the original 6600.
<FL4SHK> don't need branch prediction for the type of thing I'm doing
<lkcl> he also showed me how to do O-o-O memory management, which would be relevant for the LARs concept
<d1b2> <TiltMeSenpai> is this "The Control Data 6600"?
<FL4SHK> the fact that software guarantees no instruction ILARs misses means you can get some other assumptions
<lkcl> that took me 3-4 weeks to understand, on its own.
<lkcl> dlb2, TiltMeSenpai: yes :)
<FL4SHK> I didn't have a delay of 1 clock cycle for reading
<FL4SHK> but I did for writing
<FL4SHK> so this is probably something else
<lkcl> what about simultaneous read-and-write, on the same clock cycle? what happens then?
<FL4SHK> the only thing htat's not synchronous is reading from the array inside the FIFO
<lkcl> yes it sounds like it isn't fwft. fwft is definitely the following conditions (all on same clock cycle):
<lkcl> * FIFO is empty
<lkcl> * write occurs
<lkcl> * read occurs
<lkcl> * write "falls through" to the read port
<FL4SHK> that really doesn't sound that bad
<FL4SHK> it sounds a lot like something I've done for register files before
<lkcl> it's the sort of thing that's enough of a nuisance that people don't want to have to reinvent it (and get it wrong)
<FL4SHK> Where you'd need to read what was currently being written
<lkcl> yes, funnily enough, it's exactly the same.
<Lofty> Isn't that transparent read?
<FL4SHK> that's really not that hard to me
<lkcl> FL4SHK: yes - but it takes time, and people get it wrong, and then things break.
<FL4SHK> doesn't sound hard at all to me
<lkcl> Lofty: on regfiles? i believe so. it's kinda like having an operand forwarding bus built-in to the regfile
<FL4SHK> it's just a matter of dealing with "next state" stuff
<Lofty> It's not specific to register files
<Lofty> It's a property of the memory
<FL4SHK> I haven't needed a first word fall through FIFO before, though
<lkcl> FL4SHK: it took our team 2-3 weeks to write the regfiles with transparent reads, and unit tests, and formal correctness proofs.
<FL4SHK> what
<FL4SHK> why?
<FL4SHK> it took me that long for, say, my LAR file in my original LARs machine
<FL4SHK> *that* was a hard project
<lkcl> because this stuff is hard - for us - and we're not confident it would "work", so had to make sure by spending the time writing unit tests that gave us the confidence in the code
<lkcl> i don't think you fully appreciate: you really do have a well-above-average level of competence in hardware design :)
<lkcl> that was a compliment btw
<FL4SHK> oh, well, thanks
<whitequark> Lofty: that's a transparent read if you look at a memory alone
<whitequark> but it's called first-world fallthrough on a FIFO
<whitequark> same concept though
<Lofty> Ah, I see, thank you
<awygle> isn't it still FWFT even if it has latency >0, as long as you don't have to tick the output port to get the new data to show up?
peepsalot has joined #nmigen
<FL4SHK> lkcl: to me, a register file is something you don't even really need to verify
<FL4SHK> other than by looking at the source code
<lkcl> FL4SHK: certain industries absolutely cannot take the author's word for it - or the source code.
<lkcl> hence this:
<FL4SHK> lkcl: I don't think it takes much to formally verify a register file
<FL4SHK> like, 2 to 3 weeks is a *lot*
<FL4SHK> a LAR File, on the other hand
<FL4SHK> that's difficult to formally verify
<FL4SHK> or at least it was for me
<FL4SHK> ...I'd probably have an easier time with it today
<FL4SHK> LAR files are not simple like caches
<lkcl> indeed. as will be the OoO Dependency Matrices.
<lkcl> when you have an out-of-order design, formal correctness proofs - that data has been correctly been read/written in the right order to the register file - becomes far more challenging
<FL4SHK> when you say formal verification, do you mean with yosys?
<FL4SHK> because that's the definition I was using
<lkcl> symbiyosys - yes.
<lkcl> which uses SAT solvers like yices2, etc., yes
<lkcl> i have been thinking about how to verify the OoO Dependency Matrices for some time.
<lkcl> how to guarantee that the instruction issue order is the same as the completion order *where it actually matters*
<FL4SHK> I'd write everything in nMigen at this point
<lkcl> because - haha - in some cases it doesn't matter. yes, that's what we're doing. everything's in nmigen.
<FL4SHK> I'll need to study up on out of order machines
<lkcl> "add r1, r2, r3" and "add r4, r5, r6" do *not* matter what the completion order is because there's no dependency hazards
<FL4SHK> I understand that much
<lkcl> FL4SHK: the "normal" algorithm - the one that everyone quotes - is the Tomasulo Algorithm.
<FL4SHK> I've heard of it
<lkcl> there's a really good video on youtube by an indian guy, who explains it really well
<FL4SHK> but I don't know its details
<FL4SHK> since I'm unaware of its details, I might come up with my own thing
<FL4SHK> it'd be fun to say "hey, look, this is my own algorithm"
jeanthom has quit [Ping timeout: 240 seconds]
<lkcl> once you understand that, i can point you at a page which allows understanding of the precise capability of the (augmented) 6600.
<lkcl> :)
<FL4SHK> I don't think I want to see the Tomasulo Algorithm
<lkcl> there's some things you definitely need to think through.
<lkcl> do you want interrupts to be serviceable immediately?
<FL4SHK> if it's a machine with out of order execution, probably not
<lkcl> do you want multiple LOAD/STOREs to be done in parallel without data corruption?
<lkcl> FL4SHK: actually, the precise-augmented 6600 *can* handle interrupt-servicing immediately, because there's a way to cancel outstanding in-flight instructions
<FL4SHK> What does "outstanding" mean in this case?
<lkcl> "work that's in pipelines or waiting to be put *into* pipelines that hasn't hit the register file yet"
<lkcl> aka "in-flight"
<FL4SHK> I'll need to think about what I should do for micro ops
<lkcl> some OoO designs use a "rollback history" system. others "hold off" from writing anything that could cause "damage"
<FL4SHK> I don't want to study existing ideas for micro ops
<FL4SHK> nooo don't tell me
<lkcl> micro-ops according to Mitch Alsup is... haha :)
<lkcl> you really do want to discover this stuff for yourself, don't you? :)
<FL4SHK> yes
* lkcl zzip. with extra gaffa tape.
<lkcl> mmMmmh, mmhmhhh!
<lkcl> if you get stuck just ask.
<FL4SHK> things that I don't outright need to know like what the virtual memory system needs to do for OSes to work
<FL4SHK> I don't know if I want to know much in advance.
Yehowshua has joined #nmigen
<Yehowshua> I guess its meeting time?
<lkcl> FL4SHK: what will be fascinating is if you document all of this and put it online as libre software
<lkcl> oh?
<lkcl> oh!
<whitequark> yep, meeting time
<FL4SHK> software, eh
<FL4SHK> I thought this was hardware!!!
<FL4SHK> the reality is that hardware and software might as well be the same thing...
<jfng> o/
<whitequark> awygle? ktemkin?
<Yehowshua> yeah - have you heard of https://downloadmoreram.com?
<Yehowshua> its hardware software
<FL4SHK> you can actually download more RAM into an FPGA by making it out of logic
<FL4SHK> Array is my RAM
<FL4SHK> FL4SHK's Programmable Gatorade
<Yehowshua> Salut jfng
<whitequark> okay, my agenda items: status update, #381, #355, Record split
<whitequark> jfng, anything from you on the SoC side?
<Yehowshua> Thanks lkcl
<jfng> no, maybe discussing #20 and #10
<whitequark> alright, we can do that
<whitequark> no, nmigen-soc #10 and #20
<jfng> lkcl: i meant nmigen-soc issues
<lkcl> jfng: doh :)
<whitequark> i'll begin with the status update. not much to report; i've been looking into cleaning up cxxsim and getting it into master proper
<whitequark> that will probably take a little more time
<whitequark> the main issue i'm having is organizing the simulator guts; there are a bunch of things that are shared (the public interface essentially) and a bunch of things that are similar but not really shared exactly
<whitequark> right now there are two Simulator classes that inherit from the same "core" class
<whitequark> i think that's not a particularly great design; it's confusing to which one you refer, it's easy to have their interfaces accidentally diverge, it requires you to substitute or rename imports
<Yehowshua> So I remember sometime ago that you can express new logic **in** simulation
<Yehowshua> How does CXX handle that?
<whitequark> oh?
<whitequark> can you elaborate?
<Yehowshua> Yes - you could do a.eq(b) in a process
<Yehowshua> Lemme find an example
<whitequark> oh right
<whitequark> when you do that in a process, it doesn't add logic; it executes once and instantaneously
<whitequark> more like a regular assignment
<lkcl> whitequark: yes. we've already started doing this:
<lkcl> cxxsim = True
<lkcl> from nmigen.sim.cxxsim import Simulator, Settle
<lkcl> if cxxsim:
<lkcl> else:
<lkcl> from nmigen.back.pysim import Simulator, Settle
<whitequark> lkcl: yeah so that sucks imo
* cr1901_modern is here in read only mode mostly
<whitequark> i mostly did it because i wanted to get something out for you folks to test
<lkcl> whitequark: really appreciated
<lkcl> the usual "solution" is a Factory class system
<whitequark> what i think would be a better design is having a single Simulator, all the commands, etc, and the Simulator would take a SimulationEngine that would actually implement it
<whitequark> the Engine would be mostly or completely opaque to user code
<whitequark> so it'd be something like `from nmigen.sim import PythonEngine, CxxEngine`
<whitequark> `Simulator(engine=CxxEngine)
<whitequark> or maybe
<whitequark> `Simulator(engine=CxxEngine(builder=user_build_fn))`
<whitequark> (for people who want to customize exactly how the C++ code is built)
<Yehowshua> This feels natural `Simulator(engine=CxxEngine)`
<whitequark> right but it doesn't let you pass options to the engine
<whitequark> maybe `Simulator(engine=CxxEngine())`
<Yehowshua> yeah
<lkcl> that'd work. it's one step away from a full "class Factory" (where engine is a string and the Factory class looks that up in a dictionary)
<whitequark> there is a good reason I don't want to make the engine argument a string
<whitequark> the reason is that importing CxxEngine, right now, pulls in Python's ctypes
<whitequark> but... that's unlikely to work all that well on PyPy, and I know you folks use PyPy among other things
<whitequark> PyPy needs cffi, but that's an external dependency
<lkcl> yuk
<lkcl> oh
<whitequark> well, it doesn't strictly speaking *need* cffi
<whitequark> it just works much faster with cffi
<whitequark> and this is hella important, because the ctypes overhead is massive
<lkcl> i meant to say: i had a suggestion instead of using ctypes?
<awygle> i'm here. sigh.
* lkcl waves to awygle
<whitequark> it is in fact so large that cxxsim is only ~2x faster than pysim!
<whitequark> where in reality it should be something like ~100x faster
<lkcl> the idea is: at the same time as auto-generating the cxxsim.cc, actually auto-generate a c-based python module that matches it.
<Yehowshua> So its not very hard to write driver code around CXXRtl directly in C++.
<whitequark> lkcl: yeah and that would work even worse on pypy
<Yehowshua> I'm also not opposed to writing such drivers
<lkcl> rather than try to use swig (or other c-to-python wrapper), just auto-generate the c module.
<lkcl> whitequark: sigh :)
<Yehowshua> Like for a large design, the user could double down and write the driver code themselves
<whitequark> Yehowshua: i'm quite certain we can speed things up
<whitequark> i'm not presenting this to you as some sort of insurmountable problem
<whitequark> i'm merely describing how much overhead ctypes has
<Yehowshua> Ah
<whitequark> i haven't even tried solving this so far in the code that i wrote
<lkcl> Yehowshua: true. we just discussed that yesterday, how simulated verilator peripherals are written in c
<whitequark> simulated cxxrtl peripherals (aka cxxrtl black boxes) are, naturally, written in c++
<lkcl> i wasn't suggesting *replacing* the use of ctypes. but... a speed up of 50x, if achievable by not using ctypes is possible for /usr/bin/python3, that's quite compelling
<whitequark> a speed up of 50x would, i think, only be possible by ditching python completely
<whitequark> i.e. have the simulation call back into python only when it actually needs something from python
<Yehowshua> hmm... what about pypy... Does it have CTypes support?
<Yehowshua> Jitted pypy get quite fast
<whitequark> basically, nmigen would tell cxxrtl "here is your clocks, and here is the trigger you need to call me back on, and now do this all on your own"
<whitequark> Yehowshua: yes but ctypes has some design problems that prevent pypy from being efficient with it
<whitequark> which is one reason for cffi's existence
<cr1901_modern> I'm confused... Is the "only 2x faster" thing a new regression? I could've sworn cxxsim was an order of magnitude faster previously...
<whitequark> cr1901_modern: it's only 2x (on minerva, the speedup will be higher on larger designs) faster when used through nmigen
<whitequark> let me rephrase
<whitequark> cxxrtl is 100x faster than pysim, but cxxsim (nmigen's cxxrtl integration) is 2x faster
<cr1901_modern> ahhh okay
<lkcl> with LibreSOC, we're seeing about... 10-20 instructions per second executed in pysim
<lkcl> that's without the IEEE754 FPU added
<whitequark> what about cxxsim?
<lkcl> we're running into that ready/valid bug, on every aspect of the design
<whitequark> right but i don't think it should affect how fast the design runs
<lkcl> ready/valid signalling is a core aspect of the data "management" because it's an OoO design rather than a simple, straightforward pipeline
<whitequark> (the problem is virtually certainly on python side)
<lkcl> none of the unit tests pass, so i can't get it running to the point where i can tell how fast it is
<whitequark> mm, okay
<lkcl> because they _all_ use ready/valid style communication, unfortunately.
<whitequark> btw Yehowshua could you minimize the testcase further? that would really speed up the process of fixing the bug
<whitequark> it's somewhat subtle
<Yehowshua> Yes. I was having Michael work on that
<whitequark> great
<lkcl> Yehowshua: cutting out the actual "shift" should do the trick and just have a countdown.
<whitequark> anyway, let's go on to the next item
<Yehowshua> I'm curious about nmigen-soc
<whitequark> yeah?
<Yehowshua> What's the vision? How does it compar/complement LiteX?
<Yehowshua> **compare
<lkcl> and with OpenPITON?
<whitequark> nmigen-soc is a SoC *gateware toolkit*
<whitequark> so it gives you all the buses, and it gives you a BSP generator
<whitequark> but it doesn't, for example, know how to build firmware
<Yehowshua> OK
<whitequark> and it doesn't have a BIOS, it doesn't have any preference on what your language is
<whitequark> could be C, C++, Rust, whatever
<whitequark> LiteX could (in principle) be built on top of nmigen-soc
<Yehowshua> I see now
<whitequark> if/when LiteX migrates to nmigen
<lkcl> whitequark: it sounds very much like what we need to complete Libre-SOC.
<lkcl> if you're familiar with OpenPITON, they can specify the full spec of (an) SoC as a JSON file.
<lkcl> templating *shudder* then creates the ennntiirre SoC including AXI4 bus infrastructure
<Yehowshua> Looking at issue 10, I see that the peripherals would be asynchronous - as in able to cross multiple clock domains
<lkcl> it sounds to me like nmigen-soc would be the "bedrock" of a nmigen equivalent of that
<whitequark> yeah
<lkcl> how do CSRs work? they're just registers (in effect) but named so that, if needed, they can be "addressed" by wishbone/AXI4?
<whitequark> yeah more or less
<lkcl> ok
<awygle> correct me if i'm wrong, but it seems like nmigen-soc is a placeholder and a sketch right now, and there's a fair bit of design work to be done before it's "finalized". is that accurate?
<jfng> yes
<awygle> hi jfng, was wondering if you were here :)
<whitequark> awygle: i think the parts that are already there are reasonably functional, and shouldn't radically change
<whitequark> but there are a lot of things missing
<lkcl> oh.
<jfng> you can already use parts of it to build SoCs (e.g. the busses part)
<whitequark> so i wouldn't say it's a placeholder (nmigen-stdio is, though), but it's definitely unfinalized
<Yehowshua> So jnfg, I've talked to you a bit before. awygle, haven't really said hi to you before - so hello.
<jfng> what is currently lacking is the integration tools
<awygle> howdy :D
<FL4SHK> calculus seems a little out of the ordinary for hardware
<lkcl> right. one thing that we need for Libre-SOC is a way to check if a particular address is valid
<FL4SHK> except not really
<lkcl> *before* actually issuing the request.
<FL4SHK> sorry, just a joke
* FL4SHK leaves again
<lkcl> FL4SHK :)
<whitequark> the main reason i haven't spent a lot of time on stdio/soc yet is because we don't have streams or interfaces yet
<cr1901_modern> I remember a long time ago (~1 year ago) we discussed "how do we build firmware for nmigen SoCs"? Is the idea now that nmigen-soc will _not_ handle building at all?
<whitequark> which is why those are on the agenda today, among other things
<jfng> a subset of which (peripherals) is the current focus of development
<lkcl> this because we're doing an out-of-order design and there will be multiple (parallel) memory requests outstanding
<FL4SHK> what do you mean by interfaces whitequark?
<whitequark> FL4SHK: we'll get there in a bit :)
<FL4SHK> might be SV interfaces
<jfng> (hi awygle :) )
<FL4SHK> but I find that classes do that job mostly
<whitequark> what i have in mind is not intentionally related to SV interfaces
<FL4SHK> ah
<whitequark> right, let's actually go through the items, ok?
<whitequark> #381, which awygle raised earlier today
<_whitenotifier-b> [nmigen] awygle edited issue #342: Separate `Record` into `PackedStruct` and `Interface` components - https://git.io/Jv7U7
<whitequark> please read the discussion in https://github.com/nmigen/nmigen/issues/381
<whitequark> it's short and i don't want to rehash it here
<whitequark> some more context: the main reason this mechanism is exists / is being discussed right now is an annoying quirk of oMigen
<lkcl> whitequark: did you ever consider (as is done in Chisel3 and BSV) adding "direction" at the lowest level (Value, Signal)?
<whitequark> it had Records, which were just plain classes, and which you couldn't assign to nor use in expressions directly
<whitequark> lkcl: patience
<whitequark> we'll get to that
<lkcl> whitequark: :)
Yehowshua42 has joined #nmigen
<whitequark> there's like three layers of infrastructure to build before we get there
<awygle> i had the impression from the discussion on #381 that there was a reasonable amount of consensus for https://github.com/nmigen/nmigen/issues/381#issuecomment-624212080
<whitequark> anyway, to use oMigen records, you had to access either their fields or .raw_bits()
<whitequark> argh
<whitequark> sorry, wrong link
<whitequark> (but we'll discuss #381 too)
<awygle> oh, yes, much more controversial lol
<Yehowshua42> Oh!
<Yehowshua42> I was like #381 is not short!
<whitequark> yeah sorry :/ we should get one of those IRC bots that convert links to titles
<Yehowshua42> lkcl will be replaced
<whitequark> anyway, I wanted to make nMigen records something that you could use like you use any ordinary value
<lkcl> Yehowshua42: lol
Yehowshua has quit [Ping timeout: 245 seconds]
<whitequark> initially, Record was a direct subclass of Value, and treated specially everywhere
<whitequark> this was somewhat controversial because people tried to inherit from it, and I didn't really want to support that
<lkcl> whitequark: yeah. if it was vhdl it would not be possible. however: python, go figure. it just feels... "natural" to inherit (and extend).
<whitequark> eventually what I arrived at is making an "UserValue", which is a special Value you *can* inherit from, under the condition that it always lowers to some other nMigen value
<whitequark> which is what Record does (it lowers to a concatenation of its fields)
<lkcl> i remember we had quite a discussion about the implications of modifying inherited instances after they'd been used (once)
<whitequark> this was insightful because there are clearly quite a few more use cases for this kind of thing
<Yehowshua42> So if I understand correctly, UserValue is its own thing
<awygle> yes, the ability to "plug in" custom data types to the nmigen infrastructure is very useful
<whitequark> unfortunately UserValue has a fatal flaw
<whitequark> which is to say, it inherits from Value, and Value has a ton of methods with various names, and is getting regularly expanded
<whitequark> what this means in practice is that, unless we fix the flaw, we can never add methods to Value
<whitequark> because it'll break user code that uses the same names e.g. in record fields
<whitequark> I can definitely foresee someone using a field called "shift_left"
<lkcl> ... which pollutes the namespace into which you'd consider adding "things"
<whitequark> yes
<Yehowshua42> Is it possible to have something that doesn't inherit from value but always lowers to value? Maybe the user can define how it lowers?
<whitequark> that is precisely what #355 is about
<lkcl> i encountered this problem when creating RecordObject. i "fixed" it by overriding __getattr_
<lkcl> 1 sec
<FL4SHK> I derive from `Record`s often
<lkcl> actually __setattr__
<lkcl> https://git.libre-soc.org/?p=nmutil.git;a=blob;f=src/nmutil/iocontrol.py;h=bdbd055e41cf2a0f3337ab2aa66d2ee4d127d0dc;hb=aeed18a63d687cdaa9c00b98d46c66583fef6e2d#l94
<Yehowshua42> one thing I'd add to the list of things to discuss if we have time is bringing industry attention to nMigen
<lkcl> by over-riding __setattr__ it becomes possible to check if the thing being added is already in use ("shift_left")
<awygle> FL4SHK: the problem with that is we might break all your code because of a totally unrelated change in a new version of nmigen
<whitequark> yep
<lkcl> and to raise an exception
<whitequark> lkcl: but it would be better if this problem didn't exist in first place
<lkcl> whitequark: true :)
<whitequark> anyway, Yehowshua42 has the right solution here: a separate class that doesn't inherit from Value, but which is *castable* to Value
<whitequark> we didn't already have that implemented because... UserValue predates Value.cast IIRC
<lkcl> oo intriguing
<lkcl> i like it
<whitequark> there are a few things that we should decide
Yehowshua42 has quit [Quit: Ping timeout (120 seconds)]
<whitequark> (1) what should it be called? UserValue is a bad name because it'll also be used internally in nMigen
<whitequark> already used in fact
<awygle> trait To<Value>
<awygle> i'm obviously joking about the specifics but i think that's the general tone we should shoot for. i don't _love_ CustomValue although i'd be OK with it
<whitequark> yeah, same
<whitequark> I'm open to better names
<whitequark> ValueCastable?
<awygle> ToValue, Castable, AsValue, Lowerable
<whitequark> to go with Elaboratable?
* lkcl chooses engineering-style names
Yehowshua has joined #nmigen
<jfng> UserValue.as_value() seems redundant
<Lofty> I think ValueCastable works quite well, actually
<Yehowshua> I agree with Lofty
<lkcl> oh wait... so the idea is similar to Elaboratable in concept?
<awygle> it's a bit wordy but it 's probably the clearest
<Lofty> I'm willing to take wordiness for clarity
<Yehowshua> Same
<whitequark> lkcl: yeah, kinda. Elaboratable is an interface that lowers to a Fragment (usually a Module when you write it); ValueCastable lowers to Value
<awygle> i'd say yes. it's a marker class in the same way that Elaboratable is
<lkcl> ValueCastable... yyeah.
<whitequark> excellent, let's go with that name then
<whitequark> the cast method should be .as_value I'm pretty sure, we already have the convention going
<Yehowshua> Defining how something lowers - is this a new idea? Does some other HDL have that?
<whitequark> .as_signed()
<awygle> yeah strong agree with as_value
<Lofty> Yep, as_value works nicely.
<cr1901_modern> +1 to as_value
<lkcl> is that to be implemented *by* each inheritor of ValueCastable?
<whitequark> Yehowshua: not sure, but it seems straightforward
<whitequark> lkcl: correct
<d1b2> <TiltMeSenpai> .as_value() feels very rust-y, I like it
<Lofty> I think there are a fair few people with Rust experience here
<whitequark> TiltMeSenpai: nmigen is generally fairly rusty; not overwhelmingly so, but I do borrow ideas
<d1b2> <TiltMeSenpai> yeah
<Yehowshua> Yeah - it is straight forward. So much so that I'm wondering why didn't someone else think of this?
<awygle> then it should be to_value :p
<Yehowshua> Rust is a bootiful language
<awygle> or into_value
<awygle> rather
<lkcl> mmmm then i can see that getting "old" quite quickly - enough so that people start putting it into classes that they then inherit from
<whitequark> into_* is about ownership
<awygle> i know, i'm joking
<whitequark> right ok
<awygle> (mostly joking, partially lamenting python's weak type system)
<awygle> either way nothing productive
<lkcl> one of the really nice things about Record (and RecordObject): there's one function (the constructor) and that's it
<whitequark> lkcl: i don't expect there will be many direct subclasses of ValueCastable
<Yehowshua> python is a snek. sneak's have no bones...
<whitequark> but i don't see anything wrong with subclassing it further
<Yehowshua> **snakes - commenting on weak types
<Lofty> I mean, nMigen tries to compensate for the type system as much as possible
<whitequark> it's completely your code, do whatever you want, nmigen will take it
<lkcl> whitequark: oh this is "internal to nmigen" classes we're talking, rather than user-created ones?
<whitequark> nope, ValueCastable is something downstream code would use
<whitequark> your RecordObject would probably inherit from it
<whitequark> eventually
<whitequark> nmigen Record would inherit from it
<awygle> somebody somewhere needs to implement the lowering, and we should expose that
<Lofty> This really does sound like a trait at this point :P
<lkcl> ok. right. and then RecordObject would be used (without having every instance that inherits from RecordObject have its own to_value)
<whitequark> Lofty: it *is* a trait
<whitequark> lkcl: yes
<awygle> if you want Record lowering semantics you should be able to inherit from Record
<lkcl> okaaay
<whitequark> (Record is going away though)
<whitequark> (but yeah)
<awygle> well yes, but it's the only example we all have experience with currently
<whitequark> yeah
<Yehowshua> as in will be deprecated?
<whitequark> yes
<awygle> nice segue :p
<Lofty> And eventually removed
<whitequark> everything public goes through a deprecation cycle of at least one release
<Yehowshua> Well - you have to do what you have to
<whitequark> more if it turns out to be a major issue for downstream code
<Yehowshua> Just own it like Apple killing CD drivers
<Yehowshua> **drives
<whitequark> we don't *just* break downstream code if we can do it at all
<whitequark> *avoid it at all
<awygle> i linked this before but the proposal re: Record is summarized here https://github.com/nmigen/nmigen/issues/342#issuecomment-656390339
<lkcl> Yehowshua: we may have to take a copy of Record and maintain it externally (in nmutil) at the crossover/deprecation point
<whitequark> we have one more aspect on ValueCastable
<awygle> oh sorry
<Yehowshua> lkcl - and or eventually re-write to use valuecastable
<whitequark> it is related to an edge case of (incorrect) user code returning different things when .as_value() is called multiple times
<Yehowshua> In fact - I think re-writing codebases every once in a while is a good exercise
<Yehowshua> Albeit painful
<lkcl> Yehowshua: it depends on timing of the Oct 2020 tape-out
<lkcl> whitequark: oh?
<Yehowshua> Yeah - not for a while
<whitequark> if you return different results from .as_value() when it is called (by nmigen, during casting) multiple times
<whitequark> you can end up with wildly internally inconsistent ASTs
<whitequark> and things will break in a confusing way well after the fact
<Lofty> I want to just say "this is undefined behaviour", but obviously that's not a solution; how could one catch a situation like that?
<awygle> is this something we can check for? we have the lazy lowering stuff in the current UserValue
<lkcl> this is similar to the original discussion we had for RecordObject: what happens if someone adds things to the RecordObject *after* constructor time?
<awygle> we could check for equality there
<whitequark> awygle: we can't check for equality
<lkcl> except the problem's now moved to to_value()
<whitequark> Values override ==
<awygle> it being python you can still override __eq__ but then it should be obvious you're making things worse
<Lofty> This seems like a "murphy versus machiavelli" problem, almost.
<whitequark> no no, .as_value() returns a Value, and Value does override __eq__
<cr1901_modern> whitequark: Not saying returning different results is a good idea, but... why can't that be caught internally in nmigen via an isinstance() dance?
<whitequark> uh, how would isinstance() help?
<Lofty> They're all instances of Value, right?
<awygle> mm i guess Value.cast is static so you can't really store previous results there, you'd have to do it internal to the ValueCastable, which means the user can still do Bad Things
* lkcl wonders two things. (A) is it Officially Nmigen's Problem at all (B) if it is, can hashing of the AST be done, keeping a dictionary of first-usage and comparing it against subsquent uses?
<awygle> or that
<whitequark> (a) yes, it's a footgun, (b) that's not trivial, hence the discussion
<whitequark> let me explain the options we have
<whitequark> and why they're all bad
<cr1901_modern> Oh, the "child" type is erased when is_value() is called
<whitequark> awygle: i'm not being defensive against the user doing Bad Things on purpose; that is not possible in Python
<whitequark> (or even in Rust really, though it is harder there)
<Lofty> <Lofty> This seems like a "murphy versus machiavelli" problem, almost.
<lkcl> Lofty: lol
<whitequark> (you can already shell out to gdb and change private fields in your program without UB)
<whitequark> can always*
<awygle> i copy, i meant "without realizing, while thinking they're doing the right thing"
<whitequark> what i'm defending against is *accidental misuse*
<whitequark> yes
<whitequark> so there are two general options we have for this case
<whitequark> detect or prevent
<Lofty> I'm wondering if there's some feasible way of evaluating as_value exactly once.
<whitequark> and there are two places we can do it in
<whitequark> as_value() itself, and Value.cast()
<whitequark> detection would mean memorizing the value when it's first returned, and comparing them the next time the function is called
<whitequark> this seems pointless: it's strictly more work than prevention
<whitequark> okay, it's not actually pointless, it eagerly shows that there is a bug in the user code
<Yehowshua> I'm scratching my head on how to implement prevention
<whitequark> prevention in this case would mean memorizing the value when it's first returned, and then never calling the user function again
<lkcl> agreed: if you're going to go to the trouble of detection, and you *know* it's going to cause problems in 100% of cases, logically that suggests prevention
<Lofty> So prevention would be "evaluate exactly once"
<lkcl> mmm except...
* lkcl thinks
<whitequark> yes. we currently do that. but it's not a very good implementation
<lkcl> someone calling it twice would go "why doesn't it do what i want the second time??"
<awygle> i actually think detection is better than prevention
<awygle> yes, for that reason
<awygle> detection will loudly tell you you've made a mistake. prevention will silently do _something_ which may or may not be right
<lkcl> unless.. haha, you actually monkey-patch the module and **REMOVE** the to_value function after it's first called :)
<Lofty> lkcl: machiavelli
<Yehowshua> Why not combine detection and prevention
<Yehowshua> That is prevent
<Yehowshua> But then inform the user you prevented
<lkcl> or, you monkey-patch it to replace it with "don't call this again, here's why"
<Lofty> If a detection trips, it should be fatal
<lkcl> *or*...
<whitequark> it should be a hard error if we detect at all
<Lofty> How do you know which one is the intended result?
<whitequark> no monkey patches
<lkcl> you use an over-ride on __getattr__ which checks if the thing being accessed is named "to_value"
<whitequark> no overrides
<whitequark> no metaclasses
<awygle> what we need is linear types basically, but we can't have those. currently i like the memoization option the best.
<whitequark> no weird junk people will get confused by
<cr1901_modern> I don't see what's wrong with memoization
<whitequark> yes, let me explain
<whitequark> so the way memoization would be implemented is by adding a private field on the user class (it will end up being named _Value__casted or something) in Value.cast
<whitequark> that's fine
<whitequark> we can then either detect or prevent or whatever
<whitequark> the problem is that .as_value() is a public function
* lkcl thinks...
jeanthom has joined #nmigen
<lkcl> oh. i wonder if, just like in Elaboratable detects "def elaborate", if it's possible to require that a base-class function be called
<whitequark> and, suppose one has a PackedStruct and one wants to rotate it or shift it for whatever reason
<lkcl> that base-class function will set the flag "to_value_has_been_called"
<whitequark> so you'd write packed_struct.rotate_left(10) but that doens't work cuz it's not a Value
<FL4SHK> I have a concern
<FL4SHK> will I have to change my existing code that uses `Record`?
<whitequark> FL4SHK: eventually, yes, because Record will be deprecated and removed
<whitequark> for now, no, there will be a compat shim
<FL4SHK> How often do you plan on doing breaking changes?
<Yehowshua> And FL4SHK, you can pull in Record directly
<whitequark> every 0.x release I remove features deprecated in 0.(x-1)
<FL4SHK> oh my
<FL4SHK> will it ever become very stable?
<whitequark> the release cadence is... I think jfng suggested 3 months ideally? right now it's more like 6 months
<Lofty> It *is* 0.x software
<FL4SHK> I see
<Yehowshua> Well, its only been around a little over a year
<FL4SHK> nMigen?
<whitequark> FL4SHK: year, year and a half from now, we might have 1.0
<Lofty> Yep
<FL4SHK> I thought it was longer than that
<whitequark> something like that
<whitequark> nMigen is very young
<cr1901_modern> December 2018
<FL4SHK> I see
<awygle> starting to think we should put the deprecation policy in the readme
<Lofty> Honestly, I don't think we need to rush to 1.0 anyway, but that's kinda irrelevant
<whitequark> awygle: it will be in the docs
<awygle> we get that question a fair bit
<whitequark> quite prominently
<Yehowshua> Which leads me to my next question
<FL4SHK> Breaking changes are scary because I have old code sometimes
<Yehowshua> About the docs
<FL4SHK> and I can't always update it
<cr1901_modern> >so you'd write packed_struct.rotate_left(10) but that doens't work cuz it's not a Value
<cr1901_modern> Was this a finished thought?
<whitequark> FL4SHK: such is the life with 0.x dependencies
<FL4SHK> All right.
<Lofty> FL4SHK: the old versions will still be on PyPI, so you can pin against them, I think
<whitequark> i go to quite a bit of effort to make upgrades painless
<whitequark> e.g. the deprecation errors tell you how to fix your code, typically
<whitequark> *warnings
<Yehowshua> Yup - noticed with nmigen.back.pysim -> nmigen.sim.pysim
<FL4SHK> Any idea if very basic stuff might change?
<whitequark> with Record specifically you could also extract it from the nmigen codebase and stuff it into your own codebase and use it indefinitely
<whitequark> FL4SHK: not really
<FL4SHK> That's about what I figured
<whitequark> Record is one of the few major warts
<FL4SHK> probably not going to have to deal with very many breaking changes on my end, then
<FL4SHK> I bet PackedStruct will stick around
<whitequark> the other is the build system DSL, but that's far off, and will probably have an automatic migration system
<whitequark> PackedStruct should be the final design for that component
<FL4SHK> I largely only need plain old Python classes and `PackedStruct`
<awygle> whitequark: to try to finish the thought out, i suspect we could do _some_ kind of python shenanigans to ensure that as_value did memoization _itself_, thereby avoiding the problem. the question is, is that too much magic.
<FL4SHK> What about Layout?
<FL4SHK> will it be sticking around?
<Yehowshua> @awg
<FL4SHK> american wire gauge
<Yehowshua> awgle is right
<whitequark> we'll get to that discussion once we finish #355, ok?
<Yehowshua> FL4SHK - ur funny
<whitequark> let's stay on topic
<whitequark> so
<whitequark> 19:19 < cr1901_modern> >so you'd write packed_struct.rotate_left(10) but that doens't work cuz it's not a Value
<whitequark> this was not a finished thought, we veered way off topic
* cr1901_modern nods
<whitequark> what you *should* write when you realize `packed_struct.rotate_left` doesn't work, is `Value.cast(packed_struct).rotate_left(10)`
<whitequark> but what you might *want* to write is `packed_struct.as_value().rotate_left(10)`
<whitequark> and that doesn't detect or protect incorrect implementations of .as_value()
<lkcl> urrr yuk
<cr1901_modern> hrm... :(
<whitequark> awygle is right: we'll need *some* python shenanigans there
<awygle> whitequark: radical proposal for discussion - what if Value.cast didn't work for this?
<cr1901_modern> import inspect
<cr1901_modern> and inspect the call frame?
<awygle> so that `as_value` is the canonical and only way to do this
<whitequark> awygle: that makes the problem worse
<whitequark> because we don't control as_value
<lkcl> to explain that: if those are the implementations of.. say... the upgraded-Record-replacement's rotate_left function, fine
<awygle> mm.. yes, fair.
<whitequark> the other reason it makes the problem worse is that all of the nMigen guts use Value.cast
<whitequark> anyway, let's see which our options to fix this are
<awygle> i shoulda stopped talking once everybody was saying how right i was :p
<Yehowshua> Well its either lots of shenanigans or lots of educating
<whitequark> - we could do memoization in Value.cast and somehow detect if .as_value() is called directly
<whitequark> - we could do memoization in .as_value() (using a decorator, probably) and then detect if .as_value() is not defined using this decorator
<whitequark> so basically in the second case, code would look like this:
<whitequark> class MyRecord(ValueCastable):
<whitequark> @ValueCastable.memoize
<whitequark> def as_value(self):
<whitequark> and the @ValueCastable.memoize decorator would be mandatory
<awygle> that's not too bad. small boilerplate but minimal magic.
<lkcl> whitequark: that would at least have people questioning *why* that decorator is mandatory
<Yehowshua> That works. Somewhat opaque for new users
<whitequark> lkcl: excellent
<whitequark> that is part of what i want to achieve
<whitequark> because the docs for the decorator would explain the hazard assocaited with not memoizing
<lkcl> at which point they'd be referred either here or to the documentation
<lkcl> eexactly, yes
<whitequark> this has two good effects
<whitequark> first, .as_value() and Value.cast() do the same thing
<whitequark> second, it's done in a fairly straightforward python-like way
<whitequark> even though the mandatory nature of the decorator is unusual
<whitequark> the actual process through which this happens is not
<Lofty> So the decorator is there as a reminder that you won't be called multiple times?
<awygle> i am going to be also making lunch for the rest of this meeting so if i take a bit to respond to a ping that's why
<cr1901_modern> >and the @ValueCastable.memoize decorator would be mandatory
<cr1901_modern> Can this be enforced at compile time? If not, at least user oversights will be constrained to one location
<whitequark> cr1901_modern: that's the next point of the discussion
<whitequark> how do we enforce it?
<awygle> by "compile time" i assume you mean "during elaboration"?
<whitequark> we can enforce it in Value.cast, but maybe you never call Value.cast
<whitequark> I think it should be enforced by overriding ValueCastable.__new__
<whitequark> like we do for Elaboratable
<Lofty> I was thinking about constructor/destructor stuff or something, yeah
<lkcl> zowee. i never encountered an actual (real) use for __new__ before. cool
<awygle> oh hm, that's not a bad idea. i didn't realize that was how elaboratable worked.
<Yehowshua> So during object generation, you check if its happened before?
<whitequark> awygle: specifically, that's how the unused elaboratable warnings work
<awygle> i was about to say "the 'right' way to do this is metaclasses but bleh". __new__ is arguably less 'right' but certainly less fucky
<whitequark> the existence of the elaborate method is enforced in Fragment.get
<whitequark> yes, and metaclasses cause issues during subclassing sometimes
<whitequark> as you've discovered already
<awygle> mhm
<Yehowshua> Gotta run - I'll read the irc logs later
Yehowshua has quit [Remote host closed the connection]
<lkcl> Yehowshua: k.
<cr1901_modern> >we can enforce it in Value.cast, but maybe you never call Value.cast
<cr1901_modern> This means that Value.cast _must_ now be called on something that run the memoize decorator?
<whitequark> in general, .as_value() must always be decorated
<whitequark> okay, I think that's about right for ValueCastable
<cr1901_modern> Actually I should back up for a sec, since I'm a bit behind on nmigen features:
<cr1901_modern> Value.cast is new/not implemented yet, correct?
<whitequark> it is very much implemented and even described in docs
<cr1901_modern> So now, if you enforce the decoration rule in Value.cast, that means that _everything_ that Value.cast takes must now be decorated? Is the current behavior for Value.cast to accept a superset of types beyond "stuff that implements as_value()"? >>
<cr1901_modern> Basically "will the new behavior break compatibility with Value.cast as-is right now"?
<whitequark> er, not at all
<whitequark> it doesn't change the existing behavior in any way
<whitequark> it adds new behavior
<whitequark> well
<whitequark> the way we have it worked out is actually simpler than that
<whitequark> Value.cast() simply calls ValueCastable.as_value()
<whitequark> now, *instantiating* a ValueCastable is different
<whitequark> say you have this code:
<whitequark> class MyRecord(ValueCastable):
<whitequark> def as_value(): ...
<whitequark> if you do MyRecord() and as_value() is not decorated with @ValueCastable.memoize [preliminary name], an exception is thrown
* lkcl apologies: need to rest. will be back (and checking irc logs)
<cr1901_modern> ahhh
<d1b2> <emeb> golly! hacked up a custom platform definition for my up5k board and tried the acm_serial LUNA example and it actually enumerated!
<whitequark> okay, two hours is enough of a meeting
<whitequark> I guess we're done for today
<jfng> can we spend a 5mins on nmigen-soc issues ?
<cr1901_modern> I see... ValueCastable has the memoization logic, so Value.cast() and as_value() do the same thing
<whitequark> jfng: oh, yeah
* cr1901_modern will go back to read-only
<_whitenotifier-b> [nmigen] awygle commented on issue #355: [RFC] Redesign UserValue to avoid breaking code that inherits from it - https://git.io/JJC8h
* lkcl would like to hear about nmigen-soc
<awygle> wq lemme know if i mis-summarized or missed anything
<_whitenotifier-b> [nmigen] whitequark commented on issue #355: [RFC] Redesign UserValue to avoid breaking code that inherits from it - https://git.io/JJC4I
<whitequark> nope, all seems correct
<_whitenotifier-b> [nmigen] awygle commented on issue #355: [RFC] Redesign UserValue to avoid breaking code that inherits from it - https://git.io/JJC4L
<jfng> the question would be, can we consider it done ?
<_whitenotifier-b> [nmigen] whitequark commented on issue #355: [RFC] Redesign UserValue to avoid breaking code that inherits from it - https://git.io/JJC4m
<jfng> i believe most of the scaffolding needed for csr peripherals is done
<_whitenotifier-b> [nmigen] whitequark commented on issue #355: [RFC] Redesign UserValue to avoid breaking code that inherits from it - https://git.io/JJC4c
<jfng> one issue that has not been addressed is awygle's concern about compatibility between peripherals
<whitequark> jfng: we decided to go for Approach A, plus a wrapper that puts the two if you use a peripheral as a "black box"
<whitequark> right?
<awygle> my question on #10 would be, does this design preclude eventually having a "bus-agnostic" way to describe memories (as well as registers) which could be used to write bus-agnostic peripherals? i believe this is a desirable use case
<lkcl> awygle: and access the CSRs directly?
<awygle> lkcl: i want a way to say "i have these control registers and this memory-map" and be able to instantiate that with a Wishbone bus, or an AXI bus, or an Avalon bus, or a custom bus, without having to change anything about the peripheral
<awygle> which makes me nervous about the proposed `wishbone.Peripheral` mixin
<lkcl> awygle: cool. i can see how that would be useful / desirable
<awygle> but it's not necessarily a problem, i just want to raise it as a use case i am interested in
<whitequark> awygle: is that actually possible, mechanically?
<whitequark> unless your memory is something dumb, you're probably actually handling bursts yourself
<lkcl> there's also a real-world use-case i can think of: Raptor Engineering is doing an LPC implementation.
<whitequark> and i'm not sure if there is a way to abstract over WB and AXI bursts
<jfng> sorry, i wasn't clear enough
<lkcl> it's possible for LPC to "flip" - dynamically - into UART mode.
<jfng> my question is for a peripheral with only CSRs, no memories, no WB
<awygle> whitequark: that's a fair question, but 1) lots of memories are dumb 2) you can probably map to a lowest common denominator if performance isn't critical 3) if performance is critical you should still be able to write AXI-only peripherals
<whitequark> awygle: do we need many kinds of dumb memory peripherals?
<jfng> just two attributes: `csr_bus` and `periph_info` for metadata
<awygle> i am open to learning that it's not possible or not useful, i just don't want to preclude it at this early stage
<lkcl> as in: some CSRs *reprogram* the behaviour of the peripheral to be a completely different type of interface.
<lkcl> sorry completely different type of peripheral
<awygle> i dont' want an AXISramPeripheral and a WBSramPeripheral
<whitequark> why not?
<whitequark> does it cause problems?
<awygle> twice the verification effort, i guess?
<whitequark> hmm
Kekskruemel has joined #nmigen
<whitequark> but most of the verification of a memory mapped peripheral is verification of the bus, right?
<whitequark> like, even for a DDR controller, presumably most of it would live in stdio
<whitequark> and be verified there
<awygle> i think if we do have AXISramPeriph and WBSramPeriph, then most of the code in any given peripheral will be mapping from AXI and/or WB to a bus-neutral control interface (in nmigen-stdio) anyway. but again, i could be wrong about that.
<lkcl> it makes sense to me for AXIsramPeripheral and WBSramPeripheral to be created by way of mix-ins
<awygle> but we're drifting away from jfng's request pretty harshly
<lkcl> and likewise {AnyOtherBus}SramPeripheral
<whitequark> yeah
<jfng> memories are a whole other topic yes
<awygle> in the absence of memories i don't have any real issue with the current proposal but i don't really see the value of the wishbone.Peripheral mixin, i guess
<jfng> my question is: do we need a mixin csr.Peripheral class ?
<lkcl> sorry... {AnyOtherBus}{SomePeripheralNeedingCSRs}
<jfng> which would validate two attributes: the csr bus interface, and the peripheral metadata
<whitequark> hmm
<jfng> the alternative i see is pure naming conventions
<whitequark> how would a peripheral with both CSR and Wishbone look like?
<whitequark> inherit from wishbone.Peripheral alone? inherit from both wishbone.Peripheral and csr.Peripheral?
<awygle> what is the use case for having both?
<lkcl> and then because they inherit from both, they know to "talk" to each other?
<jfng> a wrapper peripheral class, maybe with a decorator, that would implement the bridge
<jfng> so it would inherit from wishbone.Peripheral alone
<whitequark> jfng: can you remind me what was the outcome of the discussion of split CSR/Wishbone and unified CSR/Wishbone?
<whitequark> ie do the peripherals with both CSR and Wishbone export a single Wishbone bus, or both CSR and Wishbone buses
<whitequark> I recall we reached a decision but I can't remember which one it is
<whitequark> and there was some really good reason for that decision too
<jfng> oh no, i forgot
<awygle> yknow what i'ma just shut up because i'm not very informed here. i've laid out my use case, i trust y'all to either support it or decide it's a bad idea.
<lkcl> i have a vague recollection that AXI4 has CSRs separate somehow. it could just be a convention though
<lkcl> we may actually have to use a modified version of Wishbone.
<lkcl> (adding support for speculative read/writes)
<whitequark> awygle: i'm going to defer that decision, i think nothing forces us to preclude it for now, so i'll keep the option open to have the kind of middleware you request
<whitequark> but no promise that it would absolutely be the way we go
<lkcl> anything that's "merged" would make that... difficult.
<awygle> copy
<lkcl> Wishbone is based on a "take-it-or-leave-it" type of contract.
<whitequark> yeah, nmigen-soc will strictly stick to upstream Wishbone
<lkcl> Out-of-Order designs need the "House Contract of Sale" contract. "offer, exchange, complete"
<whitequark> jfng: okay, we need to figure that out (again)
<whitequark> because i think it would be the key for making this decision
<whitequark> maybe ask key2? iirc he was involved
<lkcl> which would mean that, if it's not "separatable" (so that we can mix in alternative buses), we'd have to hard-fork nmigen-soc. or write a replacement.
<lkcl> which would be a lot of duplicated effort.
<whitequark> jfng: iirc, i argued for split CSR/Wishbone buses in peripherals that have their own Wishbone bus because you can always turn it into a merged one, but not the other way around
<lkcl> to explain: the Out-of-Order design that we're doing can have up to *eight* in-flight memory read/writes simultaneously outstanding
<whitequark> and we could have a wrapper that turns the split one into a merged one if desired
<whitequark> on the other hand, the split design can have somewhat lower resource consumption
<lkcl> where normal Wishbone it expects one and only one bus read/write at one time, and for stalling to propagate back to the main core.
<lkcl> whitequark: indeed (wrapper makes split -> merged but not possible the other way)
<whitequark> jfng: on the other hand, i think key2's counterargument was that it is necessary to ensure synchronization between CSR writes and memory writes
<whitequark> so the split design isn't actually entirely viabel
<whitequark> *viable
<jfng> i found the logs (22/03), and a split csr/wb interface + an easy to use wrapper was indeed the conclusion
<whitequark> hmm
<lkcl> is there a log somewhere of key2's counterargument?
<whitequark> it was in private communication
<lkcl> ahh ok
<lkcl> whoops
<lkcl> what was his concern?
<whitequark> if you have different latencies on CSR and WB/AXI interfaces you may have a bad time
<whitequark> eg if you flush a FIFO
<lkcl> that if you split things, you have to make a synchronisation protocol (in effect, something pretty similar to wishbone stb/ack)?
<whitequark> once you command this through CSR, you want to know that the FIFO is indeed flushed
<lkcl> yes. this i call the "take-it-or-leave-it" protocol :)
<lkcl> and a FIFO, interestingly, interferes with that... and requires the "Contract of Sale" style API.
<lkcl> funny.
<jfng> if we take a splitted bus approach, and use mixins for peripherals, then it would be very tempting to inherit from two (wb,csr) mixins
<jfng> but that would not work, i thinlk
<whitequark> jfng: why not?
<jfng> assuming each mixin must provide a `periph_info` attribute
<whitequark> right, that was exactly my concern
<lkcl> h
<jfng> there must be a single point of truth `periph_info` attribute for the whole peripheral, but this assumes that its memory layout is hierarchical
<lkcl> hhmmm i "get" key2's concern about synchronisation. it really does mean that some sort of ready/valid/busy/ack signalling is needed on CSRs.
<lkcl> the use of e.g. Wishbone (or AXI4) *masks* that need
<jfng> this signaling you need can be done by bridging your csrs behind a WB4 bus
<lkcl> because normally (i.e. in the merged design), the use *of* the Bus - which has that ready/valid/busy/ack protocol built-in - *provides* the very protocol needed so that delays can..
<lkcl> jfng: i am kinda advocating that the protocol used to communicate between split buses *is* wishbone :)
<lkcl> even when say AXI4 is used
<lkcl> because it contains the exact ready/valid/busy/ack communications protocol needed for managing (say) FIFO-based CSRs.
<whitequark> lkcl: i'm pretty sure you actually want AXI4
<whitequark> because that has out-of-order transactions
<lkcl> whitequark: well... *thinks*...
<whitequark> the reason nmigen-soc bothers with Wishbone at all is that a lot of existing cores and designs use it, and people are familiar with it
<whitequark> WB4 isn't all that good, and WB itself is essentially a legacy bus at this point
<lkcl> yeah... it's not sophisticated, that's for sure.
<lkcl> to clarify context: i'm referring to a protocol used to communicate between split peripheral option
<lkcl> as an *internal* protocol
<whitequark> jfng: so i think the periph_info issue is fixable, but we should decide something about synchronization first
<whitequark> what do you think about this?
<lkcl> in the "merged" design you don't see the problem because the Bus provides the very protocol needed to ensure that FIFO-based CSRs get correctly updated
<lkcl> acknowledgement comes back a few cycles later (when the FIFO is flushed)
<lkcl> if all CSRs were single-cycle update, there would not be a problem
<lkcl> am i making sense? :)
<jfng> could we provide some metadata about bus latency ?
<whitequark> jfng: how would a CPU core use it?
<whitequark> wait states?
* lkcl yup. tired. leave you to it to discuss, will check the logs. thank you to you both (and everyone)
<whitequark> lkcl: CSRs can't be all single cycle update because their width is unlimited
<whitequark> jfng: tbh, i am tired too, what do you think about discussing this later this week, or having another meeting next monday?
<whitequark> so on 27th rather than 3rd
<jfng> np, i need to go home too
<whitequark> we could spend 1st and 3rd monday questions in general and 2nd/4th monday just between us implementers :)
<whitequark> discussing questions*
<jfng> that's a good idea
<awygle> yeah i like that
<whitequark> okay, let's do that
<awygle> we might be able to talk about #381 :p
<whitequark> yeah...
<awygle> go get some rest y'all
ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen · IRC meetings each Monday at 1800 UTC · next meeting July 27th
<whitequark> see yall later
<awygle> oh real quick, happy to implement 355 but not sure what my schedule is like for the next <undetermined>, so don't let me hold up 0.3 if i don't make it to it is all
<whitequark> mm okay
<whitequark> it's not a huge change, so worst case I can just make it myself
<whitequark> we'll have 0.3.rc1 first
<awygle> mhm
Asu has quit [Remote host closed the connection]
Kekskruemel has quit [Quit: Leaving]
jeanthom has quit [Ping timeout: 264 seconds]
<Degi> Is Record([("abc", 1),("def", 1)]) the wrong syntax for making a record with 2 subsignals? Since .def gives an invalid syntax error
<whitequark> `def` is a Python keyword
<whitequark> you can use getattr() to access that field
<whitequark> it's just a bad placeholder name :)
<Degi> Oh indeed...
<Degi> Heh yeah I've noticed xD
<_whitenotifier-b> [nmigen/nmigen-soc] whitequark pushed 1 commit to master [+0/-0/±6] https://git.io/JJCuo
<_whitenotifier-b> [nmigen/nmigen-soc] rroohhh c754caf - test: make nmigen 0.3+ compatible
<_whitenotifier-b> [nmigen-soc] whitequark closed pull request #23: test: make nmigen 0.3+ compatible - https://git.io/JJZod
lkcl_ has joined #nmigen
lkcl has quit [Ping timeout: 264 seconds]