#nmigen on 2020-12-30 — irc logs at freenode.irclog.whitequark.org

2020-12-07 01:53 ChanServ changed the topic of #nmigen to: nMigen hardware description language · code at https://github.com/nmigen · logs at https://freenode.irclog.whitequark.org/nmigen · IRC meetings each Monday at 1800 UTC · next meeting TBD

00:09 nfbraun has quit [Quit: leaving]

00:22 ianloic_ has quit [Read error: Connection reset by peer]

00:22 ianloic_ has joined #nmigen

00:24 lf_ has joined #nmigen

00:24 lf has quit [Ping timeout: 268 seconds]

00:30 Jay_jayjay has joined #nmigen

00:54 <falteckz> Is there a function that knows the clock speed of a domain and will give me a number of periods for n time? Such that I could say - how many clock cycles for 100 milliseconds of 'sync' domain?

00:54 <falteckz> I don't believe it the responsibility of a submodule to know the clock speed, but I believe it fair to ask the domain for it

00:55 <falteckz> I guess getting the hertz of a domain is also sufficient

00:57 <falteckz> Looks like ClockDomains are just wrappers around an IO port for a Clock - which makes sense, since an oscillator/clock would either be something global to the fabric or some IO

00:57 <_whitenotifier> [YoWASP/yosys] whitequark pushed 1 commit to develop [+0/-0/±1] https://git.io/JLHqX

00:57 <_whitenotifier> [YoWASP/yosys] whitequark 7ea13ee - Update dependencies.

00:58 <agg> ClockDomains don't necessarily have anything to do with an IO port, they can be fully internal

00:58 <whitequark> falteckz: there isn't, and in general there cannot be

00:58 <whitequark> since you can reconfigure a PLL at runtime

00:58 <whitequark> for the default clock, you can use platform.default_clk_period

00:58 <whitequark> (or default_clk_frequency)

01:00 <falteckz> I think I want to avoid a PLL for now - will use platform properties.

01:02 <falteckz> agg, by fully internal, it's still wired to some IO line internally - which is what I intended to say with "global". Unless I'm misunderstanding

01:03 <falteckz> Unless you mean to say, a clock domain generated by logic - for which I shudder.

01:03 <agg> or generated by a PLL

01:03 <agg> or an internal oscillator primitive

01:04 <falteckz> Right, yeah I intended to include those two options in the "something global"

01:04 <falteckz> I perhaps used the wrong words

01:06 <agg> no, fair enough

01:25 aquijoule_ has joined #nmigen

01:33 Bertl_oO is now known as Bertl_zZ

02:16 emeb has quit [Quit: Leaving.]

02:37 lkcl has quit [Ping timeout: 272 seconds]

02:50 lkcl has joined #nmigen

02:56 Jay_jayjay has quit [Quit: My iMac has gone to sleep. ZZZzzz…]

03:04 <lsneff> whitequark: maximum signal number?

03:04 <lsneff> u32, u64?

03:04 <whitequark> amount of signals? 4b is more than enough imo

03:05 <lsneff> 👍

03:15 <awygle> i still think a specialization of ClockDomain that does know the frequency would be valuable

03:23 <falteckz> Is there not a simulator platform that is passed to elaborate?

03:23 <whitequark> awygle: everyone agrees it would be nice, but i'm not convinced it is possible to introduce in a way that does more good than harm

03:23 <whitequark> falteckz: not yet

03:25 <awygle> i understand

03:25 <whitequark> awygle: if we got first-class PLL support, then a clock tree analysis could assign frequencies to clocks... but that can only happen after elaboration anyway

03:25 <whitequark> so not actually useful to code that wants introspection

03:26 <whitequark> if it was just a field that someone fills out with no guarantees backing it, then i think that passing frequencies around explicitly makes it more clear that there is no reason for this information to be correct, since, well, you are doing it yourself

03:27 <falteckz> whitequark, perhaps that will work for me for now, passing the domain and the frequency to assume for that domain

03:28 <falteckz> MyModule(domain=domain, domain_freq=16e6)

03:28 <whitequark> yeah

03:29 <awygle> why would a PLL propagating clock constraints have to wait until after elaboration?

03:29 <awygle> (in the simple case)

03:29 <whitequark> you've answered it yourself: because i am concerned not with the simple case but with the complex one

03:30 <falteckz> *Shakes feature backlog at edge cases, angrily*

03:32 <whitequark> so if the PLL is fed directly by an input pin with a constraint on it, sure

03:32 <awygle> perhaps i wasn't clear when i said "a specialization of ClockDomain". i specifically intended that to mean "which only covers the simple cases, and which code can easily test for"

03:32 <whitequark> oh

03:32 <whitequark> ok, that's actually more reasonable

03:33 <whitequark> i think the main problem with *that* plan is the fact that domains are late bound so it's hard to get at the actual ClockDomain object most of the time

03:33 <whitequark> and passing it around is something we advise against

03:33 <whitequark> so once we figure out something about that part, yes, that can probably be introduced

03:34 <awygle> can you elaborate on why that is? i have personally found it confusing but i am sure that there's a good reason

03:34 <falteckz> The period of 16MHz is 62.5 nanoseconds, right? So why is the simulated vcd showing me a clock period of 16,000 nanoseconds?

03:34 <whitequark> sounds like period and frequency are mixed up somewhere

03:34 <falteckz> Oh

03:34 <whitequark> awygle: what specifically is?

03:35 <awygle> the late boundedness of ClockDomain. the fact that they're referred to by name, often before they exist, rather than by the object

03:35 <lsneff> whitequark: Does this work for you? https://raw.githubusercontent.com/lachlansneff/ligeia/main/src/bvcd.txt

03:37 <whitequark> awygle: it's a bunch of different reasons. consider this, if they were eagerly bound, you would effectively have to pass "sync" everywhere

03:37 <whitequark> and it would get a lot harder to use Reset/EnableInserter as well as renaming domains

03:37 <whitequark> (you could perhaps get rid of domain renaming, but not the former two)

03:38 <whitequark> lsneff: i would very much like to not deal with 4-state logic

03:39 <whitequark> high-performance simulators are almost never 4-state

03:40 <whitequark> also, all the varints in the header make it harder to parse and emit for no good reason

03:40 <lsneff> Ah, fair enough, will change the header.

03:40 <lsneff> You're sure about the 4-state?

03:40 <whitequark> i think only actual arbitrary width values (ie variables) should be encoded with varint

03:41 <whitequark> yes. 4-state is very rarely used in high-performance RTL simulations, you would only have Z at the boundary, and almost nothing does internal X

03:41 <whitequark> verilator cannot do internal X, afaik

03:41 <whitequark> cxxrtl might one day, but so far there's been fairly little interest in it

03:41 electronic_eel has quit [Ping timeout: 246 seconds]

03:41 <lsneff> A significant amount of space can be saved using liberal amounts of varint. The vcd you sent me has 2000001 timestamps. 4 bytes for each one is 8 MB to start.

03:42 electronic_eel has joined #nmigen

03:42 <lsneff> 3 state is the same as 4 state in this case, as far as I can tell

03:43 <lsneff> same space, same encoding

03:43 <whitequark> lsneff: hang on, we'll get to that

03:44 <whitequark> in general, i think the philosophical issue with your format is that it tries to be a better VCD. i do not want a better VCD. i want a format actually fit for purpose

03:44 <whitequark> the header is just cribbed from VCD in spite of the fact that the VCD one is badly thought out

03:44 <whitequark> i would use one field in the header: the amount of femtoseconds per integer timestep

03:44 <lsneff> Ah, I gotcha, I was basically trying to just be vcd, but better.

03:45 <lsneff> Let's rethink this then

03:45 <whitequark> please do not do that. put vcd in the trash bin of history and start over

03:45 <whitequark> that timescale thing causes no end of problems for me

03:46 <lsneff> On the other hand, I still want to be able to parse vcds with this tool, so I have to come up with something close enough to have a shared internal representation

03:46 <whitequark> it's a choice between timescale that is insufficiently fine and misrepresents time, or insufficiently coarse and trips up consumers that expect timestamps to be roughly sequential (ie without huge gaps), like pulseview

03:46 <lsneff> I agree with the timestamp thing

03:46 <whitequark> the only reason that timescale field looks like that is because that's how verilog things about time. you do not have to repeat mistakes of verilog

03:47 <lsneff> Why did verilog do it that way?

03:48 <whitequark> there might not be a particular reason (i've read the HOPL3 paper on verilog but i don't recall anything on timescales)

03:49 <whitequark> generally, applying chesterton's fence to verilog is a grave mistake

03:49 <whitequark> lots of things in it are just... like that

03:49 <whitequark> anyway

03:49 <whitequark> regarding varint, i think all of the non-repeating part of the format should use fixint

03:50 <whitequark> varint timestamps are ok, especially if they are deltas and not absolute times

03:51 <whitequark> (since as far as i can tell your format is not self-synchronizing, unlike vcd, deltas make more sense)

03:51 <lsneff> The timesteps are deltas

03:52 <lsneff> But this is good stuff, I'm going through and changing it

03:52 <whitequark> for signal ids, i would prefer unique rather than strictly sequential

03:52 <whitequark> this makes it possible for me to use some internal not necessarily sequential identifier in vcd files, potentially saving a lot of lookups

03:53 <whitequark> (e.g. index into the internal debug info table)

03:53 <whitequark> (it's discontiguous because of aliases)

03:53 <lsneff> Okay, sure, no problem

03:54 <whitequark> the "signal" is the value that physically changes, right?

03:54 <whitequark> and the "variable" is the HDL name?

03:55 <whitequark> (the terms should probably be made less confusing, but for now i'm fine if you just confirm it)

03:55 <lsneff> Yep, that's correct, I'm happy to bikeshed that

03:56 <lsneff> Would wire make more sense?

03:56 <whitequark> "wire" is ambiguous with wire/reg distinction

03:56 <whitequark> i would probably just use "signal" and "signal name" (also "scope name")

03:57 <whitequark> wait

03:57 <whitequark> why do signals (in your current scheme

03:57 <lsneff> Hmm, so that's a little confusing as well, since I included a name for each signal, and a name for each variable

03:57 <whitequark> ) have names at all?

03:57 <falteckz> Can I expect Memory() to just work in the simulator, including initial values?

03:58 <falteckz> Does the simulation have to "load" first?

03:58 <whitequark> lsneff: i think it is important to separate storage locations (which are just varying values) and the HDL names assigned to them (which are just names)

03:58 <whitequark> falteckz: it should just work

03:58 <lsneff> They don't have to, I just thought it might be useful. e.g. This variable is called ack, but it's a range into the bus4_config_blah_blah

03:59 <whitequark> nononono

03:59 <whitequark> please don't give storage locations any names, that would just make life complicated for both of us

03:59 <whitequark> and everyone who uses the format

03:59 <lsneff> Okay, I'll remove it

03:59 <lsneff> I get why you don't want it

03:59 <whitequark> you know how gdb works? you have addresses (offsets into stack or something) and then you have source level names

04:00 <lsneff> Yep, I gotcha

04:00 <whitequark> (hey, you know what would be really cool? if i could include a wasm snippet rematerializing all the variables i optimized out *at the time when you are viewing them*. right now i do this during VCD capture, but it would be way cooler if it was done like DWARF bytecode)

04:01 <lsneff> Hmm, I'll think about that

04:02 <whitequark> it's definitely not MVP stuff

04:02 <whitequark> just make the header versioned/extensible so we can include it later

04:02 <lsneff> Ah, a format version header is a good idea, hadn't crossed my mind

04:02 <whitequark> (i would probably go for a single u32 header that doubles as a file signature)

04:03 <whitequark> (maybe u64 to make it less ambiguous)

04:03 <lsneff> I really should be more on top of this, I've done plenty of spec implementation work, guess I've never had to spec one out before

04:03 <lsneff> A magic, you mean?

04:03 <whitequark> magic that doubles as a file version, yeah

04:03 <whitequark> no strong opinion on the exact detail though

04:04 <lsneff> That's interesting, I have mixed feelings about making different versions completely different formats

04:04 <awygle> always version number always always always

04:05 <awygle> either you'll never use it which is fine or you'll hate yourself for not including it

04:05 <whitequark> ^

04:05 <lsneff> right

04:05 <whitequark> for a streaming format that has hardcoded producers, versioning is really fine

04:05 <whitequark> more fine than extensibility imo

04:05 <lsneff> How about storage instead of signal, and variable as it is currently

04:05 <whitequark> no objection

04:10 <lsneff> So, you'd prefer binary or trinary logic?

04:10 <awygle> DWARF is one of those systems that is pretty well designed so i just assume designs taking inspiration from it are also good lol

04:11 <whitequark> awygle: i think you will be booed by more than one person who had to implement DWARF

04:11 <whitequark> let's put it this way, i am being careful in taking inspiration from it

04:11 <awygle> (i'ma let this convo die off before i reengage on the clock domains thing but i do have more to say about it lol)

04:11 <whitequark> lsneff: i think you should have 4-state bit vectors separate from 2-state ones

04:12 <lsneff> whitequark: Ah, okay, that makes sense to me

04:12 <awygle> i feel about DWARF roughly like i feel about e.g. (e)BPF, in that there are probably more elegant ways to provide the functionality but providing it at all is basically a miracle

04:12 <whitequark> sth like `enum Type { TwoState { lsb: usize, msb: usize }, FourState { lsb: usize, msb: usize }, String }`

04:13 <whitequark> actually wait

04:13 <whitequark> that's conflating representation with storage

04:13 <whitequark> `enum Type { TwoState(usize), FourState(usize), String }` that's more like it

04:14 <whitequark> *presentation with storage, sorry

04:14 <whitequark> and then [lsb:msb] range, as well as enum values and such, would be a part of the variable

04:15 <whitequark> awygle: it would be a greater miracle if compilers actually emitted complete DWARF information...

04:15 <whitequark> ... you know, like cxxrtl does :p

04:15 <lsneff> enum values? So, binary/quaternary/utf8 would be an interpretation setting, not a storage setting?

04:15 <awygle> :p

04:16 <whitequark> lsneff: enums are not utf8

04:17 <whitequark> storage type would actually be 2-state/4-state/string because this is what a simulator physically has when it emits a value

04:17 <whitequark> most simulators are 2-state, some of them (or some variables) will be 4-state, so sometimes you'll encounter 4-state as well

04:17 <whitequark> in cxxrtl, 2-state is just value<>, 4-state will be something like xvalue<> that is a pair of value<>s internally

04:19 <whitequark> meanwhile, presentation type would be something like:

04:21 <whitequark> signal name, then for integers, [msb:lsb] range, signed/unsigned, which values correspond to which symbolic names for enums, and which base the integer should be interpreted in

04:22 <whitequark> the base might be excessive, but i think the others are necessary

04:22 <lsneff> Okay

04:23 * awygle gestures vaguely at folders labeled "CSS" and "database normalization

04:23 <whitequark> eh

04:24 <whitequark> i'm pretty sure of how exactly i'd like to have the header, the storage format, and the value change format to look like

04:24 <whitequark> i'm less sure about presentation

04:24 futarisIRCcloud has joined #nmigen

04:25 <whitequark> the only thing i'm really certain about presentation is that a single HDL-level name must be possible to represent as an arbitrary sequence of storages

04:26 <lsneff> multiple storages concatenated?

04:26 <whitequark> yeah

04:26 <whitequark> it is necessary to be able to split storages to achieve maximum performance in a simulator like cxxrtl

04:26 PyroPeter_ has joined #nmigen

04:26 <whitequark> so it follows that it must be possible to reconstruct the value from the debug info

04:27 <lsneff> Makes sense, not a problem

04:27 <lsneff> Can you say more about the wasm debugging info thing?

04:27 <whitequark> i would personally be very fine if for an MVP you just had that, the ability to concatenate storages

04:27 <whitequark> right, ok

04:27 <whitequark> so you know how gdb tells you $1 = <value optimized out> sometimes?

04:28 <lsneff> Yep, definitely

04:29 <whitequark> cxxrtl also optimizes out values sometimes (a lot of the time actually)

04:29 <whitequark> but what it also does is it emits an additional function that recomputes all of the optimized-out values from the persistent state

04:29 <lsneff> Ah, very interesting

04:29 <whitequark> which a cxxrtl debug info consumer can call to rematerialize all of those

04:29 <whitequark> the VCD waveform writer currently calls it on each step

04:29 PyroPeter has quit [Ping timeout: 256 seconds]

04:29 PyroPeter_ is now known as PyroPeter

04:30 <lsneff> I see what you're getting at

04:30 <lsneff> Yeah, I think that's a solid idea

04:30 <whitequark> in theory, if i emitted that function into wasm, and emitted the wasm into the waveform dump, and added some correspondence between storages and wasm inputs/outputs... waveform dumping could be greatly sped up

04:30 <lsneff> Made more difficult by the lack of interface types

04:31 <whitequark> i think i could basically give you a sequence of storage IDs

04:31 <whitequark> you would lay them out in a memory, then the function will read storage IDs that it needs, and write back the ones it compute

04:31 <whitequark> the interface would be basically an array of u32 and a bunch of indexes into it

04:34 <whitequark> the other thing i eventually want a good waveform viewer to do is to support a sort of client/server model

04:34 <whitequark> ie: the viewer tells the model the range it wants to render (with an optional stride), the model does its best to reconstruct that range from the saved state by loading checkpoints and simulating

04:35 <whitequark> basically, if you are mipmapping anyway, why even compute the values you're about to average out?

04:36 <whitequark> i want it to be possible for a 4 billion point trace of a SoC that boots Linux to be compressed to like a few hundred MB with checkpoint/restore

04:36 <whitequark> combined with rematerialization, of course

04:38 <whitequark> (in case of the client/server model you don't need wasm, you can just ask the model directly)

04:38 <whitequark> (so it might make more sense to focus on that as it gives a greater benefit)

04:39 <lsneff> The model, as in the cxxsim instance itself?

04:40 <whitequark> yeah

04:40 <whitequark> while you're running the model, it checkpoints say every 100000 cycles. then once you ask for cycles 150000..170000, it loads a checkpoint, runs for a while without recording anything, then dumps waveforms for 20k requested cycles only

04:41 <whitequark> if you just want to render an overview of a very large trace this can be extremely efficient because waveform dumping, especially full dumps, is an order of magnitude slowdown if not more

04:41 Degi_ has joined #nmigen

04:43 Degi has quit [Ping timeout: 260 seconds]

04:43 Degi_ is now known as Degi

04:46 <lsneff> That's clever, to collect a bunch of snapshots, and then display that, instead of mipmapping the whole thing down anyhow

04:53 <whitequark> also you can snapshot eg one out of 100k cycles and then, when replaying, give one out of 1k cycles to the viewer

04:53 <whitequark> since the majority of time is spent recording VCD, not simulating

04:54 <lsneff> How much slower do you think running a cxxsim within wasm would be than natively?

04:54 <whitequark> 3-5 times slower, I measured

04:55 <lsneff> Is that with wasm simd enabled?

04:57 <whitequark> nope

04:57 <whitequark> not sure if simd would help much

04:57 <whitequark> native binaries (even with -march) barely use it

04:57 <lsneff> Gosh, hmm, I wonder what's slowing it down so much

05:00 <lsneff> Should enum be a separate interpretation type from integer?

05:02 <whitequark> I think so yeah

05:02 <lsneff> Good, that makes it easier in the spec

05:04 cr1901_modern1 has joined #nmigen

05:08 cr1901_modern has quit [Ping timeout: 272 seconds]

06:16 <lsneff> whitequark: not sure you're still online, but here's the spec after hearing your suggestions: https://raw.githubusercontent.com/lachlansneff/ligeia/main/src/tbd.txt

06:19 <whitequark> i am

06:23 <falteckz> Yay! Updating an RGB LED strip at 170 Hz

06:23 <whitequark> lsneff: thanks, i like the text description a lot more

06:24 <lsneff> I enjoyed putting together the ascii art, but it turned out to be hellish to actually read

06:26 <whitequark> yep

06:27 <lsneff> the text format is roughly similar to the way the wasm spec is laid out

06:27 <whitequark> lsneff: i think you don't strictly need integer-variable-data.msb

06:27 <whitequark> you only need the lsb offset, and then storage chunks give you the msb

06:28 <lsneff> What if you want a range that's less than the size of a single storage?

06:28 <whitequark> hmmm

06:29 <whitequark> so lsb indexes into the concatenated storages, with the first storage starting at bit 0, right?

06:29 <whitequark> what if i have a variable defined in verilog like `wire [7:3] x;` where the low 3 bits simply aren't?

06:30 <lsneff> it bit-indexes into storage(s), so wire [7:3] x; would simply be msb=7, lsb=3

06:30 <lsneff> Well, hmm

06:30 <lsneff> I didn't think about a storage not starting at 0

06:31 <lsneff> what's the use case for that anyhow

06:31 <whitequark> i just explained!

06:32 <whitequark> wire [7:3] would have a 5-bit storage

06:32 <lsneff> Right, but why not wire [4:0]?

06:32 <lsneff> I get that verilog supports that

06:32 <lsneff> anyhow, doesn't matter

06:32 <whitequark> suppose you have an address bus where every access is 4-byte aligned

06:32 <whitequark> it'll be like wire addr[31:2];

06:33 <whitequark> and you can still assign unshifted addresses to it

06:33 <lsneff> Ooo, fancy. That's interesting

06:33 <lsneff> Does nmigen support that too?

06:35 <whitequark> nipe

06:35 <whitequark> *nope

06:35 <whitequark> it potentially could, but this would interact in complex ways with other stuff and i'm not sure if it makes a good addition to the language

06:35 <lsneff> Okay, so that sounds to me like "x[7:3]" would the name of variable in that case, and it would alias a 5 bit storage

06:36 <lsneff> lsb and msb would be 0 and 4

06:36 emeb_mac has quit [Quit: Leaving.]

06:36 <lsneff> at least, that's how that would fit into the format as specified

06:36 <whitequark> so you're taking a note out of vcd's book, right?

06:37 <whitequark> this is kind of... i don't entirely like it, but i am not opposed to it either, since i am not writing vcd viewers mostly

06:37 <whitequark> basically, you'll have to support things like "take the bit 6 out of a variable named x[7:3]"

06:37 <whitequark> so there is a mini language hidden in the string that is the variable name, which you will need to parse

06:37 <lsneff> Yeah, I get that you feel queasy about that

06:37 <whitequark> and now consider memories

06:38 <whitequark> where the ranges are stacked a few levels deep

06:38 <lsneff> Okay, how about each storage has a least significant bit idx as well?

06:38 <whitequark> that would be a perfect fit with cxxrtl's model

06:39 <lsneff> 👍I'll add it to the spec

06:39 <whitequark> the case where storages would overlap i think you can make a hard error

06:39 Sid___ has left #nmigen [#nmigen]

06:39 <whitequark> (maybe mark that one variable as illegal to keep the file viewable)

06:39 <whitequark> quaternary enums aren't a thing

06:40 <whitequark> that is meaningless in terms of synthesis semantics

06:40 <lsneff> Ah, you're right there, doesn't make sense

06:40 <whitequark> or not even synthesis, just meaningless

06:41 <whitequark> one could argue for *don't cares* in enum specifications but i've never seen that supported by any other tool

06:41 <whitequark> and i am strongly opposed to conflating don't cares with x and z

06:41 <lsneff> Okay, hold up actually, overlapping storages?

06:41 <whitequark> if you just have bit patterns for enum variants that works perfectly

06:41 <whitequark> oh

06:42 <falteckz> Is there a construct that can be treated like a list of elements (bytes) or a single unsigned value? Specifically [unsigned(8), unsigned(8), unsigned(8)] when indexed, but unsigned(24) when not?

06:42 <whitequark> lsneff: well imagine a variable that consists of two 2-bit storages, with lsb 0 and lsb 1

06:42 <whitequark> falteckz: have you seen .word_select() ?

06:42 <falteckz> I had not

06:42 <whitequark> falteckz: https://nmigen.info/nmigen/latest/lang.html#bit-sequence-operators

06:43 <lsneff> Wait, can an enum be multi-storage as well?

06:44 <falteckz> .word_select(0, 8), .word_select(1, 8), .word_select(2, 8) ?

06:44 <lsneff> Yep, gotta change this around

06:44 <whitequark> lsneff: i thought about this and i think this would have mattered with your previous design

06:44 <whitequark> but with your current design, i think every variable can just come with a list of multiple storages

06:44 <whitequark> there's no downside, they would be transparently concatenated when reading

06:44 <whitequark> and everything on top would work with the bit vector

06:45 <whitequark> that said

06:45 <whitequark> it is extremely unlikely that any enum ends up being split

06:45 <lsneff> Except that having multiple storages and each storage having a lsb that's possibly not zero is complicated

06:45 <whitequark> but you can handle that just in one place, no?

06:46 <lsneff> Yeah, I believe so

06:46 <whitequark> if cxxrtl gets a TBD emitter, i suspect it will emit all integer variables as multistorage

06:46 <whitequark> although i guess i could special-case it

06:46 <whitequark> bottom line, i think integers should be handled uniformly at least

06:47 <whitequark> i can see some use cases for multistorage strings

06:47 <whitequark> though tenuous

06:47 <whitequark> hm, let me think about it

06:48 <lsneff> Multistorage makes everything more complex, since I can't hand around regular byte slices anymore

06:48 <whitequark> ok, i see

06:48 <whitequark> so how about: one type INTEGER, with >=1 storage-ids, and the case of 1 storage-id specially handled in the viewer itself

06:49 <whitequark> (a case of 1 storage-id still can have nonzero storage-lsb, so it would not complicate things much, i think?)

06:49 <whitequark> and then you have ENUM and STRING and stuff none of which support multistorage

06:49 <whitequark> i would be totally fine with that.

06:50 <lsneff> Okay, that works for me

06:50 <lsneff> So, integers have 1 or more storages, and only the first one can have a non-zero lsb

06:50 <whitequark> for enums, it would be very easy for me (i don't need to do anything) to give you the bit patterns that correspond to the storage-lsb

06:50 <whitequark> ie you ignore storage-lsb for enums and just compare the numbers with the enum specification

06:51 <whitequark> (ok, you'll need storage-lsb if the user asks for a numeric enum value)

06:51 <whitequark> lsneff: re integers, i think that won't work

06:52 <whitequark> because that makes storages tied to the variable

06:52 <whitequark> basically, if every storage is self-contained, its lsb is the absolute lsb within some RTL signal

06:53 <whitequark> suppose a RTL signal is composed out of 3 storages, for bits 7:5, 4:2, and 1:0

06:53 <whitequark> suppose a variable aliases bits 7:2 of that signal

06:53 <whitequark> it must be possible for it to only list the two higher order parts in storage-ids

06:54 <whitequark> and suppose another variable aliases only bits 7:5

06:54 <whitequark> it must be possible for it to only list one storage-id

06:54 <whitequark> which means that the storage-lsb for them should be 5, 2, and 0

06:54 <whitequark> makes sense?

06:54 <lsneff> Ah, I understand that

06:54 futarisIRCcloud has quit [Quit: Connection closed for inactivity]

06:54 <lsneff> So, just make sure the storages [lsb:lsb+length] are contiguous with one another

06:54 <lsneff> got it

06:55 <whitequark> yep pretty much, i don't think variables with holes is something that makes sense to support

06:55 <whitequark> just mark it as broken in the viewer

06:55 <whitequark> or with overlaps

06:55 <lsneff> 👍

06:55 <whitequark> if you implement the format we just agreed on (well the parts we agreed on), i will be able to make cxxrtl debug info *even more efficient*

06:56 <whitequark> right now it can only represent full aliases

06:56 <whitequark> s/will/may be/, i'd have to benchmark it first

06:56 <lsneff> Awesome

06:56 <lsneff> Yep, give me a sec to push the modified spec, and then I'll start implementing it tomorrow probably

06:57 <whitequark> looked through the rest, seems essentially fine

06:58 <lsneff> https://raw.githubusercontent.com/lachlansneff/ligeia/main/src/tbd.txt

06:58 <lsneff> Sounds good

06:59 <whitequark> tbh, there might be one other thing that would simplify emitters a lot

06:59 <whitequark> not having to emit scopes as a tree

06:59 <lsneff> Ah, yeah, parsing that is a pain too

07:00 <lsneff> How are you thinking it should be emitted?

07:00 <whitequark> i would prefer basically 0 being toplevel, scope := id;name, variable := scope-id;name;storage-id;etc

07:00 <whitequark> so a sequence of interspersed scope and variable definitions

07:00 <whitequark> er, sorry, let me edit that

07:00 <whitequark> scope := parent-id;scope-id;name, variable := scope-id;name;storage-id;etc

07:01 <whitequark> the only constraint is that a scope must be defined before it is used

07:01 <whitequark> right now the VCD writer has to do some annoying bookkeeping that it really oughtn't do

07:01 <whitequark> and it would get much worse if e.g. i tried to compose multiple cxxrtl models

07:02 <lsneff> There won't be enough scopes and variables for extra space to matter, right?

07:02 <whitequark> nope. think of it this way, that was written by a human (usually) and goes into UI for a human

07:02 <whitequark> like a tree view

07:03 <whitequark> that puts an inherent limit on the amount of scopes, i mean if you have a megabyte of variable names, that must have come from some gargantuan RTL model that will dwarf the header in size

07:04 <whitequark> wait. waaait value-change-blocks is an unsized-vec?

07:05 <lsneff> Yeah, I just didn't want you to have to declare it's length beforehand

07:05 <lsneff> Since you wouldn't necessarily know it

07:05 <whitequark> oh nvm i misparsed that

07:05 <whitequark> do we need initial-values?

07:06 <whitequark> i guess we do

07:06 <lsneff> I guess that's another vcd paradigm

07:06 <whitequark> or... hm

07:06 <whitequark> i think it would be more elegant to get rid of it

07:07 <whitequark> the first value-change-block can serve that purpose

07:07 <whitequark> the main consequence is that you must (in the viewer) handle the case where e.g. at time 0, the values are not yet known, because the simulation starts at nonzero time

07:07 <lsneff> I guess since the timestamps are deltas, that makes sense

07:07 <whitequark> starting a simulation at nonzero time is well-defined by the way

07:07 <whitequark> you may have restored a checkpoint

07:07 <lsneff> Ah

07:07 <lsneff> I guess that's just something to deal with

07:08 <whitequark> i would also strongly recommend adding a header to value-change-block

07:08 <whitequark> there's $dumpon/$dumpoff in VCD which actually make sense for checkpoint/restore mipmapping as we discussed earlier

07:09 <whitequark> and also i would like to add printf capability to cxxrtl

07:09 <whitequark> which would need its own block in the dump

07:09 <whitequark> assertion failures to be recorded as well

07:09 <lsneff> you'd like to intersperse different kinds of blocks

07:09 <whitequark> correct

07:09 <lsneff> alright

07:09 <whitequark> it might actually make sense to have timestep as a separate block

07:10 <whitequark> so: advance time, record changed values, show printed stuff / failed assertions / etc in the future

07:11 <lsneff> Having multiple value changes for a storage in a single timestep is a weird edgecase

07:11 <whitequark> yes, vcd viewers call this a "glitch"

07:11 <whitequark> some of them in some cases detect it and complain, some of them swallow it

07:11 <whitequark> the ability to represent it is actually useful

07:11 <whitequark> it can happen with certain kinds of unusual combinatorial logic and aid debugging

07:12 <whitequark> for MVP, "last write wins" is perfectly good semantics

07:13 <whitequark> lsneff: ah, so re storages/scopes/variables this is not quite what i meant

07:14 <whitequark> rather than three separate vecs, i would like to have a single vec where each entry is one of three variants

07:14 <lsneff> Ah, I see

07:14 <whitequark> essentially, i am trying to stream as much data as possible while keeping as little possible in the state

07:15 <whitequark> internally the cxxrtl vcd dumper currently uses the physical address of storage in memory as the key

07:15 <whitequark> when it first encounters it, it emits a new storage, and each time it encounters it, it emits a variable as appropriate

07:16 <whitequark> that said, i could live with three vecs if you'd rather have that

07:16 <whitequark> still better than what vcd does, by a large margin

07:16 <lsneff> The three variants are scope, variable, and... ?

07:16 <whitequark> and storage

07:16 <lsneff> Ah, okay

07:17 <lsneff> I'm not a huge fan, but if you think it would simplify, happy to do it

07:17 <whitequark> can you elaborate why you dislike it? maybe i agree

07:18 <whitequark> mostly i suggest it because that would be very natural for a waveform dumper to emit during initialization

07:18 <whitequark> but it's possible i'm missing something

07:19 <lsneff> It just seems a little inefficient to me, but it's not really, especially since this is just a small part of the file

07:19 <whitequark> ah i see, the enum discriminants

07:19 <lsneff> So, are you going to know how many declarations there will be up front?

07:20 <whitequark> that... is an excellent question

07:21 <whitequark> and now that you asked it, i think i can clarify the decision i'm thinking about here

07:21 <whitequark> gimme a sec

07:21 <lsneff> sure

07:23 <whitequark> i see two major options here. the former, in which i have to know the amount of declarations (aggregate or per-type) upfront, requires me to do that bookkeeping no matter what

07:23 <whitequark> in that case, i think we can just keep it as three vectors, it doesn't make any difference.

07:23 <lsneff> I can add an END_DECLARATIONS discriminant

07:23 <whitequark> yes, but i want to extend that concept a bit

07:25 <whitequark> basically, unify the discriminants of declarations and value changes. this would *greatly* simplify the code in cxxrtl that emits stuff, not only because it now needs zero bookkeeping, but also because there is no longer any constraint as to when the end user may be adding signals--you can start dumping a bunch of stuff, for example a memory, in the middle of a capture if you want

07:25 <whitequark> but more than that, it would help nmigen's pysim even more

07:25 <whitequark> because you can add Signal()s at runtime in Python processes, pysim does not and cannot know of every Signal() that needs to be dumped upfront

07:25 <lsneff> Ah, hmm

07:26 <whitequark> right now, if you do not inform the VCD dumper of every Python-only Signal upfront, you simply cannot dump them

07:26 <whitequark> (well you sort of can but no one will do it because it's too annoying)

07:26 <whitequark> if i could tell the viewer in the middle of a capture "oh and by the way here's a new signal" that would *massively* improve edge case experience

07:26 <lsneff> I want to be careful of how difficult it will be to implement efficiently as well

07:26 <whitequark> yes, i understand

07:26 <whitequark> which is why i offer two options

07:27 <whitequark> the third reason why i think unified discriminants might make sense is because of the client/server architecture i mentioned earlier

07:27 <whitequark> the streamed waveform dump file structure naturally maps to a socket

07:27 <whitequark> it'd still need defining the other direction, but the data that flows to a viewer would have the exact same structure

07:28 <lsneff> This + non-ordered storage ids means that I can't use the id to index an array of variable datas, and will probably have to use a hashmap instead

07:28 <lsneff> Which may add a lot of overhead, but I'm not sure

07:28 <lsneff> That's only when parsing though

07:28 <lsneff> So, probably okay

07:28 <whitequark> hmm

07:28 <lsneff> Alright, I'll unify the enumerations

07:29 <whitequark> i will happily ditch the non-ordered storage id requirement if i can get this

07:29 <whitequark> the other simplifications make storage id tracking so much easier that it is no longer an issue

07:29 <whitequark> make scope ids ordered as well, not a problem

07:30 <lsneff> Will that complicate internal bookkeeping for you?

07:31 <whitequark> marginally

07:31 <lsneff> Don't worry about that now, and if it pops up for me during profiling we can think about it later

07:31 <whitequark> ack

07:32 <whitequark> let me know once you need some sample files--that is something i can provide easily :)

07:34 <lsneff> I probably will, but not immediately

07:34 <lsneff> Take a look over it: https://github.com/lachlansneff/ligeia/blob/main/src/tbd.txt

07:34 <lsneff> There's one more thing I can think of

07:35 <lsneff> Should a timestamp be required to be emitted for any value change blocks?

07:35 <whitequark> it would be natural if it started at 0

07:36 <whitequark> by construction it can only advance forward

07:36 <lsneff> So, if the simulation started after zero, then a timestep would simply be emitted before value changes then

07:36 <whitequark> i think so, yeah

07:36 <lsneff> Sounds good

07:36 <whitequark> this implies that the variables before the 1st timestep block have 'no value'

07:37 <whitequark> gtkwave represents this as all-x even for like, 2-state variables

07:37 <lsneff> I'm uncomfortable with that because it means more bookkeeping for me

07:37 <whitequark> might make sense to treat it specially as just "no data available

07:37 <whitequark> hmm

07:37 <whitequark> ah, you want the waveform viewer to simply not let you scroll before 0 or after the last block?

07:37 <whitequark> that also works

07:38 <lsneff> Well, no I don't think that's required

07:38 <lsneff> poor ui

07:38 <whitequark> wait, it wasn't gtkwave, it was dwfw

07:39 <lsneff> It's 2:40am here, so I'm gonna head to bed. I'll start implementing this tomorrow or thursday

07:39 <whitequark> cool!

07:40 Bertl_zZ is now known as Bertl

07:44 <falteckz> I assume I can nest two FSMs, I assume I can nest the declaration of one FSM inside another during elaboration. I assume the default state for an FSM is the first State() defined. If an FSM is defined inside a parent FSM State, is the child FSM State reset to the default every time the parent State becomes active?

07:44 <whitequark> no; it simply stops transitioning

07:45 <whitequark> at the moment there is also the limitation that the inner FSM cannot cause a transition in the outer FSM directly

07:45 <whitequark> this will be lifted

07:46 <falteckz> So for the duration of the parent state, the child state must full transition back to the initial state - or at least, transition to a safe reset that on the same clock as the parent FSM transitions out?

07:47 <falteckz> i.e. child_next = "first_state", and parent_next = "get_outta_here" can happen on the same clock - and the child FSM will arrive at first state when parent FSM returns to the nest state in the future?

07:47 cr1901_modern1 has quit [Ping timeout: 264 seconds]

07:47 cr1901_modern has joined #nmigen

07:58 jeanthom has joined #nmigen

08:02 Bertl is now known as Bertl_oO

08:32 <whitequark> yes, you can arrange them like that

08:32 <whitequark> you just need an intermediate signal to express it

09:18 Lofty has quit [Quit: Love you all~]

09:19 ZirconiumX has joined #nmigen

10:43 <falteckz> I've noticed something I struggle with is knowing how much intermediate state I require to get a task done in hardware. It seems in a lot of cases the answer would be "add another flip-flop" but I have this plaguing thought that it's excessive and too complicated

10:44 <falteckz> When dealing with constructs such as individual signals, rather than objects with many properties - it can look like a lot of data, but it's just that it's all declared flat

10:46 <whitequark> yep, that's a thing

10:46 <whitequark> i'm not sure what to say other than you'll get better intuition over time

10:46 <whitequark> one useful thing is though: flops are generally not optimized by the toolchain. completely dead ones are normally removed but that's it without some special options

10:47 <whitequark> if you put some piece of data in a flop, its meaning will not change, whereas for combinatorial networks, the optimizer might completely rewire them to something that looks unlike anything you wrote

10:47 <whitequark> (unless you enable retiming, that is. *that* will change meaning of flops)

10:48 <falteckz> Is that a suggestion towards encouraging combinatorial logic where appropriate?

10:49 <whitequark> it is more of an observation

10:49 <whitequark> flops are plentiful in FPGAs, there is at least one flop per LUT

10:49 <whitequark> they are so plentiful that, save for the fact that they present an optimization barrier, using as many flops as possible can be a good idea

10:51 <falteckz> You're right - I'm getting paranoid over a resource I have a lot of, and I guess the anxiety is really about the neatness of the code rather than resource utilization. Maybe I need to break it up a little more, but still use more flops

10:52 <falteckz> Can I assign multiple .eq() to the same signal, in a single .sync += ? Will only the last one take place?

10:52 <falteckz> d.sync += out.eq(0) ; with m.If(in == 5): d.sync += out.eq(1) # <-- Valid?

10:53 <falteckz> I assume that becomes something like d.sync += out.eq(Mux(in == 5, 1, 0))

11:03 <whitequark> yes, you can

11:25 ZirconiumX is now known as Lofty

11:31 korken89 has joined #nmigen

11:37 korken89 has quit [Remote host closed the connection]

11:47 <falteckz> Is there a way to assert the fsm_state during simulation?

11:55 <jeanthom> falteckz, you can use .ongoing to check if an FSM state is currently active (https://github.com/nmigen/nmigen/blob/master/examples/basic/fsm.py)

11:55 <falteckz> How do I access this from a simulation?

12:08 chipmuenk has joined #nmigen

12:33 chipmuenk has quit [Quit: chipmuenk]

13:51 korken89 has joined #nmigen

14:35 <cesar[m]> falteckz: You can do "with m.FSM() as self.fsm:". And then, in simulation: "assert (yield dut.fsm.ongoing("STATE"))"

14:58 jeanthom has quit [Ping timeout: 272 seconds]

15:06 korken89 has quit [Remote host closed the connection]

15:15 jeanthom has joined #nmigen

15:29 emeb has joined #nmigen

15:32 Balda has joined #nmigen

15:43 <Balda> Hi ! Not sure if it's the right place to ask, but I have some trouble getting the Blinky example running

15:44 <Balda> if I try `python -m nmigen_boards.ecpix5 --variant 85`, I get the following error :

15:44 <Balda> ERROR: Module `\SGSR' referenced in module `\cd_sync' in cell `\U$$2' is not part of the design.

15:44 <Balda> Looks the same on all ECP5 board variants (orangecrab, ...)

15:45 <daveshah> Sounds like a very old Yosys?

15:45 <Balda> I used my distro's package. Let me try to compile from git

15:46 <whitequark> yep, 0.9 release is far too old

15:46 <whitequark> you can also use yowasp.org if you want

16:35 <Balda> It's all good now, many thanks !

16:35 emeb has quit [Quit: Leaving.]

16:38 emeb has joined #nmigen

18:22 Lofty has quit [Remote host closed the connection]

18:35 miek has quit [Ping timeout: 260 seconds]

18:49 miek has joined #nmigen

18:57 ZirconiumX has joined #nmigen

19:01 falteckz_ has joined #nmigen

19:01 ZirconiumX is now known as Lofty

19:04 falteckz has quit [Ping timeout: 260 seconds]

19:29 lkcl has quit [Ping timeout: 260 seconds]

19:42 lkcl has joined #nmigen

19:57 korken89 has joined #nmigen

20:48 emeb_mac has joined #nmigen

22:28 korken89 has quit [Remote host closed the connection]

22:30 jeanthom has quit [Ping timeout: 246 seconds]

23:50 emeb has left #nmigen [#nmigen]