ianloic_ has quit [Read error: Connection reset by peer]
ianloic_ has joined #nmigen
lf_ has joined #nmigen
lf has quit [Ping timeout: 268 seconds]
Jay_jayjay has joined #nmigen
<falteckz>
Is there a function that knows the clock speed of a domain and will give me a number of periods for n time? Such that I could say - how many clock cycles for 100 milliseconds of 'sync' domain?
<falteckz>
I don't believe it the responsibility of a submodule to know the clock speed, but I believe it fair to ask the domain for it
<falteckz>
I guess getting the hertz of a domain is also sufficient
<falteckz>
Looks like ClockDomains are just wrappers around an IO port for a Clock - which makes sense, since an oscillator/clock would either be something global to the fabric or some IO
<_whitenotifier>
[YoWASP/yosys] whitequark pushed 1 commit to develop [+0/-0/±1] https://git.io/JLHqX
<agg>
ClockDomains don't necessarily have anything to do with an IO port, they can be fully internal
<whitequark>
falteckz: there isn't, and in general there cannot be
<whitequark>
since you can reconfigure a PLL at runtime
<whitequark>
for the default clock, you can use platform.default_clk_period
<whitequark>
(or default_clk_frequency)
<falteckz>
I think I want to avoid a PLL for now - will use platform properties.
<falteckz>
agg, by fully internal, it's still wired to some IO line internally - which is what I intended to say with "global". Unless I'm misunderstanding
<falteckz>
Unless you mean to say, a clock domain generated by logic - for which I shudder.
<agg>
or generated by a PLL
<agg>
or an internal oscillator primitive
<falteckz>
Right, yeah I intended to include those two options in the "something global"
<falteckz>
I perhaps used the wrong words
<agg>
no, fair enough
aquijoule_ has joined #nmigen
Bertl_oO is now known as Bertl_zZ
emeb has quit [Quit: Leaving.]
lkcl has quit [Ping timeout: 272 seconds]
lkcl has joined #nmigen
Jay_jayjay has quit [Quit: My iMac has gone to sleep. ZZZzzz…]
<lsneff>
whitequark: maximum signal number?
<lsneff>
u32, u64?
<whitequark>
amount of signals? 4b is more than enough imo
<lsneff>
👍
<awygle>
i still think a specialization of ClockDomain that does know the frequency would be valuable
<falteckz>
Is there not a simulator platform that is passed to elaborate?
<whitequark>
awygle: everyone agrees it would be nice, but i'm not convinced it is possible to introduce in a way that does more good than harm
<whitequark>
falteckz: not yet
<awygle>
i understand
<whitequark>
awygle: if we got first-class PLL support, then a clock tree analysis could assign frequencies to clocks... but that can only happen after elaboration anyway
<whitequark>
so not actually useful to code that wants introspection
<whitequark>
if it was just a field that someone fills out with no guarantees backing it, then i think that passing frequencies around explicitly makes it more clear that there is no reason for this information to be correct, since, well, you are doing it yourself
<falteckz>
whitequark, perhaps that will work for me for now, passing the domain and the frequency to assume for that domain
<awygle>
why would a PLL propagating clock constraints have to wait until after elaboration?
<awygle>
(in the simple case)
<whitequark>
you've answered it yourself: because i am concerned not with the simple case but with the complex one
<falteckz>
*Shakes feature backlog at edge cases, angrily*
<whitequark>
so if the PLL is fed directly by an input pin with a constraint on it, sure
<awygle>
perhaps i wasn't clear when i said "a specialization of ClockDomain". i specifically intended that to mean "which only covers the simple cases, and which code can easily test for"
<whitequark>
oh
<whitequark>
ok, that's actually more reasonable
<whitequark>
i think the main problem with *that* plan is the fact that domains are late bound so it's hard to get at the actual ClockDomain object most of the time
<whitequark>
and passing it around is something we advise against
<whitequark>
so once we figure out something about that part, yes, that can probably be introduced
<awygle>
can you elaborate on why that is? i have personally found it confusing but i am sure that there's a good reason
<falteckz>
The period of 16MHz is 62.5 nanoseconds, right? So why is the simulated vcd showing me a clock period of 16,000 nanoseconds?
<whitequark>
sounds like period and frequency are mixed up somewhere
<falteckz>
Oh
<whitequark>
awygle: what specifically is?
<awygle>
the late boundedness of ClockDomain. the fact that they're referred to by name, often before they exist, rather than by the object
<whitequark>
awygle: it's a bunch of different reasons. consider this, if they were eagerly bound, you would effectively have to pass "sync" everywhere
<whitequark>
and it would get a lot harder to use Reset/EnableInserter as well as renaming domains
<whitequark>
(you could perhaps get rid of domain renaming, but not the former two)
<whitequark>
lsneff: i would very much like to not deal with 4-state logic
<whitequark>
high-performance simulators are almost never 4-state
<whitequark>
also, all the varints in the header make it harder to parse and emit for no good reason
<lsneff>
Ah, fair enough, will change the header.
<lsneff>
You're sure about the 4-state?
<whitequark>
i think only actual arbitrary width values (ie variables) should be encoded with varint
<whitequark>
yes. 4-state is very rarely used in high-performance RTL simulations, you would only have Z at the boundary, and almost nothing does internal X
<whitequark>
verilator cannot do internal X, afaik
<whitequark>
cxxrtl might one day, but so far there's been fairly little interest in it
electronic_eel has quit [Ping timeout: 246 seconds]
<lsneff>
A significant amount of space can be saved using liberal amounts of varint. The vcd you sent me has 2000001 timestamps. 4 bytes for each one is 8 MB to start.
electronic_eel has joined #nmigen
<lsneff>
3 state is the same as 4 state in this case, as far as I can tell
<lsneff>
same space, same encoding
<whitequark>
lsneff: hang on, we'll get to that
<whitequark>
in general, i think the philosophical issue with your format is that it tries to be a better VCD. i do not want a better VCD. i want a format actually fit for purpose
<whitequark>
the header is just cribbed from VCD in spite of the fact that the VCD one is badly thought out
<whitequark>
i would use one field in the header: the amount of femtoseconds per integer timestep
<lsneff>
Ah, I gotcha, I was basically trying to just be vcd, but better.
<lsneff>
Let's rethink this then
<whitequark>
please do not do that. put vcd in the trash bin of history and start over
<whitequark>
that timescale thing causes no end of problems for me
<lsneff>
On the other hand, I still want to be able to parse vcds with this tool, so I have to come up with something close enough to have a shared internal representation
<whitequark>
it's a choice between timescale that is insufficiently fine and misrepresents time, or insufficiently coarse and trips up consumers that expect timestamps to be roughly sequential (ie without huge gaps), like pulseview
<lsneff>
I agree with the timestamp thing
<whitequark>
the only reason that timescale field looks like that is because that's how verilog things about time. you do not have to repeat mistakes of verilog
<lsneff>
Why did verilog do it that way?
<whitequark>
there might not be a particular reason (i've read the HOPL3 paper on verilog but i don't recall anything on timescales)
<whitequark>
generally, applying chesterton's fence to verilog is a grave mistake
<whitequark>
lots of things in it are just... like that
<whitequark>
anyway
<whitequark>
regarding varint, i think all of the non-repeating part of the format should use fixint
<whitequark>
varint timestamps are ok, especially if they are deltas and not absolute times
<whitequark>
(since as far as i can tell your format is not self-synchronizing, unlike vcd, deltas make more sense)
<lsneff>
The timesteps are deltas
<lsneff>
But this is good stuff, I'm going through and changing it
<whitequark>
for signal ids, i would prefer unique rather than strictly sequential
<whitequark>
this makes it possible for me to use some internal not necessarily sequential identifier in vcd files, potentially saving a lot of lookups
<whitequark>
(e.g. index into the internal debug info table)
<whitequark>
(it's discontiguous because of aliases)
<lsneff>
Okay, sure, no problem
<whitequark>
the "signal" is the value that physically changes, right?
<whitequark>
and the "variable" is the HDL name?
<whitequark>
(the terms should probably be made less confusing, but for now i'm fine if you just confirm it)
<lsneff>
Yep, that's correct, I'm happy to bikeshed that
<lsneff>
Would wire make more sense?
<whitequark>
"wire" is ambiguous with wire/reg distinction
<whitequark>
i would probably just use "signal" and "signal name" (also "scope name")
<whitequark>
wait
<whitequark>
why do signals (in your current scheme
<lsneff>
Hmm, so that's a little confusing as well, since I included a name for each signal, and a name for each variable
<whitequark>
) have names at all?
<falteckz>
Can I expect Memory() to just work in the simulator, including initial values?
<falteckz>
Does the simulation have to "load" first?
<whitequark>
lsneff: i think it is important to separate storage locations (which are just varying values) and the HDL names assigned to them (which are just names)
<whitequark>
falteckz: it should just work
<lsneff>
They don't have to, I just thought it might be useful. e.g. This variable is called ack, but it's a range into the bus4_config_blah_blah
<whitequark>
nononono
<whitequark>
please don't give storage locations any names, that would just make life complicated for both of us
<whitequark>
and everyone who uses the format
<lsneff>
Okay, I'll remove it
<lsneff>
I get why you don't want it
<whitequark>
you know how gdb works? you have addresses (offsets into stack or something) and then you have source level names
<lsneff>
Yep, I gotcha
<whitequark>
(hey, you know what would be really cool? if i could include a wasm snippet rematerializing all the variables i optimized out *at the time when you are viewing them*. right now i do this during VCD capture, but it would be way cooler if it was done like DWARF bytecode)
<lsneff>
Hmm, I'll think about that
<whitequark>
it's definitely not MVP stuff
<whitequark>
just make the header versioned/extensible so we can include it later
<lsneff>
Ah, a format version header is a good idea, hadn't crossed my mind
<whitequark>
(i would probably go for a single u32 header that doubles as a file signature)
<whitequark>
(maybe u64 to make it less ambiguous)
<lsneff>
I really should be more on top of this, I've done plenty of spec implementation work, guess I've never had to spec one out before
<lsneff>
A magic, you mean?
<whitequark>
magic that doubles as a file version, yeah
<whitequark>
no strong opinion on the exact detail though
<lsneff>
That's interesting, I have mixed feelings about making different versions completely different formats
<awygle>
always version number always always always
<awygle>
either you'll never use it which is fine or you'll hate yourself for not including it
<whitequark>
^
<lsneff>
right
<whitequark>
for a streaming format that has hardcoded producers, versioning is really fine
<whitequark>
more fine than extensibility imo
<lsneff>
How about storage instead of signal, and variable as it is currently
<whitequark>
no objection
<lsneff>
So, you'd prefer binary or trinary logic?
<awygle>
DWARF is one of those systems that is pretty well designed so i just assume designs taking inspiration from it are also good lol
<whitequark>
awygle: i think you will be booed by more than one person who had to implement DWARF
<whitequark>
let's put it this way, i am being careful in taking inspiration from it
<awygle>
(i'ma let this convo die off before i reengage on the clock domains thing but i do have more to say about it lol)
<whitequark>
lsneff: i think you should have 4-state bit vectors separate from 2-state ones
<lsneff>
whitequark: Ah, okay, that makes sense to me
<awygle>
i feel about DWARF roughly like i feel about e.g. (e)BPF, in that there are probably more elegant ways to provide the functionality but providing it at all is basically a miracle
<whitequark>
that's conflating representation with storage
<whitequark>
`enum Type { TwoState(usize), FourState(usize), String }` that's more like it
<whitequark>
*presentation with storage, sorry
<whitequark>
and then [lsb:msb] range, as well as enum values and such, would be a part of the variable
<whitequark>
awygle: it would be a greater miracle if compilers actually emitted complete DWARF information...
<whitequark>
... you know, like cxxrtl does :p
<lsneff>
enum values? So, binary/quaternary/utf8 would be an interpretation setting, not a storage setting?
<awygle>
:p
<whitequark>
lsneff: enums are not utf8
<whitequark>
storage type would actually be 2-state/4-state/string because this is what a simulator physically has when it emits a value
<whitequark>
most simulators are 2-state, some of them (or some variables) will be 4-state, so sometimes you'll encounter 4-state as well
<whitequark>
in cxxrtl, 2-state is just value<>, 4-state will be something like xvalue<> that is a pair of value<>s internally
<whitequark>
meanwhile, presentation type would be something like:
<whitequark>
signal name, then for integers, [msb:lsb] range, signed/unsigned, which values correspond to which symbolic names for enums, and which base the integer should be interpreted in
<whitequark>
the base might be excessive, but i think the others are necessary
<lsneff>
Okay
* awygle
gestures vaguely at folders labeled "CSS" and "database normalization
<whitequark>
eh
<whitequark>
i'm pretty sure of how exactly i'd like to have the header, the storage format, and the value change format to look like
<whitequark>
i'm less sure about presentation
futarisIRCcloud has joined #nmigen
<whitequark>
the only thing i'm really certain about presentation is that a single HDL-level name must be possible to represent as an arbitrary sequence of storages
<lsneff>
multiple storages concatenated?
<whitequark>
yeah
<whitequark>
it is necessary to be able to split storages to achieve maximum performance in a simulator like cxxrtl
PyroPeter_ has joined #nmigen
<whitequark>
so it follows that it must be possible to reconstruct the value from the debug info
<lsneff>
Makes sense, not a problem
<lsneff>
Can you say more about the wasm debugging info thing?
<whitequark>
i would personally be very fine if for an MVP you just had that, the ability to concatenate storages
<whitequark>
right, ok
<whitequark>
so you know how gdb tells you $1 = <value optimized out> sometimes?
<lsneff>
Yep, definitely
<whitequark>
cxxrtl also optimizes out values sometimes (a lot of the time actually)
<whitequark>
but what it also does is it emits an additional function that recomputes all of the optimized-out values from the persistent state
<lsneff>
Ah, very interesting
<whitequark>
which a cxxrtl debug info consumer can call to rematerialize all of those
<whitequark>
the VCD waveform writer currently calls it on each step
PyroPeter has quit [Ping timeout: 256 seconds]
PyroPeter_ is now known as PyroPeter
<lsneff>
I see what you're getting at
<lsneff>
Yeah, I think that's a solid idea
<whitequark>
in theory, if i emitted that function into wasm, and emitted the wasm into the waveform dump, and added some correspondence between storages and wasm inputs/outputs... waveform dumping could be greatly sped up
<lsneff>
Made more difficult by the lack of interface types
<whitequark>
i think i could basically give you a sequence of storage IDs
<whitequark>
you would lay them out in a memory, then the function will read storage IDs that it needs, and write back the ones it compute
<whitequark>
the interface would be basically an array of u32 and a bunch of indexes into it
<whitequark>
the other thing i eventually want a good waveform viewer to do is to support a sort of client/server model
<whitequark>
ie: the viewer tells the model the range it wants to render (with an optional stride), the model does its best to reconstruct that range from the saved state by loading checkpoints and simulating
<whitequark>
basically, if you are mipmapping anyway, why even compute the values you're about to average out?
<whitequark>
i want it to be possible for a 4 billion point trace of a SoC that boots Linux to be compressed to like a few hundred MB with checkpoint/restore
<whitequark>
combined with rematerialization, of course
<whitequark>
(in case of the client/server model you don't need wasm, you can just ask the model directly)
<whitequark>
(so it might make more sense to focus on that as it gives a greater benefit)
<lsneff>
The model, as in the cxxsim instance itself?
<whitequark>
yeah
<whitequark>
while you're running the model, it checkpoints say every 100000 cycles. then once you ask for cycles 150000..170000, it loads a checkpoint, runs for a while without recording anything, then dumps waveforms for 20k requested cycles only
<whitequark>
if you just want to render an overview of a very large trace this can be extremely efficient because waveform dumping, especially full dumps, is an order of magnitude slowdown if not more
Degi_ has joined #nmigen
Degi has quit [Ping timeout: 260 seconds]
Degi_ is now known as Degi
<lsneff>
That's clever, to collect a bunch of snapshots, and then display that, instead of mipmapping the whole thing down anyhow
<whitequark>
also you can snapshot eg one out of 100k cycles and then, when replaying, give one out of 1k cycles to the viewer
<whitequark>
since the majority of time is spent recording VCD, not simulating
<lsneff>
How much slower do you think running a cxxsim within wasm would be than natively?
<whitequark>
3-5 times slower, I measured
<lsneff>
Is that with wasm simd enabled?
<whitequark>
nope
<whitequark>
not sure if simd would help much
<whitequark>
native binaries (even with -march) barely use it
<lsneff>
Gosh, hmm, I wonder what's slowing it down so much
<lsneff>
Should enum be a separate interpretation type from integer?
<whitequark>
I think so yeah
<lsneff>
Good, that makes it easier in the spec
cr1901_modern1 has joined #nmigen
cr1901_modern has quit [Ping timeout: 272 seconds]
<falteckz>
Yay! Updating an RGB LED strip at 170 Hz
<whitequark>
lsneff: thanks, i like the text description a lot more
<lsneff>
I enjoyed putting together the ascii art, but it turned out to be hellish to actually read
<whitequark>
yep
<lsneff>
the text format is roughly similar to the way the wasm spec is laid out
<whitequark>
lsneff: i think you don't strictly need integer-variable-data.msb
<whitequark>
you only need the lsb offset, and then storage chunks give you the msb
<lsneff>
What if you want a range that's less than the size of a single storage?
<whitequark>
hmmm
<whitequark>
so lsb indexes into the concatenated storages, with the first storage starting at bit 0, right?
<whitequark>
what if i have a variable defined in verilog like `wire [7:3] x;` where the low 3 bits simply aren't?
<lsneff>
it bit-indexes into storage(s), so wire [7:3] x; would simply be msb=7, lsb=3
<lsneff>
Well, hmm
<lsneff>
I didn't think about a storage not starting at 0
<lsneff>
what's the use case for that anyhow
<whitequark>
i just explained!
<whitequark>
wire [7:3] would have a 5-bit storage
<lsneff>
Right, but why not wire [4:0]?
<lsneff>
I get that verilog supports that
<lsneff>
anyhow, doesn't matter
<whitequark>
suppose you have an address bus where every access is 4-byte aligned
<whitequark>
it'll be like wire addr[31:2];
<whitequark>
and you can still assign unshifted addresses to it
<lsneff>
Ooo, fancy. That's interesting
<lsneff>
Does nmigen support that too?
<whitequark>
nipe
<whitequark>
*nope
<whitequark>
it potentially could, but this would interact in complex ways with other stuff and i'm not sure if it makes a good addition to the language
<lsneff>
Okay, so that sounds to me like "x[7:3]" would the name of variable in that case, and it would alias a 5 bit storage
<lsneff>
lsb and msb would be 0 and 4
emeb_mac has quit [Quit: Leaving.]
<lsneff>
at least, that's how that would fit into the format as specified
<whitequark>
so you're taking a note out of vcd's book, right?
<whitequark>
this is kind of... i don't entirely like it, but i am not opposed to it either, since i am not writing vcd viewers mostly
<whitequark>
basically, you'll have to support things like "take the bit 6 out of a variable named x[7:3]"
<whitequark>
so there is a mini language hidden in the string that is the variable name, which you will need to parse
<lsneff>
Yeah, I get that you feel queasy about that
<whitequark>
and now consider memories
<whitequark>
where the ranges are stacked a few levels deep
<lsneff>
Okay, how about each storage has a least significant bit idx as well?
<whitequark>
that would be a perfect fit with cxxrtl's model
<lsneff>
👍I'll add it to the spec
<whitequark>
the case where storages would overlap i think you can make a hard error
Sid___ has left #nmigen [#nmigen]
<whitequark>
(maybe mark that one variable as illegal to keep the file viewable)
<whitequark>
quaternary enums aren't a thing
<whitequark>
that is meaningless in terms of synthesis semantics
<lsneff>
Ah, you're right there, doesn't make sense
<whitequark>
or not even synthesis, just meaningless
<whitequark>
one could argue for *don't cares* in enum specifications but i've never seen that supported by any other tool
<whitequark>
and i am strongly opposed to conflating don't cares with x and z
<lsneff>
Okay, hold up actually, overlapping storages?
<whitequark>
if you just have bit patterns for enum variants that works perfectly
<whitequark>
oh
<falteckz>
Is there a construct that can be treated like a list of elements (bytes) or a single unsigned value? Specifically [unsigned(8), unsigned(8), unsigned(8)] when indexed, but unsigned(24) when not?
<whitequark>
lsneff: well imagine a variable that consists of two 2-bit storages, with lsb 0 and lsb 1
<whitequark>
falteckz: have you seen .word_select() ?
<whitequark>
lsneff: i thought about this and i think this would have mattered with your previous design
<whitequark>
but with your current design, i think every variable can just come with a list of multiple storages
<whitequark>
there's no downside, they would be transparently concatenated when reading
<whitequark>
and everything on top would work with the bit vector
<whitequark>
that said
<whitequark>
it is extremely unlikely that any enum ends up being split
<lsneff>
Except that having multiple storages and each storage having a lsb that's possibly not zero is complicated
<whitequark>
but you can handle that just in one place, no?
<lsneff>
Yeah, I believe so
<whitequark>
if cxxrtl gets a TBD emitter, i suspect it will emit all integer variables as multistorage
<whitequark>
although i guess i could special-case it
<whitequark>
bottom line, i think integers should be handled uniformly at least
<whitequark>
i can see some use cases for multistorage strings
<whitequark>
though tenuous
<whitequark>
hm, let me think about it
<lsneff>
Multistorage makes everything more complex, since I can't hand around regular byte slices anymore
<whitequark>
ok, i see
<whitequark>
so how about: one type INTEGER, with >=1 storage-ids, and the case of 1 storage-id specially handled in the viewer itself
<whitequark>
(a case of 1 storage-id still can have nonzero storage-lsb, so it would not complicate things much, i think?)
<whitequark>
and then you have ENUM and STRING and stuff none of which support multistorage
<whitequark>
i would be totally fine with that.
<lsneff>
Okay, that works for me
<lsneff>
So, integers have 1 or more storages, and only the first one can have a non-zero lsb
<whitequark>
for enums, it would be very easy for me (i don't need to do anything) to give you the bit patterns that correspond to the storage-lsb
<whitequark>
ie you ignore storage-lsb for enums and just compare the numbers with the enum specification
<whitequark>
(ok, you'll need storage-lsb if the user asks for a numeric enum value)
<whitequark>
lsneff: re integers, i think that won't work
<whitequark>
because that makes storages tied to the variable
<whitequark>
basically, if every storage is self-contained, its lsb is the absolute lsb within some RTL signal
<whitequark>
suppose a RTL signal is composed out of 3 storages, for bits 7:5, 4:2, and 1:0
<whitequark>
suppose a variable aliases bits 7:2 of that signal
<whitequark>
it must be possible for it to only list the two higher order parts in storage-ids
<whitequark>
and suppose another variable aliases only bits 7:5
<whitequark>
it must be possible for it to only list one storage-id
<whitequark>
which means that the storage-lsb for them should be 5, 2, and 0
<whitequark>
makes sense?
<lsneff>
Ah, I understand that
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
<lsneff>
So, just make sure the storages [lsb:lsb+length] are contiguous with one another
<lsneff>
got it
<whitequark>
yep pretty much, i don't think variables with holes is something that makes sense to support
<whitequark>
just mark it as broken in the viewer
<whitequark>
or with overlaps
<lsneff>
👍
<whitequark>
if you implement the format we just agreed on (well the parts we agreed on), i will be able to make cxxrtl debug info *even more efficient*
<whitequark>
right now it can only represent full aliases
<whitequark>
s/will/may be/, i'd have to benchmark it first
<lsneff>
Awesome
<lsneff>
Yep, give me a sec to push the modified spec, and then I'll start implementing it tomorrow probably
<whitequark>
looked through the rest, seems essentially fine
<whitequark>
the only constraint is that a scope must be defined before it is used
<whitequark>
right now the VCD writer has to do some annoying bookkeeping that it really oughtn't do
<whitequark>
and it would get much worse if e.g. i tried to compose multiple cxxrtl models
<lsneff>
There won't be enough scopes and variables for extra space to matter, right?
<whitequark>
nope. think of it this way, that was written by a human (usually) and goes into UI for a human
<whitequark>
like a tree view
<whitequark>
that puts an inherent limit on the amount of scopes, i mean if you have a megabyte of variable names, that must have come from some gargantuan RTL model that will dwarf the header in size
<whitequark>
wait. waaait value-change-blocks is an unsized-vec?
<lsneff>
Yeah, I just didn't want you to have to declare it's length beforehand
<lsneff>
Since you wouldn't necessarily know it
<whitequark>
oh nvm i misparsed that
<whitequark>
do we need initial-values?
<whitequark>
i guess we do
<lsneff>
I guess that's another vcd paradigm
<whitequark>
or... hm
<whitequark>
i think it would be more elegant to get rid of it
<whitequark>
the first value-change-block can serve that purpose
<whitequark>
the main consequence is that you must (in the viewer) handle the case where e.g. at time 0, the values are not yet known, because the simulation starts at nonzero time
<lsneff>
I guess since the timestamps are deltas, that makes sense
<whitequark>
starting a simulation at nonzero time is well-defined by the way
<whitequark>
you may have restored a checkpoint
<lsneff>
Ah
<lsneff>
I guess that's just something to deal with
<whitequark>
i would also strongly recommend adding a header to value-change-block
<whitequark>
there's $dumpon/$dumpoff in VCD which actually make sense for checkpoint/restore mipmapping as we discussed earlier
<whitequark>
and also i would like to add printf capability to cxxrtl
<whitequark>
which would need its own block in the dump
<whitequark>
assertion failures to be recorded as well
<lsneff>
you'd like to intersperse different kinds of blocks
<whitequark>
correct
<lsneff>
alright
<whitequark>
it might actually make sense to have timestep as a separate block
<whitequark>
so: advance time, record changed values, show printed stuff / failed assertions / etc in the future
<lsneff>
Having multiple value changes for a storage in a single timestep is a weird edgecase
<whitequark>
yes, vcd viewers call this a "glitch"
<whitequark>
some of them in some cases detect it and complain, some of them swallow it
<whitequark>
the ability to represent it is actually useful
<whitequark>
it can happen with certain kinds of unusual combinatorial logic and aid debugging
<whitequark>
for MVP, "last write wins" is perfectly good semantics
<whitequark>
lsneff: ah, so re storages/scopes/variables this is not quite what i meant
<whitequark>
rather than three separate vecs, i would like to have a single vec where each entry is one of three variants
<lsneff>
Ah, I see
<whitequark>
essentially, i am trying to stream as much data as possible while keeping as little possible in the state
<whitequark>
internally the cxxrtl vcd dumper currently uses the physical address of storage in memory as the key
<whitequark>
when it first encounters it, it emits a new storage, and each time it encounters it, it emits a variable as appropriate
<whitequark>
that said, i could live with three vecs if you'd rather have that
<whitequark>
still better than what vcd does, by a large margin
<lsneff>
The three variants are scope, variable, and... ?
<whitequark>
and storage
<lsneff>
Ah, okay
<lsneff>
I'm not a huge fan, but if you think it would simplify, happy to do it
<whitequark>
can you elaborate why you dislike it? maybe i agree
<whitequark>
mostly i suggest it because that would be very natural for a waveform dumper to emit during initialization
<whitequark>
but it's possible i'm missing something
<lsneff>
It just seems a little inefficient to me, but it's not really, especially since this is just a small part of the file
<whitequark>
ah i see, the enum discriminants
<lsneff>
So, are you going to know how many declarations there will be up front?
<whitequark>
that... is an excellent question
<whitequark>
and now that you asked it, i think i can clarify the decision i'm thinking about here
<whitequark>
gimme a sec
<lsneff>
sure
<whitequark>
i see two major options here. the former, in which i have to know the amount of declarations (aggregate or per-type) upfront, requires me to do that bookkeeping no matter what
<whitequark>
in that case, i think we can just keep it as three vectors, it doesn't make any difference.
<lsneff>
I can add an END_DECLARATIONS discriminant
<whitequark>
yes, but i want to extend that concept a bit
<whitequark>
basically, unify the discriminants of declarations and value changes. this would *greatly* simplify the code in cxxrtl that emits stuff, not only because it now needs zero bookkeeping, but also because there is no longer any constraint as to when the end user may be adding signals--you can start dumping a bunch of stuff, for example a memory, in the middle of a capture if you want
<whitequark>
but more than that, it would help nmigen's pysim even more
<whitequark>
because you can add Signal()s at runtime in Python processes, pysim does not and cannot know of every Signal() that needs to be dumped upfront
<lsneff>
Ah, hmm
<whitequark>
right now, if you do not inform the VCD dumper of every Python-only Signal upfront, you simply cannot dump them
<whitequark>
(well you sort of can but no one will do it because it's too annoying)
<whitequark>
if i could tell the viewer in the middle of a capture "oh and by the way here's a new signal" that would *massively* improve edge case experience
<lsneff>
I want to be careful of how difficult it will be to implement efficiently as well
<whitequark>
yes, i understand
<whitequark>
which is why i offer two options
<whitequark>
the third reason why i think unified discriminants might make sense is because of the client/server architecture i mentioned earlier
<whitequark>
the streamed waveform dump file structure naturally maps to a socket
<whitequark>
it'd still need defining the other direction, but the data that flows to a viewer would have the exact same structure
<lsneff>
This + non-ordered storage ids means that I can't use the id to index an array of variable datas, and will probably have to use a hashmap instead
<lsneff>
Which may add a lot of overhead, but I'm not sure
<lsneff>
That's only when parsing though
<lsneff>
So, probably okay
<whitequark>
hmm
<lsneff>
Alright, I'll unify the enumerations
<whitequark>
i will happily ditch the non-ordered storage id requirement if i can get this
<whitequark>
the other simplifications make storage id tracking so much easier that it is no longer an issue
<whitequark>
make scope ids ordered as well, not a problem
<lsneff>
Will that complicate internal bookkeeping for you?
<whitequark>
marginally
<lsneff>
Don't worry about that now, and if it pops up for me during profiling we can think about it later
<whitequark>
ack
<whitequark>
let me know once you need some sample files--that is something i can provide easily :)
<lsneff>
Should a timestamp be required to be emitted for any value change blocks?
<whitequark>
it would be natural if it started at 0
<whitequark>
by construction it can only advance forward
<lsneff>
So, if the simulation started after zero, then a timestep would simply be emitted before value changes then
<whitequark>
i think so, yeah
<lsneff>
Sounds good
<whitequark>
this implies that the variables before the 1st timestep block have 'no value'
<whitequark>
gtkwave represents this as all-x even for like, 2-state variables
<lsneff>
I'm uncomfortable with that because it means more bookkeeping for me
<whitequark>
might make sense to treat it specially as just "no data available
<whitequark>
hmm
<whitequark>
ah, you want the waveform viewer to simply not let you scroll before 0 or after the last block?
<whitequark>
that also works
<lsneff>
Well, no I don't think that's required
<lsneff>
poor ui
<whitequark>
wait, it wasn't gtkwave, it was dwfw
<lsneff>
It's 2:40am here, so I'm gonna head to bed. I'll start implementing this tomorrow or thursday
<whitequark>
cool!
Bertl_zZ is now known as Bertl
<falteckz>
I assume I can nest two FSMs, I assume I can nest the declaration of one FSM inside another during elaboration. I assume the default state for an FSM is the first State() defined. If an FSM is defined inside a parent FSM State, is the child FSM State reset to the default every time the parent State becomes active?
<whitequark>
no; it simply stops transitioning
<whitequark>
at the moment there is also the limitation that the inner FSM cannot cause a transition in the outer FSM directly
<whitequark>
this will be lifted
<falteckz>
So for the duration of the parent state, the child state must full transition back to the initial state - or at least, transition to a safe reset that on the same clock as the parent FSM transitions out?
<falteckz>
i.e. child_next = "first_state", and parent_next = "get_outta_here" can happen on the same clock - and the child FSM will arrive at first state when parent FSM returns to the nest state in the future?
cr1901_modern1 has quit [Ping timeout: 264 seconds]
cr1901_modern has joined #nmigen
jeanthom has joined #nmigen
Bertl is now known as Bertl_oO
<whitequark>
yes, you can arrange them like that
<whitequark>
you just need an intermediate signal to express it
Lofty has quit [Quit: Love you all~]
ZirconiumX has joined #nmigen
<falteckz>
I've noticed something I struggle with is knowing how much intermediate state I require to get a task done in hardware. It seems in a lot of cases the answer would be "add another flip-flop" but I have this plaguing thought that it's excessive and too complicated
<falteckz>
When dealing with constructs such as individual signals, rather than objects with many properties - it can look like a lot of data, but it's just that it's all declared flat
<whitequark>
yep, that's a thing
<whitequark>
i'm not sure what to say other than you'll get better intuition over time
<whitequark>
one useful thing is though: flops are generally not optimized by the toolchain. completely dead ones are normally removed but that's it without some special options
<whitequark>
if you put some piece of data in a flop, its meaning will not change, whereas for combinatorial networks, the optimizer might completely rewire them to something that looks unlike anything you wrote
<whitequark>
(unless you enable retiming, that is. *that* will change meaning of flops)
<falteckz>
Is that a suggestion towards encouraging combinatorial logic where appropriate?
<whitequark>
it is more of an observation
<whitequark>
flops are plentiful in FPGAs, there is at least one flop per LUT
<whitequark>
they are so plentiful that, save for the fact that they present an optimization barrier, using as many flops as possible can be a good idea
<falteckz>
You're right - I'm getting paranoid over a resource I have a lot of, and I guess the anxiety is really about the neatness of the code rather than resource utilization. Maybe I need to break it up a little more, but still use more flops
<falteckz>
Can I assign multiple .eq() to the same signal, in a single .sync += ? Will only the last one take place?