<Sarayan>
I'm not sure how it can work, tbh. The common control on the 10/20 LUTs 40 FF can invert the clocks in addition to routing stuff, so it's not an entirely routing-only thing
<Lofty>
Remember, bels are for placement
<Sarayan>
yes and no
<Sarayan>
we don't have "FF that can invert its clock", we have "control group common to 40 FFs that can invert 3 clock lines"
<Sarayan>
so there's a hierarchical plaacement... thing that also does a change on the signal (e.g. inversion)
<Sarayan>
you can't just pretend the FFs can invert their clocks on input, because there's only 3 clock lines for 40 FFs
<Sarayan>
while the 3 clock lines limit can be seen are a pure placement thing, the optional inversion is where I don't know how things work out
<Lofty>
daveshah: does 1 BEL mean exactly 1 thing can be placed on it?
<daveshah>
Yes
<Lofty>
Ugh
<Lofty>
This means I have to do subdivision to impedance-match mistral and nextpnr
<daveshah>
So generally your BELs would be a small unit
<daveshah>
Yes, that's pretty much inevitable
<Lofty>
<Sarayan> while the 3 clock lines limit can be seen are a pure placement thing, the optional inversion is where I don't know how things work out
<Lofty>
So, here's my understanding: nextpnr would ask for a clock line to carry signal X, another for signal Y, and another for signal Z
<Lofty>
If it tries to add another one, you reject that as impossible
<daveshah>
Yeah thats how constraints work atm
<daveshah>
You basically have a couple of functions that return false if things aren't legal
<Lofty>
Does the Z coordinate have to be strictly sequential?
<Lofty>
I suppose it does
<daveshah>
No it doesn't
<daveshah>
In fact, encoding information in different bit ranges might even be useful
<daveshah>
Have a look at nexus or nextpnr-xilinx for examples of that
<Lofty>
I was considering encoding hierarchy information in the Z coordinate
<daveshah>
Yes that's probably a good idea
<Sarayan>
the routing is (2 input clock lines + 2 input data lines) -> (2 selected clock lines) -> (3 clock lines build from one of the two selected + optional inverter)
<Lofty>
I feel like this might need a LAB diagram to think things through though
<Lofty>
Or I can do my usual approach of winging it :P
<Sarayan>
yeah, I still have drawing it in my todo list
<Lofty>
...Do you think I can get away with temporarily pretending that anything that isn't a LAB has zero BELs?
<daveshah>
Yes
<daveshah>
Well you probably want IO too
<Lofty>
True
<Lofty>
I guess for I/O you can get away with one pin = one BEL?
<daveshah>
yeah
<daveshah>
and no need for any registers or anything, just the basic IO buffer
<Lofty>
Sarayan: are the GPIOs on the tile grid?
Ultrasauce has quit [Quit: Ultrasauce]
<Lofty>
Because if they're not, then that's going to be a headache
<daveshah>
mm, nextpnr assumes everything is one homogeneous tilegrid
<Lofty>
Or else you need to resize the tilegrid
<daveshah>
yeah
<Lofty>
Actually, what am I thinking.
<Lofty>
The gpio_get_pos function takes in a pos_t
<Lofty>
Which is a grid coordinate
<Lofty>
So the answer is yes :P
<daveshah>
:)
<Lofty>
By the looks of it, an I/O tile corresponds to 2-4 I/O pins
<daveshah>
that sounds very typical
<Sarayan>
up to 4 yes
<Sarayan>
if it's less than for which numbers are missing is close to random
<daveshah>
probably just depends on where they could squeeze pads in
<Lofty>
Is it better to just straight-up return 4, or to give an actually accurate number of bels?
Ultrasauce has joined #prjmistral
<daveshah>
straight-up return 4
<daveshah>
at worst you'll allocate a tiny bit more memory than needed
<Lofty>
I was wondering if it would have algorithmic impacts like wasting time trying to place on a BEL that doesn't exist
<Sarayan>
most are 4, so a really tiny bit
<daveshah>
getGridDimZ doesn't affect placement
<daveshah>
it's only about allocating arrays of the right size
<Sarayan>
GPIOs are fixed placmeent inany case
<daveshah>
placement will only consider bels that actually exist; based on one of the other functions (I think it actually just goes over all the bels with getBels iirc)
<Sarayan>
nextpnr is not the one deciding which pin does what
<daveshah>
yeah that too
<daveshah>
occasionally in the early stages of a design you let it place IO if you just want a rough idea
<Sarayan>
sure, but not the usual case :-)
<daveshah>
indeed
<Sarayan>
Lofty, you have your new method
<Lofty>
Thanks Sarayan
<Lofty>
So you're using a vector to indicate zero-or-one, right?
<Sarayan>
can be more than one
<Lofty>
Aaaa
<Sarayan>
there can be only one inner block, but there can be multiple peripheral blocks
<Lofty>
"peripheral block"?
<Sarayan>
lab/mlab/m10k/dsp there can be only one and it will be alone though
<Sarayan>
fpll, cmux*, gpio, etc
<Lofty>
<Sarayan> lab/mlab/m10k/dsp there can be only one and it will be alone though <-- this is the bit I care about
<Sarayan>
if you look at the floor plan they're on the periphery, and they're configured by the pram
<Sarayan>
Yeah, I guessed that :-)
<Lofty>
Sure, but I'm pretty confident nextpnr will just walk the entire tilegrid
<Lofty>
So those periphery cases are ones I have to deal with
<Sarayan>
sure
<Sarayan>
there are things like the clock muxes you want handled by nextpnr too at some point
<Sarayan>
and the gpios are there of course
<Sarayan>
(there's not entirely enough information inthe lib/docs to handle clock muxes yet, sorry)
<Lofty>
So, what, should I just go through the vector until I find something I recognise?
<Lofty>
Or do I use tile types instead? :P
<Lofty>
Okay, next on the list is getTilePipDimZ
<Sarayan>
hmmm, list of rnodes per tile is not something we have
<daveshah>
Almost certainly just a return 1
<daveshah>
pip Zs do not need to be unique
<Lofty>
Ah, that's useful to know
<daveshah>
They are only if you need to group pips at a finer level than tile (eg for some complex partial reconfiguration cases)
<Lofty>
I'm going to arbitrarily start with 60 BELs per LAB: 10 ALMs per LAB, 2 LUT outputs plus 4 flop outputs per ALM
<daveshah>
Yes that's how I'd split it up
<Sarayan>
you're gonna have to add some virtual routing between those
<Sarayan>
not a problem
<Lofty>
Not necessarily, but anyway
<Sarayan>
half of the ff inputs do not go through the LUT
<Lofty>
Next: getBelByName
<Lofty>
...Can I just stub this?
<Sarayan>
what's a name?
<daveshah>
Not if you want to use the Python API or design load/save
<Lofty>
A string
<Sarayan>
so it's BLOCK.xxx.yyy?
<Lofty>
Presumably a human readable index
<Lofty>
Yeah
<Sarayan>
you can implement it in not many lines :-)
<Sarayan>
that gives you the block type enum from the string block name
<Lofty>
And then I also need a BelId system
<Lofty>
Which I'm guessing is kinda a pos_t plus the nextpnr Z coordinate?
<Sarayan>
if it canbe a uint64_t you can use a pnode with no port (PNONE)
<Lofty>
Okay, that works
<daveshah>
I would suggest a pos_t plus Z-coordinate
<Lofty>
I'm just wondering about block types
<Lofty>
pos_t is uint16_t
<Sarayan>
pos_t is 14 bits in reality
<Lofty>
The Z is probably another 16 bits
<Sarayan>
not sure, you seem to be going up to 60 and no more
<Lofty>
Hmm
<Lofty>
Okay
<Sarayan>
the cmux will get you to 20 max
<Sarayan>
block type is 8-bits (and even that's overkill)
<Sarayan>
there are only 57 block types in reality
<Sarayan>
(that's including BNONE, the no-block value)
<Lofty>
Sarayan: So, would the Z coordinate also have to include the peripheral blocks as well?
<Lofty>
I suppose so
<Sarayan>
I guess, *but* you can do something like block_type | (index << 8)
<Sarayan>
so your 0-59 goes into index
<Sarayan>
and I guess given Z the string is going to be BLOCK.xxx.yyy.zzz
<Lofty>
Well, if block is implied in Z it's just gonna be XXX.YYY.ZZZ
<Lofty>
Or whatever
<Sarayan>
I recommend not to imply it
<Sarayan>
Ah
<Sarayan>
I mean
<Sarayan>
zzz for be index
<Sarayan>
zzz would be index
<Lofty>
...I'll just have a block_type_t in the BelId.
<Sarayan>
it's just way more readable to have MLAB.025.011.020 to mean the first FF of the mlab at (25, 11)
<Sarayan>
and debugging loves readability
<Lofty>
Mhm
<Sarayan>
and Z would be (20 << 8) | MLAB
<Lofty>
Z is 8 bits :P
<Sarayan>
it is?
<Lofty>
I'll make it 16
<Sarayan>
I'm surprised
<Sarayan>
ok, there's something I don't understand, is (x, y, z) unique or it's (bel type, x, y, z)?
<daveshah>
z must be unique for all bel types
<Sarayan>
ok
<Sarayan>
but z being (index << 8) | bel_type is not an issue
<Sarayan>
right?
<daveshah>
That's fine
<Lofty>
<Lofty> Does the Z coordinate have to be strictly sequential?
<Lofty>
<Lofty> I suppose it does
<Lofty>
<daveshah> No it doesn't
<Lofty>
<daveshah> In fact, encoding information in different bit ranges might even be useful
<Lofty>
Next problem
<Sarayan>
yeah, he's not contradicting himself, damint ;-)
<Lofty>
The .c_str() method wants a BaseCtx
<Lofty>
How do I get one of those?
<daveshah>
Arch inherits from it
<daveshah>
IdStrings are stored in the BaseCtx
<Lofty>
So I just pass this?
<daveshah>
Yeah
<Sarayan>
I'm failing at seeing the difference between wires and pips
<Lofty>
I'm guessing wires are 1:1 and pips are 1:many
<Sarayan>
context: the cv routing network is just a bunch of muxes that takes a number of metal lines as inputs and drive one metal line with the selection
<Lofty>
I think basically all routing networks are like that
<daveshah>
A wire is a piece of metal
<daveshah>
A pip is a transistor or group of transistors
<daveshah>
(or a configuration of a group of transistors)
<Sarayan>
spartan 2 had groups of 6 mosfets that connected 4 metal lines in any way possible without amplification or directionality
<Sarayan>
no such thing here though
<Sarayan>
so the pips are the muxes, and the wires what connect the muxes and.or the bels?
<daveshah>
The pips are the mux settings
<daveshah>
A N-input mux will be N pips, from each possible input wire to the output wire
<daveshah>
1
<Sarayan>
so bindWire is a static thing?
<Sarayan>
and damn, that means a PipId can't be 32 bits
<daveshah>
No bindWire and bindPip are called during routing to update the mapping between nets and wires/pips
<Sarayan>
what's a net?
<omnitechnomancer>
A wire can input to multiple possible pips
<Sarayan>
omnitechnomancer: ok, but there's no binding going on, or more precisely no unbinding. There's no way to disconnect the wire from the bel port
<omnitechnomancer>
I think the difference is that this is used for the wires connected directly to bels, while bind pip normally deals with wire binding itself?
<daveshah>
Yes that's correct
<Sarayan>
daveshah: beautiful
<omnitechnomancer>
<Sarayan "omnitechnomancer: ok, but there'"> the binding is for a net, nothing to do with the internals of the fpga
<omnitechnomancer>
its to say this wire is associated with this net now, you use it to represent that fixed connection
<Lofty>
Yeah, it's for a hash table
<Sarayan>
but saveshah says it's not static, the placement/routing can decide to unbind it
<Sarayan>
daveshah sorry
<Lofty>
Mapping a virtual net to a physical wire in the FPGA
<Lofty>
Or unmapping it
<daveshah>
The mapping can change because you could change which net connects to a cell pin
<daveshah>
Or which bel the cell is placed on
<omnitechnomancer>
for outputs when you place the bel in a location you will need to bind the output wires to the output nets of that bel
<Sarayan>
ahhhhhhhhhhhhhhhhhhhhhhh, it's an internal routing algorithm thing... which I don't get why the arch-specific code should know about, but that's a different story
<omnitechnomancer>
the arch-specific code needs to explain which wires are attached to which bels
<Sarayan>
yes, there's getBelPinWire for that
<Sarayan>
and friends
<Sarayan>
that's why I was lost, the information is there
<Sarayan>
(well, I'm still kinda lost, but heh)
<omnitechnomancer>
hmmm
<daveshah>
So the idea is, when updating net->wire mappings, use bindWire for the start wire (which is driven by a bel pin) and bindPip for all downstream wires (which are driven by pips)
<Sarayan>
wait, you need to *user* bindWire and friends, or you need to *implement* it when you add a new fpga type?
<omnitechnomancer>
is this a method because the generic code doesn't necessarily understand how net->wires is stored?
<Sarayan>
okuser -> use
<daveshah>
You need to implement it, by and large
<daveshah>
You might also use it for any arch specific routing passes like a custom global router
<Sarayan>
cat wants food, badly
<omnitechnomancer>
daveshah: I think the misunderstanding is why this is not a generic operation and otherwise driven from specifying the bel pin wires
<Sarayan>
pretty much
<Sarayan>
otoh, maybe Lofty gets it and I'll just see what is done :-)
<omnitechnomancer>
I think it's because you might need to do special things with WireIds and NetInfos which arch specific code gets to define aspects of so generic code might not know them
<Sarayan>
what kind of special things?
<Sarayan>
and now cat needs paets, badly
<Sarayan>
pets
<omnitechnomancer>
give cat pets
<Sarayan>
I am I am
<Sarayan>
purring is happening
<omnitechnomancer>
I am not sure how much this is mandatory to have the arch define, but it's one of the points where you could do extra things if you need
<omnitechnomancer>
ecp5 also updates the UI here (though a generic version could too)
<Sarayan>
can you not implement it/upcall some generic version?
<Sarayan>
I mean, I see nothing there that's a priori fpga-specific
<Sarayan>
maybe there's historical reasons, but I don't get them :-)
<daveshah>
Yeah there isn't really a very good reason
<daveshah>
nextpnr in its current state does leave too much up to the arch imo
<Sarayan>
sure seems so
<omnitechnomancer>
I think all of ecp5 ice40 and nexus have identical impls
<omnitechnomancer>
and gowin has an analagous but lacking of asserts impl
<omnitechnomancer>
so could probably be generic
<Sarayan>
that's a sign I'd say :-)
<omnitechnomancer>
I think dave was trying to be maximally flexible just in case
<omnitechnomancer>
and not make it something you configure with xml files :P
<Sarayan>
sure, but you can have flexibility through optional virtual functions for instance
<Sarayan>
And now the cat is asleep on me
<Sarayan>
life is so hard
<Lofty>
Pffft
<Sarayan>
snoring cat
<Lofty>
Sarayan: want me to push my nextpnr branch so you can look at laugh?
<Lofty>
s/at/and/
<Sarayan>
Don't worry, I won't laugh at you :-)
<Sarayan>
when nextpnr asks for a delay, that's minimal or maximal?
<Lofty>
cyclonev branch of my nextpnr fork
<Lofty>
There's the option for both
<Lofty>
But I think it normally works by maximum
<Sarayan>
feels like you really need the lab diagram at this point, and probably the cmux too
<Sarayan>
the thing is, I really need to cross-reference it with the timing model
<daveshah>
Lofty: leaving a few small comments
<Sarayan>
wtf, git fetch --all doesn't see any branch
<Sarayan>
I'm working directly on that, I don't even have a personal fork :-)
<daveshah>
nextpnr, not mistral
<Sarayan>
ahhhhhhhhhhhh
<Sarayan>
but of course
<Sarayan>
just to be annoying, the bottom FFs have two outputs
<Sarayan>
well, to be more precise, there are 3 outputs per block of two FFs, each goes to either a FF output or directly to the LUT output, the first to the first FF and the second and third to the second FF
<daveshah>
It sounds like a couple of extra pips could represent that routing
<Sarayan>
and the routes after the second and third are very different, the first is a local loopback that connects to roughly half of the inputs of the LABs in that block
<daveshah>
Plus a validity check during placement to make sure you don't oversubscribe outputs and end up with an untouchable conflict
<daveshah>
*unrouteable
<daveshah>
autocorrect
<Sarayan>
quartus calls it LO (local), I call it LD (local dispatch)
<Sarayan>
the really funky thing is that only one in the two FFs of a pair can be connected to the output of the LUT, the other gets one of the inputs of the LAB
<Sarayan>
that's going to be hard to represent, that "optional crossing"
<daveshah>
This all sounds like fairly typical placement validity rules tbh
<Sarayan>
can't that end up with the planner bouncing against rules it can't predict "what about now? No. And now? Still no"
<daveshah>
It's not ideal for very heavily utilised designs and that is something I'd like to improve one day
<daveshah>
But below about 90% utilisation it tends to converge on a legal placement pretty well
<Sarayan>
ahhhh no it's not crossed
<Sarayan>
there's a line, "sload", that tells whether to get combout or not
<daveshah>
oh, so it can be used as a dynamic load if you don't tie that signal to a constant?
<Sarayan>
well, tbh, I don't understand entirely everything and I hope the timing model is going to help
<Sarayan>
using mistral::CycloneV; does not exactly do what you think it does
<Sarayan>
I suspect using namespace mistral; would be more approriate
<Sarayan>
using mistral::CycloneV; means typedef using mistral::CycloneV CycloneV; e.g. you create a new type
<daveshah>
you'd have to be careful about collisions with stuff in nextpnr tho
<Sarayan>
or, well, a new name for the type
<daveshah>
those header files are included by pretty much everything else
<Sarayan>
There's exactly one thing in the namespace mistral and that's the class CycloneV
<daveshah>
that's fine then
<Sarayan>
Hopefully there will be Cyclone10LP and others in the future, but it's *really* clean
<daveshah>
yeah, `using namespace mistral; ` is the correct solution providing you stick to that contract
<Sarayan>
(inside CycloneV is a different story of course)
<Sarayan>
I was tempted to have a bare CycloneV without a namespace tbh
<Sarayan>
Ah, interesting, we have a different philosophy, I'd have done a using BelId = uint32_t;
<Sarayan>
that way I'm sure the compiler is not going to split it ever
<Sarayan>
You can only have one Arch at a time?
<daveshah>
as it stands yeah they are separate binaries
<daveshah>
but afaik NEXTPNR_NAMESPACE_BEGIN creates an arch-specific namespace so you could link them together if you really wanted to
<Sarayan>
I'd really want not to have everything named Arch :-)
<Sarayan>
One project at a time, I really need to finish the library, then I'll be able to help with nextpnr itself
<Sarayan>
if I go play with nextpnr now I'll never finish the lib and that would be bad
<daveshah>
there's always so much to do :)
<Sarayan>
yeah
<daveshah>
sounds like progress is going really well though
<Sarayan>
but making quartus is probably the thing I'm the most competent at among the group of us
<daveshah>
2021; yeah of the open FPGA
<Sarayan>
(that sentence is fucked, but you get the drift)
<Sarayan>
makeing quartus talk that is
<Lofty>
Fixed your nits, daveshah
<Lofty>
(and Sarayan's nit too)
<daveshah>
Thanks, makes life a lot easier reviewing it in small chunks
<Lofty>
I'm guessing for bindBel/unbindBel I can just copy/paste it?
<Lofty>
It doesn't seem very arch-specific
<daveshah>
Mostly
<daveshah>
For nextpnr-xilinx I do some dirty tracking to speed up validity checks
<daveshah>
But that's a minor performance optimisation at best
<Lofty>
I definitely hope a lot of this gets factored out in planar :P
<daveshah>
Yeah it all will be
<daveshah>
I just hope planar gets beyond a dream
<daveshah>
One day...
<Sarayan>
planar?
<daveshah>
riir nextpnr
<Sarayan>
riir?
<daveshah>
rewrite it in rust :)
<Lofty>
daveshah: there's the cxx crate :P
<daveshah>
that could be useful for mistral :)
<daveshah>
interop with nextpnr is quite low on my priority list; because coming up with a fresh set of APIs and data structures is more interesting imo
<Lofty>
Yeah, fair
<Sarayan>
there's as little in code as possible in mistral
<daveshah>
I think planar will have a standard database format; so mistral would mainly be a build-time dep
<Sarayan>
there's a *ton* of generated stuff, but that's not as issue
<daveshah>
and perhaps used for bitgen; if you don't go for a text IL twixt
<Sarayan>
I think "standard database" is shooting yourself in the feet, but that's just me
<daveshah>
There will still be some arch-specific hooks
<daveshah>
but the idea is to try and see if there's a way of doing deduplication that applies to a reasonable way of arches, well enough
<daveshah>
this will also eliminate the need to rebuild everything for each arch
<daveshah>
as you can see, things are still at the stage of 'unknown unknowns' atm, not even known unknowns
<Sarayan>
heh
<daveshah>
any feedback you have from looking at nextpnr is always useful tho
<Sarayan>
Right now? A lot of stuff should be out of arch :-)
<daveshah>
yeah, makes sense
<Sarayan>
right now the arch api is a tad frightening
<daveshah>
everyone who looks at it says that, tbh
<daveshah>
i can't disagree
<Sarayan>
because it has to do more than describing the arch, it has to do fundamental managing of internal nextpnr structures
<daveshah>
mmm
<sorear>
I do wonder whether it's possible for a standard database to encode enough information to get close to generating a bitfile directly
<Sarayan>
I'm not saying that it's always bad to be able to pop the cover, so to say, but it shouldn't be a requirement
<daveshah>
probably, although your rule format isn't going to be far off turing complete in that case
<Sarayan>
sorear: there's all the framing that's going to be a pain to describe non-turingly
<daveshah>
and things like complex expressions depending on cell parameters and the like
<Sarayan>
plus there's funky thing like the fact that the cv has actually *3* configuration ram blocks, with slightly different behaviours
<daveshah>
yeah
<daveshah>
the actual bitstream format will definitely not be in-scope for planar
<daveshah>
overall, planar will still have some kind of code/data mix like nextpnr
<daveshah>
just hopefully with slightly less code in most of the common cases
<sorear>
could farm out the more turing-y bits to another program if that helps, but the "another program" should be <1MB of code
<Sarayan>
xilinx work(s?ed?) with boolean expressions computed on the routing/state graph for every bit of the firmware
<Sarayan>
quartus works with muxes everywhere
<Sarayan>
and ram blocks
<daveshah>
there are sometimes places where you have to e.g. subtract one from a parameter
<daveshah>
for ECP5
<daveshah>
expressing that as a boolean expression would be a bit of a PITA
<Sarayan>
where the muxes/blocks are at positions in the config rams relative to an origin, or in a small number of cases absolute
<Sarayan>
plus about 10K optional inverters scattered in the cram