genii has quit [Remote host closed the connection]
AndresNavarro has joined ##openfpga
futarisIRCcloud has joined ##openfpga
rohitksingh_work has joined ##openfpga
Bike has quit [Quit: leaving]
AndresNavarro has quit [Quit: rcirc on GNU Emacs 25.2.1]
pie__ has joined ##openfpga
AndresNavarro has joined ##openfpga
pie___ has quit [Ping timeout: 245 seconds]
ayjay_t has quit [Read error: Connection reset by peer]
ayjay_t has joined ##openfpga
emeb_mac has quit [Quit: Leaving.]
m4ssi has joined ##openfpga
ayjay_t has quit [Read error: Connection reset by peer]
ayjay_t has joined ##openfpga
<RaYmAn>
if anyone have any ecp5 boards and some time, I could use some help testing out multiboot https://github.com/SymbiFlow/prjtrellis/pull/68 my go-to test is just two bitstreams that blink different LEDs
Miyu has joined ##openfpga
<RaYmAn>
in my tests so far, it works as well as the official tool, but that means not on everything so it works ok 1/2 boards I have
<RaYmAn>
wasn't sure if the database was updated after the merge
ayjay_t has quit [Read error: Connection reset by peer]
<RaYmAn>
it works great ok my ecp5-evn but the second bitstream doesn't work at all on another board I have. but that's no different from with lattice tools
ayjay_t has joined ##openfpga
rohitksingh_work has quit [Ping timeout: 250 seconds]
rohitksingh_work has joined ##openfpga
ayjay_t has quit [Read error: Connection reset by peer]
ayjay_t has joined ##openfpga
rohitksingh_wor1 has joined ##openfpga
rohitksingh_wor1 has quit [Client Quit]
mumptai has joined ##openfpga
rohitksingh_work has quit [Ping timeout: 245 seconds]
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
Asu has joined ##openfpga
ayjay_t has quit [Read error: Connection reset by peer]
ayjay_t has joined ##openfpga
indy has quit [Ping timeout: 250 seconds]
emily has quit [Remote host closed the connection]
emily has joined ##openfpga
AndresNavarro has quit [Ping timeout: 255 seconds]
Flea86 has quit [Quit: Goodbye and thanks for all the dirty sand ;-)]
rohitksingh has joined ##openfpga
<whitequark>
daveshah: so
<whitequark>
i am thinking about a soft PLL on ice40
<whitequark>
using the LUT cascade output as a delay line
<whitequark>
and using the metastable state of DFFs to sample at a higher resolution than can be afforded by the delay line
<whitequark>
this hinges on the duration of metastable state being comparable to delay in the LUT
<daveshah>
interesting
<daveshah>
This is not something I really know anything about though
<daveshah>
iCE40 LUTs are fairly slow
<sxpert>
somehow they manage 60MHz or so
<tnt>
sxpert: you can get pipelined single lut layer logic at way higher than 60M even on a up5k.
<sxpert>
yeah, though you need many pipeline stages I suppose
<sxpert>
would you need to write this in a rather specific way ?
<tnt>
I always keep in mind the number of LUT layers between two FFs when writing logic.
<sxpert>
hmm,
<sxpert>
how does one cound that ?
<sxpert>
count
<tnt>
But of course depends on what you're doing. Like if I'm doing a DSP algo with no feedback, I can pretty much put a FF after each lut layer "for free" since the FF is there in the LCs anyway.
<tnt>
sxpert: when I write logic, I just 'map it' in my head to LUT4.
<tnt>
I know what can fit in an iCE40 LC and what doesn't.
<sxpert>
I see
<tnt>
usually I always have a paper drawing of the exact logic I want. Then I write verilog that describes that.
<sxpert>
hah
<tnt>
Only time I went away from that recently was the decode logic for a custom soft core ... and that went pretty bad. I tried describing it behavioraly with verilog 'case' and let the synthesis figure it out. Result was aweful. In the end, I ended up writing and external tool to convert that behavioral description in a huge truth table and then run that through and external logic optimizer and then feed the basic logic equation to yosys and that worked way better.
* sxpert
wonders how to optimise his stuff
<sxpert>
hmm
<whitequark>
tnt: flowmap produces optimal LUT depth for any given logic
<whitequark>
this is polynomial time for LUTs, unlike AIGs etc
<daveshah>
it produces optimal LUT depth for a given logic netlist
<sxpert>
if I understand tnt, yosys has issues with optimizing large 'case' statements ?
<whitequark>
right, that's what i said, no?
<daveshah>
afaik there are transformations to the netlist (like balancing) that reduce depth without changing fucntion
<tnt>
whitequark: the main issue was that yosys was not taking good advantage of the lot of 'don't cares'
<whitequark>
ohh, i see what you mean
<sxpert>
hmm
<whitequark>
tnt: hmmm, that might be possible to improve
<tnt>
That was a very simple test case to illustrate what I meant.
<tnt>
whitequark: is your flowmap worked merged in yosys btw ? Or do I have to use a branch ? Curious to try if it helps the mapping of the base logic equations.
<whitequark>
flowmap is upstream
<whitequark>
all of my code that is ready to use, anyway
m4ssi has quit [Ping timeout: 240 seconds]
m4ssi has joined ##openfpga
rohitksingh has quit [Remote host closed the connection]
rohitksingh has joined ##openfpga
<tnt>
whitequark: any 'howto' to integrate it in an ice40 flow ? It's not in the default synth_ice40 'macro' right ?
<whitequark>
it is
<whitequark>
wait, hm
<whitequark>
that part's not upstream it looks
<whitequark>
tnt: ok. stop just before map_gates in synth_ice40. run simplemap; flowmap -maxlut 4
emeb has joined ##openfpga
rohitksingh has quit [Ping timeout: 246 seconds]
bibor has quit [Quit: WeeChat 1.6]
bibor has joined ##openfpga
<somlo>
_florent_: thanks!
<tnt>
whitequark: tx, that works. Results just for that comb block is pretty similar to the default synth_ice42 in # LUTs (input verilog is just a bunch of logic equations expanding a 16 bit opcode into 58 control signals). Not sure if there is an easy way to measure the depth of generated logic.
<whitequark>
tnt: -debug will tell you
<whitequark>
for flowmap that is
<whitequark>
tnt: the main advantage of flowmap over abc is that it preserves all signal names
<tnt>
Oh wait, I just inserted simplemap; flowmap -maxlut 4 right before map_gates, but I left everything, so AFAICT it's still running abc.
<sxpert>
so you could run flowmap instead of ABC ?
<whitequark>
yes
<whitequark>
flowmap is a replacement for abc
<sxpert>
ah
<sxpert>
and it's supposedly better ?
<whitequark>
it produces less space efficient designs that are approximately as fast
<whitequark>
in fact i have a partial implementation...
<whitequark>
tnt: you do not need any abc commands at all
<tnt>
yeah, I just tried doing the final techmap after flow map (since it's only comb logic, no need for the rest).
<tnt>
it works, but it uses about twice as many LUTs.
genii has joined ##openfpga
<whitequark>
that's expected
<whitequark>
flowmap is not optimizing for LUT count at all
<sxpert>
could there be a later stage where those luts are factorized ?
<whitequark>
yes, FlowMap-Area
<ylamarre>
whitequark: That's useful, most of the times, synthesizers over optimise logic causing heavy logic congestion. It's nice being able to fine tune which gets optimised and which might benifit from not doing so.
<whitequark>
ylamarre: even better
<whitequark>
flowmap-r has a tunable tradeoff between area and depth
<whitequark>
so you can trade one logic level for less area
<whitequark>
or two
<whitequark>
abc can't do this!
<ylamarre>
Since signal names are conserved, we don't have to fight the tools to put the constraints.
<tnt>
"over optimize" ... "mis optimize". My experience is they _think_ they do something good ... but end up doing something that's worse than just doing the obvious.
<sxpert>
also sounds easier to debug
<whitequark>
btw, flowmap is a very very early ancestor of the synthesis algorithm that's now in vivado
<whitequark>
the team that worked on flowmap was sponsored by, and produced results for, xilinx
<ylamarre>
whitequark: Ah, makes sense!
<whitequark>
and their algorithms are often an improvement over their predecessor that are something like "we took a superexponential algo and made it polynomial"
<whitequark>
or "we took a polynomial algo and reduced the constsant by a few orders of magnitude"
<ylamarre>
I remember using such feature in Vivado and it was deeply appreciated.
<whitequark>
in fact i think flowmap is the basis of *all* modern LUT synthesis
<whitequark>
indirectly
<ylamarre>
tnt: It's not so obvious to estimate routing congestion during synthesis. Espescialy in what I'd call "outlying design".
<whitequark>
hm, not sure about abc
Richard_Simmons has joined ##openfpga
Bob_Dole has quit [Ping timeout: 264 seconds]
<ylamarre>
sxpert: A good way to estimate logic, is to check for the "worst case". If well coded/though, there's not much to optimise (others should correct me here).
<ylamarre>
Especially on something like an ice40 where you don't have a lot of specialty logic.
<ylamarre>
If you have a wide comparision, you might be able to take advantage of the carry propagation logic, but otherwise it all goes in the LUTs.
<ylamarre>
IMO, if your design is going (and fitting) on a smallish FPGA, you have a good enough idea on how it maps...
<tnt>
That's what I love about the ice40, it's small enough I can pretty much fit it in my head :)
<whitequark>
this is what people say about c and microcontrollers too
<whitequark>
but i'm not very convinced
<whitequark>
sure, if i am doing an ALU that will be replicatede 16 or 32 times, i will hand optimize it
<whitequark>
(but i still rely on yosys to infer correct carry chains and such)
<ylamarre>
It's more about estimating logic usage, than getting accurate results, knowing if you should add a layer of pipeline somewhere or does it still fit in your LUT.
<tnt>
Obviously you have to pick your battles. That's why the decode logic, I mostly did it using existing logic optimizers and I didn't go and hand code every LUT4.
<tnt>
But the ALU/execution unit, I knew it'd be the critical path and there, I went way lower to make sure things would be exactly as designed.
<sxpert>
am recoding my alu to what I've learned...
<sxpert>
ylamarre: I wonder if actual coding style has any bearing on the generated logic
<tnt>
depends what you mean by 'style' ... but yeah, in general, synthesis tool can infer different things depending how it's written.
<tnt>
getting them to reliably do what you want is an art form.
<ylamarre>
Now, o_bus_load_dp should just be assigned it's logic...
<ylamarre>
It's not a reset thing...
<sxpert>
it's a control line to the bus controller
<ylamarre>
Now there are "2 shcool of thought": reset condition at the begining or at the end of your always block.
<ylamarre>
sxpert: My point still stand.
<daveshah>
One problem is with the reset at beginning and the rest of the logic in an `else` is that if you decide not to put reset on a signal (e.g. for datapaths), then !reset becomes a clock enable for that signal which can waste routing/resources
<whitequark>
yep. this is why nmigen puts reset at the end of the block in an `if`
<ylamarre>
Exactly
<ylamarre>
I'm on two thread here, so I'm a little slow on the reply, sorry.
<ylamarre>
I'll finish why o_bus should just be it's own line, while people can explain the reset thing.
<ylamarre>
Reset at the begining is legacy style and it's only advatage is you see your "common" reset at the begining of you block.
<sxpert>
ylamarre: ok, just tried, o_bus_load_dp can't be a wire, as start_load_dp is on for a very short time
<ylamarre>
Not a wire, but you can/should/will put logic on your reg assignation line!
<sxpert>
ah, so put that as a similar logic in a always @(*) block ?
<ylamarre>
Give me a few minutes to trace this mess... I'll do it for this signal only, put it'll be a great explanation of what we mean by coding style drives/helps the tools...
<sxpert>
always @(*) begin
<sxpert>
if (start_load_dp) o_bus_load_dp = 1;
<ylamarre>
Then hopefully someone can put the explanation in a nice tweet post, thanking @gatin00b
<sxpert>
if (xfr_copy_done) o_bus_load_dp = 0;
<sxpert>
end
<sxpert>
this makes it work
<ylamarre>
sxpert: please, let me a few minutes... it'll be very nice when all compact and sexy.
<sxpert>
ok
* sxpert
waits ;)
<ylamarre>
Ok, there's just so much stuff to say it'll take more than a few minutes actually... so I'll go with something else that'll explain what I wanna convey here...
<ylamarre>
So one good practice I'd give you is to keep resets or presets buffered. Meaning they should come from a FF
<ylamarre>
So you have this module that is preset or reset at some point by a signal 'rst'
<sxpert>
something like "blah <= ~blah;" ?
* sxpert
missed somtething there
<ylamarre>
So you want something like (reset at the bottom style):
<ylamarre>
always @(posedge clk) begin
<ylamarre>
my_reg <= selector ? reg_b : reg_a;
<ylamarre>
if ( rst == 1'b1 ) begin
<ylamarre>
my_reg <= 1'b0;
<ylamarre>
end
<ylamarre>
end
<ylamarre>
In this case, reset was probably useless since it's a mux, but I didn't have a good example.
<ylamarre>
You need more code...
<ylamarre>
lemme expand a bit..
<ylamarre>
i'll include reg_a and reg_b and we'll have something to work with...
<ylamarre>
and it'll give us something to compare legacy reset and at-bottom reset.
<ylamarre>
always @(posedge clk) begin
<ylamarre>
reg_a <= reg_a + 4'b1;
<ylamarre>
my_reg <= selector ? reg_b : reg_a;
<ylamarre>
reg_b <= 4'hA;
<ylamarre>
if (rst == 1'b1) begin
<ylamarre>
reg_a <= 4'b0;
<ylamarre>
reg_b <= 4'h5;
<ylamarre>
end
<ylamarre>
end
<ylamarre>
always @(posedge clk) begin
<ylamarre>
if (rst == 1'b1) begin
<ylamarre>
reg_a <= 4'b0;
<ylamarre>
reg_b <= 4'h5;
<ylamarre>
end
<ylamarre>
else begin
<ylamarre>
reg_a <= reg_a + 4'b1;
<ylamarre>
reg_b <= 4'hA;
<ylamarre>
my_reg <= selector ? reg_b : reg_a; //"Wrong" as it introduce a CE
<ylamarre>
end
<ylamarre>
end
<ylamarre>
IOk, here we go...
<ylamarre>
So the first case shows what we mean by reset-at-bottom.
<ylamarre>
The top part of the always block is our logic and the bottom part is handling our control signal logic.
<ylamarre>
By control signal I mean: PRESET, RESET and CE (Clock Enable)
<ylamarre>
In the secound case, rst has to go through an inverter then to CE pins of my_reg's FF
<ylamarre>
This adds unecessary logic and routing.
<tnt>
sxpert: you're not targetting the ice40 right ?
<ylamarre>
my_reg is not reset because it doesn't need to be reset since it's value won't be used when reset is deasserted (somewhere else in the design)
<ylamarre>
It doesn't matter which FPGA is targeted, crap coding style is crap coding style ;)
<tnt>
ylamarre: I'm raising it because the arch of the ice40 makes async reset much preferrable imho. For other (ecp/xilinx/...) not so much.
<ylamarre>
Notice, I also provide the length of all my signals. This ensures there are no surprises if register width should expand later on.
<ylamarre>
tnt: why?
<tnt>
ylamarre: in the ice40, sync resets is gated by CE. (so you need CE=1 for RST line to work).
<ylamarre>
Oh! good catch!
m4ssi has quit [Remote host closed the connection]
<ylamarre>
I heard something like that for some Intel parts, but never bothered with them so, I never really cared, but good to know thanks.
* ylamarre
has only ever really used Xilinx parts
<ylamarre>
We'll modify later to accomodate for that.
<tnt>
I think sxpert is working on ECP5 IIRC ?
<tnt>
ylamarre: I worked on xilinx pretty exclusively as well until I started doing ice40 stuff ~ 1 y ago.
<tnt>
Still need to get into ECP5 ... got the hw, just no time yet to dive into it.
<ylamarre>
I think both...
<ylamarre>
sxpert: Ok, so in your code, alu_active is a reset and CE merged together... avoid this!
<ylamarre>
sxpert: You provide me with a good example here: wire [1:0] phase;
<ylamarre>
assign phase = i_clk_ph + 3;
<sxpert>
well, alu_active is indeed reset and module_enable
<ylamarre>
verilog integers are 32bits signed
<sxpert>
so that should be "2'b11" and not "3" ?
<sxpert>
or maybe
<ylamarre>
If that's what you want...
<sxpert>
"- 2'b01" ?
<ylamarre>
Yes!
<ylamarre>
that's better.
<ylamarre>
'cause that's really the intent.
<sxpert>
right
<sxpert>
I thought the tool would cast the integer value to whatever the destination size is
<ylamarre>
Nope
<sxpert>
I see
<ylamarre>
It's in the standard somewhere
<ylamarre>
Don't assume!
<sxpert>
ok
<ylamarre>
I'd also favor "- 2'd1"
<ylamarre>
Sign extension and subtle bit truncation will bite you back...
<ylamarre>
That's garanteed by Murphy!
zem has quit [Ping timeout: 240 seconds]
<ylamarre>
also, you have unbuffered substration followed by comparison going those some other logic which eventualy goes into reset and preset...
<ylamarre>
Your optimiser might save you, but I wouldn't rely on that.
<sxpert>
hmm
zem has joined ##openfpga
<ylamarre>
Reversing the logic, my understanding here is you have logic that are only active one fourth of the time depending on the phase...
<sxpert>
yeah
<sxpert>
phase_0 is prepare commands onto the bus
<sxpert>
phase_1 is when the other device on the bus does things
<sxpert>
phase_2 is when the instruction decoder happens
<sxpert>
phase_3 is when instruction execution starts or happens
<sxpert>
ylamarre: I have updated the thing, does it look any better ?
<ylamarre>
Ok, and with all the comparisions and all that, instead on "saving" some flops on the adders and messing up your decoding logic, you might want to have a shift register, a one-hot state machine if you will
<ylamarre>
Code still looks the same on github...
<sxpert>
hadn't pushed, sorry ;)
<sxpert>
done
<ylamarre>
Not exactly what I'm trying to show you...
<sxpert>
ah
pointfree has quit [Excess Flood]
pointfree has joined ##openfpga
<ylamarre>
I'll rework the code on my way back home, I think it's a good example of how coding style "drives" the tool, but it's taking too much time and I need to work.
<sxpert>
ok, no problem
<sxpert>
I'll continue adding functionnality, and rewrite according to the new set of rules
<azonenberg_work>
oh yay, my order of amber 3ml syringes from amazon is out for delivery
<azonenberg_work>
Now i can play with my new UV cure epoxy
<sxpert>
ylamarre: changed the clock generation to a shift register as you said
xdeller__ has quit [Read error: Connection reset by peer]
xdeller__ has joined ##openfpga
<sxpert>
azonenberg_work: the amber color is to prevent the thing from curing in the daylight ?
<azonenberg>
yeah
<azonenberg>
they're basically a LPF with a cutoff in the mid-yellow range
<azonenberg>
so green/blue/UV are blocked
<ylamarre>
azonenberg: Like those sleeping glasses?