<tpw_rules>
might have to port over this project i'm doing...
<qu1j0t3>
tpw_rules: do you even need dynamic allocation? rust might be overkill for such a small program?
<tpw_rules>
huh i wonder if it uses the same technique i do for my 'tasks': just bastardize the NVIC
<qu1j0t3>
lots of ways to allocate without a heap anyway
<tpw_rules>
no i don't want dynamic memory
<adamgreig>
rust works fine with no allocator
<tpw_rules>
it's a large program with lots of moving parts, and i agree with the concept of ditching C
<qu1j0t3>
so do I :)
<adamgreig>
rtfm does indeed use the nvic as a hardware scheduler
edmund20[m] has quit [Quit: removing from IRC because user idle on matrix for 30+ days]
<tpw_rules>
i was, if i had to migrate the program so dramatically, interested in a real rtos where tasks can be paused and resumed. at least with my system a higher priority task can't ever yield, it has to return from the interrupt cause the stack is shared
<adamgreig>
yea. there is interest in getting decent bindings to an rtos established, or writing a rust rtos, but neither really exists as yet
<tpw_rules>
but of course the problem with that is i have to spend memory :P
pointfree[m] has quit [Quit: removing from IRC because user idle on matrix for 30+ days]
emeb has left ##openfpga [##openfpga]
dj_pi has joined ##openfpga
dj_pi has quit [Ping timeout: 258 seconds]
unixb0y has quit [Ping timeout: 246 seconds]
unixb0y has joined ##openfpga
dj_pi has joined ##openfpga
pie_ has quit [Remote host closed the connection]
pie_ has joined ##openfpga
<cr1901_modern>
tpw_rules: https://github.com/cr1901/AT2XT Full-fledged Rust application for msp430 that fits in 2kB flash, 128 bytes (sic) of RAM, and doesn't even use half of that. No heap, only globals and stack.
<tpw_rules>
ooh
<cr1901_modern>
(to be clear- it uses 1800 bytes of ROM)
<qu1j0t3>
nice
mumptai_ has joined ##openfpga
<tpw_rules>
now the onus is on me to actually learn rust :P
<tpw_rules>
i've written precisely one thing in it: a custom data compressor
mumptai has quit [Ping timeout: 258 seconds]
ayjay_t has quit [Read error: Connection reset by peer]
ayjay_t has joined ##openfpga
_whitelogger has joined ##openfpga
ayjay_t has quit [Read error: Connection reset by peer]
ayjay_t has joined ##openfpga
Miyu has joined ##openfpga
dj_pi has quit [Ping timeout: 268 seconds]
Miyu has quit [Ping timeout: 250 seconds]
pie_ has quit [Remote host closed the connection]
pie_ has joined ##openfpga
_whitelogger_ has joined ##openfpga
_whitelogger has quit [Remote host closed the connection]
Bike has quit [Quit: Lost terminal]
_whitelogger has joined ##openfpga
rohitksingh has joined ##openfpga
_whitelogger has joined ##openfpga
jcreus has joined ##openfpga
<whitequark>
daveshah: :S
<whitequark>
this caching thing is a horrible pain
<whitequark>
absolutely nothing works properly
<whitequark>
i need a different approach i think
<swetland>
whitequark: I've managed to (mostly) repro your alrsu in verilog (because I hate myself). don't quite have carry rigged up right yet, I think.
<whitequark>
swetland: but you could um
<whitequark>
just
<swetland>
63 cells vs 199 cells for the naive switch/case approach
<whitequark>
use migen to compile it?
<swetland>
that sounds entirely too easy (I should give that a look too)
<swetland>
realized the biggest contributor to my alu footprint is synth_ice40 doesn't infer the DSP block for multiply.
<whitequark>
yeah it's a TODO
<swetland>
easy enough to instantiate one for now
<daveshah>
whitequark: hrm. I suppose invalidating the entire input and output cone always should be safe but slow?
<daveshah>
swetland/whitequark: yes, DSP inference is planned. I'm currently thinking about a pass similar to memory_bram for it
<daveshah>
With some kind of text file config describing the arch's primitives
<daveshah>
This seems like quite a small ALU. What is the operation you want to map to DSP?
<swetland>
multiply
<swetland>
the cell counts above are for whitequark's super-tuned alsru vs a case statement of the same 10 ops
<daveshah>
What size multiply?
<daveshah>
I'm just curious because there will probably be a heuristic in the DSP mapper that will need tuning
<swetland>
mine is chewing up 692 cells
<swetland>
a 16x16 multiply is 317 LUT4 + 8 CARRY w/ synth_ice40
<daveshah>
Oh I see, I thought it was included in the count earlier
<daveshah>
Well a 16x16 would definitely map to a DSP
<daveshah>
I think the threshold will probably be somewhere between 4x4 and 8x8
<swetland>
4x4 is 13 LUT, 8x8 is 69 LUT, 2 CARRY
<daveshah>
Yeah, 8x8 seems like a reasonable threshold, there should probably also be a rule to pick the largest 8 operations to map
<daveshah>
It's a tradeoff because of the extra delay a DSP adds
* swetland
nods
<tnt>
Do you know the delay of a non-registered DSP off the top of your head ? (ballpark)
<swetland>
I expect like for a lot of fancy custom blocks, instantiation and application-specific configuration will give the best results at the end of the day (as it supports various registers for pipelining, etc)
<swetland>
but it's certainly nice having inference make use of such blocks when there's a clear benefit
<daveshah>
tnt: up to 8ns
<daveshah>
Plus the routing delay of reaching them
<daveshah>
But this is lower if you only use the LSBs
<daveshah>
eg 3.3ns for using only 8 output bits
<daveshah>
We do want some Yosys support for folding pipelines and fused multiply add into the DSP
<daveshah>
But some fancier stuff like the accumulator mode and carry stuff probably won't be inferred
<swetland>
yeah it's got a lot of knobs
<daveshah>
err I don't think you've looked at the ECP5 DSP then :P
<tnt>
daveshah: huh ... you would know better than me :p
<tnt>
I might have gotten that wrong.
<daveshah>
I know that 35 is the dedicated PLL input
<tnt>
yeah, that I know for sure too.
<tnt>
Oh damnit ... I knew that 2 month ago. Need to recheck the sources now.
<tnt>
yeah, no you're right. The padin stuff is the same for if the clock comes from the PLL output or from the pad at that site.
<daveshah>
because the PLLs are glued (only metaphorically I hope) into the input path
<tnt>
ok, updated.
<sorear>
we'll have an answer on that soon enough
<daveshah>
I wouldn't be surprised if the PLLs and surrounding routing looked quite different to other parts
<daveshah>
*other parts of the die
<tnt>
he posted a couple teaser pics but I didn't see the full set yet anywhere ?
<daveshah>
they very much feel like something bought in and added on]
<daveshah>
yeah, just teasers till
<daveshah>
*still
* tnt
updated the gist ... should be correct now :p
edmund has quit [Remote host closed the connection]
edmund has joined ##openfpga
<daveshah>
tnt: Looks good
<daveshah>
might want to mention GB_IOs 4 & 5 are for HFOSC & LFOSC respectively
<daveshah>
I think GB_IO 3 does exist in the uwg30 package
<tnt>
Yeah this is sg48, pin mapping for uwg30 will be different. This is just a quick reference for the icebreaker mostly.
<tnt>
I'll add HFOSC / LFOSC, good idea.
edmund has quit [Read error: Connection reset by peer]
edmund has joined ##openfpga
rohitksingh has quit [Ping timeout: 250 seconds]
rohitksingh has joined ##openfpga
emily has joined ##openfpga
Flea86 has quit [Quit: Goodbye and thanks for all the dirty sand ;-)]
Bike has joined ##openfpga
emily has quit [Quit: leaving]
rohitksingh has quit [Ping timeout: 244 seconds]
Miyu has joined ##openfpga
rohitksingh has joined ##openfpga
kuldeep has joined ##openfpga
kuldeep_ has joined ##openfpga
kuldeep_ has quit [Remote host closed the connection]
kuldeep_ has joined ##openfpga
kuldeep has quit [Remote host closed the connection]
kuldeep_ is now known as kuldeep
ayjay_t has quit [Read error: Connection reset by peer]
ayjay_t has joined ##openfpga
kuldeep_ has joined ##openfpga
kuldeep has quit [Read error: Connection reset by peer]
kuldeep_ has quit [Remote host closed the connection]
kuldeep has joined ##openfpga
Richard_Simmons has joined ##openfpga
Bob_Dole has quit [Ping timeout: 250 seconds]
rohitksingh has quit [Ping timeout: 246 seconds]
X-Scale has quit [Ping timeout: 240 seconds]
X-Scale has joined ##openfpga
mumptai has joined ##openfpga
pie_ has quit [Remote host closed the connection]
pie_ has joined ##openfpga
Miyu has quit [Ping timeout: 258 seconds]
Miyu has joined ##openfpga
Miyu has quit [Ping timeout: 244 seconds]
cr1901_modern has quit [Ping timeout: 240 seconds]
<tnt>
Damn, it's like the tools are reading my mind ... and then doing the opposite of what I want just to mess with me. Each time I want a signal to be a mux, it creates a Clock Enable ... and when I want a clock enable, it creates a mux.
<whitequark>
daveshah: okay, i have depth relaxation working
dj_pi has joined ##openfpga
<whitequark>
mostly, anyway
<whitequark>
it sometimes arrives at a solution with negative slack
<whitequark>
but that should be easy to fix
<whitequark>
the code is a mess though :S
<daveshah>
Well, probably less of a mess than abc :/
<whitequark>
daveshah: i can clean it up
<whitequark>
it's mostly that i'm in way more pain than usual for the last several days and it starts to interfere with my ability to refactor
<daveshah>
Sorry to hear that :(
<whitequark>
it's kind of annoying because it doesn't interfere with my ability to design algorithms yet
<whitequark>
but they don't work well because the individual operations are flaky
<whitequark>
i'm not sure exactly how pain relates to either of those
<tnt>
(in glorious 8 colors because ... 3 bit hdmi pmod)
<RaYmAn>
impressive!
<davidc__>
thats pretty impressive. External parallel to HDMI generator?
<tnt>
yes, tfp410 on the pmod to do the tmds encoding.
<tnt>
with a 147 MHz pixel clock
<zkms>
nice
<q3k>
147MHz? huh
<tnt>
Well there is only the clock running at that frequency ... the rest is running at half of that and I have a 2 pixel wide pipeline :p
<tnt>
Then DDR reg to push that out.
<q3k>
oh
<q3k>
right
<q3k>
... the ice40 has no hardware ddr gearboxes, right?
<daveshah>
It does
<daveshah>
Only DDR though
<q3k>
oh just setting up SB_IO correctly give you DDR out?
<tnt>
yes.
<q3k>
i guess that works fine for simple source-synchronous stuff
* q3k
makes mental note
<adamgreig>
ddr input too, handy for oversampling stuff
<daveshah>
I've used the DDR input to do MIPI CSI too
mumptai has quit [Read error: Connection reset by peer]
mumptai has joined ##openfpga
<adamgreig>
can yosys/npnr infer ddr io at all?
<whitequark>
no
<adamgreig>
will it move a register on the io into the sb_io?
<whitequark>
actually i don't think yosys can do any ddr inference
<whitequark>
aiui its primitives do not represent ddr
<whitequark>
CLK_POLARITY can be 0 or 1
<swetland>
do any tools do that? seems like one of those "gotta tell it exactly what you want" situations in general
<whitequark>
you can trigger on posedge clk or negedge clk
<whitequark>
in theory
<whitequark>
to infer ddr
<adamgreig>
yea you could imagine writing some verilog that described it
<whitequark>
seems fragile
<adamgreig>
just not sure what the tools would do
<adamgreig>
i imagine make an inverted clock and feed another flipflop that locally or something horrible :P
<whitequark>
ew, no
<whitequark>
that would definitely not work
<tnt>
usually for my IO I don't trust inference as far as I can throw it.
<swetland>
my gut is when you start interacting with specialized hw blocks its best to instantiate directly with all the specific knobs and dials, even if it does mean having vender/arch-specific conditionals
<mumptai>
just zse the falling edge
<whitequark>
yeah what swetland said
<q3k>
i've never seen DDR inference
<tnt>
+1
<whitequark>
I/O is particularly bad
<q3k>
yes
<whitequark>
even hi-z breaks all the time
<adamgreig>
sure, that's what i've always done, just wondered
<q3k>
also doing more advanced modeling like clock delays and stuff in verilog... nah
<adamgreig>
does nextpnr know that you can register-or-not the sb_io? will it create a clocked+registered input automatically or do you have to use sb_io to get anything beyond just a raw direct in/out?
* swetland
tries to figure out what the hell set off all his smoke detectors. my sense of smell is crap. I don't see any sign of smoke/burning/etc. everything finally reset. not the optimal way to wake up
<tnt>
adamgreig: you need to instanciate.
<daveshah>
adamgreig: no, not at the moment
<adamgreig>
ok, good to know
<adamgreig>
can probably save a few flipflops then :P
<tnt>
atm I don't think it even does clock analysis properly depending on the SB_IO config.
<daveshah>
It's slightly tricky because you need to make sure you don't pack different clocks or CEs into two IO that then share a tile
<tnt>
hehe, yeah, my design currently works only on PMOD 1A of the icebreaker because of that :p