<kbeckmann>
i know i'm not doing a lot of operations to generate the image, but is still blows my mind a bit that we can run this in actual real time and not even drop frames at 60fps.
jeanthom has joined #nmigen
q3k has quit [Ping timeout: 240 seconds]
<kbeckmann>
i did some profiling and i am spending most time in mul_uu. it could be cool to have native implementations of these, but i realize that will require an substantial effort.
<kbeckmann>
this made a significant improvement in mul_uu(): if (BitsY <= 64 && BitsA <= 32 && BitsB <= 32) { return value<BitsY>{a.data[0] * b.data[0]}; }
<kbeckmann>
(60 fps in 1280x720 :o )
<kbeckmann>
the code above is not correct, need to cast the values to uint64_t first. but you get the idea.
q3k has joined #nmigen
Asu has joined #nmigen
Asuu has joined #nmigen
Asu has quit [Ping timeout: 260 seconds]
jeanthom has quit [Ping timeout: 260 seconds]
Asu has joined #nmigen
Asuu has quit [Ping timeout: 256 seconds]
Asu has quit [Ping timeout: 246 seconds]
Asu has joined #nmigen
Asuu has joined #nmigen
Asu has quit [Ping timeout: 258 seconds]
FFY00 has quit [Ping timeout: 260 seconds]
FFY00 has joined #nmigen
jeanthom has joined #nmigen
jeanthom has quit [Ping timeout: 265 seconds]
jeanthom has joined #nmigen
jeanthom has quit [Ping timeout: 260 seconds]
<whitequark>
kbeckmann: yeah, that change would benefit the hardware greatly, too
<whitequark>
strength reduction basically
<kbeckmann>
ah yes, i guess i was thinking i was doing software again and relying on a compiler to optimize that.
<Sarayan>
wq: You mean benefit the hardware by using specialized resources when they exist?
<whitequark>
by not requiring any specialized resources
<whitequark>
instead of x*[1,-1], use [x,-x]
<whitequark>
so no multiplier needed
kernelmethod has joined #nmigen
<_whitenotifier-f>
[nmigen] BracketMaster opened issue #403: Records don't work in Assert - https://git.io/JfQsY
jeanthom has joined #nmigen
<_whitenotifier-f>
[nmigen/nmigen-yosys] whitequark pushed 1 commit to master [+0/-0/±2] https://git.io/JfQnn
<_whitenotifier-f>
[nmigen/nmigen-yosys] whitequark 32c69e1 - Update to WASI SDK 11.0.
Asuu has quit [Quit: Konversation terminated!]
<_whitenotifier-f>
[nmigen/nmigen-yosys] whitequark pushed 1 commit to master [+0/-0/±3] https://git.io/JfQcG
<_whitenotifier-f>
[nmigen/nmigen-yosys] whitequark c5768f4 - Add CXXRTL backend to the package.
<_whitenotifier-f>
[nmigen/nmigen-yosys] whitequark pushed 1 commit to master [+0/-0/±1] https://git.io/JfQc4
<_whitenotifier-f>
[nmigen/nmigen-yosys] ... and 5 more commits.
<kbeckmann>
oh that's great! i tried to make a proper fix but gave up after i realized that i'm not that familiar with c++ templates. got stuck where i wanted to support >32 bits - the returned value would require two elements in the initialization which is not generic and thus failed to build for <=32 bit multiplications.
<whitequark>
I was going to just skip >32 bits lol
<whitequark>
can you show me your attempt? I might be able to help you with that
<whitequark>
which would actually be preferable because I'm currently busy working on nmigen.back.cxxsim
<kbeckmann>
for <=32 bits, i basically just did if (BitsY <= 32) { return value<BitsY>{a.data[0] * b.data[0]}; }
<whitequark>
ahhh I see
<kbeckmann>
overflow might get a bit nasty
<whitequark>
yes that's actually not correct
<kbeckmann>
mm especially if the operands are larger than 16
<kbeckmann>
oh i also noticed a funky thing. it seems it's possible to create a value<24> foo{0x11223344u); where data contains the 8 "extra" MSB
<whitequark>
yes
<kbeckmann>
i guess i'm violating some contract here
<whitequark>
and it's illegal
<whitequark>
yep
<kbeckmann>
alright
<whitequark>
creating values directly isn't the preferred way
<whitequark>
the problem is that the preferred way doesn't exist yet