sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
<GitHub105> [misoc] sbourdeauducq pushed 1 new commit to master: https://git.io/vKNJ2
<GitHub105> misoc/master d3c6735 Sebastien Bourdeauducq: targets/kc705: use vivado by default
<GitHub154> [artiq] sbourdeauducq pushed 1 new commit to master: https://git.io/vKNJa
<GitHub154> artiq/master b5e52e9 Sebastien Bourdeauducq: runtime: fix unused variable warning
<bb-m-labs> build #115 of misoc is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/misoc/builds/115
<bb-m-labs> build #569 of artiq-kc705-nist_clock is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-kc705-nist_clock/builds/569
<bb-m-labs> build #282 of artiq-win64-test is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/282
<bb-m-labs> build #844 of artiq is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/844
sb0 has quit [Quit: Leaving]
<bb-m-labs> build #570 of artiq-kc705-nist_clock is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-kc705-nist_clock/builds/570
<bb-m-labs> build #283 of artiq-win64-test is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/283
<bb-m-labs> build #845 of artiq is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/845
sb0 has joined #m-labs
ssk1328 has quit [Quit: Connection closed for inactivity]
rohitksingh_work has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
<cr1901_modern> rjo: What was your rationale for coupling the wishbone interface to a "pending" signal: https://github.com/m-labs/artiq/blob/master/artiq/gateware/spi.py#L324-L344 I was trying to move all the non-bus-related logic into a separate class, but I can't actually do that b/c of the pending signal. >>
<cr1901_modern> Just "putting the pending signal into a submodule" will not work, b/c it will change the order that the pending signal's conditions will be tested in an always@ block. Instead of If(bus.ack) being last, If(spi.start) will be last), which changes the synthesized code. I
<cr1901_modern> 'm about to just leave the wishbone core alone, and make a separate CSR class, and call it a day.
<sb0> okay, remote TTLs are working
<sb0> :)
<sb0> clock synchronization/reproducible latency looks good too (as far as my scope can tell)
ssk1328 has joined #m-labs
fengling has joined #m-labs
<rjo> cr1901_modern: the double buffer is part of the wishbone layer. thus pending as well (whether there is a write in that buffer).
<rjo> cr1901_modern: why do you need to touch the wb core?
flaviusb has joined #m-labs
<sb0> whitequark, now we just need the si5324.
<sb0> whitequark, why not use the misoc csr generator for the wishbone interface?
<sb0> whitequark, there is a (tiny) race condition on xfer.idle.
<sb0> if you do it fast, xfer = ...; while(!(xfer & idle)); will break.
<sb0> unlikely to happen with a lm32 cpu driving it, but I like to keep those things clean.
ssk1328 has left #m-labs [#m-labs]
<sb0> rjo, in spi, "bus.dat_r.eq(Array([data_read, xfer.raw_bits(), config.raw_bits()])[bus.adr])" can be moved from comb to sync to improve timing
<sb0> and there should be a way to tell when the transfer is done
<sb0> ah, you're blocking
<rjo> sb0: i am pretty sure balancing will do that for me.
fengling has quit [Ping timeout: 240 seconds]
fengling has joined #m-labs
<sb0> rjo, it can't
<sb0> balancing doesn't change the number of registers
<sb0> buses often have large routing delays
<rjo> sb0: i just tried eight complete two-tone RTIO channels of 1.25 GHz on a kc705 (with full ARTIQ included as well) and it gets to 57% of the LUTs...
<rjo> ah. ok. interesting.
<rjo> not this one. but yes.
<sb0> synthesizers aren't smart enough to determine that the data is actually used only once cycle later. you need to add that register yourself.
<sb0> *one
<rjo> one question i couldn't answer myself: are we typically under register resource pressure or are registers basically free since every output of a slice also contains a register?
<sb0> registers are basically free
<sb0> or at least very cheap
<sb0> so use them generously
<rjo> interestingly though, almost all LUT-FF pairs have one of them unused.
<sb0> it tells you somewhere in the synthesis report if there are any "pure" registers (with the LUT being a pass-through)
<sb0> the few times I looked at that on milkymist-soc/misoc, that number was very low
<sb0> really?
<sb0> is it more often the LUT or the register?
<sb0> resource utilization looks good! did you optimize it?
<rjo> LUT-FF pairs: 87628, LUT-FF pairs with one unused LUT: 82498, LUT-FF pairs with one unused Flip Flop: 84071
<rjo> i remember seeing a "rout-through" number but that should mean something else.
<sb0> hmm, that's not like what I remember
<rjo> *route-
<sb0> this number is quite surprising
<rjo> there might be another type of register as well. slice-lut usage is 57%, slice-register usage is 27%
<sb0> maybe vivado now has some strategy to move the FF onto the routing path under low FPGA utilization to improve timing?
<sb0> rjo, are you including the spline interpolator? those shouldn't use a lot of resources though...
<sb0> how come it takes so few resources? iirc your original estimate was higher
<rjo> these are zeroth-order spline interpolators. the full ones will add more.
<rjo> i'll give it a spin with the full thing. i also need to check that it actually is the full thing and vivado doesn't cull it into triviality.
<rjo> i think this is roughly in-line with the original estimate given the difference in design and the zeroth-order interpolators.
<rjo> sb0: soon we will have to redo the RTIO cpu-side register layout to allow for wide events (maybe 160-256 data bits)
<sb0> is there any gateware problem with that? I think misoc should already handle that
<sb0> the software won't though
<rjo> sb0: in my originall mail a channel pair contained two of these expensive NCOs, using 14% LUTs per pair, the current one has one expenvice NCO per channel, using 57%/8 per channel.
<sb0> how did you remove half the expensive NCOs?
<rjo> if i generate wider-than-64-bit registers, it silently drops them.
<sb0> hm, that's a bug then, but that shouldn't be hard to fix
<rjo> using that other new formula that i jotted down a while ago to generate the two tones.
<sb0> should we try to get a 64-bit CPU?
<sb0> which? and are the channels independent, or IQ paired?
<rjo> mail from jun-14
<rjo> independent plus optionally iq cross-fed.
<rjo> i don't know about the 64 bit cpu ;) that's not my expertise.
<sb0> I can't find it. was I Cc'd?
<sb0> it's not in the list archives either
<rjo> yes. CANb+zoGYJ2QBPoYE_BxtW3mKOamR48_xXHYE9_Z6LMy0Cd=qng@mail.gmail.com
<rjo> message-id
<sb0> mh, nothing
<rjo> it was in that longer but efficient exchange.
<rjo> not on the list.
<sb0> can you resend it to me?
<sb0> or possibly, on the list.
<sb0> by the way, another Joe idea is he wants to heat the DAC chips to temperature-stabilize them
<sb0> he intends to do that with a 100mW resistor - just near the chip that dissipates 2100mW. I'm quite skeptical about this thing.
<rjo> hahahahaha. good one.
<sb0> one of his main questions was whether the FPGA would be capable of PWM control of that resistor. sigh.
<rjo> i have done that for a photodetector circuit a few years ago (actually a hobbs design).
<rjo> but it is a major pain for a few reasons.
<sb0> "one of the nice features of ultrascale is they can do PWM"
<rjo> if he wants something like that, he should look into conduction-cooled hardware for aviation and military. that stuff doesn't need stupid air around it and all the parts are heat-sunk together.
<rjo> working in vacuum is so much better.
<sb0> do you just neglect radiated power?
<rjo> even pwm is noisy. delta-sigma is so much nicer! i will point him there.
<sb0> powers of four in equations sound a bit nasty
<rjo> radiated power is not that much if the temperature differential is small.
<sb0> I would actually look into *removing* heat from the DAC chip, not adding to it
<sb0> 2W into that small package probably makes it pretty hot already
<rjo> mithro: wouldn't it be great to have one of your GSoC guys working on integrating the vivado/ise output highlighter code into migen? ;)
<mithro> rjo: ?
<rjo> yes. and even worse. because that thing emits 2W, it (obviously) needs a low resistance path to a huge thermal mass sink (air or al or cu). then that 100 mW resistor can't change temperature at all.
<rjo> mithro: i was just reverse-i-searching through my terminal and remembered you saying that you had a colorizer.
<rjo> sb0: he should have somebody do a design study of the tempco, likely fluctuations in temp, and possible remedies. he really enjoys procrastinating on these details.
<sb0> rjo, yeah, I told him to ask creotech.
<sb0> and not only is he procrastinating, but he wants rushed results
<rjo> mithro: yes. cleaned up and polished a bit and integrated into migen by default. that really adds value to vivado's brain-dump/b text strem.
<mithro> rjo: Actually - I think we lost colorizing the Xilinx output at some point in the move to misoc - https://github.com/timvideos/HDMI2USB-jahanzeb-firmware/blob/master/tools/colormake.pl
<mithro> rjo: That script is pretty hacky :) - I haven't done anything for vivado either
fengling has quit [Ping timeout: 240 seconds]
<rjo> mithro: i remember looking at some python tool that made these generic colorization things really simple and smooth.
<mithro> rjo: Yeah I remeber something like that too
fengling has joined #m-labs
mumptai has quit [Remote host closed the connection]
<rjo> sb0: ~62% LUT usage with high-order spline interpolators (still modulo checking that everything works)
<sb0> rjo, another thing - Joe wanted to reduce the speed of the backplane links to 5Gbps. wasn't the plan to run them at >10Gbps so all EMI is in a frequency range where it doesn't do damage?
<rjo> sb0: the spectrum of those links always extends down to "almost" DC.
<sb0> how?
<rjo> shannon for example ;)
<rjo> slide 9 for example.
<rjo> that "1" in relative frequency is that 10 GHz or 5 GHz or 1.25 GHz...
<rjo> so the trick will be (to whiten the spectrum for clock noise EMI) to do randomized AKR on idle.
<rjo> slide 12
<rjo> or be lucky and not have any of those comb teeth within any PLL bandwidth of any relevant clock.
<sb0> hm, I would have thought running disparity control did a better job than that at cutting low frequencies
<rjo> but basically to transmit X GBps on a X GHz link you need all the bandwidth from close to 0 up to X (plus nyquist images).
<sb0> for 8b10b it's X Gbps on a X*1.25GHz link
<rjo> yes.
<rjo> but there is lots of aliasing from high frequency components into the low frequency components.
<rjo> so even if disparity control can "clear" out those 250 MHz at the bottom, you would get nyquist images down there from the fact that everything has sharp edges.
<rjo> from that presentation you are 20 dB below peak spectral density at ~0.01 line rate.
<sb0> I don't get it
<sb0> isn't the spectrum of a square wave f, 3f, 5f, etc.?
<sb0> what does this have to do with the sharpness of the edges?
<rjo> 8b10b only guarantees that there is no DC component. you can write down very long valid 10b sequences that are periodic. and their period is then directly a low frequency component.
<sb0> and how do you get aliasing if you're not sampling?
<sb0> yes, but this has nothing to do with aliasing or edge sharpness, right?
<rjo> you are sampling.
<rjo> there are two reasons why there are things so low.
<rjo> one is that 8b10b only guarantees 0 power at DC.
<rjo> and the other is that if you have a frequency component at f-y (y being small, f being the line rate) you will also have one at y.
<sb0> hm, right.
<rjo> and for EMI and clock noise immunity, the smooth wide background noise is harmless because pll bandwidths are small. it is always problematic if the noise is peaking and one of the peaks is near the clock (pretty much guaranteed for commensurate frequencies). "data" is harmless and "idle" is dangerous...
rohitksingh_wor1 has joined #m-labs
rohitksingh_work has quit [Ping timeout: 264 seconds]
<rjo> sb0: the drtio demo looks good! do you plan on doing a round trip test (even if it is not in the contract)?
<sb0> I can have a look afterwards. but this is slightly annoying with the channel combination of the xilinx transceivers that I mentioned earlier...
<sb0> maybe we can patch this problem with some migen-based abstraction
<sb0> some sort of "transceiver manager" module that provides saner abstraction than the xilinx mess
<sb0> maybe the transceiver manager can be a library of its own that provides transceivers for various fpga families
<sb0> bare transceivers without encoding, elastic buffers, channel bonding etc.
<sb0> those can be put into fabric libraries
<sb0> _florent_, have you touched altera and lattice transceivers?
sb0 has quit [Quit: Leaving]
<larsc> altera has what I think they call superfunctions, which have a similar level of abstraction if not even worse than the xilinx ones
<larsc> It might be that they have more low level primitives aswell, but I haven't seen those
<larsc> megafunction is what they are called
<cr1901_modern> rjo: Wanted to create a higher level interface than SPIMachine to attach a bus (Wishbone, CSR) to. It's an idea that's not panning out.
<rjo> cr1901_modern: yet another layer, between SPIMachine and the (wishbone) SPIMaster? i would not try that.
<cr1901_modern> rjo: Ack. I'm aborting this train of thought.
<cr1901_modern> It's not working like I thought it would; your SPIMaster is tied to the wishbone bus to increase performance, and decoupling it won't gain much (and might even make things worse)
<cr1901_modern> rjo: Incidentally, while it's not possible to double buffer on a CSR-based core (no ack signal), LM32's branch prediction should reduce the penalty for having to poll the status register.
<rjo> cr1901_modern: good that we use mor1kx ;)
<rjo> which reminds mere. there were commits on branch prediction.
<rjo> whitequark: there are some branch predictor changes in mor1kx between us and their master. are you interested in them? they might affect #485 and similar things.
rohitksingh_wor1 has quit [Read error: Connection reset by peer]
rohitksingh has joined #m-labs
sandeepkr has quit [Quit: Leaving]
sandeepkr has joined #m-labs
<_florent_> sb0: no I haven't touched myself altera or lattice transceivers, a colleague did for lattice transceivers, they are probably simpler than Xilinx's ones but they are limited to 3Gbps on ECP3 and 5Gpbs on ECP5
rohitksingh has quit [Ping timeout: 258 seconds]
rohitksingh has joined #m-labs
rohitksingh has quit [Ping timeout: 265 seconds]
rohitksingh has joined #m-labs
FabM has quit [Quit: ChatZilla 0.9.92 [Firefox 47.0/20160604131506]]
<whitequark> sb0: (csr generator) I didn't know about that one; copied from gateware.spi
<whitequark> (race condition) ack, but not quite sure how to fix it. I'll look at it, perhaps
<whitequark> (csr initializer) ack, I will use that
<whitequark> rjo: unclear. #485 could be almost anything.
<rjo> whitequark: sure. one has to start somewhere.
rohitksingh has quit [Quit: Leaving.]
cr1901_modern has quit [Read error: Connection reset by peer]
cr1901_modern has joined #m-labs
mindrunner has quit [Remote host closed the connection]
mindrunner has joined #m-labs
tmbinc__ has quit [Ping timeout: 260 seconds]