sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
sb0 has quit [Quit: Leaving]
sandeepkr has quit [Read error: Connection reset by peer]
sandeepkr has joined #m-labs
fengling has joined #m-labs
stekern_ is now known as stekern
rohitksingh_work has joined #m-labs
MiW has quit [Ping timeout: 244 seconds]
sb0 has joined #m-labs
FabM has joined #m-labs
FabM has quit [Remote host closed the connection]
FabM has joined #m-labs
MiW has joined #m-labs
rohitksingh_work has quit [Quit: Leaving.]
rohitksingh_work has joined #m-labs
<sb0> rjo, what do you think of this? https://world.taobao.com/item/38355814512.htm
<rjo> sb0: do you want do know what it is or do you want to use it for something?
<sb0> I only have a vague idea what it is... what can it be used for?
<rjo> i am pretty sure that the "collimator" in there is a wrong translation. form the looks of it this seems to be just a fiber coupled, fabry-perot filter. use it to clean up a laser or to add/drop channels on CWDM.
<sb0> add chanels?
<rjo> an FP reflects pretty much everything it doesn't pass. you add a circulator and then you have an add-drop-mixer.
<rjo> mixing not in the RF sense. more like "muxer"
<rjo> larsc: could i bounce a few jesd/clocking questions off of you?
<sb0> rjo, can telecom FPs be used at any wavelength, or do they use things like dichroic mirrors?
<rjo> they most likely only work at telco wavelengths. certainly the Q is terrible at other wavelengths, other meaning > 100 nm away.
<rjo> but i have e.g. seen an SFP module designed for 1300 nm receive fine at 1050.
<larsc> rjo: sure
<sb0> rjo, if I build this http://www.repairfaq.org/sam/manuals/sfpiins1.htm I suppose I can use it to check if ECDL tuning si is working?
<rjo> sb0: yes. that's a standard alignment tool. you are building one of these: https://www.thorlabs.de/newgrouppage9.cfm?objectgroup_id=859
<rjo> sb0: it won't get you the absolute frequency though.
<rjo> larsc: ok. let me start with laying down what we have/want.
<rjo> larsc: we have a distributed real-time input-output architecture. we have a "master" device that is the authroritative source of a 125 MHz clock and a synchronization signal that marks e.g. a certain T=0 clock interval. we can replicate that clock and the synchronization very well somewhere else.
<rjo> let's say that "somewhere else" is a MCH in uTCA language (if you know that ecosystem).
<rjo> if you don't knoow it: it's basically a rack with blades and the MCH is the star center node of the rack.
<rjo> and the rack has ample tools to get good fast clocks and synchronization signals to the various blades. on the blades there is an fpga and an ad9154 jesd204b dac.
<larsc> ok
<rjo> larsc: sorry. still setting up the story.
<sb0> there are so many french words in this domain. étendue, finesse, étalon
<whitequark> the étalé space of a sheaf
<whitequark> (I have no idea what that means. I've been wondering for years)
<rjo> summarizing so far: we have a clock and a time in the rack. that clock and the time are replicated to all the blades as well. the rack controller (mch) could distribute another "sample" or "dac" clock (e.g. 2 GHz) to the blades. and maybe even a SYSREF signal.
<rjo> call that first clock/time signal pair (it is actually embedded and reconstructed from a serial datastream but that doesn't matter) the DRTIO clock (125 MHz, all nicely deterministic latency and so on).
<rjo> call the second pair "the DAC clock". i guess that would be deviceclock and SYSREF in JESD language.
<rjo> everything is derived from a central clock and therefore phase synchronous. the only unknown is the alignment of SYSREF w.r.t. the RTIO clock.
<larsc> if you have a good reference clock and a synchronization event you can probably get away with generating the deviceclock and sysref locally
<larsc> your problem will be the sync signal
<rjo> we would do JESD204B SC 1 between the DACs and their FPGAs and then also time-stamp a SYSREF edge on the RTIO clock and use that to re-shuffle the DAC samples in the FPGAs to.
<rjo> in general i would think that i can't generate SYSREF from an FPGA because that would not meet the timing spec w.r.t the DAC deviceclock.
<sb0> rjo, have you considered using another GTX as a DTC to generate the sync signals?
<larsc> rjo: is this existing hardware or are you designing something new
<larsc> ?
<rjo> sb0: SYSREF is very slow. 15 MHz is the number i remember. maybe that works but maybe the preemphasis stuff etc all messes with the slow clock....
<sb0> I think there will be no problem with that
<sb0> as I understand it, preemphasis only alters the signal on a short time scale (<UI)
<sb0> it's just a pre-driver and a post-driver, that are enabled shortly before/after the main driver and modulate the current
<rjo> larsc: part of it exists (the uTCA rack and the DRTIO mechanism and the DRTIO transcievers etc). other parts we would like to prototype using eval boards (KC705 or KCU105 and AD9154-FMC-EBZ). and finally a big part is hardware that we are designing.
<larsc> hm
<sb0> so GTX are general-purpose IOs at 12GHz...
<larsc> I probably would have used a local clock chip for sysref and deviceclock generation
<larsc> that makes it a lot easier to meet all the requirements
<rjo> larsc: finally my questions: does it look like we could implement something like a custom jesd204b-style mcs in gateware on the fpgas even if we don't have a common source of sysref?
<rjo> larsc: but ultimately we need mcs between all the blades.
<sb0> if we hook up everything to GTXs, we can probably deal with a lot of things in gateware
<rjo> sb0: but the big question is still: to what reference do we timestamp?
<rjo> sb0: we have our timestamp reference on DRTIO clock only so far.
<sb0> wouldn't the DACs be clocked at a multiple of the DRTIO clock?
<rjo> sb0: what we could look at is running the serial side of the GTX with the fast DAC clock and the slow side with RTIO.
<rjo> then we get a timestamp with DAC-sample granularity/resolution in RTIO domain.
<sb0> the serial side of the GTX has to go through the QPLL or CPLL
<rjo> sb0: does that have fixed and adjustable phase input-to-output?
<sb0> I don't think you can have a pass-through there. so you'd need to divide and remultiply with the Q/CPLL
<rjo> larsc: or rephrasing the question: is there an obvious way to do mcs and det-lat between FPGA-DAC blades given that we are "almost there" with our RTIO clock?
<sb0> fixed I think so, but let me see if I find info about this...
<larsc> rjo: you'll need sysref to align the DACs, and depending on how fast your deviceclock is you might run into timing issues if sysref and devicelcock are not source synchronous
<sb0> rjo, you definitely cannot disable the Q/CPLL. the GTX clock input takes 700MHz max
<larsc> are you planning on using the PLL inside the DAC?
<sb0> rjo, I don't find any info about configuring the Q/CPLL skew. but assuming it is constant doesn't sound crazy.
<rjo> larsc: (sysref alignment) yes. that's what i fear.
<rjo> larsc: (pll in dac) no.
<rjo> it does sound like the workaround to not having "proper" jesd204b sc1 mcs would be to timestamp sysref in the fpga somehow and then shuffle the samples.
<rjo> but it needs to be timestamped to deviceclock (2 ghz) resolution and referenced to rtio clock.
<rjo> so what we could do is a start-stop counter in the gtx (free running) and then start on rtio clock and stop on sysref.
<sb0> if we need to timestamp stuff, GTXs can also be used as TDCs
<larsc> I don't understand how timestamping in the fabric works for the transmit path
<sb0> if we use ultrascale, we can perhaps use the SERDES at that rate with some hackery too
<sb0> hm, or maybe not
<larsc> your non-det delay will be between FPGA output and DAC output
<rjo> larsc: but that we solve with jesd204b sc1
<rjo> larsc: completely without mcs and the standard way.
<larsc> ok, the timestamping is used to align the incomming datastream to the reference?
<rjo> but now we still have to align that machinery with respect to our DRTIO clock. maybe "aligning" is the wrong word. but just timestamping and then correcting for the delay.
<rjo> yes
<sb0> but since it's a clock, we can scan a IDELAY as well
<rjo> sb0: i guess that a start-stop-counter with two gtx would be fine.
<larsc> ok, so we can assume that at the input to the GTX all samples accorss all FPGAs are aligned, right?
<sb0> start-stop-counter?
<rjo> sb0: just two TDC and take the difference. then you can ignore the QPLL phase.
<rjo> larsc: i guess. but let me try to define that.
<sb0> ok. yes, if we have spare GTXs
<sb0> delay scanning is what is done for ddr3 write leveling btw
<sb0> this works on regular ios
<rjo> larsc: at every RTIO clock cycle (8ns) i have 8 samples (data clock into the dac is 1 GHz, but the dac does 2x interpolation).
<rjo> the RTIO clock is our preferred time reference when talking about this.
<larsc> where and how is the DAC clock generated?
<rjo> larsc: yes. good question. we would either a) distribute the DAC clock to the blades (but then we would also want to distribute SYSREF) or b) we would just generate the DAC clock/deviceclock and SYSREF on every blade.
<rjo> ... generate it from the RTIO clock, that is.
<larsc> if you can generate them both and make sure they are source synchronous that would solve all issues I guess
<rjo> afaics we can not guarantee that the RTIO clock meets setup/hold w.r.t. the dac clock no matter what we do. so there would be a CDC between them. but they are exactly an integer frequency ratio.
<rjo> larsc: are you referring to a)?
<larsc> b)
<sb0> rjo, why can't we guarantee that?
<sb0> too much jitter?
<rjo> sb0: that would also need to be source-synchronous. then the RTIO clock and 2 GHz and SYSREF would need to come from a common source.
<rjo> sb0: we can do that on a single blade. yes.
<sb0> distributing just the RTIO clock and generating SYSREF/DAC clock on the blades sounds fine imo
<rjo> sb0: but remember that this is now systematically excluding the "external low noise clocking" idea that has been passed around.
<sb0> unless there are concerns about clock multiplication on the blades having high phase noise
<sb0> yes
<rjo> larsc: hmm. ok. i suspect i need to do a bit more thinking here. thanks a lot.
<sb0> well, if we let the user give the DAC clock directly, then the FPGAs have to do some magic
<rjo> sb0: if we do external clocking (feed 2 GHz into the blade through the front), then divide by X to generate sysref on the board and divide by Y to generate an "output-side" RTIO clock, we still need a CDC between the data input side RTIO and the output side RTIO.
<rjo> which is fine.
<sb0> mh, does that CDC have constant latency?
<sb0> CDC generally does not
<sb0> in that case, how do we sync RTIO timestamps?
<rjo> right. that doesn't work. we would still need to reset the X and Y dividers based on the input RTIO clock.
<sb0> yes, that too
<rjo> and then it's pretty much the same as doing a CDC between the RTIO output domain and the JESD framing domain.
<sb0> yes, and I suggest keeping the JESD problems in JESD-land
<rjo> it's not really a jesd problem.
<sb0> well, external DAC clocking problems
<rjo> it's also not external. the samples need to be shuffled.
<rjo> ;)
fengling has quit [Ping timeout: 240 seconds]
<rjo> i am still trying to understand whether there is a generic way (using all the tools that we have, clocking, synchronization, sample shuffling, timestamping) to do det-lat and mcs for the case of timestamped samples in a slow clock domain to a high-speed DAC..
<sb0> can't we reprocess the external DAC clock with some low-phase-noise device to align it with the internal DAC clock generated from the RTIO clock?
<sb0> that will also deal with the problem of users feeding each blade with a different 2GHz phases
<larsc> I don't think the DAC clock matters that much, it's sysref what is important
<larsc> on the data path you have multiple CDCs anyway
<sb0> how much phase noise are we talking about? aren't there DLLs or PLLs we can use?
<rjo> larsc: for noise reasons it's the dac clock. some people say they can do a low noise clock much better outside the confines of the blade. that's why they want to feed it in from the outside. but the problem is that that clock is obviously without sysref and without alignment to our rtio "superclock".
<larsc> hm, right
<larsc> some of the JESD devices have a sysref setup and hold monitor, but it looks like the one you are using does not
<rjo> larsc: afaict that only "moves" the problem around.
<rjo> phase drifts are not so much of a problem since all clocks and signals must be commonly sourced anyway.
<sb0> rjo, if they do that, would they be responsible for having the same phase at each blade?
<rjo> sb0: yes. but still.
<rjo> what i would like to do is the following:
<rjo> a) in the case that we generate dacclock and sysref from rtio clock internally (using a low-noise oscillator and a clock distribution chip), i'd like to be able to either a1) measure the phase of sysref w.r.t. rtio clock or b) reset the sysref divider w.r.t rtio clock
<rjo> b) in the case where clock comes in externally, use the same clock distribution and division chip (but not the oscillator), and do the same thing again, leaving the responsibility to phase delay the external clock so that it meets rtio-to-dac_clock setup/hold
<rjo> that "b) reset the sysref..." should be "ab) reset the sysref..."
<rjo> that "b) reset the sysref..." should be "a2) reset the sysref..."
<sb0> who would do the phase delay? the user?
<sb0> we can give the user phase indications
<sb0> sounds good
<rjo> for the a1) rtio-to-sysref phase measurement we would need to be sample-accurate (2 ghz) that means either a dual gtx tdc or something else.
<sb0> can't the fpga generate sysref?
<sb0> for periodic signals, you can use a regular IO and a scanning IDELAY
<rjo> for the a2) resetting we need to be able to generate very fast edges to meet the window for dac_clock.
<sb0> well, resetting sounds as hard as generating it directly
<rjo> sb0: that fpga-generates sysref is what larsc suggests not to do because it leads to the misalignment problems.
<sb0> instead of resetting, what is easy is telling the divider to shift by e.g. having a pulse which lasts one DAC period longer. but that requires a smart divider.
<sb0> Kintex-7 IDELAY has taps down to 39ps average
<sb0> that's two orders of magnitude better than what we need...
<sb0> er
<sb0> one
mumptai has joined #m-labs
<sb0> the smart divider and the idelay techniques do not require another GTX. others do I think.
<larsc> so how does that work in detail?
<sb0> idelay scanning?
<larsc> yes
<larsc> what is connected where?
<rjo> larsc: ad9154 does support 'one-shot then monitor sync' (SYNCMODE=0x9). is that what you were referring to?
<sb0> a clock with an unknown phase is sampled via IDELAY + FF by a clock of a known phase of the same frequency
<larsc> rjo: I don't think so, this sounds more like it checks that sysref does not gets out of phase in relation to the lmfc
<sb0> you wiggle the IDELAY to align the two, the value of the IDELAY then gives you the phase difference
<sb0> it's similar to how DDR3 write leveling works
<larsc> I don't know how DDR3 write leveling works
<larsc> sb0: what is the max clock rate this would work for?
<sb0> if you use the SERDES, over a GHz
<sb0> well, Gbps, so over 500MHz
<larsc> sb0: with the IDELAY there seems to be quite a bit of a difference between minmum delay and maximum delay for a specific tap setting
<larsc> up to 300ps
<sb0> where did you see that?
<larsc> looking at the timing report how it affects hold and setup of the signal that goes through it
<sb0> hm
<larsc> although that was with the low-performance setting
<sb0> do you know clock dividers that nicely allow adjusting the phase of their divided clock?
<rjo> larsc: (syncmode=0x9) isn't that the same? it tells you whether sysref is out-of phase w.r.t. lmfc with dac_clock granularity. (that's not violating s/h timing but it's close)
<sb0> and I don't really understand your concern about generating SYSREF with another GTX
<rjo> sb0: in the worst case (2.4 GHz dac_clock) you need to hit a 200 ps window with the sysref edge.
<sb0> yes? the GTX seems to be the right tool for that, as it provides UI down to 83ps
<larsc> rjo: it might help to detect the effect of a setup or hold violation
<rjo> how do you know that you hit it?
<rjo> larsc: ack. but it doesn't prove that you are meeting requirements.
<sb0> you still have that problem with the other techniques
<sb0> I suppose one will have to check the DAC outputs with a good scope
<rjo> worse. temperature changes might invalidate your efforts because there is so much different logic between the two branches (dac_clock and sysref)
<sb0> can't we use one of those DACs that has SYSREF monitoring features?
<rjo> you could in principle use that syncmode=0x9 to find a lower limit where that error counter starts increasing and an upper limit where it starts increasing and then choose the middle. that's the same as for ddr3
<larsc> It's tricky, I tried something like that last week
<rjo> but still. it's nasty. i have no idea how frequently we need to retime all this. there is so much different logic between the two paths.
<larsc> I could not really get it to report errors
<rjo> larsc: ?
<larsc> see if sysref setup/hold is met based on whether the sampled signal is fully periodic
<larsc> on the fpga side though
<larsc> I've now build a solution using IDELAYs, I hope that works better
<rjo> larsc: is that with "continuous sync mode"?
<rjo> larsc: so that the sample shuffling constantly tracks lmfc and lmfc is constantly reset by sysref?
fengling has joined #m-labs
<larsc> is what with?
<rjo> larsc: that "using periodicity to check for s/h on sysref".
<larsc> what I did was continuous sync mode and then check that once the lmfc is active sysref always is always perfectly aligned to it (in respect to the device clock)
<rjo> by looking at the peridodicty of ADC samples?
<larsc> in the fabric
<rjo> yes.
<larsc> sorry, I can't follow
<larsc> our thoughts are not aligned ;)
<rjo> all good. i think i understand what you did.
rohitksingh_work has quit [Read error: Connection reset by peer]
<rjo> larsc: what actually happens to the missing/superfluous samples if an lmfc realignment happens? dropping/repeating/stuffing zeros?
fengling has quit [Ping timeout: 240 seconds]
<rjo> (this is in the fpga fabric on an adc setup).
<larsc> the standard says something about repeat the previous frame on error
<larsc> but I don't think all implementations do this
<larsc> it should really be application specific
<larsc> the easiest is to report an error and shut everything down and let the application reinit things
<rjo> ok. but i guess dropping/repeating would be generically ok. also since lfmc realignments are not expected to happen at all (in some sense).
<sb0> I have other ideas
<sb0> what about not doing any JESD synchronization and feeding the DAC output back into the FPGA for e.g. IDELAY scanning or DDMTD?
<sb0> if AD messed up that chip like they messed the 9914, it will lose sync over time, though
<rjo> it's also lane alignment.
<rjo> you need to get every dac to the fpga and do it over and over again
<sb0> yes, use some rf switch
rohitksingh has joined #m-labs
<rjo> tagging sysref edges the same way seems to be significantly simpler.
<sb0> how do you ensure sysref setup/hold?
<rjo> it is source synchronous.
<sb0> yes, but how do you tune its delay wrt the dac clock?
<sb0> you still need to do that.
<rjo> you don't. it is source synchronous.
<rjo> you make it with a clock distribution chip the usual way
<rjo> then it meets s/h
<rjo> then you measure sysref-to-rtio to better than 1 dac_clock and you shuffle the samples accordingly.
<rjo> you can do the measurement with a) two gtx or b) idelay scanning or c) ddmtd if you really want to.
<sb0> so that delay should be zero, is that what you are saying?
<rjo> no. let that delay be X samples. then you delay/rearrange the samples in a frame to come out at the right time by effectively delaying them by M - X. where M is the number of samples in a multiframe.
<sb0> what I mean is: the sysref edge is aligned with a dac clock edge.
<rjo> or better (M-X) % 16
<rjo> yes. it is. by construction.
<rjo> this is by the way a setup that we can already test and implement on kc705+ad9154-fmc-ebz
<sb0> okay, we get sysref from that card?
<sb0> I reckon DDMTD gives the highest precision, which may come useful if we want to give phase indications to the user, maybe
<rjo> yeah. but there are a bunch of implementation details with ddmtd.
<rjo> we get sysref already to do jesd204b sc1
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
<rjo> you don't really need high precision. you only need to locate it within a dac clock period.
acathla has quit [Quit: Coyote finally caught me]
acathla has joined #m-labs
<rjo> _florent_: do i see that right that we would shoot for mode 2 (table 44 page 46)? (or mode 4 with two channels).
<rjo> it seems to me that one would generically feed a jesd204b core with multiframes. then all the behind-the-scenes stuff is in the core.
<rjo> alternatively one would feed it with (multiple) frames plus a strobe to indicate start-of-multiframe.
<rjo> larsc: ^^ does that make sense?
<larsc> difficult to say, even if you align the input in multiframes as soon as you have an underflow you things will be out of sync
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
<_florent_> rjo: yes with the KC705 I was going to use mode 2 (4lanes / 4 converters)
<larsc> interesting result of the idelay s/h tester, there is a region where it always works, followed by a region where it fails 50%, followed by a region where it always fails. but there is pretty much nothing else, only a single tap value where it fails more randomly and also dependent on the temperature
<larsc> I would have expected more of a gradient
fengling has joined #m-labs
<rjo> larsc: interesting. but i would not expect much randomness (and the corresponding gradient).
fengling has quit [Ping timeout: 240 seconds]
<rjo> apart from actual jitter i would expect all randomness to be suppressed by double or triple registering that thing
<rjo> i guess you could actually interpret this as a pure jitter measurement.
<larsc> there is a almost 1ns window where it is pretty much 50/50 whether the delayed signal sees the same edge on the same clock cycle
<larsc> and only on the edges of those windows there is one tap where things are not quite equal and depend on the temperature
<larsc> http://metafoo.de/sysref_monitor.png 0 is good, f is bad
<larsc> and the 6 and the c change based on temp
fengling has joined #m-labs
<rjo> larsc: are you just scanning the phase of the sysref output of the clock distribution chip?
<rjo> larsc: the 0x8 is a bit weird yes. but could be some dynamic effect since you are hitting the edge _every_ cycle.
fengling has quit [Ping timeout: 240 seconds]
<rjo> larsc: (about the multiframe/frame interface) but where do you get underflows?
<rjo> larsc: how fast is the deviceclock?
<larsc> rjo: ah, right, I'm looking at both edges, the 0x8 probably means one of them always is good and the other always is bad
<rjo> larsc: does the sync mechanism react to rising and falling sysref?
<larsc> sync should only react to rising according to the spec
<larsc> you can get away with non 50/50 duty cycle
<rjo> and what you are scanning there is actually an idelay on sysref before it goes into the jesd204b core?
<larsc> yes
<rjo> and speed of sysref/deviceclock_adc are?
<larsc> I have two idelays, one at a fixed offset of 15 (half the maximum delay) the other cycling from 0 to 31
<larsc> and then compare the output of the two when a edge is detected
<larsc> deviceclock is 250MHz
<larsc> sysref 7.???
<larsc> 7.125 or something
<rjo> ah.
<rjo> just 32 taps? aren't there 255 on spartan6?
<rjo> no. forget what i said.
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
sandeepkr has quit [Ping timeout: 240 seconds]
rohitksingh has quit [Quit: Leaving.]
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
acathla` has joined #m-labs
acathla has quit [Read error: Connection reset by peer]
acathla` is now known as acathla
fengling has joined #m-labs
acathla has quit [Changing host]
acathla has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
<rjo> wow. the spec says that there is space for 16 SFP on a full-size double-wide AMC front panel...
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
mumptai has quit [Quit: Verlassend]
kuldeep has quit [Ping timeout: 244 seconds]
kuldeep has joined #m-labs
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]