sb0_ changed the topic of #m-labs to: https://m-labs.hk :: Logs http://irclog.whitequark.org/m-labs
_rht has quit []
_rht has joined #m-labs
_rht has quit [Client Quit]
X-Scale has quit [Ping timeout: 240 seconds]
_rht has joined #m-labs
bluebugs has quit [Quit: ZNC 1.7.1 - https://znc.in]
cedric has joined #m-labs
cedric has quit [Changing host]
cedric has joined #m-labs
_whitelogger has joined #m-labs
X-Scale has joined #m-labs
<sb0> hartytp: it's not really glitches; when 3.3V fails it does not recover until a power cycle, and the board is completely dysfunctional when that happens (no JTAG etc.)
<sb0> I don't think this is related to the hmc830 problem, but the latter can still be some silly hw problem...
<sb0> so, I tried running "conda create -n foo package.tar.bz2" to check if it would install dependencies automatically, but of course, it simply crashed instead
<sb0> the workaround is to create the environment first, then install a non-noarch package in it, then "conda install package.tar.bz2"
<sb0> typical conda behavior
<sb0> and of course, then it doesn't even look at the dependencies. no error, no warning, simply the package gets thrown into the environment.
<sb0> workaround: conda install -c file://<package_directory> <package_name>
<sb0> .....
<sb0> "--no-deps Do not install, update, remove, or change dependencies. This WILL lead to broken environments and inconsistent behavior. Use at your own risk."
<sb0> so it warns you about it, but then does it silently anyway
_rht has quit [Quit: Connection closed for inactivity]
<sb0> now "conda create -n artiq-kc705-nist_clock", with a fresh conda install, has been "solving environment" for 10 minutes at 100% CPU usage, and still not done. how do people use this crap?
<sb0> workaround: "conda create -n xxx artiq-kc705-nist_clock python=3.5", activate, then conda install ...
<sb0> maybe i should update the docs...
<sb0> bb-m-labs: force build artiq
<bb-m-labs> build forced [ETA 58m20s]
<bb-m-labs> I'll give a shout when the build finishes
<sb0> power-cycled the kasli and reconnected USB, flashing problem is gone
<bb-m-labs> build #2305 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2305
<bb-m-labs> build #2306 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2306
<bb-m-labs> build #2852 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2852
proteusguy has quit [Ping timeout: 272 seconds]
<whitequark> adamgreig: either is fine
<_whitenotifier-c> [m-labs/nmigen] whitequark pushed 1 commit to master [+0/-0/±1] https://git.io/fhgKX
<_whitenotifier-c> [m-labs/nmigen] whitequark 2c80f35 - lib.fifo: fix typo in AsyncFIFO documentation.
<whitequark> sb0: conda people could and should have used aspcud, but instead they wrote their own shitty dependency solver
<_whitenotifier-c> [nmigen] Success. The Travis CI build passed - https://travis-ci.org/m-labs/nmigen/builds/482733767?utm_source=github_status&utm_medium=notification
<_whitenotifier-c> [nmigen] Success. 83.25% remains the same compared to e33580c - https://codecov.io/gh/m-labs/nmigen/commit/2c80f35de46d59024c4524b7cee1ba030db9f086
<_whitenotifier-c> [nmigen] Success. Coverage not affected when comparing e33580c...2c80f35 - https://codecov.io/gh/m-labs/nmigen/commit/2c80f35de46d59024c4524b7cee1ba030db9f086
<whitequark> sb0: I think I found a case where the migen simulator behavior is clearly troublesome
<whitequark> let's say you are trying to read from a FIFO. right now there is a FIFOInterface.read() in nmigen that does the following:
<whitequark> yield self.re.eq(1)
<whitequark> yield
<whitequark> value = (yield self.dout)
<whitequark> yield self.re.eq(0)
<whitequark> that's fine and well, except let's say you (a) call this in a loop, (b) want to check for readability
<whitequark> and you cannot do this. you have one of the two options: 1) spending 2 cycles per read instead of 1, 2) only knowing whether the FIFO (was) readable after advancing the timeline
<whitequark> now, this does not particularly hurt with a FIFO, because re is &-d with readable
<whitequark> but I think not being able to implement this pattern is clearly a deficiency
<GitHub-m-labs> [artiq] sbourdeauducq pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/217493523199aea8649c7f91a63b0bbe3173d645
<GitHub-m-labs> artiq/master 2174935 Sebastien Bourdeauducq: nix: update package descriptions
<sb0> _florent_: still won't work with copper sfp cable between sayma and kasli
<sb0> and strangely enough I cannot reproduce the results with the SFP loopback, it now receives garbage data as well
_whitelogger has joined #m-labs
<sb0> more exactly: the PMA loopback (i_LOOPBACK=0b010) still works, but the physical hardware loopback is fucked
<bb-m-labs> build #2307 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2307
<sb0> same behavior on another sayma board ...
<sb0> rjo: the "PLL lock timeout" kasli_tester/ad9910 bug is reproducible on kasli-1
<bb-m-labs> build #2308 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2308
futarisIRCcloud has joined #m-labs
m4ssi has joined #m-labs
proteusguy has joined #m-labs
<rjo> sb0: the buildbot didn't reproduce it before and i couldn't reproduce it yesterday morining. i saw it the first time seconds before the flashing failed.
<sb0> rjo: ok, it's easily reproduced right now
<bb-m-labs> build #2853 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2853 blamelist: Sebastien Bourdeauducq <sb@m-labs.hk>
<rjo> sb0: no. works fine when i try it.
cjbe has joined #m-labs
<cjbe> sb0 / rjo: I also saw these PLL lock timeouts. Reverting the 3 "configurable refclk divider" commits solved it.
proteusguy has quit [Ping timeout: 245 seconds]
<cjbe> There seemed to be some nasty persistent state - we ran on these commits for a while, and the lock timeout only appeared after powercycling the Kasli
<cjbe> I suspect this is due to the refclk input divider reset bit (cfr3) doing something funny / undocumented
<rjo> this was my sequence master: bad, a467b8f8 (before that change): good, 385916a9: good, 40187d19: good, master: good.
<rjo> i also noticed twice that some sequence of register writes around that input divider lead to no output at all while it should. but that was fixed by doing a different sequence.
<rjo> sb0: could you cycle that crate?
<cjbe> So I tried power cycling the Kasli, then doing the CPLD / DDS channel init. I saw PLL lock timeouts on yesterdays master, then reverted 2bea5e3..40187d1 and saw no lock timeouts. I repeated this a couple of times swapping between working an non-working
<GitHub-m-labs> [artiq] jordens pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/91e375ce6acd6b626b1040abefd1af7c1cdb9d63
<GitHub-m-labs> artiq/master 91e375c Robert Jördens: ad9910: don't reset the input divide-by-two...
<rjo> sb0: i can't reproduce it. but i reverted a change hypothetically related. if you see it, could you bisect it, probably with power cycling?
<sb0> rjo: cycled
<GitHub-m-labs> [artiq] sbourdeauducq pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/9ee5fea88d4d08f41f1cfe19bee88e58b2ac602d
<GitHub-m-labs> artiq/master 9ee5fea Sebastien Bourdeauducq: kasli: support optional SATA port for DRTIO
<bb-m-labs> build #2309 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2309
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
proteusguy has joined #m-labs
<bb-m-labs> build #2310 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2310
<bb-m-labs> build #1002 of artiq-win64-test is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/1002
<bb-m-labs> build #2854 of artiq is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2854
<bb-m-labs> build #2311 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2311
proteusguy has quit [Ping timeout: 250 seconds]
<bb-m-labs> build #2312 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2312
<bb-m-labs> build #1003 of artiq-win64-test is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/1003
<bb-m-labs> build #2855 of artiq is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2855
<GitHub-m-labs> [artiq] sbourdeauducq pushed 1 new commit to release-4: https://github.com/m-labs/artiq/commit/139e87de3669bf95f535e0777982734a64f43289
<GitHub-m-labs> artiq/release-4 139e87d Sebastien Bourdeauducq: firmware: fix compilation error with more than 1 Grabber
<GitHub-m-labs> [artiq] sbourdeauducq pushed 2 new commits to master: https://github.com/m-labs/artiq/compare/9ee5fea88d4d...a0eba5b09ba2
<GitHub-m-labs> artiq/master 2e3555d Sebastien Bourdeauducq: firmware: fix compilation error with more than 1 Grabber
<GitHub-m-labs> artiq/master a0eba5b Sebastien Bourdeauducq: satman: support Grabber
<GitHub-m-labs> [artiq] jordens pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/b692981c8e3bbeb48b70846fa8f2a0fb97285bd7
<GitHub-m-labs> artiq/master b692981 Robert Jördens: ad9910: add note about red front panel led...
<bb-m-labs> build #2313 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2313
<bb-m-labs> build #2314 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2314
rohitksingh has joined #m-labs
X-Scale has quit [Read error: Connection reset by peer]
<bb-m-labs> build #1004 of artiq-win64-test is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/1004
<bb-m-labs> build #2856 of artiq is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2856
X-Scale has joined #m-labs
proteusguy has joined #m-labs
rohitksingh has quit [Ping timeout: 246 seconds]
rohitksingh has joined #m-labs
hartytp_ has joined #m-labs
<bb-m-labs> build #2315 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2315
<hartytp_> sb0, rjo: initial notes on Sayma clocking https://hastebin.com/uxahecatey.coffeescript
<hartytp_> if there is anything else you need then let me know
<bb-m-labs> build #2316 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2316
<bb-m-labs> build #1005 of artiq-win64-test is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/1005
<bb-m-labs> build #2857 of artiq is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2857
<rjo> hartytp_: ok.
<rjo> hartytp_: 1. i'd make sure that the loop filter can be made passive if there are issues (design, supply, noise, etc) with the active solution.
<rjo> hartytp_: are you sure that "using the dac to measure sysref" works? does it tell you which deviceclk cycle and/or setup-hold violations?
<hartytp> 1. yes, absolutely
<rjo> hartytp_: where is your sketch with the kc705 digital-wr design?
<rjo> hartytp: aren't there two retiming stages?
<hartytp> may even go for passive by default to allow me to prototype it on Sayma v2.0 (depends on how much time I have to test before v1.0 goes to production)
<hartytp> 2. are you sure that "using the dac to measure sysref" works? does it tell you which deviceclk cycle and/or setup-hold violations?
<hartytp> I think so, but need to test
<hartytp> one-shot-then monitor allows us to synchronise to an edge of the DAC clock, and then see how much we need to move sysref by to hit the next edge
<hartytp> in the case of non-deterministic delays between sysref and dac clock that are a fraction of the dac clock, we'd see the sysref edge moving
<hartytp> anyway, yes, needs testing, but I can do that fairly quickly once we've moved sayma to kernels
<hartytp> two retiming stages?
<hartytp> for sysref?
<hartytp> not in my current plan
<sb0> hartytp_: it's not using oneshot and monitor right now
<hartytp> no, it's not
<hartytp> but if you look back over my notes from testing sync, I couldn't get the current code to work, so I changed it
<sb0> why didn't it work?
<hartytp> I'd have to look back over the github issue
<hartytp> for the actual digital design, message weida (or I can do it on your behalf if easier)
<hartytp> nb using the shared I2C bus makes his code more complex than it needs to be
<sb0> "relatively "easy" problem since we have an ~4ns period" did you mean 8ns?
<hartytp> oops, yes
<hartytp> :)
<hartytp> thanks
<hartytp> sb0: re the DAC sysref detection, maybe I should remove the one-shot-then-monitor from that doc
<hartytp> point is just that if we can make sysref detection work reliably then we can use it to reset the pll. if not then the pll phase isn't an issue anyway
<hartytp> how we implement the dac sync needs more testing
<rjo> hartytp: and was there a sketch of the sustom_sync design as well?
<rjo> *custom
<hartytp> rjo: yes, but that's a little out of date now
<hartytp> if it would help, I can draw up a new BD based on the doc I sent
<hartytp> but, there wouldn't be anything in it that's not in the text
<rjo> where is the out of date version? now that you claim there is only one retiming stage i am confused.
<hartytp> github
<hartytp> heading off to meeting now, but can dig it out later
<rjo> ok. found it. i just misread which fanout this is clocked from.
<key2_> whitequark: what is the way you chose to do the .connect in nmigen ?
<key2_> for EPs
<whitequark> .connect? EPs?
<whitequark> I don't understand the question
<key2_> when in migen you do self.comb += source.connect(sink)
<key2_> how do you do that with nmigen ?
rohitksingh has quit [Ping timeout: 246 seconds]
rohitksingh has joined #m-labs
<sb0> hartytp: using it to detect the PLL phase seems a bit fragile. lots of moving parts with uncertainties...
<hartytp> rjo: okay, good. If you still want an updated BD let me know
<hartytp> sb0: yes. AFAICT, it should work, but it does make a complex system more complex
<hartytp> the other option is to add an extra DFF and delay line.
<hartytp> sample the DAC clock from the ref clk
<hartytp> or add a programmable divider with reset. also needs DFF + delay line
<hartytp> but then, we're kind of back to using an HMC7043 but suppling it with a SYSREF signal to do the sync that way
<hartytp> also not simple and needs testing
<hartytp> using the DAC as a phase detector has the advantage of being easy to prototype with current hw
<hartytp> well, I can also easily prototype using a DFF + delay line to synchronise the PLL.
<hartytp> clock DFF from ref clk, input is DAC clock after passing through delay line, output to FPGA. sweep delay and look at the signal at the fpga
<hartytp> not sure if one can scrap the DFF + delay line and do this directly in the FPGA, a la SiPhaser
<rjo> hartytp: ultimately, what was the reason to scrap the hmc7043?
<hartytp> rjo: the way we were using it seemed to hit a lot of issues, and didn't seem to be in line with the intended use case, for which it was correctly tested
<hartytp> i.e. using the FPGA to sample the output phase on one channel and then shifting all channels to do the alignment
<hartytp> AFAICT we would have had better luck if we'd supplied it with a proper SYSREF signal
<hartytp> the way they do in all the app notes
<rjo> RFSYNCIN?
<hartytp> that could be generated using the DFF + delay line (I assume the FPGA has too much timing variation and jitter to do this directly)
<hartytp> yes, RFSYNCIN, which needs to meet S/H from the DAC clock
<rjo> trying to understand this further, JESD204B synchronization is achieved independent of where the SYSREF edge is. it just needs to be deterministic at both source and sink. what was the reason to actively rephase clocks?
<hartytp> if that was the plan then I'm okay with it. however, the thinking was that if we need to use a DFF + delay line to produce the RFSYNCIN then why not just supply that directly to the DAC and cut out the HMC7043
<hartytp> needs to meet S/H to avoid ambiguity
<hartytp> and needs to reproducibly hit the same edge over PVT so want to be near the middle of the window
<hartytp> and low enough jitter
<hartytp> if we were confident that we could produce an output directly from the FPGA that would pick the same DAC clock cycle each time over PVT then we wouldn't need to retime. When we discussed it, the conclusion was that this seemed like a stretch so better to add some cheap retiming hw
<hartytp> DFF + delay line isn't much overhead in terms of money/firmware/board space/power consumption so seemed easier to add it than worry about timing
<rjo> i am not following what you are referring to now. let's split this in two questions. (a) why can't we just use the standard passive SYSREF jesd204b design/why do we need to actively phase SYSREF? (b) how do we implement active SYSREF phasing?
<rjo> let's just talk about (a)
<hartytp> so, my understanding of this is that we want to pick out an edge of the ref clk that corresponds to our t=0
<hartytp> so the DAC output has deterministic latency w.r.t. that clock edge
<hartytp> so, for example, assume the DAC clock has a deterministic phase w.r.t. the ref clock
<rjo> hmm. that's not how i have come to understand jesd204b over the years.
<hartytp> ok
<hartytp> go on..
<hartytp> essentially, my understanding is that SYSREF picks out the edge of the DAC clock that the first sample is released on
<rjo> it's more like: the clock tree generates sysref. sysref marks clock deviceclk cycles on both data sink and data source. the source generates special data at a sample in fixed sample-cycle relation to the special deviceclk cycle. the source aligns its buffer accordingly.
<rjo> what you just said is correct. but the source side also places its marker on the data.
<rjo> that in turn makes the scheme independent of the "absolute" sysref phase.
<hartytp> so, if SYSREF moves by 1 period of the DAC clock then the timing of the first output sample will move by 1 DAC clock period w.r.t. the ref clk
<rjo> the only problem for us may be that the fanout now needs to generate the deviceclk and that should be in phase to the RTIO clock.
<hartytp> aah, yes, I see what you mean
<rjo> but this may actually have a solution.
<hartytp> yes, I think that if the FPGA were clocked from the 7043 then things would be much easier
<rjo> so in fact one could argue that all we want to do is to drive rfsyncin with the rtio clock (modulo frequency).
<rjo> that is if we stick with the hmc7043
<hartytp> rjo: well, RFSYNCIN still needs to have a tunable delay to ensure it can meet S/H at 2GHz
<rjo> again. i am not lobbying. just trying to understand.
<hartytp> rjo: sure, it helps to talk it through
<hartytp> and it also needs to have jitter <<200ps (min DAC clock period less 200ps S/H time, which is my memory of the HMC7043 requirements)
<hartytp> and the drift needs to be at a similar level
<rjo> yes RFSYNC needs a tunable delay if that can't be achieved passively.
<hartytp> I don't see how you'd achieve it passively, since I'm not sure one can model things accurately enough
<rjo> well. you have the rtio clock nearby just upstream of the hmc830.
<hartytp> anyway, it seems easier and safer to add a tunable delay than to rely on modelling to match clocks at that level (after all, this is precisely why the HMC7043 has analog phase shifters)
<rjo> ack. the diff phase through the hmc830 and straight to the rfsyncin needs to be matched.
<hartytp> so...
<hartytp> you take the RTIO clock and phase shift it and feed it to the HMC7043
<hartytp> then re phase shift it and feed that to the DACs
<hartytp> that's not very different to just taking the RTIO clock, phase shifting it and feeding that to the DACs, which is essentially what my shceme is
<rjo> why the "re phase shift it and feed that to the DACs "? that's what the hmc7043 does.
<hartytp> yes, sorry, not clear. that part was what the HMC7043 is doing
<hartytp> my point is that you still need the external delay line and fanouts that we have now. It's a very similar concept.
<hartytp> you're just using the HMC7043 to provide an extra delay line
<hartytp> so you either have one delay line + 1 HMC7043
<hartytp> or two delay lines
<rjo> the only additon over sayma_v1 is the delay line between the hmc830 input and rfsyncin. (plus maybe the fanout upstream).
<hartytp> yes, I think that's correct
<rjo> i am using the hmc7043 to provide fanout and delay lines.
<hartytp> yes. You're using a single-channel delay line + HMC7043. I'm using a two channel delay line.
<hartytp> (well, "I" and "you" isn't quite right, just these are two ideas)
<hartytp> this was my original plan
<hartytp> I moved away from it because it felt to me like it would be simpler to just use the extra delay line.
<rjo> ok. another avenue would be to use some better phase detection scheme to measure the hmc7043 fpga deviceclk phase (w.r.t. rtio). and then just properly align samples based on that measurement. no active shifting needed.
<hartytp> I don't follow
<hartytp> you mean delay the samples in software by adding time offsets?
<hartytp> fwiw, if we had more data converters on the board, the HMC7043 would be a no brainer. For something with only 2 DACs, it's less clear that it provides a nicer path than a couple of discrete components (certainly more registers to program and more potential for undocumented features that take a long time to reverse engineer and understand)
<hartytp> rjo: hmmmm...
<hartytp> I'm not sure that feeding the HMC7043 with the phase shifted rtio clock quite works
<hartytp> IIRC the JESD framing means that it all works on a sub-multiple of the RTIO clock.
<rjo> modulo frequency. yes.
<hartytp> so, one has to produce a divided down RTIO clock, and ensure that the divided clock is correctly synchronised
<rjo> ok. it may require the same retiming and delay from fpga to rfsyncin.
<hartytp> but that division is easy to do in the FPGA
<hartytp> right, so in either case, we have FPGA -> DFF -> delay line
<hartytp> 1 option is to have DFF -> 2ch delay line -> DACs
<hartytp> 2nd option is to have DFF->delay line -> HMC7043 -> DACs
<hartytp> so, ignoring the PLL output divider I think it's hard to say that there is much difference between the two. Personally, I fee, that buying a delay line with 2 channels is a simpler solution than adding all the complexity of the HMC7043 (look how many pages the register map runs to)!
<rjo> what i mean with the other avenue is that in order to compensate for the fpga deviceclk phase (out of the hmc7043, without being able to slip the entire tree), you can just do that when you place the marker sample in reaction to sysref alignment (by shifting that sample).
<hartytp> hmmm...I *think* that should work, but not really my area.
<rjo> i.e. add synthetic delay in the buffer that compensates the hmc7043-to-rtio phase that you measured.
<hartytp> it's not obvious to me that it's simpler than the other options we're considering (and still seems to be quite a non-standard way of approaching the problem)
<rjo> in terms of clocking it is actually more standard. you'd strictly follow jesd204b sc1.
<hartytp> and requires a really good way of measuring the phase of the 2GHz clock at the FPGA, so probably still a DFF + delay line
<rjo> only in your datapath before that there would be a compensation to the phase you measured.
<rjo> yes. absolutely. accurate timing is not going to magically disappear. it's just about choosing *where* to guarantee it.
<rjo> ok. in short: i don't see any big problems with the custom_sync approach. i consider the alternatives to be of comparable complexity and risk.
<rjo> but it was very helpful for me to talk through the alternative concepts.
<rjo> oh. maybe support independent sysref outputs from the fpga to the two dacs (if that's not already the plan). i see no harm.
<rjo> ah. there might be a trivial way to measure the phase of the hmc7043 divider tree: just generate 1/n and 1/(n+1) divided outputs and feed them to the fpga. easy to sample, gives you the typical n fold timing factor.
<rjo> and this could even be tested and implemented on sayma_v1
<hartytp> rjo: there is more than one way of doing this that I would consider perfectly acceptable. The approach I've pushed for is based on my personal biases and pushes the complexity into the part of the system that I feel most confident debugging
<hartytp> I completely agree with everything you've said.
<hartytp> If someone else wants to take this over and use a different approach I'm also fine with that :)
<hartytp> so long as we end up with the right result
<rjo> with this measuring the hmc7043 phase would be almost trivial. even right now.
<hartytp> "oh. maybe support independent sysref outputs from the fpga to the two dacs (if that's not already the plan). i see no harm." we considered that
<rjo> hartytp: what part of the hmc7043 delay slip thing didn't work again?
<hartytp> I have no objections to doing it. It's not currently implemented to save a DFF/FPGA output, but that's easy to change
<rjo> IIRC there were a bunch of RTM signals left.
<hartytp> "ah. there might be a trivial way to measure the phase of the hmc7043 divider tree: just generate 1/n and 1/(n+1) divided outputs and feed them to the fpga. easy to sample, gives you the typical n fold timing factor."
<hartytp> interesting idea. DDMTD kind of thing
<rjo> ok. sorry. no time. but i suspect the beat note style measurement of the hmc7043 phase could be very interesting.
<hartytp> yes, but still needs testing and is non-trivial to implement
<hartytp> basically, the multi-cycle slip was doing something weird that we were struggling to debug. Seemed to not be monotinic or something
<hartytp> but, it was hard to probe all the signals needed to debug.
<hartytp> essentially, sb0 tried it for a while, didn't understand what was happening, so we decided to try a different approach that was simpler
<hartytp> tl;dr I suspect that any of these approaches can be made to work with enough time, and I don't much care which one we use. Let's just pick one and actually get it to work. The method I'm using seems as good as any other, but I'm more comfortable debugging it. So, if I'm doing the development then it seems like the right choice. If someone else takes over then they can take a different tac
<hartytp> rjo: unless we deicde to take a different approach (should be agreed asap). My plan would be to use the current sayma hw to demonstrate full DAC sync (including HMC830 sync) at 600MHz
<hartytp> I can do that with the HW I have using the DFF + delay line
<hartytp> the only thing stopping me right now is that I'd like to port Sayma init to kernels first since it makes things much easier
rohitksingh has quit [Remote host closed the connection]
hartytp has quit [Quit: Page closed]
m4ssi has quit [Remote host closed the connection]
proteusguy has quit [Ping timeout: 245 seconds]
mumptai has joined #m-labs
proteusguy has joined #m-labs
hartytp has joined #m-labs
<hartytp> rjo: in hindsight, before designing Sayma v1.0 we should have tested out the HMC7043 part of the design using an eval board. e.g. to test the multi-cycle slip when applied to multiple channels, by looking on a fast scope
<hartytp> it might be that if someone did that (e.g. drive it from Kasli) then they could figure out why it doesn't work and make the code work
<hartytp> although, we'd also need to check that using the FPGA as a phase detector still works at higher clock frequencies and across VT/build-build variations
<hartytp> If someone gets all that working then I'd be happy to keep using the 7043. But, it's not a quick job or clearly simpler than any other approach on the table
hartytp has quit [Quit: Page closed]
<rjo> hartytp_: right. since you are implementing it you need to decide. if i were at it i would invest a day into this style of measuring the 7043 phase (i consider it extremely likely to work well) and another couple days of testing the data-centric phase compensation or the hmc7043 slip based phase compensation.
m4ssi has joined #m-labs
<rjo> the risk i see is getting the hmc7043 slip mechanics to work (or the data-centric delays). one advantage is that i can't see anything missing to try this on sayma_v1. and there are also options to combine the two approaches (data delay plus coarse digital delay) to completly bypass the hmc7043 slip stuff.
<rjo> hartytp_: imo the big omission was to not think the synchronization concept through. i.e. the need and complexity to align the jesd clocking tree to rtio (or measure and compensate) was not noticed. failure to communicate between the clocking tree designers, the jesd designers, and the rtio designers.
<rjo> this 1/n vs 1/(n+1) scheme is more like a fractional pll than dmtd (at least to me).
<rjo> and afaict it is pretty much immune to PVT in the fpga. precisely because of the n resolution gain.
<rjo> there are also a lot of questions i have about the past attempts to use the hmc7043 slip stuff and things that look suspicious to me. but they are not relevant if the hmc7043 is gone. anyway. this is just my input. it's your decision.
hartytp has joined #m-labs
<rjo> the only limit to the resolution gain is the ac coupling cutoff.
<hartytp> "this is just my input. it's your decision. "
<hartytp> ack. I suspect that any decisions I make will end up impacting you and SB (arguing that we can treat elements of a complex design like Sayma in isolation is arguably part of what lead to the issues we have had)
<hartytp> so am giving you a chance for input. But, I'm happy to make an executive call. Talking it through has been helpful
<hartytp> taking a step back though. Let's look at what would need to be done for a complete demo of the sync scheme I'm thinking of (no HMC7043, DFF + delay line, either use the HMC830 in fundamental mode or, if we can't get the DAC to work at max clock rate, use the DAC as a phase detector and reset the PLL)
<hartytp> 1. Verify the "analog" performance/device imperfections in the HMC830. This needs to be done for any approach we take since, as I found with the ADF PLL, it's easy to get unexpected phase shifts at the >100ps level which could kill sync at 2GHz
<hartytp> 2. I've already verified that I can generate a stable, low jitter SYSREF at the DAC, and use it to measure the SYSREF v DAC clock edges at 2GHz. This timing was stable so I think it's now low risk (and I have no reason to think that the HMC7043 or any other IC would be more or less stable)
<hartytp> So, modulo the odd DAC behaviour I found, that part is done for my scheme
<hartytp> I couldn't get this to synchronise, but all the data I have said this was some issue with the DAC (either misconfiguration or a silicon bug) so will likely affect any sync scheme we adopt
<hartytp> 3. Check that I can synchronise the HMC830 PLL reliably at 600MHz clock rate. Use that to demo synchronisation at 600MHz DAC clock.
<hartytp> this is a very good thing to do anyway, since it removes the risk of issues like the DAC not being able to sync at max clock rate (cf ad9914)
<hartytp> (3) can be done relatively easily with the hardware I currently have. Quicker if we port the Sayma init to kernels (and seems nicer in the long run since we can store sync params in the device db rather than flash). But I can do this in rust if necessary
<hartytp> If that works, then I can't see any reason why the HMC7043 approach is nicer. About the same amount of hardware, cost, power. Similar levels of complexity.
<hartytp> also, currently the HMC7043 is disconnected on my RTM (cut traces) to insert the DFF, so it would take a bit more work for me to test it.
<hartytp> not a strong preference though.
<hartytp> the only other point I'd make is to reiterate that during the testing I've done I've lost a lot of time reverse engineering complex ICs like the HMC7043. They are generally beautiful, but do have bugs (the HMC7043 has a few nasty ones if you look over the ADF forum), and the documentation isn't great
<hartytp> and, they have a lot of features we don't need.
<hartytp> empirically, stripping this back to a few discrete components makes it faster to develop the systems. At least, that's been my experience. YMMV
hartytp has quit [Ping timeout: 256 seconds]
m4ssi has quit [Remote host closed the connection]
mumptai has quit [Remote host closed the connection]
Gurty has quit [Ping timeout: 264 seconds]
Gurty has joined #m-labs
Gurty has joined #m-labs
Gurty has quit [Changing host]