#m-labs on 2019-01-22 — irc logs at freenode.irclog.whitequark.org

2018-12-09 21:21 sb0_ changed the topic of #m-labs to: https://m-labs.hk :: Logs http://irclog.whitequark.org/m-labs

00:04 _rht has quit []

00:05 _rht has joined #m-labs

00:05 _rht has quit [Client Quit]

00:05 X-Scale has quit [Ping timeout: 240 seconds]

00:06 _rht has joined #m-labs

00:40 bluebugs has quit [Quit: ZNC 1.7.1 - https://znc.in]

00:40 cedric has joined #m-labs

00:40 cedric has quit [Changing host]

00:40 cedric has joined #m-labs

01:06 _whitelogger has joined #m-labs

01:33 X-Scale has joined #m-labs

01:53 <sb0> hartytp: it's not really glitches; when 3.3V fails it does not recover until a power cycle, and the board is completely dysfunctional when that happens (no JTAG etc.)

01:54 <sb0> I don't think this is related to the hmc830 problem, but the latter can still be some silly hw problem...

02:00 <sb0> so, I tried running "conda create -n foo package.tar.bz2" to check if it would install dependencies automatically, but of course, it simply crashed instead

02:04 <sb0> the workaround is to create the environment first, then install a non-noarch package in it, then "conda install package.tar.bz2"

02:04 <sb0> typical conda behavior

02:06 <sb0> and of course, then it doesn't even look at the dependencies. no error, no warning, simply the package gets thrown into the environment.

02:12 <sb0> workaround: conda install -c file://<package_directory> <package_name>

02:12 <sb0> .....

02:15 <sb0> "--no-deps Do not install, update, remove, or change dependencies. This WILL lead to broken environments and inconsistent behavior. Use at your own risk."

02:15 <sb0> so it warns you about it, but then does it silently anyway

02:16 _rht has quit [Quit: Connection closed for inactivity]

02:29 <sb0> now "conda create -n artiq-kc705-nist_clock", with a fresh conda install, has been "solving environment" for 10 minutes at 100% CPU usage, and still not done. how do people use this crap?

02:36 <sb0> workaround: "conda create -n xxx artiq-kc705-nist_clock python=3.5", activate, then conda install ...

02:37 <sb0> maybe i should update the docs...

04:19 <sb0> bb-m-labs: force build artiq

04:19 <bb-m-labs> build forced [ETA 58m20s]

04:19 <bb-m-labs> I'll give a shout when the build finishes

04:19 <sb0> power-cycled the kasli and reconnected USB, flashing problem is gone

04:50 <bb-m-labs> build #2305 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2305

05:06 <bb-m-labs> build #2306 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2306

05:09 <bb-m-labs> build #2852 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2852

05:39 proteusguy has quit [Ping timeout: 272 seconds]

05:45 <whitequark> adamgreig: either is fine

05:47 <_whitenotifier-c> [m-labs/nmigen] whitequark pushed 1 commit to master [+0/-0/±1] https://git.io/fhgKX

05:47 <_whitenotifier-c> [m-labs/nmigen] whitequark 2c80f35 - lib.fifo: fix typo in AsyncFIFO documentation.

05:48 <whitequark> sb0: conda people could and should have used aspcud, but instead they wrote their own shitty dependency solver

05:50 <_whitenotifier-c> [nmigen] Success. The Travis CI build passed - https://travis-ci.org/m-labs/nmigen/builds/482733767?utm_source=github_status&utm_medium=notification

05:50 <_whitenotifier-c> [nmigen] Success. 83.25% remains the same compared to e33580c - https://codecov.io/gh/m-labs/nmigen/commit/2c80f35de46d59024c4524b7cee1ba030db9f086

05:50 <_whitenotifier-c> [nmigen] Success. Coverage not affected when comparing e33580c...2c80f35 - https://codecov.io/gh/m-labs/nmigen/commit/2c80f35de46d59024c4524b7cee1ba030db9f086

06:55 <whitequark> sb0: I think I found a case where the migen simulator behavior is clearly troublesome

06:57 <whitequark> let's say you are trying to read from a FIFO. right now there is a FIFOInterface.read() in nmigen that does the following:

06:57 <whitequark> yield self.re.eq(1)

06:57 <whitequark> yield

06:57 <whitequark> value = (yield self.dout)

06:57 <whitequark> yield self.re.eq(0)

06:57 <whitequark> that's fine and well, except let's say you (a) call this in a loop, (b) want to check for readability

06:59 <whitequark> and you cannot do this. you have one of the two options: 1) spending 2 cycles per read instead of 1, 2) only knowing whether the FIFO (was) readable after advancing the timeline

07:00 <whitequark> now, this does not particularly hurt with a FIFO, because re is &-d with readable

07:00 <whitequark> but I think not being able to implement this pattern is clearly a deficiency

07:07 <GitHub-m-labs> [artiq] sbourdeauducq pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/217493523199aea8649c7f91a63b0bbe3173d645

07:07 <GitHub-m-labs> artiq/master 2174935 Sebastien Bourdeauducq: nix: update package descriptions

07:09 <sb0> _florent_: still won't work with copper sfp cable between sayma and kasli

07:09 <sb0> and strangely enough I cannot reproduce the results with the SFP loopback, it now receives garbage data as well

07:24 _whitelogger has joined #m-labs

07:28 <sb0> more exactly: the PMA loopback (i_LOOPBACK=0b010) still works, but the physical hardware loopback is fucked

07:37 <bb-m-labs> build #2307 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2307

07:42 <sb0> same behavior on another sayma board ...

07:47 <sb0> rjo: the "PLL lock timeout" kasli_tester/ad9910 bug is reproducible on kasli-1

07:53 <bb-m-labs> build #2308 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2308

07:55 futarisIRCcloud has joined #m-labs

08:30 m4ssi has joined #m-labs

08:39 proteusguy has joined #m-labs

08:39 <rjo> sb0: the buildbot didn't reproduce it before and i couldn't reproduce it yesterday morining. i saw it the first time seconds before the flashing failed.

08:40 <sb0> rjo: ok, it's easily reproduced right now

08:42 <bb-m-labs> build #2853 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2853 blamelist: Sebastien Bourdeauducq <sb@m-labs.hk>

09:04 <rjo> sb0: no. works fine when i try it.

09:05 <rjo> https://hastebin.com/motitihivu.rb

09:08 cjbe has joined #m-labs

09:09 <cjbe> sb0 / rjo: I also saw these PLL lock timeouts. Reverting the 3 "configurable refclk divider" commits solved it.

09:10 proteusguy has quit [Ping timeout: 245 seconds]

09:10 <cjbe> There seemed to be some nasty persistent state - we ran on these commits for a while, and the lock timeout only appeared after powercycling the Kasli

09:11 <cjbe> I suspect this is due to the refclk input divider reset bit (cfr3) doing something funny / undocumented

09:14 <rjo> this was my sequence master: bad, a467b8f8 (before that change): good, 385916a9: good, 40187d19: good, master: good.

09:15 <rjo> i also noticed twice that some sequence of register writes around that input divider lead to no output at all while it should. but that was fixed by doing a different sequence.

09:16 <rjo> sb0: could you cycle that crate?

09:19 <cjbe> So I tried power cycling the Kasli, then doing the CPLD / DDS channel init. I saw PLL lock timeouts on yesterdays master, then reverted 2bea5e3..40187d1 and saw no lock timeouts. I repeated this a couple of times swapping between working an non-working

09:38 <GitHub-m-labs> [artiq] jordens pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/91e375ce6acd6b626b1040abefd1af7c1cdb9d63

09:38 <GitHub-m-labs> artiq/master 91e375c Robert Jördens: ad9910: don't reset the input divide-by-two...

09:40 <rjo> sb0: i can't reproduce it. but i reverted a change hypothetically related. if you see it, could you bisect it, probably with power cycling?

10:05 <sb0> rjo: cycled

10:07 <GitHub-m-labs> [artiq] sbourdeauducq pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/9ee5fea88d4d08f41f1cfe19bee88e58b2ac602d

10:07 <GitHub-m-labs> artiq/master 9ee5fea Sebastien Bourdeauducq: kasli: support optional SATA port for DRTIO

10:07 <bb-m-labs> build #2309 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2309

10:15 futarisIRCcloud has quit [Quit: Connection closed for inactivity]

10:18 proteusguy has joined #m-labs

10:23 <bb-m-labs> build #2310 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2310

10:36 <bb-m-labs> build #1002 of artiq-win64-test is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/1002

10:36 <bb-m-labs> build #2854 of artiq is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2854

11:07 <bb-m-labs> build #2311 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2311

11:16 proteusguy has quit [Ping timeout: 250 seconds]

11:23 <bb-m-labs> build #2312 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2312

11:35 <bb-m-labs> build #1003 of artiq-win64-test is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/1003

11:35 <bb-m-labs> build #2855 of artiq is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2855

11:47 <GitHub-m-labs> [artiq] sbourdeauducq pushed 1 new commit to release-4: https://github.com/m-labs/artiq/commit/139e87de3669bf95f535e0777982734a64f43289

11:47 <GitHub-m-labs> artiq/release-4 139e87d Sebastien Bourdeauducq: firmware: fix compilation error with more than 1 Grabber

11:48 <GitHub-m-labs> [artiq] sbourdeauducq pushed 2 new commits to master: https://github.com/m-labs/artiq/compare/9ee5fea88d4d...a0eba5b09ba2

11:48 <GitHub-m-labs> artiq/master 2e3555d Sebastien Bourdeauducq: firmware: fix compilation error with more than 1 Grabber

11:48 <GitHub-m-labs> artiq/master a0eba5b Sebastien Bourdeauducq: satman: support Grabber

11:50 <GitHub-m-labs> [artiq] jordens pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/b692981c8e3bbeb48b70846fa8f2a0fb97285bd7

11:50 <GitHub-m-labs> artiq/master b692981 Robert Jördens: ad9910: add note about red front panel led...

12:18 <bb-m-labs> build #2313 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2313

12:32 <bb-m-labs> build #2314 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2314

12:38 rohitksingh has joined #m-labs

12:43 X-Scale has quit [Read error: Connection reset by peer]

12:44 <bb-m-labs> build #1004 of artiq-win64-test is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/1004

12:44 <bb-m-labs> build #2856 of artiq is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2856

12:45 X-Scale has joined #m-labs

12:55 proteusguy has joined #m-labs

12:57 rohitksingh has quit [Ping timeout: 246 seconds]

12:58 rohitksingh has joined #m-labs

13:02 hartytp_ has joined #m-labs

13:14 <bb-m-labs> build #2315 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2315

13:18 <hartytp_> sb0, rjo: initial notes on Sayma clocking https://hastebin.com/uxahecatey.coffeescript

13:18 <hartytp_> if there is anything else you need then let me know

13:30 <bb-m-labs> build #2316 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/2316

13:42 <bb-m-labs> build #1005 of artiq-win64-test is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/1005

13:42 <bb-m-labs> build #2857 of artiq is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2857

13:50 <rjo> hartytp_: ok.

13:51 <rjo> hartytp_: 1. i'd make sure that the loop filter can be made passive if there are issues (design, supply, noise, etc) with the active solution.

13:54 <rjo> hartytp_: are you sure that "using the dac to measure sysref" works? does it tell you which deviceclk cycle and/or setup-hold violations?

13:56 <hartytp> 1. yes, absolutely

13:56 <rjo> hartytp_: where is your sketch with the kc705 digital-wr design?

13:57 <rjo> hartytp: aren't there two retiming stages?

13:57 <hartytp> may even go for passive by default to allow me to prototype it on Sayma v2.0 (depends on how much time I have to test before v1.0 goes to production)

13:57 <hartytp> 2. are you sure that "using the dac to measure sysref" works? does it tell you which deviceclk cycle and/or setup-hold violations?

13:57 <hartytp> I think so, but need to test

13:58 <hartytp> one-shot-then monitor allows us to synchronise to an edge of the DAC clock, and then see how much we need to move sysref by to hit the next edge

13:58 <hartytp> in the case of non-deterministic delays between sysref and dac clock that are a fraction of the dac clock, we'd see the sysref edge moving

13:59 <hartytp> anyway, yes, needs testing, but I can do that fairly quickly once we've moved sayma to kernels

13:59 <hartytp> two retiming stages?

13:59 <hartytp> for sysref?

13:59 <hartytp> not in my current plan

13:59 <sb0> hartytp_: it's not using oneshot and monitor right now

13:59 <hartytp> no, it's not

14:00 <hartytp> but if you look back over my notes from testing sync, I couldn't get the current code to work, so I changed it

14:00 <sb0> why didn't it work?

14:00 <hartytp> I'd have to look back over the github issue

14:00 <hartytp> rjo: for wr do you mean this: https://github.com/sinara-hw/meta/issues/15#issuecomment-454840325

14:01 <hartytp> for the actual digital design, message weida (or I can do it on your behalf if easier)

14:01 <hartytp> nb using the shared I2C bus makes his code more complex than it needs to be

14:01 <sb0> "relatively "easy" problem since we have an ~4ns period" did you mean 8ns?

14:02 <hartytp> oops, yes

14:02 <hartytp> :)

14:02 <hartytp> thanks

14:02 <hartytp> sb0: re the DAC sysref detection, maybe I should remove the one-shot-then-monitor from that doc

14:03 <hartytp> point is just that if we can make sysref detection work reliably then we can use it to reset the pll. if not then the pll phase isn't an issue anyway

14:03 <hartytp> how we implement the dac sync needs more testing

14:04 <rjo> hartytp: and was there a sketch of the sustom_sync design as well?

14:05 <rjo> *custom

14:05 <hartytp> rjo: yes, but that's a little out of date now

14:06 <hartytp> if it would help, I can draw up a new BD based on the doc I sent

14:06 <hartytp> but, there wouldn't be anything in it that's not in the text

14:09 <rjo> where is the out of date version? now that you claim there is only one retiming stage i am confused.

14:11 <hartytp> github

14:11 <hartytp> heading off to meeting now, but can dig it out later

14:14 <rjo> ok. found it. i just misread which fanout this is clocked from.

14:14 <key2_> whitequark: what is the way you chose to do the .connect in nmigen ?

14:15 <key2_> for EPs

14:21 <whitequark> .connect? EPs?

14:21 <whitequark> I don't understand the question

14:26 <key2_> when in migen you do self.comb += source.connect(sink)

14:26 <key2_> how do you do that with nmigen ?

14:34 rohitksingh has quit [Ping timeout: 246 seconds]

15:05 rohitksingh has joined #m-labs

15:26 <sb0> hartytp: using it to detect the PLL phase seems a bit fragile. lots of moving parts with uncertainties...

15:43 <hartytp> rjo: okay, good. If you still want an updated BD let me know

15:43 <hartytp> sb0: yes. AFAICT, it should work, but it does make a complex system more complex

15:44 <hartytp> the other option is to add an extra DFF and delay line.

15:44 <hartytp> sample the DAC clock from the ref clk

15:45 <hartytp> or add a programmable divider with reset. also needs DFF + delay line

15:46 <hartytp> but then, we're kind of back to using an HMC7043 but suppling it with a SYSREF signal to do the sync that way

15:46 <hartytp> also not simple and needs testing

15:46 <hartytp> using the DAC as a phase detector has the advantage of being easy to prototype with current hw

15:51 <hartytp> well, I can also easily prototype using a DFF + delay line to synchronise the PLL.

15:52 <hartytp> clock DFF from ref clk, input is DAC clock after passing through delay line, output to FPGA. sweep delay and look at the signal at the fpga

15:53 <hartytp> not sure if one can scrap the DFF + delay line and do this directly in the FPGA, a la SiPhaser

15:53 <rjo> hartytp: ultimately, what was the reason to scrap the hmc7043?

15:54 <hartytp> rjo: the way we were using it seemed to hit a lot of issues, and didn't seem to be in line with the intended use case, for which it was correctly tested

15:55 <hartytp> i.e. using the FPGA to sample the output phase on one channel and then shifting all channels to do the alignment

15:55 <hartytp> AFAICT we would have had better luck if we'd supplied it with a proper SYSREF signal

15:55 <hartytp> the way they do in all the app notes

15:55 <rjo> RFSYNCIN?

15:55 <hartytp> that could be generated using the DFF + delay line (I assume the FPGA has too much timing variation and jitter to do this directly)

15:56 <hartytp> yes, RFSYNCIN, which needs to meet S/H from the DAC clock

15:57 <rjo> trying to understand this further, JESD204B synchronization is achieved independent of where the SYSREF edge is. it just needs to be deterministic at both source and sink. what was the reason to actively rephase clocks?

15:57 <hartytp> if that was the plan then I'm okay with it. however, the thinking was that if we need to use a DFF + delay line to produce the RFSYNCIN then why not just supply that directly to the DAC and cut out the HMC7043

15:58 <hartytp> needs to meet S/H to avoid ambiguity

15:58 <hartytp> and needs to reproducibly hit the same edge over PVT so want to be near the middle of the window

15:58 <hartytp> and low enough jitter

15:59 <hartytp> if we were confident that we could produce an output directly from the FPGA that would pick the same DAC clock cycle each time over PVT then we wouldn't need to retime. When we discussed it, the conclusion was that this seemed like a stretch so better to add some cheap retiming hw

16:00 <hartytp> DFF + delay line isn't much overhead in terms of money/firmware/board space/power consumption so seemed easier to add it than worry about timing

16:01 <rjo> i am not following what you are referring to now. let's split this in two questions. (a) why can't we just use the standard passive SYSREF jesd204b design/why do we need to actively phase SYSREF? (b) how do we implement active SYSREF phasing?

16:01 <rjo> let's just talk about (a)

16:03 <hartytp> so, my understanding of this is that we want to pick out an edge of the ref clk that corresponds to our t=0

16:03 <hartytp> so the DAC output has deterministic latency w.r.t. that clock edge

16:04 <hartytp> so, for example, assume the DAC clock has a deterministic phase w.r.t. the ref clock

16:04 <rjo> hmm. that's not how i have come to understand jesd204b over the years.

16:04 <hartytp> ok

16:04 <hartytp> go on..

16:05 <hartytp> essentially, my understanding is that SYSREF picks out the edge of the DAC clock that the first sample is released on

16:05 <rjo> it's more like: the clock tree generates sysref. sysref marks clock deviceclk cycles on both data sink and data source. the source generates special data at a sample in fixed sample-cycle relation to the special deviceclk cycle. the source aligns its buffer accordingly.

16:06 <rjo> what you just said is correct. but the source side also places its marker on the data.

16:07 <rjo> that in turn makes the scheme independent of the "absolute" sysref phase.

16:07 <hartytp> so, if SYSREF moves by 1 period of the DAC clock then the timing of the first output sample will move by 1 DAC clock period w.r.t. the ref clk

16:07 <rjo> the only problem for us may be that the fanout now needs to generate the deviceclk and that should be in phase to the RTIO clock.

16:08 <hartytp> aah, yes, I see what you mean

16:08 <rjo> but this may actually have a solution.

16:08 <hartytp> yes, I think that if the FPGA were clocked from the 7043 then things would be much easier

16:09 <rjo> so in fact one could argue that all we want to do is to drive rfsyncin with the rtio clock (modulo frequency).

16:10 <rjo> that is if we stick with the hmc7043

16:10 <hartytp> rjo: well, RFSYNCIN still needs to have a tunable delay to ensure it can meet S/H at 2GHz

16:10 <rjo> again. i am not lobbying. just trying to understand.

16:10 <hartytp> rjo: sure, it helps to talk it through

16:12 <hartytp> and it also needs to have jitter <<200ps (min DAC clock period less 200ps S/H time, which is my memory of the HMC7043 requirements)

16:12 <hartytp> and the drift needs to be at a similar level

16:12 <rjo> yes RFSYNC needs a tunable delay if that can't be achieved passively.

16:12 <hartytp> I don't see how you'd achieve it passively, since I'm not sure one can model things accurately enough

16:12 <rjo> well. you have the rtio clock nearby just upstream of the hmc830.

16:13 <hartytp> anyway, it seems easier and safer to add a tunable delay than to rely on modelling to match clocks at that level (after all, this is precisely why the HMC7043 has analog phase shifters)

16:13 <rjo> ack. the diff phase through the hmc830 and straight to the rfsyncin needs to be matched.

16:14 <hartytp> so...

16:14 <hartytp> you take the RTIO clock and phase shift it and feed it to the HMC7043

16:14 <hartytp> then re phase shift it and feed that to the DACs

16:14 <hartytp> that's not very different to just taking the RTIO clock, phase shifting it and feeding that to the DACs, which is essentially what my shceme is

16:15 <rjo> why the "re phase shift it and feed that to the DACs "? that's what the hmc7043 does.

16:15 <hartytp> yes, sorry, not clear. that part was what the HMC7043 is doing

16:16 <hartytp> my point is that you still need the external delay line and fanouts that we have now. It's a very similar concept.

16:16 <hartytp> you're just using the HMC7043 to provide an extra delay line

16:16 <hartytp> so you either have one delay line + 1 HMC7043

16:16 <hartytp> or two delay lines

16:16 <rjo> the only additon over sayma_v1 is the delay line between the hmc830 input and rfsyncin. (plus maybe the fanout upstream).

16:17 <hartytp> yes, I think that's correct

16:17 <rjo> i am using the hmc7043 to provide fanout and delay lines.

16:17 <hartytp> yes. You're using a single-channel delay line + HMC7043. I'm using a two channel delay line.

16:18 <hartytp> (well, "I" and "you" isn't quite right, just these are two ideas)

16:18 <hartytp> this was my original plan

16:19 <hartytp> I moved away from it because it felt to me like it would be simpler to just use the extra delay line.

16:19 <rjo> ok. another avenue would be to use some better phase detection scheme to measure the hmc7043 fpga deviceclk phase (w.r.t. rtio). and then just properly align samples based on that measurement. no active shifting needed.

16:19 <hartytp> I don't follow

16:19 <hartytp> you mean delay the samples in software by adding time offsets?

16:21 <hartytp> fwiw, if we had more data converters on the board, the HMC7043 would be a no brainer. For something with only 2 DACs, it's less clear that it provides a nicer path than a couple of discrete components (certainly more registers to program and more potential for undocumented features that take a long time to reverse engineer and understand)

16:22 <hartytp> rjo: hmmmm...

16:22 <hartytp> I'm not sure that feeding the HMC7043 with the phase shifted rtio clock quite works

16:22 <hartytp> IIRC the JESD framing means that it all works on a sub-multiple of the RTIO clock.

16:22 <rjo> modulo frequency. yes.

16:23 <hartytp> so, one has to produce a divided down RTIO clock, and ensure that the divided clock is correctly synchronised

16:23 <rjo> ok. it may require the same retiming and delay from fpga to rfsyncin.

16:23 <hartytp> but that division is easy to do in the FPGA

16:23 <hartytp> right, so in either case, we have FPGA -> DFF -> delay line

16:24 <hartytp> 1 option is to have DFF -> 2ch delay line -> DACs

16:24 <hartytp> 2nd option is to have DFF->delay line -> HMC7043 -> DACs

16:25 <hartytp> so, ignoring the PLL output divider I think it's hard to say that there is much difference between the two. Personally, I fee, that buying a delay line with 2 channels is a simpler solution than adding all the complexity of the HMC7043 (look how many pages the register map runs to)!

16:25 <rjo> what i mean with the other avenue is that in order to compensate for the fpga deviceclk phase (out of the hmc7043, without being able to slip the entire tree), you can just do that when you place the marker sample in reaction to sysref alignment (by shifting that sample).

16:26 <hartytp> hmmm...I *think* that should work, but not really my area.

16:26 <rjo> i.e. add synthetic delay in the buffer that compensates the hmc7043-to-rtio phase that you measured.

16:27 <hartytp> it's not obvious to me that it's simpler than the other options we're considering (and still seems to be quite a non-standard way of approaching the problem)

16:27 <rjo> in terms of clocking it is actually more standard. you'd strictly follow jesd204b sc1.

16:27 <hartytp> and requires a really good way of measuring the phase of the 2GHz clock at the FPGA, so probably still a DFF + delay line

16:28 <rjo> only in your datapath before that there would be a compensation to the phase you measured.

16:28 <rjo> yes. absolutely. accurate timing is not going to magically disappear. it's just about choosing *where* to guarantee it.

16:33 <rjo> ok. in short: i don't see any big problems with the custom_sync approach. i consider the alternatives to be of comparable complexity and risk.

16:33 <rjo> but it was very helpful for me to talk through the alternative concepts.

16:34 <rjo> oh. maybe support independent sysref outputs from the fpga to the two dacs (if that's not already the plan). i see no harm.

16:37 <rjo> ah. there might be a trivial way to measure the phase of the hmc7043 divider tree: just generate 1/n and 1/(n+1) divided outputs and feed them to the fpga. easy to sample, gives you the typical n fold timing factor.

16:37 <rjo> and this could even be tested and implemented on sayma_v1

16:37 <hartytp> rjo: there is more than one way of doing this that I would consider perfectly acceptable. The approach I've pushed for is based on my personal biases and pushes the complexity into the part of the system that I feel most confident debugging

16:38 <hartytp> I completely agree with everything you've said.

16:38 <hartytp> If someone else wants to take this over and use a different approach I'm also fine with that :)

16:38 <hartytp> so long as we end up with the right result

16:38 <rjo> with this measuring the hmc7043 phase would be almost trivial. even right now.

16:39 <hartytp> "oh. maybe support independent sysref outputs from the fpga to the two dacs (if that's not already the plan). i see no harm." we considered that

16:39 <rjo> hartytp: what part of the hmc7043 delay slip thing didn't work again?

16:39 <hartytp> I have no objections to doing it. It's not currently implemented to save a DFF/FPGA output, but that's easy to change

16:39 <rjo> IIRC there were a bunch of RTM signals left.

16:39 <hartytp> "ah. there might be a trivial way to measure the phase of the hmc7043 divider tree: just generate 1/n and 1/(n+1) divided outputs and feed them to the fpga. easy to sample, gives you the typical n fold timing factor."

16:40 <hartytp> interesting idea. DDMTD kind of thing

16:40 <rjo> ok. sorry. no time. but i suspect the beat note style measurement of the hmc7043 phase could be very interesting.

16:40 <hartytp> yes, but still needs testing and is non-trivial to implement

16:41 <hartytp> basically, the multi-cycle slip was doing something weird that we were struggling to debug. Seemed to not be monotinic or something

16:41 <hartytp> but, it was hard to probe all the signals needed to debug.

16:42 <hartytp> essentially, sb0 tried it for a while, didn't understand what was happening, so we decided to try a different approach that was simpler

16:43 <hartytp> tl;dr I suspect that any of these approaches can be made to work with enough time, and I don't much care which one we use. Let's just pick one and actually get it to work. The method I'm using seems as good as any other, but I'm more comfortable debugging it. So, if I'm doing the development then it seems like the right choice. If someone else takes over then they can take a different tac

16:47 <hartytp> rjo: unless we deicde to take a different approach (should be agreed asap). My plan would be to use the current sayma hw to demonstrate full DAC sync (including HMC830 sync) at 600MHz

16:48 <hartytp> I can do that with the HW I have using the DFF + delay line

16:48 <hartytp> the only thing stopping me right now is that I'd like to port Sayma init to kernels first since it makes things much easier

16:57 rohitksingh has quit [Remote host closed the connection]

17:29 hartytp has quit [Quit: Page closed]

17:43 m4ssi has quit [Remote host closed the connection]

17:46 proteusguy has quit [Ping timeout: 245 seconds]

17:49 mumptai has joined #m-labs

17:49 proteusguy has joined #m-labs

17:56 hartytp has joined #m-labs

17:57 <hartytp> rjo: in hindsight, before designing Sayma v1.0 we should have tested out the HMC7043 part of the design using an eval board. e.g. to test the multi-cycle slip when applied to multiple channels, by looking on a fast scope

17:58 <hartytp> it might be that if someone did that (e.g. drive it from Kasli) then they could figure out why it doesn't work and make the code work

17:59 <hartytp> although, we'd also need to check that using the FPGA as a phase detector still works at higher clock frequencies and across VT/build-build variations

17:59 <hartytp> If someone gets all that working then I'd be happy to keep using the 7043. But, it's not a quick job or clearly simpler than any other approach on the table

18:21 hartytp has quit [Quit: Page closed]

19:54 <rjo> hartytp_: right. since you are implementing it you need to decide. if i were at it i would invest a day into this style of measuring the 7043 phase (i consider it extremely likely to work well) and another couple days of testing the data-centric phase compensation or the hmc7043 slip based phase compensation.

19:55 m4ssi has joined #m-labs

20:08 <rjo> the risk i see is getting the hmc7043 slip mechanics to work (or the data-centric delays). one advantage is that i can't see anything missing to try this on sayma_v1. and there are also options to combine the two approaches (data delay plus coarse digital delay) to completly bypass the hmc7043 slip stuff.

20:21 <rjo> hartytp_: imo the big omission was to not think the synchronization concept through. i.e. the need and complexity to align the jesd clocking tree to rtio (or measure and compensate) was not noticed. failure to communicate between the clocking tree designers, the jesd designers, and the rtio designers.

20:23 <rjo> this 1/n vs 1/(n+1) scheme is more like a fractional pll than dmtd (at least to me).

20:24 <rjo> and afaict it is pretty much immune to PVT in the fpga. precisely because of the n resolution gain.

20:27 <rjo> there are also a lot of questions i have about the past attempts to use the hmc7043 slip stuff and things that look suspicious to me. but they are not relevant if the hmc7043 is gone. anyway. this is just my input. it's your decision.

20:30 hartytp has joined #m-labs

20:31 <rjo> the only limit to the resolution gain is the ac coupling cutoff.

20:32 <hartytp> "this is just my input. it's your decision. "

20:33 <hartytp> ack. I suspect that any decisions I make will end up impacting you and SB (arguing that we can treat elements of a complex design like Sayma in isolation is arguably part of what lead to the issues we have had)

20:34 <hartytp> so am giving you a chance for input. But, I'm happy to make an executive call. Talking it through has been helpful

20:35 <hartytp> taking a step back though. Let's look at what would need to be done for a complete demo of the sync scheme I'm thinking of (no HMC7043, DFF + delay line, either use the HMC830 in fundamental mode or, if we can't get the DAC to work at max clock rate, use the DAC as a phase detector and reset the PLL)

20:37 <hartytp> 1. Verify the "analog" performance/device imperfections in the HMC830. This needs to be done for any approach we take since, as I found with the ADF PLL, it's easy to get unexpected phase shifts at the >100ps level which could kill sync at 2GHz

20:38 <hartytp> 2. I've already verified that I can generate a stable, low jitter SYSREF at the DAC, and use it to measure the SYSREF v DAC clock edges at 2GHz. This timing was stable so I think it's now low risk (and I have no reason to think that the HMC7043 or any other IC would be more or less stable)

20:38 <hartytp> So, modulo the odd DAC behaviour I found, that part is done for my scheme

20:39 <hartytp> I couldn't get this to synchronise, but all the data I have said this was some issue with the DAC (either misconfiguration or a silicon bug) so will likely affect any sync scheme we adopt

20:40 <hartytp> 3. Check that I can synchronise the HMC830 PLL reliably at 600MHz clock rate. Use that to demo synchronisation at 600MHz DAC clock.

20:40 <hartytp> this is a very good thing to do anyway, since it removes the risk of issues like the DAC not being able to sync at max clock rate (cf ad9914)

20:41 <hartytp> (3) can be done relatively easily with the hardware I currently have. Quicker if we port the Sayma init to kernels (and seems nicer in the long run since we can store sync params in the device db rather than flash). But I can do this in rust if necessary

20:42 <hartytp> If that works, then I can't see any reason why the HMC7043 approach is nicer. About the same amount of hardware, cost, power. Similar levels of complexity.

20:43 <hartytp> also, currently the HMC7043 is disconnected on my RTM (cut traces) to insert the DFF, so it would take a bit more work for me to test it.

20:43 <hartytp> not a strong preference though.

20:44 <hartytp> the only other point I'd make is to reiterate that during the testing I've done I've lost a lot of time reverse engineering complex ICs like the HMC7043. They are generally beautiful, but do have bugs (the HMC7043 has a few nasty ones if you look over the ADF forum), and the documentation isn't great

20:45 <hartytp> and, they have a lot of features we don't need.

20:45 <hartytp> empirically, stripping this back to a few discrete components makes it faster to develop the systems. At least, that's been my experience. YMMV

21:17 hartytp has quit [Ping timeout: 256 seconds]

21:44 m4ssi has quit [Remote host closed the connection]

22:05 mumptai has quit [Remote host closed the connection]

23:48 Gurty has quit [Ping timeout: 264 seconds]

23:50 Gurty has joined #m-labs

23:50 Gurty has quit [Changing host]