<GitHub65>
[pdq] jonaskeller opened pull request #24: Fix wavesynth program transfer via SPI (m-labs/pdq#20) (master...master) https://github.com/m-labs/pdq/pull/24
<GitHub-m-labs>
[artiq] sbourdeauducq commented on issue #1065: With the DRTIO master that has the SAWG, this insanity manifests itself by breaking the RTM FPGA loading, with the error "Did not exit INIT after releasing PROGRAM". If this symptom is reproducible, that would be something that is less of a PITA to zero in on than the very complicated crash-kernel. https://github.com/m-labs/artiq/issues/1065#issuecomment-39962
<sb0_>
of course I cannot reproduce the RTM FPGA loading failure ...
<GitHub-m-labs>
[artiq] sbourdeauducq commented on issue #1065: I did get one memory corruption event reported while running the crash-kernel, by running the CRC in a loop (no 1s delay) and with 1/10 the size, so that it runs faster and it increases the chance of catching an error before the board freezes.... https://github.com/m-labs/artiq/issues/1065#issuecomment-399637216
<sb0_>
hartytp_, found any good way of crashing the board without the crash-kernel?
sb0 has joined #m-labs
sb0 has quit [Client Quit]
<GitHub-m-labs>
[artiq] hartytp commented on issue #1080: Okay. I'll try that and your blinker next to see if we can find some issue with a simpler logic block that we can focus on instead of debugging complex jesd/memory issues. https://github.com/m-labs/artiq/issues/1080#issuecomment-399643322
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: Okay. Next steps imho are to move the sawg onto a separate CD with reset controlled by a kernel csr. Will then see if enabling/disabling it during the kernel fixes the issue.... https://github.com/m-labs/artiq/issues/1065#issuecomment-399644005
<sb0_>
so far, it would seem we have two ways to reproducibly trigger sayma insanity: the crash-kernel, and #998
<sb0_>
hartytp_, how do you want to clock SAWG/JESD without the RTM? the Si5324 output is not routable to all the JESD transceiver (thank Xilinx for that...)
<GitHub-m-labs>
[artiq] sbourdeauducq commented on issue #998: Reverting commit 83428961ad3dd74c5aa58859da035630c2fc06cf makes the bug disappear on the master (and stops the crash-kernel from crashing). So, it really looks like this and #1065 are linked. https://github.com/m-labs/artiq/issues/998#issuecomment-399646597
<GitHub-m-labs>
[artiq] sbourdeauducq commented on issue #998: Reverting commit 83428961ad3dd74c5aa58859da035630c2fc06cf (done on 84b3d9ecc604e3fd30c3b2095b6f336eef3d11c2) makes the bug disappear on the master (and stops the crash-kernel from crashing). So, it really looks like this and #1065 are linked. https://github.com/m-labs/artiq/issues/998#issuecomment-399646597
<sb0_>
hartytp_, maybe if you use the drtio master, you can keep rtio/sawg clocked from the si5324 and jesd clocked from the hmc7043
<sb0_>
there will be elastic buffer overruns and DAc data corruption etc. but it shouldn't matter
<sb0_>
the #998 repro should be independent from that
<sb0_>
let me try that actually
<sb0_>
basically revert 8b3c12e6ebcf3e72f2a54f9cfea76475cdf73c6d (though it needs to be done manually)
hartytp__ has joined #m-labs
<hartytp__>
sb0: I don't know why you have any confidence that the Vadatec board will be easier to get up and running with ARTIQ/SAWG
<hartytp__>
that seems unlikely to me
<sb0_>
because all artiq boards work correctly except sayma?
<hartytp__>
well, Sayma works just fine without SAWG
<hartytp__>
we haven't put 8 channels of SAWG on anything else
<sb0_>
pipistrello, papilio pro, kasli and its variants, kc705 without sawg, kc705 with sawg
<sb0_>
none of those has been the royal PITA that sayma is
<hartytp__>
I'm still of the opinion that the level of moaning I've heard about it is far out of proportion to the level of issues
<hartytp__>
e.g. turns out that all serwb issues were due to you not bothering to use the correct IO standard
<hartytp__>
oops
<hartytp__>
how much of my life did the lack of even a basic code review from you cost me?
<hartytp__>
so, let's focus on getting this to work and cut the crap?
<hartytp__>
I had a go at moving SAWG into a separate CD that's controlled by kernels
<hartytp__>
will be interesting to see if the crash kernel still crashes with the SAWG in reset state
<hartytp__>
(let me know if you see anything obviously wrong with that code)
<hartytp__>
otherwise, I'll look at the blinker
<hartytp__>
tbh, I'm not sure how to test without the RTM
<sb0_>
not all serwb issues were due to that problem, and _florent_ also tried to debug it for a long time without finding out about the io standard issue (which was his mistake in the first place)
<hartytp__>
sb0: your subcontractor = your issue
<sb0_>
and of course, when the root cause is identified, everything seems "simple"
<hartytp__>
at least, that's the way i see it
<hartytp__>
sure, but my point is that I'm still not convinced that these issues are all to do with Sayma
<hartytp__>
rather than, say issues with project management/reviews
<sb0_>
yes, but I'm saying that you cannot say it's "basic" or "simple"
<sb0_>
a "simple" and "basic" capacitor problem could also cause the crashes we're seeing right now ...
<hartytp__>
well, not using the right IO standards is about as "simple" as HDL issues get
<hartytp__>
anyway, this isn't how I plan to spend my weekend
<hartytp__>
arguing about this
<hartytp__>
point is just that I don't find your comments about Sayma helpful in moving us towards a good outcome
<hartytp__>
for anyone
<hartytp__>
...
<hartytp__>
moving on
<hartytp__>
if you see something wrong with that commit then let me know, otherwise I'll try playing around with a Kernel
<hartytp__>
re: removing the rtm to minimize the example
<hartytp__>
I can add a 150MHz JESD clock by soldering coax onto the AMC to RTM connector
<hartytp__>
but, how much does that help us? The JESD obviously can't start up without an RTM
<hartytp__>
still, I guess that's what I'll do
<hartytp__>
make the runtime not crash if the JESD link doesn't start up
<hartytp__>
s/crash/panic
<hartytp__>
and try with no RTM
<hartytp__>
(oh, and, yes a broken capacitor could cause this issue, but Greg has looked at all power supplies several times and found nothing suspicious
<sb0_>
it's not clear to me whether he tried it with the SAWG running, nor if his board crashes at all
<hartytp__>
maybe there is something daft like some FPGA pin that isn't connected correctly
<sb0_>
yes, it could be that too
<hartytp__>
so we should get another design review for the AMC
<hartytp__>
I can't do that
<hartytp__>
we can ask greg to take another look
<hartytp__>
you know any hardware guys who would mind having a look?
<hartytp__>
I think he did try with sawg, but if you don't know then why not ask him on GitHub? No point wondering when he's generally very fast at responding to questions
<hartytp__>
marmelada: while this may well not be a hw issue, I think we do need to consider that
<hartytp__>
c.f. how many issues we had with the ethernet chip because of a floating pin
<hartytp__>
do you know of anyone (e.g. creotech) that could do a full design review of the AMC?
<hartytp__>
all the boring stuff like digging through the xilinx guides on capacitors, clocking recommendations, magic pins that have to have something done to them to make things work
<hartytp__>
the works
<hartytp__>
well, not a full design review, but at least the parts around the FPGA/RAM
<hartytp__>
and particularly anything ultrascale-related
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: okay, I modified my code to move the SAWG onto a separate CD, whose RESET can be controlled by Kernels. Running the "crash kernel" with the SAWG in reset does not crash. https://github.com/m-labs/artiq/issues/1065#issuecomment-399654163
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: okay, I modified my code to move the SAWG onto a separate CD, whose RESET can be controlled by Kernels. Running the "crash kernel" with the SAWG in reset does not crash.... https://github.com/m-labs/artiq/issues/1065#issuecomment-399654163
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: Haven't looked. I'm seeing what happens if I enable the SAWG in the Kernel. If you have time, can you try running my code and see if you get the same result (no crash with SAWG disabled)?... https://github.com/m-labs/artiq/issues/1065#issuecomment-399656141
<GitHub-m-labs>
[artiq] hartytp commented on issue #1080: One data point here: running with the SAWG held in reset, I don't see the "crash kernel" crash. But, I do see a bunch of errors during init (JESD PRBS, can't determine SYSREF margin at FPGA) https://github.com/m-labs/artiq/issues/1080#issuecomment-399656302
<GitHub-m-labs>
[artiq] hartytp commented on issue #1080: Note to self: try this with a no-sawg build. It would be interesting to see if there is a difference between no SAWG and SAWG in reset. If there is, then this seems much more like a vivado issue than a hardware issue. https://github.com/m-labs/artiq/issues/1080#issuecomment-399656378
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: @gkasprow The SAWG still runs even when there are JESD errors. Look at the logs I've posted, they all have JESD errors but, so long as the board boots up fully and you can run the kernel, I don't think that's a problem. https://github.com/m-labs/artiq/issues/1065#issuecomment-399659751
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: @gkasprow While this might well be a code issue, I think it is worth doing a complete design review of Sayma AMC. Is there anyone else (e.g. Creotech) who can lend us a fresh pair of eyes. Looking at the usual things like checking Xilinx decoupling requirements, checking for any pins with special requirements, comparing our schematic to relevant ultra-scale eval boards,
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: @sbourdeauducq what would it take to get the SAWG running on a kintex ultrascale eval board? I would feel *much* more confident that this is likely a HW issue if you could show that a design of comparable complexity runs correctly on an ultrascale eval board.... https://github.com/m-labs/artiq/issues/1065#issuecomment-399660481
<GitHub61>
[smoltcp] dlrobertson commented on pull request #236 200ec19: The is `O(3n)` right? Is there a way we could make this faster? What if we used a `ManagedMap` instead of a `ManagedSlice`? https://github.com/m-labs/smoltcp/pull/236#discussion_r197614219
<GitHub180>
[smoltcp] dlrobertson commented on pull request #236 200ec19: :bike: :house: : `ipv4-fragmentation`? Will it ever make sense to have a single generic `ip-fragmentation` feature? Is there a reason to keep two separate features for ipv4 and ipv6? https://github.com/m-labs/smoltcp/pull/236#discussion_r197613823