<sb0>
the crasher kernel still crashes sayma without the LOC. but it seems it again prints garbage on the UART instead of freezing.
<sb0>
_florent_, is the RTM FPGA supposed to be completely reset during link init??
<sb0>
there are still intermittent serwb init failures that seem to depend on what happened to the AMC before the boot
<sb0>
or it reboots again. and I'm not sure if that's significant, but the memory scan look worse after a crash-induced reboot.
<sb0>
_florent_, also, why does it sometimes freeze in "wishbone test"? afaik the core is supposed to do a bus error when there is a problem, not stall transactions indefinitely
<sb0>
and this doesn't seem to be due to memory corruption
<_florent_>
sb0: yes the rtm is fully reset during link init. It seems that after a crash, the hardware is not working as well as before the crash (sdram scan, someone also reported serwb errors IIRC), but don't understand why
<_florent_>
sb0: for the wishbone freeze, i can add a timeout if we don't receive response to read and generate an error
<sb0>
so, it seems the DACs are also trashing the FPGA
<sb0>
or the JESD core, the transceivers, or anything that gets initialized in board_artiq::ad9154::init
<sb0>
maybe it's a power integrity issue?
<rjo>
sb0: when i was reviewing the rtm platform def, i also started the amc but didn't get far. one thing that may be useful to check is whether the lvds inputs (especially clocks on non-gt inputs) have termination. probably not the cause of the current problems but still worthwhile.
<sb0>
those reboot loop results seem well reproducible, at least when I'm running them on the HK board
rohitksingh has joined #m-labs
<_florent_>
sb0: i think you sent the same diff two times in #1065
<GitHub-m-labs>
[artiq] enjoy-digital commented on issue #1065: @sbourdeauducq: in your first case, the HMC830 and HMC7043 are initialized, but the buffers on the FPGA inputs are still disabled. (We enable them when initializing the DAC).... https://github.com/m-labs/artiq/issues/1065#issuecomment-397538869
<GitHub-m-labs>
[artiq] enjoy-digital commented on issue #1065: @sbourdeauducq: so let's remove the buffers (or add a separate control for them) and redo your first test. If that's still working fine, then it's related to the JESD/DACs. If not, then to the HMC830/HMC7043. https://github.com/m-labs/artiq/issues/1065#issuecomment-397539922
<GitHub-m-labs>
[artiq] enjoy-digital commented on issue #1065: @sbourdeauducq: so let's remove the enables on the buffers (or add a separate control for them) and redo your first test. If that's still working fine, then it's related to the JESD/DACs. If not, then to the HMC830/HMC7043. https://github.com/m-labs/artiq/issues/1065#issuecomment-397539922
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: @sbourdeauducq interesting! The conclusion here is that the HMC7043 is still interfering with the FPGA, at least after recovering from a crash -- even with the RESET connected and pulled to 3V3.... https://github.com/m-labs/artiq/issues/1065#issuecomment-397610851
cr1901_modern has joined #m-labs
<GitHub-m-labs>
[artiq] sbourdeauducq commented on issue #1065: Simply apply the patch above - then it doesn't boot at all and fails memtest! No need to have a prior crash, this also happens right after a power cycle. https://github.com/m-labs/artiq/issues/1065#issuecomment-397611190
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: Also, are you sure that you have the HMC7043 rework done correctly, and that you're holding the chip in reset mode during boot? Can you try removing the AC coupling caps that connect the HMC7043 to the two FPGAs on your board? https://github.com/m-labs/artiq/issues/1065#issuecomment-397638448
<rjo>
hartytp: i don't think 1.9Vpp into an unterminated LVDS 1.8V HP bank biased to a 0.9 V is fine.
<hartytp>
hmmm...
<hartytp>
1.9Vpp
<hartytp>
yes, I'm used to thinking about it as 800mV ish as a single-ended signal
<rjo>
hartytp: have you tested that? i don't think on the clocks is the problem here. if this turns out to make the sysref scan bad or make sync fail, then we can revisit.
<hartytp>
no, it's just something I noticed when looking over the code after some of the SC1 issues, and was curious about
<rjo>
and i don't think the 200r bias resistors will hurt lvds swing (given that we run it in high perf mode at 750 mVpp).
<hartytp>
rjo: well, if that is a problem, I think we need to remove those resistors on the HMC7043 outputs
<hartytp>
rjo: am I being daft here, or is your argument about the LVPECL not correct
<hartytp>
1.8Vpp is the differential signal
<hartytp>
the input swing across each input is 800mVpp
<hartytp>
which should be fine
<hartytp>
or am I missing something?
<rjo>
hmc7043 table 6 i am reading 1.9Vpp at 1 GHz. assuming that's with the 200R output, that would swing to 0.9+0.95V=1.85V which is high given the input.
<rjo>
and vin_max=vcco+0.2v=2v is pretty close considering that there might well be overshoots.
<rjo>
it seems completely unnecessary to drive that into the fpga fabric.
<rjo>
and obviously if there is no input termination on the LVDS inputs, all bets are probably off.
<hartytp>
table 7
<hartytp>
well, yes, the lack of termination is obviously not at all good
<rjo>
figure 6. sorry.
<hartytp>
since it's a doubly unterminated line
<hartytp>
rjo: that's differential
<hartytp>
i.e. the difference between the p and n outputs
<hartytp>
so each of those outputs only swings by 800mV (which is standard for LVPECL)
<rjo>
the driver side is terminated.
<rjo>
850 mV.
<hartytp>
sure
<rjo>
but it's not standard for a LVDS receiver at all.
<hartytp>
no, it's not standard, but I wouldn't have thought it would do any harm...
<hartytp>
(particularly not after some loss in transmission lines)
<rjo>
well. the datasheet has 600 mV vdiff max.
<hartytp>
okay
<hartytp>
well, in that case you're 100% right :)
<hartytp>
there could be some diodes between the inputs
<hartytp>
or something like that
<hartytp>
well, we're clearly exceeding that by quite some margin, which I can well imagine causing issues
<rjo>
but apart from being potentially harmful it seems to be unneeded as there is no indication that there is a SI issue on sysref.
<rjo>
they talk about certain cases where higher vdiff is tolerated but is seems pointless to explore that.
<hartytp>
ack. I don't think it was something we put much thought into
<sb0>
the artix-7 fpga has clamp diodes to I/O bank VCC and ground
<sb0>
and they are permanently connected, unlike in some other fpga families
<hartytp>
well, in that case I would only expect a max se swing limit and not a max vdiff limit
<hartytp>
but, who knows
<hartytp>
anyway, as rjo says, LVPECL is probably over the top for those signals, so let's stick with LVDS and probably remove the bias resistors in the next revision
<rjo>
750mV vodiff pp in LVDS (high power as currently) is still massive and more than 2*vidiff typ. i don't think the 100R each are going to hurt that signal.
<hartytp>
sb0 anything else you want me to look at
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: hmm...after a reboot, the kernel ran (same output on UART), 5 minutes later, no crash afaict. Re loading the AMC FPGA with `artiq_flash -t sayma ... start` mem test looks good, and the Kernel runs again with the same output. https://github.com/m-labs/artiq/issues/1065#issuecomment-397668228
<hartytp>
as I said though, it might be worth checking that you reall can disable the HMC7043 after your rework by holding reset high
<GitHub-m-labs>
[artiq] hartytp commented on issue #1065: hmm...after a reboot, the kernel ran (same output on UART), 5 minutes later, no crash afaict. Re loading the AMC FPGA with `artiq_flash -t sayma ... start` mem test looks good, and the Kernel runs again with the same output.... https://github.com/m-labs/artiq/issues/1065#issuecomment-397668228
<GitHub65>
[smoltcp] podhrmic closed pull request #237: Fix MTU settings so fragmented packets can be received (master...proper_mtu_handling) https://github.com/m-labs/smoltcp/pull/237
<GitHub132>
[smoltcp] podhrmic commented on issue #236: @pothos It turns out that counting in the ethernet header is strictly speaking needed only for UDP packets. With MTU of 1500, linux sends ethernet frames that are 1514 bytes long. For TCP packets, they are only 1500 bytes long (including the header). You can try this with a wireshark and see for yourself.... https://github.com/m-labs/smoltcp/pull/236#issuecomment-39770