lekernel changed the topic of #milkymist to: Mixxeo, Migen, Milkymist-ng & other Milkymist projects :: Logs: http://en.qi-hardware.com/mmlogs :: Mixxeo preorder lists.milkymist.org/pipermail/devel-milkymist.org/2013-May/003344.html
antgreen has quit [Remote host closed the connection]
antgreen has joined #milkymist
Hawk777 has quit [Quit: Coyote finally caught me]
Hawk777 has joined #milkymist
antgreen has quit [Read error: Operation timed out]
antgreen has joined #milkymist
bhamilton has joined #milkymist
bhamilton has quit [Quit: Leaving.]
bhamilton has joined #milkymist
lekernel has joined #milkymist
proppy has quit [Ping timeout: 246 seconds]
bhamilton has quit [Quit: Leaving.]
bhamilton has joined #milkymist
proppy has joined #milkymist
<GitHub95>
[mibuild] sbourdeauducq pushed 1 new commit to master: http://git.io/YFzgVQ
<GitHub95>
mibuild/master 548f268 Sebastien Bourdeauducq: platform/rhino: rename ismm data out signal to locked
Alarm_ has joined #milkymist
lekernel has quit [Ping timeout: 264 seconds]
Alarm_ has quit [Quit: ChatZilla 0.9.90 [Firefox 21.0/20130511120803]]
lekernel has joined #milkymist
antgreen has quit [Ping timeout: 264 seconds]
antgreen has joined #milkymist
bhamilton has quit [Quit: Leaving.]
<wpwrak>
(pattern degeneration) if it's a problem with phase differences, knowing how the correct pattern gets distorted would allow to quantify the phase error, which in turn may (*) allow picking the most stable value
<wpwrak>
(*) assuming you can adjust the phase
<lekernel>
the clock/phase relationship is already adjusted using oversampling with edge detection
<lekernel>
the system samples at 2x the data rate and adjusts the data input delay to make the middle extra sample have 50% of 1's and 50% of 0's at each transition
<lekernel>
you have to do this sort of thing because the skew is spec'd to many bit times in HDMI/DVI
<lekernel>
and can be different for each channel
bhamilton has joined #milkymist
<wpwrak>
maybe this algorithm gets thrown off sometimes ?
<wpwrak>
it may also not be phase directly but bit width. e.g., a "1" may be longer or shorter than a "0". so if you take random samples from a perfectly balanced stream, you should actually see more of one type than of the other
<wpwrak>
and if you try to force your detection to yield 50/50, you'd basically have to hop around the phase
<lekernel>
no it doesn't - it's under software control, only runs at 2Hz and prints results on the serial console
<wpwrak>
one way to find out if this is the case would be by disabling phase adjustment and just collecting samples at the bit clock. then see what ratio you get. if you're close to 50/50, then there's no issue. if it's more like 30/70, then there may be an issue
<lekernel>
the delay only oscillates by a few dozen ps
<wpwrak>
hmm, so you have the delay (ps) for fine-tuning plus a larger offset for the skew (multi-bit, so in the many ns range)
<lekernel>
yes
<lekernel>
fine tuning is done to recover bits
<lekernel>
then there is character synchronization to recover words
<wpwrak>
ah, i see
<lekernel>
and finally inter-channel deskew to make the words match on all 3 channels (done by exploiting the fact that hsync/vsync happen exactly at the same time on the 3 channels)
<wpwrak>
okay, that sounds reasonable. the 50/50 fine-tuning could still settle on being very close to an edge
<lekernel>
character synchronization is done by checking for all 9 possible shifts of the concatenation of the last 2 words, and if one of those shifts keeps yielding the same valid control word then it gets selected
<wpwrak>
"the same" control word ? the same as what ?
<lekernel>
during a blank period the control words are repeated. it checks for this.
<wpwrak>
okay
<lekernel>
this avoids false positive since control words have more transitions than normal data
<wpwrak>
sounds good. tricky protocol :)
<wpwrak>
but the fine-tuning still seems suspicious. if you have edges at times 0 and 1, it could settle, say, on 0.1, and never realize it's at risk of getting thrown off by even a slight disturbance, right ?
<lekernel>
ok so the algo works like this:
<lekernel>
at transitions (and only at transitions) it checks for the middle sample
<lekernel>
if you have a transition like
<lekernel>
0 ---[0]---> 1 then you are probably sampling too early, since the middle sample is still in the 0 region
<wpwrak>
bit in: ...abc... then if (a != b) sample(c); ?
<wpwrak>
oh, i see that way
<lekernel>
0 ---[1]---> 1 then you are probably sampling too late, since the middle sample has arrived in the 1 region
<lekernel>
and same for 1->0 transitions
<lekernel>
then you increment/decrement a counter when it's early/late
<wpwrak>
okay, you're sampling the edge. good. that should be precise then. it also answers my next question, how you know which way to tune :)
<lekernel>
and if counter reaches a certain absolute value, then you adjust the input data delay by some dozen ps
<lekernel>
the way it's done is: the gateware signals the software when the counter has reached that point (and freezes the counter), then the software has to change the delay and reset the counter
<wpwrak>
001 and 110 are both treated as the same "early" ? or do you distinguish 0->1 and 1->0 transitions
<lekernel>
detected as the same early
<wpwrak>
so non-50/50 duty cycles should cancel each other out if the threshold is large enough. good.
<lekernel>
and the delay resulting from this algo is smaller on the shortened pair
<lekernel>
s/smaller/longer
<wpwrak>
can you run three counters, one at -1 delay, one at 0 (i.e., like the one you have now), the third at +1 delay ? that would allow you to tell whether the delay adjustment conclusion was correct
<lekernel>
no, there's only one delay unit per io pin
<wpwrak>
darn
<lekernel>
but that thing really seems to work... there's little oscillation
<lekernel>
only by a couple taps of the delay unit, which is a couple dozen ps
<lekernel>
and adjusting the delay does not corrupt the data going through the delay unit
<wpwrak>
so you have good edges but bad bits ? hmm ...
<lekernel>
remember the erroneous words happen in bursts
<lekernel>
probably much shorter than the 500ms period of the software phase adjustements
<lekernel>
so they may cause a shift by one tap, but you won't notice it from the rest of the noise
<wpwrak>
so you know how phase/delay adjustments are related to error bursts ? (in the time of occurrence)
<lekernel>
no, I don't
<lekernel>
the system only adjusts by one tap (dozen ps) maximum every 500ms
<lekernel>
an short error burst happening between two phase adjustements will be lost in the noise
<lekernel>
note that during many phase adjustement periods, WER=0
<lekernel>
the error burst happen maybe every 2s or so
<wpwrak>
how often does the bit skew adjuster run ?
<lekernel>
every 500ms
<lekernel>
and it only does correction by one tap maximum
<lekernel>
every time
<lekernel>
it's only meant to compensate for temperature and voltage drift
<wpwrak>
i mean the one that matches shifted patterns
<lekernel>
there is of course a faster calibration when the port is plugged
<lekernel>
the character synchronization? it's hardware and it's on at all times
<wpwrak>
yeah, seems that the fine-tuning is off the hook for now
<wpwrak>
(char sync) you said it only runs in blank periods. that's between lines or between frames ?
<lekernel>
both
<larsc>
onced it is synced it shouldn't get out of sync again, right?
<wpwrak>
is the source allowed to drop/insert bits ? i.e., can the skew "jump" ?
<wpwrak>
there are two types of errors i see: 1) incorrect hsync timing ("late"), and 2) stray error bursts in the pixel data
<lekernel>
the character synchronization switches every time there are 8 *consecutive* appearances of one of the 4 possible control words at the *same* shift position
<wpwrak>
so that suggests synchronization is normally not lost for very long. if a desync event can happen only on horizontal retrace, that would be incompatible with that being the culprint
<wpwrak>
s/print/prit/
<lekernel>
note that the pictures are 100% perfect and WER=0 when there is no DCM_CLKGEN ...
<wpwrak>
and that only happens between lines or frames
<lekernel>
I think the DCM may intermittently lose lock and output a broken clock
<wpwrak>
but then the fine-tuning ought to run wild
<lekernel>
? why
<lekernel>
no
<wpwrak>
if the clock goes off
<wpwrak>
well, unless the clock just glitches
<lekernel>
if it goes wrong for less than 500ms you'd only see a shift by one tap
<lekernel>
then software resets the counter, and if the clock was OK for the next 500ms the tap simply goes back at next phase adj
<lekernel>
and there's also some oscillation by 1-3 taps at all times, so it's really going to be lost in the noise
<wpwrak>
dvisampler0_pix is affected by the DCM ?
<lekernel>
all pix* clocks are from DCM -> PLL
<lekernel>
or just PLL now
<wpwrak>
how long is the bit period, in taps ?
<lekernel>
lots
<lekernel>
more than 50
<lekernel>
+/- the nice 300% PVT Xilinx has half-spec'd on the IODELAY
<wpwrak>
hmm, if the DCM-PLL combo jitters, i should have seen this