#milkymist on 2013-05-30 — irc logs at freenode.irclog.whitequark.org

2013-05-16 16:04 lekernel changed the topic of #milkymist to: Mixxeo, Migen, Milkymist-ng & other Milkymist projects :: Logs: http://en.qi-hardware.com/mmlogs :: Mixxeo preorder lists.milkymist.org/pipermail/devel-milkymist.org/2013-May/003344.html

01:01 antgreen has quit [Remote host closed the connection]

01:02 antgreen has joined #milkymist

01:04 Hawk777 has quit [Quit: Coyote finally caught me]

01:06 Hawk777 has joined #milkymist

04:21 antgreen has quit [Read error: Operation timed out]

04:31 antgreen has joined #milkymist

07:41 bhamilton has joined #milkymist

07:51 bhamilton has quit [Quit: Leaving.]

08:04 bhamilton has joined #milkymist

08:05 lekernel has joined #milkymist

08:06 proppy has quit [Ping timeout: 246 seconds]

08:17 bhamilton has quit [Quit: Leaving.]

08:19 bhamilton has joined #milkymist

08:21 proppy has joined #milkymist

09:08 <GitHub95> [mibuild] sbourdeauducq pushed 1 new commit to master: http://git.io/YFzgVQ

09:08 <GitHub95> mibuild/master 548f268 Sebastien Bourdeauducq: platform/rhino: rename ismm data out signal to locked

09:12 Alarm_ has joined #milkymist

09:38 lekernel has quit [Ping timeout: 264 seconds]

09:42 Alarm_ has quit [Quit: ChatZilla 0.9.90 [Firefox 21.0/20130511120803]]

09:51 lekernel has joined #milkymist

10:09 antgreen has quit [Ping timeout: 264 seconds]

10:24 antgreen has joined #milkymist

13:31 bhamilton has quit [Quit: Leaving.]

14:24 <wpwrak> (pattern degeneration) if it's a problem with phase differences, knowing how the correct pattern gets distorted would allow to quantify the phase error, which in turn may (*) allow picking the most stable value

14:24 <wpwrak> (*) assuming you can adjust the phase

14:37 <lekernel> the clock/phase relationship is already adjusted using oversampling with edge detection

14:38 <lekernel> the system samples at 2x the data rate and adjusts the data input delay to make the middle extra sample have 50% of 1's and 50% of 0's at each transition

14:39 <lekernel> you have to do this sort of thing because the skew is spec'd to many bit times in HDMI/DVI

14:39 <lekernel> and can be different for each channel

15:00 bhamilton has joined #milkymist

15:18 <wpwrak> maybe this algorithm gets thrown off sometimes ?

15:19 <wpwrak> it may also not be phase directly but bit width. e.g., a "1" may be longer or shorter than a "0". so if you take random samples from a perfectly balanced stream, you should actually see more of one type than of the other

15:20 <wpwrak> and if you try to force your detection to yield 50/50, you'd basically have to hop around the phase

15:22 <lekernel> no it doesn't - it's under software control, only runs at 2Hz and prints results on the serial console

15:22 <wpwrak> one way to find out if this is the case would be by disabling phase adjustment and just collecting samples at the bit clock. then see what ratio you get. if you're close to 50/50, then there's no issue. if it's more like 30/70, then there may be an issue

15:22 <lekernel> the delay only oscillates by a few dozen ps

15:24 <wpwrak> hmm, so you have the delay (ps) for fine-tuning plus a larger offset for the skew (multi-bit, so in the many ns range)

15:24 <lekernel> yes

15:25 <lekernel> fine tuning is done to recover bits

15:25 <lekernel> then there is character synchronization to recover words

15:25 <wpwrak> ah, i see

15:26 <lekernel> and finally inter-channel deskew to make the words match on all 3 channels (done by exploiting the fact that hsync/vsync happen exactly at the same time on the 3 channels)

15:27 <wpwrak> okay, that sounds reasonable. the 50/50 fine-tuning could still settle on being very close to an edge

15:28 <lekernel> character synchronization is done by checking for all 9 possible shifts of the concatenation of the last 2 words, and if one of those shifts keeps yielding the same valid control word then it gets selected

15:29 <wpwrak> "the same" control word ? the same as what ?

15:29 <lekernel> during a blank period the control words are repeated. it checks for this.

15:30 <wpwrak> okay

15:30 <lekernel> this avoids false positive since control words have more transitions than normal data

15:30 <wpwrak> sounds good. tricky protocol :)

15:32 <wpwrak> but the fine-tuning still seems suspicious. if you have edges at times 0 and 1, it could settle, say, on 0.1, and never realize it's at risk of getting thrown off by even a slight disturbance, right ?

15:33 <lekernel> ok so the algo works like this:

15:33 <lekernel> at transitions (and only at transitions) it checks for the middle sample

15:33 <lekernel> if you have a transition like

15:34 <lekernel> 0 ---[0]---> 1 then you are probably sampling too early, since the middle sample is still in the 0 region

15:34 <wpwrak> bit in: ...abc... then if (a != b) sample(c); ?

15:34 <wpwrak> oh, i see that way

15:34 <lekernel> 0 ---[1]---> 1 then you are probably sampling too late, since the middle sample has arrived in the 1 region

15:35 <lekernel> and same for 1->0 transitions

15:35 <lekernel> then you increment/decrement a counter when it's early/late

15:35 <wpwrak> okay, you're sampling the edge. good. that should be precise then. it also answers my next question, how you know which way to tune :)

15:36 <lekernel> and if counter reaches a certain absolute value, then you adjust the input data delay by some dozen ps

15:37 <lekernel> the way it's done is: the gateware signals the software when the counter has reached that point (and freezes the counter), then the software has to change the delay and reset the counter

15:37 <wpwrak> 001 and 110 are both treated as the same "early" ? or do you distinguish 0->1 and 1->0 transitions

15:37 <lekernel> detected as the same early

15:38 <wpwrak> so non-50/50 duty cycles should cancel each other out if the threshold is large enough. good.

15:39 <lekernel> I've actually tested that thing with a cable like this: https://pbs.twimg.com/media/BF4h83xCUAAI_Jb.jpg:large

15:39 <wpwrak> ;-))

15:39 <lekernel> and the delay resulting from this algo is smaller on the shortened pair

15:40 <lekernel> s/smaller/longer

15:43 <wpwrak> can you run three counters, one at -1 delay, one at 0 (i.e., like the one you have now), the third at +1 delay ? that would allow you to tell whether the delay adjustment conclusion was correct

15:44 <lekernel> no, there's only one delay unit per io pin

15:44 <wpwrak> darn

15:44 <lekernel> but that thing really seems to work... there's little oscillation

15:45 <lekernel> only by a couple taps of the delay unit, which is a couple dozen ps

15:45 <lekernel> and adjusting the delay does not corrupt the data going through the delay unit

15:45 <wpwrak> so you have good edges but bad bits ? hmm ...

15:46 <lekernel> remember the erroneous words happen in bursts

15:46 <lekernel> probably much shorter than the 500ms period of the software phase adjustements

15:47 <lekernel> so they may cause a shift by one tap, but you won't notice it from the rest of the noise

15:47 <wpwrak> so you know how phase/delay adjustments are related to error bursts ? (in the time of occurrence)

15:47 <lekernel> no, I don't

15:48 <lekernel> the system only adjusts by one tap (dozen ps) maximum every 500ms

15:48 <lekernel> an short error burst happening between two phase adjustements will be lost in the noise

15:49 <lekernel> note that during many phase adjustement periods, WER=0

15:49 <lekernel> the error burst happen maybe every 2s or so

15:50 <wpwrak> how often does the bit skew adjuster run ?

15:50 <lekernel> every 500ms

15:50 <lekernel> and it only does correction by one tap maximum

15:50 <lekernel> every time

15:50 <lekernel> it's only meant to compensate for temperature and voltage drift

15:50 <wpwrak> i mean the one that matches shifted patterns

15:50 <lekernel> there is of course a faster calibration when the port is plugged

15:51 <lekernel> the character synchronization? it's hardware and it's on at all times

15:51 <wpwrak> yeah, seems that the fine-tuning is off the hook for now

15:52 <wpwrak> (char sync) you said it only runs in blank periods. that's between lines or between frames ?

15:52 <lekernel> both

15:54 <larsc> onced it is synced it shouldn't get out of sync again, right?

15:54 <wpwrak> is the source allowed to drop/insert bits ? i.e., can the skew "jump" ?

15:56 <wpwrak> there are two types of errors i see: 1) incorrect hsync timing ("late"), and 2) stray error bursts in the pixel data

15:56 <lekernel> the character synchronization switches every time there are 8 *consecutive* appearances of one of the 4 possible control words at the *same* shift position

15:57 <wpwrak> so that suggests synchronization is normally not lost for very long. if a desync event can happen only on horizontal retrace, that would be incompatible with that being the culprint

15:57 <wpwrak> s/print/prit/

15:57 <lekernel> note that the pictures are 100% perfect and WER=0 when there is no DCM_CLKGEN ...

15:57 <wpwrak> and that only happens between lines or frames

15:58 <lekernel> I think the DCM may intermittently lose lock and output a broken clock

15:58 <wpwrak> but then the fine-tuning ought to run wild

15:58 <lekernel> ? why

15:58 <lekernel> no

15:58 <wpwrak> if the clock goes off

15:59 <wpwrak> well, unless the clock just glitches

15:59 <lekernel> if it goes wrong for less than 500ms you'd only see a shift by one tap

15:59 <lekernel> then software resets the counter, and if the clock was OK for the next 500ms the tap simply goes back at next phase adj

16:00 <lekernel> and there's also some oscillation by 1-3 taps at all times, so it's really going to be lost in the noise

16:01 <wpwrak> dvisampler0_pix is affected by the DCM ?

16:01 <lekernel> all pix* clocks are from DCM -> PLL

16:01 <lekernel> or just PLL now

16:01 <wpwrak> how long is the bit period, in taps ?

16:01 <lekernel> lots

16:02 <lekernel> more than 50

16:02 <lekernel> +/- the nice 300% PVT Xilinx has half-spec'd on the IODELAY

16:02 <wpwrak> hmm, if the DCM-PLL combo jitters, i should have seen this

16:03 <wpwrak> i.e., here: http://downloads.qi-hardware.com/people/werner/ming/hdmi-si/fpga-2pix-hdmi-clk-ok.png

16:03 <lekernel> it could be some sort of internal FPGA crosstalk too... which could explain the problems when the SDRAM is read on the other board

16:03 <lekernel> I wonder if the CLKGEN removal fixes that too

16:03 <lekernel> will try...

16:04 <wpwrak> (50 taps) okay, so +/- 3 shouldn't matter

16:08 <wpwrak> if the clock briefly goes out of phase, a comparison between good and bad patterns would still show that

16:26 bhamilton has quit [Quit: Leaving.]

16:31 <lekernel> let me just do that if I run into problems with the PLL alone... I feel I could spend a lifetime investigating every slowtan6 quirk...

16:32 <lekernel> and memory controller issues are causing more damage atm

16:47 <GitHub136> [migen] sbourdeauducq pushed 2 new commits to master: http://git.io/qWvPNQ

16:47 <GitHub136> migen/master ebbd5eb Sebastien Bourdeauducq: bus/csr/SRAM: better handling of writes to memories larger than the CSR width

16:47 <GitHub136> migen/master f0b0942 Sebastien Bourdeauducq: bitreverse: fhdl/tools -> genlib/misc

16:49 <GitHub58> [migen] sbourdeauducq pushed 1 new commit to master: http://git.io/zds_Ag

16:49 <GitHub58> migen/master cebfe78 Sebastien Bourdeauducq: genlib/misc: fix import

18:02 aeris has quit [Read error: Connection reset by peer]

18:03 aeris has joined #milkymist

18:50 Alarm_ has joined #milkymist

18:59 Hawk777 has quit [Quit: Coyote finally caught me]

19:00 Hawk777 has joined #milkymist

19:12 <lekernel> Fallenou, what's wrong with turning the MMU off when entering kernel mode?

19:42 <GitHub161> [milkymist-ng] sbourdeauducq pushed 4 new commits to master: http://git.io/4QGM2g

19:42 <GitHub161> milkymist-ng/master 084aa64 Sebastien Bourdeauducq: dvisampler/clocking: remove DCM_CLKGEN

19:42 <GitHub161> milkymist-ng/master 30f5ef8 Sebastien Bourdeauducq: software/videomixer: remove unneeded DCM resets

19:42 <GitHub161> milkymist-ng/master 6d71e09 Sebastien Bourdeauducq: cif: move to milkymist folder

19:56 Alarm_ has quit [Quit: ChatZilla 0.9.90 [Firefox 21.0/20130511120803]]

20:36 antgreen has quit [Quit: Leaving]

20:41 antgreen has joined #milkymist

20:51 <Fallenou> lekernel: I don't know exactly, I've just been told it's not the good thing to do, but I keep digging :)

20:51 <Fallenou> until I can see the big picture

20:51 <Fallenou> maybe that was rubish

20:52 <Fallenou> going to read the last two answers tomorrow

20:52 <Fallenou> gn8

21:09 lekernel has quit [Quit: Leaving]

21:09 mumptai has joined #milkymist

22:16 antgreen_ has joined #milkymist

22:17 mumptai has quit [Quit: Verlassend]

22:19 antgreen has quit [Ping timeout: 252 seconds]

22:33 bkero has quit [Ping timeout: 256 seconds]

22:33 ftoad has quit [Remote host closed the connection]

22:34 ftoad has joined #milkymist

22:38 ftoad has quit [Ping timeout: 276 seconds]

22:39 ftoad has joined #milkymist

22:44 ftoad has quit [Ping timeout: 248 seconds]

22:44 ftoad has joined #milkymist

22:45 antgreen_ has quit [Ping timeout: 248 seconds]

23:01 antgreen has joined #milkymist

23:11 bkero has joined #milkymist