#milkymist on 2013-04-12 — irc logs at freenode.irclog.whitequark.org

2013-02-02 10:11 lekernel changed the topic of #milkymist to: Milkymist One, Migen, Milkymist SoC & Flickernoise :: Logs: http://en.qi-hardware.com/mmlogs

00:06 jimmythehorn has quit [Quit: jimmythehorn]

00:34 dvdk has joined #milkymist

00:35 dvdk has quit [Remote host closed the connection]

00:43 kristianpaul has quit [Read error: Operation timed out]

00:46 kristianpaul has joined #milkymist

01:25 wpwrak has quit [Ping timeout: 258 seconds]

03:38 wpwrak has joined #milkymist

04:03 antgreen has joined #milkymist

04:37 <wpwrak> hmm, why can't i slice a Cat object ? :-(

04:39 <wpwrak> and indexing doesn't go so well either. migen accepts it but the verilog it produces doesn't find the approval of the HDL parser: {mmc_dat[1], mmc_clk, mmc_cmd, mmc_dat[2]}[0] <= dbg[0];

06:27 <larsc> I think I have a patch for that somewhere

06:37 Martoni has joined #milkymist

07:07 <larsc> wpwrak: http://pastebin.com/ULgHckVY

07:26 <larsc> I wonder could we instead of triggering on rising edge trigger on the falling edge, which should be more or less straight?

07:27 <larsc> nah I guess not

07:27 <larsc> since data is only valid between rising and falling edge

07:28 lekernel has joined #milkymist

07:30 <larsc> although you could delay sda

07:30 <lekernel> wpwrak, you should also remove the 100pF from your board if not already done

07:53 <GitHub75> [mibuild] sbourdeauducq pushed 1 new commit to master: http://git.io/GyU8Dw

07:53 <GitHub75> mibuild/master 59d64e9 Werner Almesberger: mibuild: define memory card pins of the Milkymist One platorm...

08:14 bhamilton has joined #milkymist

08:17 bhamilton has left #milkymist [#milkymist]

08:23 bhamilton has joined #milkymist

08:37 <larsc> don't you hate it when you get an timing error by 0.001 ns?

08:39 <lekernel> just ignore... it's not like xilinx's model is that accurate anyway

08:39 <lekernel> sometimes it tells you timing is met when it actually isn't

08:41 <larsc> I tried, it didn't work

08:41 <larsc> well or my logic is wrong

08:42 <lekernel> try a shot of freeze spray... makes the fpga go faster :)

08:42 <larsc> hehe

08:54 bhamilton has left #milkymist [#milkymist]

09:22 bhamilton has joined #milkymist

09:22 bhamilton has left #milkymist [#milkymist]

09:32 bhamilton1 has joined #milkymist

09:42 bhamilton1 has left #milkymist [#milkymist]

09:51 <larsc> ah the joys of p&r, added some more logic now the timings are met and everything works

09:53 <lekernel> sounds like my usual experience with the xilinx software. and have this problem in the middle of horrid, time-sinking bug-hunting for maximum frustration.

09:57 <lekernel> this is actually one of the problems that should be fixed for good with FPGAs

09:57 <lekernel> that and a good in-chip logic analyzer (which would not require complete recompilation)

10:06 Martoni_ has joined #milkymist

10:08 Martoni has quit [Ping timeout: 248 seconds]

10:08 Martoni_ is now known as Martoni

10:20 <larsc> yes

10:25 <wpwrak> lekernel: 100 pF ? where ?

10:26 <wpwrak> oh .. on top of some of the resistors ?

10:26 <lekernel> wpwrak, on top of the pull-ups

10:26 <lekernel> yes

10:28 <wpwrak> very clean work. i had noticed that those "caps" were a bit tall but didn't realize they were stacks

10:28 <wpwrak> so all four of them can go, i guess ?

10:32 <wpwrak> lekernel: btw, mibuild doesn't have a setup.py, even though the tutorial suggests it does ("by running their respective setuptools script")

10:38 <wpwrak> err yes, all four, obviously. morning caffeine is beginning to kick in :)

10:39 * wpwrak is beginning to wonder if brain size and such aren't overrated and the true secret behind human intelligence is the use of stimulants ...

10:46 <lekernel> yes, remove all 4 caps

10:46 <lekernel> I sent you some replacement 4.7K resistors if removing the whole stack is easier for you

10:47 <lekernel> I've been called "crazy" more than once for stacking 0402 components, but it's actually not that hard, you should try ;)

10:50 bhamilton has joined #milkymist

10:53 <lekernel> what I find hard, however, is using just the right amount of solder like professionals in assembly factories do... so the solders look beautiful and equal instead of the blobs that I make

10:53 <wpwrak> lekernel: btw, good idea about adding some support under the board. i started to worry as well. now there's a piece of acrylic under it

10:54 <wpwrak> removing the caps does indeed help

10:55 <wpwrak> http://downloads.qi-hardware.com/people/werner/ming/edid/scl-faster-no-100pF.png

10:55 <lekernel> but still a glitch, right?

10:55 <wpwrak> no it's getting a little painful to still find a glitch. about 1:10-1:12, i'd say

10:55 <wpwrak> yes

10:55 <lekernel> interesting behaviour...

10:56 <lekernel> I wouldn't expect a steady ramp to produce so many problems

10:56 <wpwrak> with my stronger pull-up and the caps removal, we dropped the glitchiness by about an order of magnitude

10:56 <lekernel> nice catch btw! thanks Werner

10:56 <lekernel> and there are no glitches when there is no video signal?

10:57 <wpwrak> ah, lemme try that

11:00 <wpwrak> 0 in 50 tries (only a number of very short ones from crosstalk in my crappy LA)

11:01 <lekernel> so it sounds like some sort of internal FPGA crosstalk when the IO voltage stays in the forbidden zone

11:01 <lekernel> good to know

11:03 <wpwrak> doesn't have to be in the FPGA. it's more likely on the unshielded and very long paths you have between board and FPGA

11:04 <lekernel> wouldn't that show on the scope?

11:04 <wpwrak> maybe if it was faster :)

11:05 <wpwrak> also my probe setup isn't optimal. long loops between ground and signal. for best results, i'd have to expose some ground next to the signal and hold the probe to it. but even then the capacitative probe may not catch it.

11:06 <wpwrak> i have resistive probes, with nearly infinite bandwidth, but they're a bit messy to deploy

11:07 <wpwrak> my rigol has a sample rate of 200 MSa/s with multiple channels, 400 MSa/s with one. so anything that's faster than 5-10 ns is just invisible

11:08 <wpwrak> and to properly catch a HF glitch, i'd need even more samples. so we're in the > 20 ns domain. almost at the pixel clock

11:08 <wpwrak> maybe i should try the lottery. a good 500 MHz scope has been on my wish list for a very long time ;-)

11:09 <lekernel> how do you plan to implement digital deglitching?

11:09 <lekernel> downsample?

11:10 <wpwrak> maybe just sum over the scl buffer. may be heavy on the fpga but should give us an idea of what works

11:10 <lekernel> make sda + sdl go through a flip-flop, and only pulse its clock-enable pin once every 2**n cycles

11:11 <lekernel> only takes a n-bit counter and 2 flip-flops

11:11 <wpwrak> lars proposed a counter for a minimum stable time. not sure which is simpler

11:11 <lekernel> downsampling should do the trick with very little resources I think

11:11 <wpwrak> we can optimize later :) let's first see if the problem can be killed

11:12 <larsc> my proposal was basically a digital schmitt trigger

11:12 <lekernel> if the downsampled period is longer than the duration of glitching, it should fix it

11:26 <wpwrak> i set a duration trigger and either my scope is missing a lot of things or the glitches are fairly infrequent

11:27 <wpwrak> i also noticed that DDC often doesn't recover even after xrandr --off

11:28 <wpwrak> an FPGA reset cures that. so maybe something else gets messed up when there are too many problems

11:31 <wpwrak> let's see if i can get a mugshot without pixel data

11:34 <wpwrak> (getting out of DDC hell) and sometimes i have to unplug. so the problem may well be on the PC side

11:51 <wpwrak> lekernel: btw, shouldn't things like this work ? self.comb += dbg_pads[0].eq(dbg[0])

11:52 <wpwrak> this works: self.comb += dbg_pads.eq(dbg[0:3] + 8*scl_i)

11:53 <wpwrak> larsc: thanks for the "len" patch ! seems to work, but then i run into the bad verilog migen generates for it :-(

11:53 <larsc> meh

11:53 <lekernel> dbg_pads[0].eq(dbg[0]) doesn't work?

11:54 <wpwrak> scope doesn't seem to see trouble without pixel data. had it searching for something like half an hour now

11:54 <lekernel> slices are incompletely implemented, ENOTIME etc.

11:54 <lekernel> but that case should work

11:55 <wpwrak> i created them with this: dbg_pads = Cat(*(mmc.dat[2], mmc.cmd, mmc.clk, mmc.dat[1]))

11:55 <lekernel> ah, yes, slicing Cat isn't implemented (because it's not implemented in Verilog either and I'm tired of having to write workarounds for shit like that)

11:56 <lekernel> but it should be ...

11:56 <wpwrak> migen produces something a human would understand: {mmc_dat[1], mmc_clk, mmc_cmd, mmc_dat[2]}[0] <= dbg[0];

11:56 <wpwrak> alas, the verilog parser doesn't

11:56 <lekernel> yeh, I know

11:56 <lekernel> verilog crap

11:57 <lekernel> that case is rare enough that I found it unworthy of my time

11:57 <wpwrak> reminds me of early C++ pre-compilers, which took C++ and generated C from it. every once in a while, you had to fish a problem from the generated C. with C++ mangled names and all, not the nice heuristics migen uses.

11:58 <wpwrak> well, if you have something that's an aggregate, this case of slicing seems to be the most sane way to use bits of it. luckily, the algorithmic approach works as well.

11:59 <wpwrak> actually, why can't these things just be like arrays ? instead of Cat(*(A, B, C)) have [A, B, C] ?

12:00 <lekernel> because you can, however, assign to Cat()

12:00 <lekernel> Cat(...).eq() works

12:00 <lekernel> and you can't make that work with a pure Python list

12:01 <wpwrak> hmm. no way to override the assignment and add a test for array-of-signals ?

12:02 <larsc> hm?

12:02 <lekernel> I don't think you can easily override the built-in types

12:02 <lekernel> otherwise I'd have fixed the dictionary and set non-determinism problem too

12:04 <wpwrak> while trying to turn python into a domain-specific language for my TMC tools, i found that there are surprisingly many things you can override. but yes, could be that this one is tougher. would have to explore a bit.

12:05 <wpwrak> (surprisingly many) e.g., you can have variable reads with complex side-effects

12:05 <lekernel> >>> list.x = 8

12:05 <lekernel> Traceback (most recent call last):

12:05 <lekernel> File "<stdin>", line 1, in <module>

12:05 <lekernel> TypeError: can't set attributes of built-in/extension type 'list'

12:05 <larsc> you can overwrite __builtins__.list with your own wrapper thogh

12:06 <larsc> but that doesn't work for [] and friends

12:07 <wpwrak> what if you override __setattr__ ? would that still go to "list" instead of your wrapper ?

12:08 <larsc> __setattr__ of what?

12:08 <lekernel> I'm also a bit hesistant to alter the behaviour of all possibly third-party modules that run in conjunction with the migen stuff

12:08 <wpwrak> if your list wrapper

12:08 <wpwrak> coward ;-)

12:09 <wpwrak> s/if/of/

12:09 <lekernel> being able to use python libraries in test benches is a plus and I don't want to break it

12:09 <wpwrak> looking at my old code, there i had my own class. so i didn't have to fight built-ins

12:10 <lekernel> which is exactly what Cat() is

12:10 <lekernel> with the added semantics that it should represent bit-concatenated values

12:17 <wpwrak> appears that you need ruby for such dirty things. i'm beginning to understand why whitequark loves it so much. create a parallel universe, almost identical to ours, just with a few almost unnoticeable twists

12:18 <lekernel> I suppose you have seen the "wat?" video?

12:19 <lekernel> http://www.youtube.com/watch?v=kXEgk1Hdze0

12:24 <wpwrak> oh dear ;-)

12:25 * wpwrak hurries back to the safety of C :)

12:36 <wpwrak> lekernel: your delay line with "FIXME: understand what is really going on here" ... was that an attempt to escape the glitches ?

12:37 <larsc> it fixed some brokenes

12:37 <lekernel> yes, you should be able to remove it

14:12 <wpwrak> hmm, if assigning from a "variable", wouldn't it be more intuitive if that yielded a blocking assignment, no matter what the target is ?

14:13 <wpwrak> a bit like having "volatile" anywhere in an assignment in C

14:16 <lekernel> no, you need to stop using blocking assignments at some point

14:16 <lekernel> what do you need variables for?

14:17 <wpwrak> for summing the buffer - because i'm too lazy to find out how to assemble a big sum operation :)

14:18 <lekernel> actually a better API for variables would be to associate them to some non-variable backing signal

14:18 <wpwrak> so no, i don't really "need" it. just using it now for convenience.

14:18 <lekernel> but again, I'm only using variables in a couple places, so ...

15:03 <wpwrak> think you;ll like this: http://downloads.qi-hardware.com/people/werner/ming/edid/debounce-thresh-7-count-overview.png

15:03 <wpwrak> mail with details is on the way

15:05 <larsc> wpwrak: the problem is, that this wont work if the glitch happens on the 7th bit

15:06 <larsc> after the 7th

15:06 <larsc> e.g. 0111111101111...

15:06 <larsc> imo having a upper and a lower threshold is better

15:07 <lekernel> just oversample

15:07 <lekernel> s/oversample/downsample

15:07 <lekernel> it's just a flip flop and a counter ...

15:07 <lekernel> I'm actually surprised your design met timing

15:08 <wpwrak> yeah, it ain't pretty ;-)

15:08 <wpwrak> maybe the synthesizer figured out what it actually does and got rid of all the unnecessary stops

15:09 <larsc> lekernel: how does that downsampler work?

15:10 <wpwrak> and yes, you're right, i'd still need a hysteresis for this to be stable.

15:11 <lekernel> just load the flip flop once in 2**n cycles

15:11 <larsc> but why does that work? wouldn't it sample the value that's present at the D input at that point?

15:12 <lekernel> if clock_period*2**n > total duration of glitches, then you can't have two consecutive samples in the glitched region

15:12 <wpwrak> you probably have to reset the counter, too

15:13 <lekernel> on a low to high transition, the second sample will always be right. the first one can be high or low, and can end up in the middle of glitches

15:13 <wpwrak> hmm, sounds right. why does it feel wrong ?

15:14 <lekernel> so this adds a clock_period*2**n jitter, which is a lot, but I2C should be slow enough that it won't be a problem

15:14 <wpwrak> probably just poor intuition :)

15:14 <wpwrak> yeah, I2C is slowness incarnate

15:15 <wpwrak> and it's fairly unconcerned with timing. actually, lemme check that ...

15:16 <lekernel> downsampling also works for debouncing switches... you sample them at around 100Hz, and done

15:18 <wpwrak> yeah, you have half a clock cycle for things like ACK. plenty of time.

15:20 <lekernel> wpwrak, little trick: Cat(carry, counter).eq(counter + 1) works

15:20 <lekernel> but the other way around

15:20 <lekernel> Cat(counter, carry).eq(counter + 1)

15:22 <wpwrak> neat :)

15:25 <wpwrak> 32 should be sufficient. the 20 cycles delay line was ~460 us. half a clock cycle takes about 12 us. the glitch events seem to take around 100 ns.

15:25 <wpwrak> s/460 us/460 ns/

15:26 <lekernel> clock frequency is 83MHz on milkymist-ng and 50MHz on the EDID tester

15:28 <wpwrak> seems about right then. 400-650 ns.

15:29 <lekernel> maybe take something like twice the time the voltage is in the forbidden zone, to be sure?

15:30 <lekernel> unless that becomes non-negligible compared to the nominal 100kHz of I2C

15:30 <wpwrak> ah, and i think my sum deglitcher should actually be stable even if the upset is on the 7th or 8th bit, as long as the glitching interveal isn't larger than the delay buffer (if it is, you have noise entering and leaving, and the sum can dance around the threshold)

15:31 <wpwrak> yeah, 1 us ought to work, too. that's longer than the very worst-case rise time seen so far

15:31 <wpwrak> of course, i don't know what happens with other equipment. e.g., my HDMI cable is relatively shows.

15:32 <lekernel> yeh, so let's take some safety margin

16:09 <wpwrak> seems to work quite well. mail is coming.

16:16 <larsc> I still don't see why this will work. You sample the signal every n cycles, if the glitch happens at exact that moment you'll still see the glitch

16:16 <lekernel> yes, but not at the next sample

16:17 <lekernel> so all the glitches do now is increase jitter, not make counters etc. go too fast

16:17 Martoni has quit [Quit: ChatZilla 0.9.90 [Firefox 20.0/20130329043827]]

16:17 <wpwrak> larsc: if you sample the glitch, then this simply delays you by one sample period

16:18 <lekernel> larsc, there are only glitches during a transition. maybe that's what you did not understand?

16:18 <larsc> maybe

16:18 <wpwrak> larsc: since the signal only glitches at an edge, you're guaranteed to have a glitch-free signal the next time

16:18 <lekernel> when the signal is hard-0 or hard-1, there are no glitches

16:18 <wpwrak> stereo :)

16:18 <larsc> ok, I understand now

16:18 <larsc> I hope

16:19 <larsc> for the glitch to still manifest it basically has to happen at x+n, and x+3*n

16:19 <wpwrak> works also with the pull-up gone. as expected.

16:19 <larsc> well x and x+2*n

16:20 <lekernel> yes but it won't, because the sample period is low enough that signal has reached a stable state by then

16:20 <larsc> yes

16:20 <larsc> that's what I was missing in my mental image

16:21 <lekernel> s/low/long

16:23 <larsc> but it's not exactly downsampling, is it?

16:23 <wpwrak> yeah, it's just sampling

16:24 <lekernel> well, we are sampling first at the system clock frequency

16:25 <lekernel> with a double-FF synchronizer

16:25 <lekernel> and then only consider 1 out of 64 of those samples

16:28 <wpwrak> nice example for how discarding information improves its quality :)

16:29 ohama has quit [Ping timeout: 255 seconds]

16:31 <lekernel> http://arstechnica.com/science/2013/03/flash-memory-issue-forces-curiosity-rover-into-safe-mode/ why can't I help thinking about the RTEMS filesystem API when reading news like this ...

16:32 <wpwrak> yeah, we found interesting bugs there ...

16:33 <wpwrak> ... i still find the message queue crash rather baffling. let's just hope RTEMS only goes into satellites who can peacefully explode in space at a safe distance from everyone, not into, say, nuclear power plant control ...

16:37 ohama has joined #milkymist

16:37 <larsc> or nuclear missile silo control

16:38 <wpwrak> yeah, silo safety protocol

16:39 <wpwrak> like in "twilight's last gleaming": "they can't open the doors". and right then they open :)

16:53 jimmythehorn has joined #milkymist

17:34 bhamilton has quit [Quit: Leaving.]

18:18 <davidc__> wpwrak: In my day job I work on SCADA/smartgrid/etc security. Lets just say people sleep better at night before they see the source of some of those devices.

18:20 <wpwrak> you mean, it's enough if you wake up five times per night, drenched in sweat ?

18:34 <lekernel> wpwrak, excellent job with the i2c! can you just send a patch against milkymist-ng?

19:25 <wpwrak> thanks ! :) need to do some shopping first, then i'll make the patch. also have to check some more systems, see if they have any interesting new perversions. all of them linux, though, so the level of perversity should be limited

20:08 <wpwrak> channel B will also want looking at. there, we have the HDMI clock right next to SCL. if anything can go wrong, it ought to be there :)

20:18 <wpwrak> lekernel: do you have any use for the debugging stuff ?

20:19 <lekernel> no

20:20 <lekernel> or well, hopefully not ;) will p

20:20 <lekernel> ull it from your repos if I end up needing it ...

20:20 <wpwrak> heh ;-)

20:23 <wpwrak> it's kinda disappointing, once the debug code is cleaned out, there's just a few lines left

20:23 <wpwrak> now lets see if i got them right ...

20:39 <wpwrak> patch is on its way

20:56 <lekernel> wpwrak, do both ports work for you?

20:56 <lekernel> port A is fine, port B isn't... but it could be a different problem, I never got port B to work

20:57 <lekernel> (like poor solders)

21:03 <wpwrak> i'll give it a try in a bit. food first :)

21:04 <wpwrak> did it fail at the I2C/DDC/EDID level ?

21:05 <lekernel> no detection at all by xrandr

21:06 <wpwrak> nice. if it looks so bad, it must be something simple :-)

21:08 <GitHub30> [milkymist-ng] sbourdeauducq pushed 1 new commit to master: http://git.io/tsz4ng

21:08 <GitHub30> milkymist-ng/master 7a6e564 Werner Almesberger: edid.py: sample SCL only every 64 clock cycles, to avoid bouncing...

21:23 <mw1> hi

21:24 <mw1> Fallenou: how is netbsd doing?

21:24 mw1 is now known as mwalle

22:00 lekernel has quit [Quit: Leaving]

22:09 qi-bot has quit [Ping timeout: 245 seconds]

22:14 <wpwrak> connectivity on the board looks good. pin definitions in m1.py look correct. ergo it must work. let's see if it does ...

22:17 <wpwrak> but .. nothing happens indeed

22:22 <wpwrak> is it the same on all boards ? or do some work ?

22:25 <wpwrak> bah. now it works

22:29 <wpwrak> sorry, no problem in sight :) just had to disconnect and reconnect

22:30 qi-bot has joined #milkymist

22:30 <wpwrak> maybe a bit more glitches than on CH A, but that's all. and of course, they all get eaten by the downsampling

22:31 <wpwrak> also the pixel clock blinking is like on CH A

22:32 <wpwrak> maybe you ran into the PC giving up for good. i've seen that happen a few times. only a HDMI disconnect cured that.

23:15 kristianpaul has quit [Remote host closed the connection]