#milkymist on 2011-08-26 — irc logs at freenode.irclog.whitequark.org

00:41 <wpwrak> wolfspraul: good news: M1 arrived ! and it seems to behave :)

00:44 <wolfspraul> nice!

00:44 <wolfspraul> of course it behaves

00:45 <wolfspraul> or you want to claim that we ship out untested goods? :-)

00:46 <wpwrak> hehe :)

00:47 <wpwrak> hmm, those bottoms ... i feel a strange urge to just mount a block of aluminium and mill a monolithic one

00:48 <wolfspraul> bottoms?

00:48 <wolfspraul> oh buttons?

00:49 <wpwrak> er yes, buttons. wakeup not quite complete yet :)

00:49 <wolfspraul> wakeup? wow

00:49 <wolfspraul> my morning coffee just ready, one sec... (picking up from stove) :-)

00:50 <wolfspraul> how can it be both my morning and your morning at the same time? strange :-)

00:50 <wolfspraul> about test results: Adam finished 37 boards now, all fine

00:50 <wolfspraul> only 0x4D stepped out of the line

00:50 <wolfspraul> another 9 to go

00:53 <wpwrak> wolfspraul: with fedex syncing me to day time and some nasty toothache the last two night keeping me from sleeping (fixed today - dentists are amazingly efficient nowadays), my pattern is even crazier than ever :)

00:54 <wpwrak> (boards) nice !

00:56 <wpwrak> roh: basic button shape ~3 mm + 0.7 mm shaft followed by ~1.1 mm disc, thickness about 4.8 mm in total ?

00:57 <wpwrak> roh: shaft diameter ... duh .. 300 mil ? disc diameter 1.2 mm ?

01:02 <wpwrak> looks at a 100 x 150 x 5 mm Al plate from some misguided experiments at thermal distribution

01:24 <wpwrak> well, fun for later. for now, buttons aren't convenient to have anyway

01:30 <wpwrak> grmbl. X crash.

01:39 <wpwrak> now .. a 12 V wall wart for the camera. hmm ...

01:45 <kristianpaul> same from wrt should work..

01:46 <wpwrak> hmm, no wrt supply at hand

01:47 <wpwrak> must be hiding

01:47 <wpwrak> i'll just try 9 V

02:15 <wpwrak> hmm, video brightness seems to be quite hard to set

02:16 <wpwrak> at least at night. maybe it's better with daylight

02:21 <kristianpaul> already tried to increase that brifght on with flcikernoise?

02:22 <wpwrak> yeah. but the range in which i get useful images is incredibly narrow

02:22 <kristianpaul> and thats ccd, case cmos was not that good

02:22 <kristianpaul> zoom range?

02:23 <wpwrak> zoom ?

02:23 <wpwrak> it's the standard M1 cam. ain't no zoom :)

02:23 <kristianpaul> no?

02:23 <kristianpaul> i mean you can focus..

02:23 <kristianpaul> ah yeah, i forgot :)

02:25 <wpwrak> (focus) hmm, i can unscrew the lens. but that doesn't look like focus

02:26 <kristianpaul> personally i felt confident at no more than 3 meters far from camera

02:30 <roh> wpwrak: it should be 8mm diameter button caps

02:31 <wpwrak> "explosive minds" may look even cooler with the direction reversed. now it's more like imploding :)

02:31 <roh> the spacer (about 7.9mm diam) should be 0.5mm thick and the inner end cap (12mm diam) should be 1mm thick

02:31 <wpwrak> kristianpaul: oh, i'm about 30-50 cm from the cam :)

02:31 <roh> the button cap has (should have) the same thickness as the sidewall

02:32 <wpwrak> so the spacer should have a smaller diameter than the cap ?

02:32 <kristianpaul> wpwrak: wow, too near

02:33 <kristianpaul> wpwrak: about to make video chat with lekernel ;)?

02:33 <wpwrak> space constraints :)

02:34 <kristianpaul> (glup)

02:35 <wpwrak> short cable, limited length of arms, and unwillingness to raise to touch anything] :)

02:37 <kristianpaul> (short cable) oh, well i bought a 3m cable ;-D

02:37 <kristianpaul> also a black cloth :)

02:39 <roh> wpwrak: that difference is only to make it easier to glue without it standing over and hindering it from sliding in completely

02:42 <wpwrak> roh: aah, i see. nice.

02:42 <wpwrak> kristianpaul: (cloth) to hide behind ? :)

02:43 <kristianpaul> lol

02:43 <kristianpaul> no, i wanted to see if quality around video in effects improv

02:43 <kristianpaul> e

02:52 <wpwrak> and, did it ?

03:00 <kristianpaul> not for my own like

03:01 <kristianpaul> but i hold some comments to avoid recall the topic of this channel :)

03:03 <wpwrak> lekernel: idea for future improvement: if no local display of some sort is added, maybe have a LED next to each input. turn it on if the patch is using that channel (video, audio, etc.). blink it if the patch is using the channel but the signal doesn't look right (e.g., no sync, too much black / too much white, etc.)

03:05 <wpwrak> lekernel: regarding recompiling patches, does it actually need to do this for each setup change ? i.e., do the patches depend on setup items ? if not, you could just have a flag in RAM which patches you've already compiled. should be much easier to implement than a persistent cache that survives power cycling.

03:05 <wpwrak> oh, and if you go multi-core, you could just compile patches in the background, while rendering ;-))

03:23 <kristianpaul> ;-)

03:33 <stekern> wpwrak: the problem with the flag approach is; how do you know that the patch haven't changed?

03:35 <wpwrak> stekern: clear the flag when you overwrite/edit a patch

03:35 <stekern> of course the 'flag' could be some crc/hash of the source, that might solve it

03:36 <wpwrak> yes, or do a hash if you want to get fancy :)

03:36 <stekern> wpwrak: yes, but what if the patch have been modified externally

03:37 <stekern> (admittedly, I am not to familiar with how things work, is it possible that it would be externally modified?)

03:39 <wpwrak> i don't think without you noticing. i.e., you'd still have to transfer it.

03:44 <stekern> well, in that case, the flag approach might work

03:47 <stekern> if cpu time need for calculating hash vs compiling patch is about the same, then there's no point with that

03:47 <stekern> *needed

03:48 <wpwrak> yeah. no idea how they compare

03:48 <stekern> me neither ;)

03:49 <wpwrak> just noticed that the M1 spends quite a bit of time compiling patches, even if all i do is go to the camera settings

03:49 <stekern> I'm in larval stage, at the point where I've got the toolchain compiled and tested to run flickernoise in qemu

04:05 <kristianpaul> thats good !

04:05 <kristianpaul> since yday i started to try port the debian memtester package, it got it to compile, but after dirty comment mmu related code..

04:06 <kristianpaul> also some posix functions that rtems dislked (mlock and related..)

04:07 <kristianpaul> i dint tested yet, i still need to harcode some memory lenghts..

04:08 <kristianpaul> may be you can take a look to the code, i really hackishm now... but it compiles ! ;)

04:08 <kristianpaul> s/i/is

04:11 <wpwrak> kristianpaul: hmm, porting a memtester that tries to defeat virtual memory to a MMU-less system, and having to defeat the VM-dependent feature the program uses, somehow sounds wrong to me ;-)

04:11 <kristianpaul> humm

04:12 <kristianpaul> wpwrak: what you suguest for a memtest/stres test?

04:13 <roh> something simple and small which runs completely from sram and tests the complete dram?

04:13 <roh> output/input via serial

04:13 <kristianpaul> also acording to changelog mmap was for adding the feature of testing specific physical regions of memory

04:13 <kristianpaul> good point roh

04:14 <kristianpaul> run from sram

04:14 <kristianpaul> this is the code i found http://pyropus.ca/software/memtester/

04:17 <kristianpaul> my main concern about dram in M1 is posible corruption, as is just a *guess* as i never undertood well the DMA problem with first minimac core

04:37 <stekern> kristianpaul: as roh said, start out with something simple, like just writing all '0's and all '1's and see if they read back ok, do walking '0'/'1's and see if they read back ok

04:41 <stekern> if those simple tests passes, then you can start looking into more complex algorithms

04:41 <stekern> if they don't pass, you might have saved yourself some trouble :)

04:41 <kristianpaul> good plan ;)

04:42 <stekern> (but perhaps got yourself into the trouble figuring out why they don't pass)

09:02 <lekernel> I think hashing a patch will be much faster than compiling it

09:03 <lekernel> and yes patches can be modified externally, via FTP, shell, file manager, etc.

09:03 <lekernel> afaik there's no "file modified" notification API in RTEMS like there is in Linux, and given how badly the RTEMS filesystem is designed I'd rather not touch it

09:05 <lekernel> kristianpaul, I have done tons of SDRAM tests, check the archives

09:12 <wolfspraul> wpwrak: you up?

09:13 <lekernel> wolfspraul, hi

09:13 <wolfspraul> hi

09:13 <lekernel> any prospect regarding when the first boards are shipped?

09:13 <wolfspraul> http://en.qi-hardware.com/wiki/Milkymist_One_run_3_schedule#Test_Results

09:14 <lekernel> yes, there seems to be fully working ones. but how about packaging them and selling them?

09:17 <lekernel> sorry to be insistent, but so many things are depending on that...

09:17 <wolfspraul> it doesn't worry you that 2 boards that worked perfectly stopped working after a little bit of rendering?

09:18 <wolfspraul> it's your brand. you think we can ship products that are known to fail after a few times rendering?

09:19 <wolfspraul> here's the plan: Adam is currently dumping the nor of those two

09:19 <wolfspraul> overall the test results look really good now

09:19 <wolfspraul> but I would like to have at least a theory for what happened on those 2 boards

09:19 <wolfspraul> can you rule out that the bad reset ic we chose causes nor corruption on power down?

09:20 <lekernel> maybe it's just the same thing that happened to the video chips on the RC1 boards Adam reworked and sent me

09:21 <wolfspraul> I have an idea

09:21 <wolfspraul> why don't we just erase all trace of those two boards, 0x4C and 0x7D, from the production and testing plans, and sell the rest as if everything was always perfect?

09:22 <wolfspraul> :-)

09:22 <wolfspraul> kidding...

09:22 <wolfspraul> when Adam is here we ask him about the solder he used

09:22 <wolfspraul> how would that explain a board that first works and then fails?

09:22 <wolfspraul> a whisker - where? which chip?

09:22 <wolfspraul> and it shows up after a few render cycles?

09:23 <wolfspraul> are you trying to find a theory that can explain what we find, or are you trying to find a theory that will allow you to sell the remaining boards with a straight face?

09:23 <wolfspraul> so the best would be if we can come up with a quick test to identify boards that will later fail

09:23 <wolfspraul> the worst would be if we find that the wrong reset ic we have causes nor corruptions

09:24 <wolfspraul> we can also close our eyes really hard and just sell the stuff even though we cannot produce it at a consistent quality

09:24 <wolfspraul> I think that's suicidal for the Milkymist brand in the long run though.

09:24 <wolfspraul> aw: hey Adam :-)

09:24 <wolfspraul> congratulations on finishing the reworks of another 47 boards!

09:25 <lekernel> perfectionism is suicidal too, because you can't get anything done in the end

09:25 <wolfspraul> we have a question for you: which solder are you using for the reworks?

09:25 <aw> wpwrak, have you settled down on your board? ;-)

09:25 <wolfspraul> lekernel: oh that's why I'm asking you, it's your brand. Please think about the test results carefully.

09:25 <wolfspraul> I am every bit aware that perfect is the enemy of good.

09:25 <wolfspraul> but boards that fail after successful rendering worry me, that's all.

09:26 <wolfspraul> in the companies I've worked so far (all Western brand companies), something like this would not ship.

09:26 <wolfspraul> a Chinese company would long have started shipping, of course

09:26 <aw> lead soldering to be used while reworks

09:26 <wolfspraul> lekernel: does that settle your whisker theory?

09:27 <lekernel> aw, and what solder did you use for the two video chips you reworked on rc1?

09:27 <wolfspraul> aw: have you dumped some nor partitions from 0x4C and 0x7C ?

09:27 <lekernel> the ones that failed

09:28 <aw> lekernel, the same lead soldering of currently one i used, it's reel. same as while in rc1

09:28 <aw> 0x4c: http://downloads.qi-hardware.com/hardware/milkymist_one/production/rc3/test_results/bitstream/0x4c-standby1.bit/

09:28 <wolfspraul> aw: seems that Werner is sleeping

09:28 <rejon> yahyah, last night at sharism presents beijing, we projected the milkymist entire time, froze at least 3 times

09:29 <rejon> needed full reboots

09:29 <aw> 0x7c: http://downloads.qi-hardware.com/hardware/milkymist_one/production/rc3/test_results/bitstream/0x7c-standby1.bit/

09:29 <rejon> kind of embarrasing, hope we can make them better

09:29 <lekernel> rejon, do you have L19 shorted?

09:29 <rejon> it was wolfgangs

09:29 <rejon> m1

09:30 <rejon> me wolfgang and xiangfu there

09:30 <wolfspraul> not shorted I think (forgot)

09:30 <aw> xiangfu, can we start to update urjtag now?

09:30 <xiangfu> aw, yes. sure.

09:30 <aw> xiangfu, lead me in instructions, tks. ;-)

09:31 <aw> rejon, hi ;-)

09:31 <rejon> hi

09:32 <xiangfu> wolfspraul, the m1 do connect one camera. maybe that is the reason (L19 shorted)

09:32 <xiangfu> aw, goto your urjtag.git folder and run 'git pull'

09:33 <lekernel> 4c/7c: there is some corruption in those two bitstreams

09:33 <lekernel> are the flash partitions locked?

09:33 <xiangfu> lekernel, no. we are update aw's urjtag now.

09:34 <wolfspraul> ah I'm just downloading. corruptions, hmm.

09:34 <aw> i've never used 'lockflash' while rc3 until now.

09:34 <wolfspraul> ok, so xiangfu and adam will try to work out how to lock the rescue partitions

09:35 <wolfspraul> lekernel: what is your theory on where those corruptions come from?

09:35 <lekernel> http://pastebin.com/iaZvxSmQ

09:36 <wolfspraul> one word zeroed

09:36 <lekernel> maybe wrong power-down ramps, as Werner suggested

09:37 <wolfspraul> what is the chance in your opinion that the power-down ramps cause this zero word?

09:37 <xiangfu> 4c: is only one bit. but 7c is more.

09:37 <lekernel> if that's the case, locking certainly would help restrict the incidence of the problem to the unlocked partitions, which means the board would always be able to boot in rescue mode

09:37 <wolfspraul> from 7 to 0 is 3 bits, no?

09:38 <wolfspraul> it may even make it go away entire if this mostly affects small addresses (IF)

09:38 <xiangfu> oh. sorry. yes 3bits. :(

09:38 <lekernel> and it seems the standby bitstream is affected more often (since when the board fails, it's usually no reconfiguration at all and not other issues) so there's some chance locking would make the problem disappear entirely

09:39 <wolfspraul> I'm willing to accept all sorts of theories and support the product, but I need to do it with a straight face, i.e. after doing my best to understand and mitigate the problem.

09:39 <wolfspraul> ok, so let's get the locking done first

09:40 <wolfspraul> could the writing of zeroes also come from a software bug?

09:41 <lekernel> yes, maybe

09:41 <wolfspraul> that'd be best for me :-)

09:41 <lekernel> actually the whole flash corruption could come from software bugs

09:42 <wolfspraul> until Werner is back and requests some tests, Adam & Xiangfu will get the locking setup

09:42 <wolfspraul> then we lock the rescue partitions on some boards (let's say 10), and do some render cycling

09:42 <wolfspraul> xiangfu: are you on this with Adam?

09:43 <wolfspraul> also make a short script that will let adam lock the partitions of an existing board without reflashing...

09:43 <wolfspraul> lock-only.sh

09:43 <xiangfu> wolfspraul, yes. we are talking

09:43 <wolfspraul> great, thanks

09:43 <xiangfu> wolfspraul, (lock-only.sh) sound good.

09:44 <wolfspraul> lekernel: I think so far all zero words I've seen are at low addresses

09:44 <wolfspraul> even within the 640 KB standby bitstream

09:44 <wolfspraul> but I haven't paid close attention to all cases we had, Werner knows them all

09:47 <wolfspraul> 0x7C is not an entire word. offset 0x1EC: from 44 0C -> 40 00

09:47 <wolfspraul> one bit remaining :-)

09:47 <wolfspraul> a low offset again

09:48 <wolfspraul> if it's a power ramp down problem, why would only small addresses be affected?

09:49 <wolfspraul> is it easier for a wire to be 0 than to be 1?

09:53 <lekernel> it's also interesting to notice that the two corruption events occurred at different addresses but with very similar content

09:53 <lekernel> hmm, no, actually the whole beginning of the bitstream contains an almost periodic pattern

09:56 <wolfspraul> in the second case one bit remains

09:56 <wolfspraul> 7c: offset 0x1EC from 44 0C -> 40 00

09:56 <wolfspraul> well, all I've seen was at low addresses

09:56 <wolfspraul> so if we are lucky, a locking of the standby bitstream will make the problem go away entirely

09:56 <wolfspraul> although if it's power ramp-down cause, who knows maybe the locking will not work? :-)

09:57 <wolfspraul> lekernel: if it's a ramp-down problem, is there a theory that suggests that low addresses are more likely to get hit than higher ones?

09:57 <wolfspraul> is it more likely that an address line is 0 than 1?

09:57 <lekernel> the power ramp down theory is that underpowering the FPGA while the flash is still running causes the FPGA to put out incorrect signals that are interpreted as valid writes by the flash

09:58 <wolfspraul> it sounds very far fetched to me

09:58 <lekernel> locking makes the accepted write sequence a lot more complex

09:58 <wolfspraul> but I know too little about the signals between fpga and nor and how likely this is to happen

09:58 <wolfspraul> well ok. we definitely try locking.

09:58 <wolfspraul> eventually luck has to be with us

09:58 <lekernel> he

09:59 <lekernel> that flash does receive write commands

09:59 <wolfspraul> if it's a software bug, that's also ok

09:59 <wolfspraul> eventually we'll hunt it down, or at least defuse it first with locking etc.

09:59 <lekernel> the 3.3V supply is correct, so if the flash gets written, then it has received a proper write command

09:59 <lekernel> unless the flash chips are counterfeit/crappy, but you do not think this is true

09:59 <wolfspraul> sounds pretty unlikely to me in an uncontrolled ramp-down

09:59 <wolfspraul> no no

10:00 <wolfspraul> get your mind off of that, that's a mental trap

10:00 <lekernel> what can cause write commands are:

10:01 <lekernel> * incorrect signals during power up/down - the reset IC was supposed to prevent that by holding the reset during those events. it does it during power up, but the power down case is less clear as Werner pointed

10:01 <lekernel> * software bugs

10:01 <lekernel> * FPGA configuration system going mad

10:03 <lekernel> by making the accepted write sequence way more complex, locking would probably rid us of the symptoms of any of those problems

10:03 <wolfspraul> if in addition for whatever reason this happens only on low addresses, we are all set

10:04 <wolfspraul> our users will never experience the downside of the bandaid we use to keep the product working -> perfect solution

10:04 <wolfspraul> if it also happens on higher addresses, we may still decide to ship, because the event is rare and will 'only' trigger the need for a web update

10:05 <wolfspraul> (assuming the rescue path and web update actually work, which I assume now)

10:05 <wolfspraul> basically in 470 render cycles (30 seconds each), we had this happen twice

10:06 <wolfspraul> the numbers are a little low, but it seems to be in about 1 out of 200 render cycles

10:06 <wolfspraul> [numbers low] I mean our statistical data is limited to really say 1/200

10:06 <wolfspraul> but something like that

10:08 <wolfspraul> xiangfu: can you also update Adam's flterm to the latest version?

10:08 <wolfspraul> let's just get both flterm and urjtag updated

10:08 <xiangfu> wolfspraul, yes. already done that.

10:08 <wolfspraul> perfect

10:08 <aw> xiangfu, thanks for your instructions, now my jtag is new

10:08 <wolfspraul> xiangfu: everything updated?

10:08 <xiangfu> wolfspraul, we just done update. now I finish the small lock_only.sh

10:08 <wolfspraul> wow, great

10:08 <wolfspraul> ok good

10:08 <wolfspraul> aw: here is what I propose

10:09 <wolfspraul> 1. xiangfu writes a little lock_only.sh script that you can use to lock the partitions of already flashed good boards

10:09 <wolfspraul> 2. I think we can reflash 0x4C and 0x7C and see whether they boot again

10:09 <wolfspraul> 3. we pick 10 boards, 0x4C and 0x7C and 8 others, and run lock_only.sh on them

10:10 <wolfspraul> 4. then we do 10 render cycles on those 10 boards

10:10 <GitHub173> [milkymist] sbourdeauducq pushed 1 new commit to master: http://git.io/2_lHRQ

10:10 <GitHub173> [milkymist/master] flterm: add check if c is 0x00 - Xiangfu Liu

10:10 <xiangfu> thanks lekernel

10:10 <wolfspraul> well, that's only 100 render cycles, so maybe not enough

10:10 <wolfspraul> aw: do you think we should reflash 0x4C and 0x7C ?

10:11 <wolfspraul> until Werner is back, I have no reason for any measurements now. just want to reflash them (including locking)

10:11 <wolfspraul> should we do that?

10:11 <aw> wolfspraul, yes, i think before lock flash, we can reflash 0x4c and 0x7c firstly

10:11 <wolfspraul> yes, let's reflash both and see whether they boot to render

10:11 <wolfspraul> first step

10:12 <aw> BUT, we're doing a no-bigger data base even 10-times power-cycle. my question is:

10:13 <wolfspraul> maybe we should buy a programmable power supply :-)

10:13 <wolfspraul> then we still have a problem how to press the middle button automatically

10:13 <wolfspraul> we don't have this now

10:13 <aw> if after this 10 boards with 10 times through lock flash function, say NO err happens, but can we trust us and say this step is safe?

10:13 <wolfspraul> good question

10:13 <wolfspraul> from your tests, it seems we need about 200 cycles for 1 failure

10:14 <wolfspraul> but let's do step by step, not speculate too much

10:14 <aw> yupp...

10:14 <wolfspraul> let's reflash 4C and 7C and see whether they boot

10:14 <xiangfu> aw, let me test first. ..

10:14 <wolfspraul> then we lock

10:14 <aw> i quite don't think that we should pick 10 boards firstly

10:14 <wolfspraul> then we think :-)

10:14 <wolfspraul> agree

10:14 <wolfspraul> first step: reflash 4C and 7C, see whether they boot

10:15 <aw> how about we just use 0x4c and 0x7c to do individually 100-times tests after reflash and lock?

10:15 <wolfspraul> yes, why not. good idea.

10:15 <aw> that's total 200 times

10:15 <wolfspraul> but let's reflash first and see whether they boot :-)

10:15 <aw> yup

10:15 <wolfspraul> I have seen too many surprises, don't want to speculate too much.

10:15 <GitHub130> [scripts] xiangfu pushed 2 new commits to master: http://git.io/V6b2WA

10:15 <GitHub130> [scripts/master] compile-lm32-rtems: add clean-rtems for easy re-build rtems - Xiangfu Liu

10:15 <GitHub130> [scripts/master] scripts: lockflash only script file - Xiangfu Liu

10:15 <aw> sorry that we do this firstly even if Werner say later we were wrong

10:15 <wolfspraul> then we just speculate speculate, and then the test results don't come out as expected -> time wasted speculating :-)

10:15 <aw> ;-)

10:16 <wolfspraul> well

10:16 <wolfspraul> we should move forward

10:16 <wolfspraul> it cannot be so totally wrong :-)

10:16 <wolfspraul> btw, I am online for about 1h, then I need to go to some club opening to demo m1

10:16 <wolfspraul> so if I'm offline later, just fyi

10:16 <aw> i meant that missed some good chance to find...well

10:17 <wolfspraul> no I don't think so

10:17 <wolfspraul> really - no worries

10:17 <xiangfu> aw, BTW: you can put this file 'http://downloads.qi-hardware.com/people/xiangfu/tmp/72-qi-hardware.rules' under your '/etc/udev/rules.d' and change the GROUP to 'adam' then you don't needs 'sudo' on nanonote and milkymist one.

10:17 <aw> okay...once xiangfu send me that. I'll do it

10:17 <wolfspraul> I think you can reflash already, 4C and 7C

10:17 <xiangfu> aw, 'wget https://raw.github.com/milkymist/scripts/master/scripts/lockflash_only_m1_rc3.sh'

10:17 <wolfspraul> xiangfu: from now on, Adam should always automatically lock after flashing

10:18 <wolfspraul> so Adam's reflash_m1.sh should have the locking commands enabled by default

10:18 <xiangfu> put this file under your '2011-07-13/for-rc3' same folder of 'reflash_m1.sh'

10:19 <wolfspraul> xiangfu: does Adam's reflash_m1.sh always lock by default now?

10:19 <GitHub86> [scripts] xiangfu pushed 1 new commit to master: http://git.io/Ey0mog

10:19 <GitHub86> [scripts/master] scripts: reflash_m1_rc3.sh bump version and enable lockflash - Xiangfu Liu

10:19 <xiangfu> wolfspraul, not yet.

10:20 <wolfspraul> please let's enable locking by default

10:20 <wolfspraul> we move full power to locking now, always lock

10:21 <xiangfu> aw, after you download lockflash_only_m1_rc3.sh, you can update your reflash_m1.sh by download this file: https://raw.github.com/milkymist/scripts/master/scripts/reflash_m1_rc3.sh

10:21 <xiangfu> aw, and overwrite your local version

10:21 <wolfspraul> xiangfu: I even think the reflash_m1.sh original should enable locking by default

10:22 <wolfspraul> it almost becomes part of the m1 design/architecture :-)

10:22 <wolfspraul> locking only the standby and rescue partitions, but that should be enabled by default

10:22 <aw> xiangfu, okay

10:22 <wolfspraul> imho

10:22 <xiangfu> wolfspraul, yes. agree.

10:25 <aw> xiangfu, the difference between 'lockflash_only_m1_rc3.sh' and 'reflash_m1_rc3.sh' is just one for lock the other is for reflash too?

10:26 <xiangfu> aw, have you update your local version reflash_m1.sh?

10:27 <aw> not yet...change now...my one line cmd is that with log function you gave me before. ;-)

10:28 <aw> xiangfu, i.e.: ./reflash_m1_rc3.sh $1 $2 2>&1 | tee -a log/urjtag_$2.log

10:29 <xiangfu> aw, ok

10:29 <xiangfu> you better delete old reflash_m1.sh . for don't confuse.

10:29 <wolfspraul> disconnected

10:30 <wolfspraul> xiangfu: why do we have a separate reflash_m1_rc3.sh ? can we have just one m1 reflash script?

10:30 <xiangfu> wolfspraul, no. it's just the name in my repo.

10:30 <aw> wolfspraul, no need though

10:31 <aw> wolfspraul, sometimes is managed on my site i think...

10:31 <aw> xiangfu, btw, i rename log file name as: ./reflash_m1_rc3.sh $1 $2 2>&1 | tee -a log/urjtag_lock_$2.log

10:33 <aw> alright..now to reflash/lock those two.

10:34 <wolfspraul> yes good

10:35 <wolfspraul> xiangfu: name? don't understand. well. the name says _rc3 and that is hopefully temporary. there should be only one m1 reflash script.

10:35 <wolfspraul> if we need multiple variants, there should be options (command line parameters)

10:36 <wolfspraul> I didn't even look inside the script, just saying from the name - this will cause confusion, guaranteed.

10:36 <xiangfu> wolfspraul, yes. I know. just don't have time merge them. we have 'snapshots' 'updates' different URL and different way to generate bios.bin file.

10:37 <wolfspraul> so there should be only 1 script

10:37 <wolfspraul> the script should have a version number right at the beginning in some variable, maybe just the date it was last edited

10:37 <wolfspraul> so when someone has the script locally, they can quickly check whether they have the latest version

10:37 <xiangfu> maybe I can do that this weekend :)

10:37 <wolfspraul> ok

10:37 <xiangfu> wolfspraul, (version) yes. should be already in adam's log file

10:38 <wpwrak> good morning ! :) catching up and replenishing my caffeine store

10:39 <wolfspraul> well I'm sure there are reasons for the different scripts, it's all work.

10:39 <wolfspraul> just remember to fix it at some point (merge) - this will GUARANTEED create confusions

10:39 <wolfspraul> even among ourselves :-)

10:39 <wolfspraul> you will see :-)

10:39 <wolfspraul> so if we don't merge them, we pay the price in a different way

10:39 <wolfspraul> but sort it in with your other priorities, you have overview...

10:39 <wolfspraul> he

10:39 <wolfspraul> I'm already with the first evening beer :-)

10:39 <wolfspraul> gotta get ready for the club opening...

10:40 <wolfspraul> wpwrak: have you seen any nor corruptions at higher addresses?

10:41 <wolfspraul> (after you caught up...)

10:45 <aw> 0x4c reflash and lock okay, 0x7c is not...wait..upload log...

10:45 <xiangfu> after cleanup the reflash_m1.sh will send email to list. I am already lazy on this task :)

10:46 <wolfspraul> "7c is not" - bah

10:46 <wolfspraul> :-)

10:50 <wolfspraul> wpwrak: what's your take on the new 4C and 7C findings?

10:50 <wolfspraul> curious about the log update and why 7C did not reflash...

10:50 <wolfspraul> we are hoping that locking will safely eliminate this problem

10:51 <aw> 0x4c: http://downloads.qi-hardware.com/hardware/milkymist_one/production/rc3/test_results/log/urjtag_lock_4C.log

10:51 <aw> 0x7c: http://downloads.qi-hardware.com/hardware/milkymist_one/production/rc3/test_results/log/urjtag_lock_7C.log

10:53 <wolfspraul> oh, unknown stepping

10:53 <aw> 0x7c: i was thought my usb cable not contacted well , so i reflash twice. :) but first time seems already stands for.

10:53 <wolfspraul> no no

10:53 <aw> unknown stepping again?

10:54 <aw> oah..yes, just saw it

10:55 <aw> so i need to edit again?: Added "0011<tab>xc6slx45<tab>3" in (/usr/local/share/urjtag/xilinx/xc6slx45/STEPPINGS) file.

10:56 <wolfspraul> it's just the stepping, from the urjtag update probably

10:56 <wolfspraul> we need to update the 011 stepping into one file, remember?

10:56 <wolfspraul> xiangfu: can you get that patch sent upstream?

10:56 <wolfspraul> which file was it again? (searching mmlogs...)

10:56 <wolfspraul> probably overwritten from the update

10:56 <wolfspraul> you have to edit /usr/local/share/urjtag/xilinx/xc6slx45/STEPPINGS

10:56 <wolfspraul> xiangfu: we need to get this sent upstream

10:56 <wolfspraul> yes I think so

10:56 <wolfspraul> edit it, then try to flash/lock again

10:56 <aw> hm...i see

10:57 <xiangfu> wolfspraul, (patch upsteram) ok.

10:58 <aw> good...reflashing... ;-)

10:59 <wolfspraul> otherwise we need to remember to edit the file, which you see we can easily forget...

10:59 <wolfspraul> so the time is better spent getting this 1-line patch upstream

11:03 <xiangfu> lekernel, what is the 'Added "0011<tab>xc6slx45<tab>3" ' do exactly. I am writing a commit log

11:03 <wolfspraul> added another Spartan-6 stepping

11:03 <lekernel> just make the latest xilinx silicon stepping recognized

11:04 <aw> done...let's test 100 times of boot to rendering or just reconfiguration only to verify if 'lockflash' really work?

11:04 <wolfspraul> wait

11:05 <wolfspraul> so now both 4C and 7C are reflashed, and they both can render?

11:05 <wolfspraul> let's first boot to render once, so we know they work

11:05 <wolfspraul> if that's the case, yes, I agree. let's do 100 cycles with each.

11:05 <wolfspraul> actually 100 may not be enough, our data suggests more like 200 each. phew.

11:06 <wpwrak> 0x4c: http://pastebin.com/9QDs7B4D

11:06 <wolfspraul> sorry about that, we have no automation now!

11:06 <wolfspraul> how long does this take?

11:06 <aw> 7C rendered

11:06 <wolfspraul> 90 seconds each test

11:06 <wolfspraul> 45 seconds boot, 30 seconds render, some more for the cycling

11:06 <wpwrak> 0x7c: http://pastebin.com/w4mCvbRT

11:06 <wolfspraul> 90*200=18,000 seconds = 4-5 hours

11:06 <wolfspraul> argh

11:07 <wpwrak> both have a single-word corruption. so a reflash should fix them.

11:07 <xiangfu> UrJtag 0011<tab>xc6slx45<tab>3 sent out

11:07 <wolfspraul> aw: let's do 100 each first

11:07 <wolfspraul> sorry that we don't have this better automated right now

11:07 <aw> 4C rendered

11:08 <wolfspraul> wpwrak: do you have any other ideas? do you agree with the approach to reflash 4c/7C (already done), and then 100 thirty-second render cycles on each?

11:10 <xiangfu> aw, you have to unplug the power cable for reboot right?

11:10 <aw> xiangfu, yes

11:11 <wolfspraul> xiangfu: some automation thoughts

11:11 <xiangfu> aw, ok. there is a command can reboot m1 in 'flterm' but anyway we can not use that command in our case

11:11 <wolfspraul> first - we are not sure which exact sequence triggers the problem

11:11 <wolfspraul> for example whether a soft-reboot is enough

11:11 <wolfspraul> so to be safe, we do a cold power cycling right now (unplug dc jack)

11:11 <wolfspraul> simply because that's how we always tested so far

11:11 <wpwrak> a bias towards small numbers if common in real life. so that may not mean much. particularly if it's a sw bug :)

11:12 <xiangfu> wolfspraul, yes.

11:12 <wolfspraul> we don't really have any comparison data for cutting power at the mains, or for soft reboots

11:12 <aw> i think that no way that I have to simulate a real power on and off action. ;-)

11:12 <wolfspraul> aw:Â Â we know too little now

11:12 <wolfspraul> and we just want to start selling :-)

11:12 <wolfspraul> so it's difficult

11:12 <wolfspraul> we need your help in manual testing, because that's how we tested so far

11:12 <wolfspraul> and we cannot get a better automation understood and setup fast

11:12 <wolfspraul> xiangfu: the next problem is the middle button, which needs to be pressed

11:13 <wolfspraul> in the future we would use programmable power supplies, but they can only simulate certain types of power cycling

11:13 <wolfspraul> they cannot simulate the user unplugging the DC jack with his hands (potentially even causing effects simply from touching the metal...)

11:13 <aw> wolfspraul, ha...sorry that you would misunderstand my last sentence. sorry. i meant that I have to manually power on and off to simulate. ;-)

11:13 <wolfspraul> and we will always run into the middle button press as well

11:14 <wolfspraul> well, we will try to improve some of those things, but it will take time

11:14 <aw> no complain at all. ;-)

11:14 <wolfspraul> wpwrak: ok, so you are good with the 2*100 cycles test?

11:14 <wpwrak> (making the sequence more complex) the unlocking would also be an uncommon code path. so if it's a sw bug of just using the wrong address somewhere, you'd never hit this

11:16 <wolfspraul> let's just see what we get

11:16 <wolfspraul> then we move from there

11:16 <wolfspraul> I gotta run to the club...

11:16 <wolfspraul> aw: see you tomorrow or Monday. I think we are close :-)

11:16 <wolfspraul> thanks for all the hard work!

11:16 <wolfspraul> l8... will read the backlog...

11:16 <wolfspraul> good luck!

11:16 <aw> wait...so

11:16 <aw> so agree to test 200 times?

11:16 <wolfspraul> I do

11:16 <aw> wpwrak, agreed?

11:17 <wolfspraul> then you just follow what Werner agrees with too :-)

11:17 <aw> he...okay ;-)Â Â you go firstly. ;-)

11:17 <wolfspraul> plus you will probably need dinner at some time first :-) and it's Friday evening!

11:17 <wolfspraul> we are close I think

11:17 <wolfspraul> maybe the locking is the final nail

11:17 <wolfspraul> I certainly hope so

11:17 <wpwrak> reflashing 0x4c and 0x7c sounds good to me. the single-word corruption we've already seen a few times doesn't look related to what happens in 0x3c/0x77. which is good news. it means that no new boards have joined the "something very very wrong but we don't quite know what" cluster.

11:17 <wolfspraul> but we have to see the real data, what can we do

11:18 <wolfspraul> wpwrak: yes

11:18 <wolfspraul> and the addresses are all small, even in the 640 kb block we look at

11:18 <wolfspraul> so there's a good chance whereever this comes from, it will never hit anything past the standby bitstream

11:18 <wolfspraul> all wishful thinking of course...

11:18 <aw> alright...so after dinner. I'll go for test 200 times.

11:18 <wolfspraul> great

11:18 <wolfspraul> and I will read the backlog later :-)

11:18 <wolfspraul> aw: THANK YOU!

11:19 <wolfspraul> thanks so much for the great energy and passion

11:19 <wolfspraul> almost there!

11:19 <aw> alright...no problems, i ought to.

11:19 <wpwrak> the single NOR word corruption cluster may be: 1) fixed 100% by locking (unlikely, imho); 2) fixable in the field; 3) not fixable in the field but with a not too hard recovery path, so people can work around the issue; 4) point to a NOR defect (unlikely, imho)

11:20 <wolfspraul> I think locking stands a good chance

11:20 <wolfspraul> unless the problem just bypasses locking entirely

11:20 <wpwrak> (programmable power supply, middle button) i'm on it .. :) http://projects.qi-hardware.com/index.php/p/wernermisc/source/tree/master/labsw/

11:20 <wolfspraul> what do you mean with "fixable in the field"?

11:21 <wolfspraul> anyway gotta run

11:21 <wolfspraul> backlog

11:21 <wolfspraul> l8

11:21 <lekernel> wpwrak, how can locking not fix the standby bitstream problem?

11:27 <wpwrak> if it's a bad NOR cell (and not a rogue write), it may still lose data later

11:27 <wpwrak> also, we may hit other addresses, which could still render the M1 unusable (that is, without human intervention)

11:28 <wpwrak> i'm thinking of the VJ at club scenario: you plug it in and it doesn't start flickernoise, or comes up with a friendly message telling you to fix your bitstream or whatever. the crowd cheers, the VJ gets nervous :)

11:29 <lekernel> rendering is possible in rescue mode

11:29 <wpwrak> now, if we can properly protect standby and recovery, which i hope and expect we can, it's not insanely difficult to bring the system back to life after such a mishap

11:30 <wpwrak> you could still lack new FN features, or your patches themselves may get corrupted

11:31 <wpwrak> but yes, we can make recovery from NOR trouble relatively benign, even if it's not possible to prevent it from occurring in the first place

11:32 <wpwrak> also, the users could simply be instructed to plan to have a few minutes before the show to deal with any potential NOR problem. plus, don't power cycle during the show.

11:32 <wpwrak> not nice, but it would reduce the impact of the issue further

11:33 <wpwrak> now, for testing what's really going on. i'd suggest to do the current power cycling test at least 1000 times and until the corruption has happened at least 10 times, i.e., whichever comes last.

11:34 <lekernel> can you name a single technology device those days that has none of such problems?

11:34 <wpwrak> each time a corruption is found, record the location of the corruption and fix, then continue

11:34 <lekernel> even those overrated apple macbooks suffer display problems because of poor BGA soldering

11:35 <lekernel> even with all the money and resources apple has, they failed to fix it in the first place

11:35 <scrts2> power cycling at least 1000 times... :D

11:35 <wpwrak> after this, do the same test, but with a soft reset. that avoids the power drop. if the corruptions magically go away, we know it's a power up/down issue. if they don't, it's software, FPGA logic, NOR itself, EMI, etc.

11:35 <scrts2> I wonder who wouldn't bother doing this

11:35 <lekernel> so i'm more than willing to accept a little incidence of NOR corruption in unlocked partitions here

11:36 <wpwrak> scrts: you're saying aw will run screaming to the other end of taiwan when he reads this ? ;-)

11:37 <lekernel> of course we should fix it, but we should balance it against the massive delays a perfect solution would cause

11:37 <wpwrak> lekernel: some products do in fact much worse, e.g., recently, it was in the news that Intel SSDs are losing data quite predictably. they fixed one path via a firmware upgrade and are still guessing about another one. that much about the power of big corps :)

11:38 <wpwrak> my hope its that it won't take all that long

11:38 <wpwrak> if it's a general problem, each of us should be able to reproduce it

11:39 <wpwrak> so the question is simply who manages to automate the test first :)

11:39 <wpwrak> btw, any magic key combination to switch rendering to 1024x768 ?

11:45 <wpwrak> btw2, it may be cool to have some patch that has a camera reaction in an augmented reality way. e.g., show the camera input; overlay it with white blocks in some area; sample the camera image "behind" these white blocks; if there's a sudden brightness/color change of a large number of pixels, let the block "explode"

11:46 <wpwrak> that may motivate people to experiment with interactive effects, which i think could be very cool. alas, if you don't show the way in a simple example as the one i've described, it will take much much longer before someone gets motivated enough to try.

11:46 <scrts2> I did not read the problem, but I suppose the device hangs up?

11:57 <lekernel> nah, rendering in 1024x768 is only supported on git head with the demo firmware (not FN)

11:58 <lekernel> it's slow too (~7-12 fps)

11:59 <lekernel> and buggy

12:13 <wpwrak> lekernel: (1024x768) :-( any hope to be able to get it to work ? your earlier experiments sounded encouraging

12:14 <lekernel> maybe by doubling the SDRAM frequency

12:15 <wpwrak> ah, and can midi control adjust audio sensitivity and maybe camera brightness ? these two often seem to need some tweaking

12:15 <wpwrak> (double sdram) sounds scary :)

12:15 <lekernel> there are already Fx keys to adjust camera brightness and contrast

12:15 <wpwrak> oh, cool

12:16 <lekernel> (sdram) yeah, i'll probably feel motivated to do that if/when this project becomes popular

12:16 <wpwrak> (sdram) nice :)

12:19 <wpwrak> btw, i think a tutorial mode would be nice. the current default of going as quickly as possible into "show" mode doesn't really seem to fit what most people will expect. e.g., first you want to explore, getting all the feedback and guidance you can. only once you're familiar with the system, you'd turn off those things. of course, someone would have to program this ...

12:19 <wpwrak> (at least it's not scary verilog ;-)

12:38 <wpwrak> aw: how did the 100 cycles go ? ;)

12:38 <wpwrak> or was it 200 ? :)

12:39 <aw> wpwrak, hi sorry, i just started .;-)

12:40 <kristianpaul> lekernel: cross talk?

12:41 <lekernel> yes

12:50 <adamw_> 0x4c: 10th power-cycle pass

12:50 <wpwrak> "Shift" by Geiss is really cool

12:52 <adamw_> wpwrak, i bought a relay card with Christopher in om to do tons of tests via auto tests with programmable power supply and multimeters (GPIB)

12:53 <adamw_> with that way can verify many things. ;-)

12:59 <wpwrak> adamw_: you still have them ?

13:00 <wpwrak> adamw_: oh, and what multimeter do you have ?

13:04 <wpwrak> heh, conduirebourre ;-) best camera effect, i think

13:04 <adamw_> at that time we used Keithley 2303 and Agilent 34401A, 16 channels relay card through GPIB

13:05 <wpwrak> adamw_: and what do you have now ?

13:05 <adamw_> wpwrak,Â Â now i have 34401A

13:05 <adamw_> no programmable power supply. :(

13:05 <wpwrak> ah, okay. do you have GPIB to the PC ?

13:06 <adamw_> need buy one. ;-)

13:06 <adamw_> so you want me to capture NOR corruption as it happens while auto measure current. ;-)

13:08 <adamw_> well...hope we don't do this. then solve, but as a lab site with auto equipments is good. ;-)

13:08 <wpwrak> naw, just thinking ahead

13:08 <wpwrak> yes, automation is good. very good :)

13:08 <adamw_> we probably will go for this auto... ;-)

13:09 <adamw_> 20th

13:11 <adamw_> i even do think 100 times is not enough. ;-) you know that we can't five up any reasons caused especially that it's not a probability distribution.

13:12 <wpwrak> yeah, my guess would be more like 1000

13:12 <adamw_> the single NOR word corruption cluster may be: 1) fixed 100% by locking (unlikely, imho); 2) fixable in the field; 3) not fixable in the field but with a not too hard recovery path, so people can work around the issue; 4) point to a NOR defect (unlikely, imho)

13:12 <adamw_> you just posted those four candidates. ;-)

13:14 <adamw_> s/five/give

13:15 <wpwrak> yup. by the way, do you run the CRC check or just see if standby loads ?

13:15 <adamw_> process of boot to rendering with power-cycle

13:15 <adamw_> NO CRC check

13:16 <adamw_> that'd be long period...;-)

13:16 <wpwrak> heh ;-)

13:17 <adamw_> wpwrak, btw, how do you think that boards were failed in CRC test?

13:18 <adamw_> wpwrak, since one board I caught it and re-performed CRC test without power off then just pass, how to explain this?

13:19 <adamw_> that was 0x85: got "flickernoise.fbi(rescue)(CRC)CRC failed(expected aa12a56a, got b0c6b06d)" and "splash.raw CRC failed(expected 978f860c, got 33d3152a)" while using test program 10. keep performing CRC test again, then pass without power-cycle. 11. rendering and CRC test pass

13:24 <adamw_> 30th

13:37 <wpwrak> hmm, 0x85 sounds like one of those NOR bus problem boards then

13:37 <wpwrak> may be similar to 0x3c and 0x77. or maybe the NOR bus problems (without the "pulses" on PROGRAM_B) are something else

13:38 <adamw_> 40th

13:43 <wpwrak> this is a touch one

13:44 <wpwrak> s/touch/tough/

13:51 <wpwrak> lekernel: your USB stack can't be all that bad - it managed to find the first device (the keyboard) in this little mess: http://pastebin.com/p1ymfXL7

13:52 <lekernel> what is sad here is you need to go through all that crap just to receive stupid keystrokes

13:52 <lekernel> die USB, die

13:53 <wpwrak> lekernel: alas, it didn't find the mouse. otherwise, this little gem would work 100%: http://blog.brightpointuk.co.uk/riitek-rii-mini-wireless-keyboard-mouse-laser-pointer-combo

13:53 <lekernel> and it wouldn't even tell you the keyboard layout

13:53 <wpwrak> lekernel: it'll outlive both of us ;-)

13:53 <wpwrak> telling the keyboard layout would spoil the sense of mystery and adventure ;-)

14:18 <adamw_> 70th

15:10 <adamw_> 0x4c: 100th boot to rendering done: http://downloads.qi-hardware.com/hardware/milkymist_one/production/rc3/test_results/4C-lock-results

15:11 <adamw_> i gotta go and 0x7c will be the next one. cool. ;-)

15:16 <wpwrak> grr, vanished

15:17 <wpwrak> would have been nice to get a CRC check at the end

15:57 <wolfspraul> wpwrak: ok, 100 tests on 0x4C succeeded - good sign

15:57 <wolfspraul> until I see evidence against it, I am assuming/hoping the locking fixes the bug ;-)

16:10 <lekernel> it rather fixes the the symptom, but that's good enough for now

17:12 <wpwrak> ah, that was with locking ?

17:25 <lekernel> er... hopefully

17:26 <wpwrak> yeah ;-)

19:24 <wpwrak> hmm, there seems to be another issue with external connections. connected line in to my stereo (had used the battery-powered kaossilator before). then it stopped responding to audio. even when i connected back to the kaos.

19:25 <lekernel> this totally sucks

19:27 <wpwrak> power-cycled. everything okay again. connected stereo again. M1 froze (wouldn't get to the desktop with a mouse click)

19:27 <lekernel> there's another FB between analog and digital ground, maybe that's the same problem as on the video in

19:27 <wpwrak> power-cycled. still no reaction to the stereo.

19:27 <wpwrak> hehe ;-)

19:27 <lekernel> FYI, audio chip failure when rendering would freeze the software

19:28 <wpwrak> went back to kaossilator. audio dead. power-cycling ...

19:29 <lekernel> those run3 boards are the worst disaster that ever happened in this project

19:29 <wpwrak> one issue that quite clearly exists in M1 is that it combines a lot of different grounds. and you can't quite know at what potential they are.

19:30 <wpwrak> well, i think it's also seeing more intensive testing now. so it's normal that more critters come out. we turn more stones ;-)

19:30 <wpwrak> audio back to normal after power-cycling

19:31 <lekernel> phew...

19:31 <wpwrak> at least it seems i can paralyze audio quite reliably :)

19:31 <lekernel> try shorting L3 ...

19:32 <wpwrak> i'm kinda curious what exactly my stereo sends out there

19:34 <lekernel> the wm9707 datasheet says avss/dvss voltage should be max +/- 0.3V

19:34 <lekernel> it could easily be exceeded by transients across L3 ...

19:34 <lekernel> yay, smells like even more rework delays

19:35 <lekernel> (and of course, the problem never manifested itself with the lm4550 nor on my wm9707 test board ...)

19:36 <wpwrak> maybe your signal sources have better/different grounding

19:38 <wpwrak> i'm also a little suspicious about DMX. those expensive USB-DMX dongles all seem to have galvanic isolation. that's probably not just because it sounds cool ...

19:40 <wpwrak> and DMX seems a particularly good candidate for potential differences because the devices will be far away from the DJ desk, probably connecting to very different points in the mains wiring

19:40 <wpwrak> (well, that's my layman's suspicion. i didn't know DMX even existed before i saw it in the M1 schematics, so maybe i'm all wrong :)

19:41 <wpwrak> anyway, let's see what's up with the audio

19:43 <lekernel> I haven't had any DMX issue so far, but it seems to be a persistent and inconvenient pattern that all problems happen on other people's boards

19:53 <lekernel> otoh they make expensive DMX isolation devices http://www.fullcompass.com/product/303310.html

19:53 <lekernel> which suggests there are also non isolated devices out there

19:55 <wpwrak> grmbl. all i see of my audio signals is some 100 Hz noise. very weird.

19:58 <wpwrak> (also from the kaossilator. something's clearly wrong with my measurement ... let's try a different cable)

19:58 <lekernel> I'd bet this is the same weird problem there was with the video input...

19:59 <lekernel> it makes sense that it doesn't happen from the battery powered device but happens from the mains powered one

20:02 <wpwrak> different cable seems to work better (or maybe it was setting the probe to X1 - dunno how that output driver works)

20:02 <wpwrak> kaossilator is indeed ~+/- 0.3 V

20:09 <wpwrak> about 1.3 Vpp on "tape out" on my stereo. and it's not active when playing from line in. that much about the pass through i wanted to try.

20:09 <wpwrak> what does the audio chip spec say about 1.3 Vpp ? deadly ? or just clipping ?

20:11 <wpwrak> hmm, well beyond the absolute maximum ratings

20:13 <lekernel> ?

20:13 <wpwrak> ah, but you have 1:2 divider

20:13 <lekernel> you have DC?

20:13 <lekernel> i'm talking about digital to analog ground potentials

20:13 <lekernel> across L3

20:13 <wpwrak> naw, shouldn't have DC. in any case, you're blocking DC>

20:14 <lekernel> not the voltage between the ground of your cable and its signal

20:14 <wpwrak> at the moment i'm checking the audio signal that comes out of my system. if it's acceptable for the M1.

20:14 <lekernel> it's also the voltage between the ground of your cable and the digital ground of the M1

20:14 <lekernel> it can develop across L3

20:14 <wpwrak> looks good. so ground is the next step.

20:14 <lekernel> and the maximum voltage is +/- 0.3V

20:14 <wpwrak> yes yes, i see that it looks quite like the video

20:15 <wpwrak> 1.3 Vpp measures is probably okay. that's with a few mV of noise, and you have a 1:2 divider. so it'll be around 0.6 Vpp

20:16 <wpwrak> s/measures/measured/

20:17 <wpwrak> wonders what "normal" line in/out levels really are

20:19 <wpwrak> i'm runing a scope calibration to get rid off the little DC offset it shows

20:19 <wpwrak> afk for a bit

21:22 <wpwrak> and ~30-60 min more afk fun, and then i'll be back to the M1

21:31 <kristianpaul> why rc3 worst run? more hardware more issues pop, that is not a hiden secret i guess

22:15 <wpwrak> back

22:15 <wpwrak> lekernel: you'll like this: the ~1.2 Vpp you have on line in may be insufficient: http://en.wikipedia.org/wiki/Line_level

22:17 <lekernel> just crank up the volume, this is a totally trivial issue

22:18 <lekernel> kristianpaul, easy to say for you

22:18 <wpwrak> lekernel: even more so if we consider that the "normal" level wolfson consider seems to be around only +/- 100 mV, so 0.2 Vpp

22:18 <wpwrak> lekernel: no, i mean the voltage the M1 input is designed to handle

22:19 <wpwrak> lekernel: the codec does up to 0.6 Vpp (absolute maximum ratings), you have an 1:2 divider, so you get 1.2 Vpp for the input signal

22:19 <wpwrak> lekernel: (probably already with distortions, etc., but that may not matter so much)

22:21 <wpwrak> lekernel: but it seems that "LINE" levels you may encounter can go up to about 1.8-2.2 Vpp, particularly with "professional" equipment

22:22 <wpwrak> my sony, with ~1.3 Vpp would be high for a consumer electronics device, but still well below "professional equipment"

22:42 <wpwrak> hmm, new hypothesis: the data sheet is simply wrong ;-)

22:42 <wpwrak> and the absolute maximum rating is in truth AVss-0.3 V to AVdd+0.3 V

22:43 <wpwrak> in which case everything is nice and well

22:44 <wpwrak> adam will be disappointed that we failed to create yet another rework item for him ;-) L3 is still on, though. let's see about it ...

22:57 <lekernel> I was talking about "Difference DVSS to AVSS"

22:57 <lekernel> which is also +/- 0.3V

22:58 <wpwrak> ah, i see. yes, that doesn't agree with L3.

23:01 <wpwrak> heh, i see L19 also has a history of being made eliminated ;-) (huge solder blob)

23:23 <wpwrak> reworked. works like a charm

23:25 <wpwrak> doing a few unplug/plug cycles

23:26 <wpwrak> solid as a rock

23:26 <wpwrak> what's funny is that the stereo makes noise when i connect the stereo:line-out to m1:line-in. some interesting things must be passing over that ground.

23:50 <kristianpaul> okay never mind my easy comments i regret now