#milkymist on 2011-08-03 — irc logs at freenode.irclog.whitequark.org

03:19 <kristianpaul> http://downloads.qi-hardware.com/people/kristianpaul/wtf%20accum%20int.png

03:19 <kristianpaul> http://downloads.qi-hardware.com/people/kristianpaul/good%20accum%20int.png

03:20 <kristianpaul> in theory is same signal coming from a core

03:20 <kristianpaul> but different pin

03:20 <kristianpaul> in the fppa

03:20 <kristianpaul> this behavior is familiar to you lekernel ?

03:59 <aw> xiangfu, http://downloads.qi-hardware.com/hardware/milkymist_one/production/rc3/test_results/4E-results

04:00 <aw> xiangfu, i noticed that's usb B port having 'run time err'. do you know exactly how it means and what's its purpose? so that i can easily know what part/chip i need to check. ;-)

04:02 <xiangfu> you can see " Control transfer failed:" that mean M1 send 'control message' to usb. but m1 can not get any reply.

04:03 <aw> xiangfu, alright, tks. ;-)

04:03 <xiangfu> the USB core is software, I saw the CRC test on image are OK.

04:05 <aw> oah~yes, so that means i only need to see if those connections between U17 (Universal Serial Bus Transceiver) port B and fpga.

04:05 <aw> so far i can only microscope U17's all pins though. :)

04:06 <xiangfu> from the log the USB-A have problem.

04:06 <aw> hmm...no

04:06 <xiangfu> first you can try switch the USB device on USB-A and USB-B

04:06 <aw> i tried to switch them

04:07 <aw> when i switched, it shows: USB: HC: Device disconnect on port B

04:07 <aw> USB: HC: Low speed device on port A

04:07 <aw> USB: HC: VID: 0E6A, PID: 6001

04:07 <aw> USB: HC: Found keyboard

04:09 <aw> port A didn't show 'run time err', only shows up when I plugged keyboard in port B.

04:09 <xiangfu> ok

04:10 <aw> sorry that it's 'RX timeout error'

04:10 <xiangfu> " so that means i only need to see if those connections between U17 (Universal Serial Bus Transceiver) port B and fpga." yes.

05:17 <aw> 0x4e and 0x72: the audio codec circuit doesn't function is because L1's pad soldering not well. The root cause it that the footprint design from rc1 is 0603, and current we use a 0402 with zero ohm, from a more quantity run that we got this err : http://en.qi-hardware.com/wiki/File:M1_rc3_0x4e_L1-1.png

05:17 <aw> so let's use 0402 footprint for L1 in rc4 then we fix this. ;-)

05:42 <aw> 0x4e: the U17's TP25 impedance is about 240 ohm which should be 17M ohm (I measured/compared to U16).

05:44 <aw> this is good condition that if I still get failure after resoldering, then I can take apart U17, then if the impedance of TP25 to gnd is still low, this definitely means that ball soldering under fpga is bad.

06:27 <aw> good news also bad news that this caused by micrel parts, the impedance of TP25 is M ohm level after I took apart U17. :(

06:35 <wolfspraul> I think that's good news. I wouldn't want to find any number of bga ball problems under the fpga. remember there are still pins we don't test, such as the expansion header.

06:35 <wolfspraul> if it's isolated on some external part, we fix it and done :-)

07:04 <roh> hey guys.. how are the boards doing? sounds bad

07:04 <roh> just finished packing the second shipment

07:07 <wolfspraul> not bad

07:07 <wolfspraul> well, if you look from the SMT date, and you imagine how it could be, then it's bad :-)

07:08 <wolfspraul> but if in the end we have 60 or 70 or even more 100% perfect and super tested boards, I don't consider the run bad. Maybe a little 'tough' ;-)

07:08 <roh> wolfspraul: heh... well.. its much more complex than the 20 boards we hacked together some evening on the weekend

07:08 <wolfspraul> we are making mistakes, but man I just don't know how to work on something as complex as this without making mistakes.

07:08 <wolfspraul> if someone can, step up and do it

07:09 <roh> even there we needed lots of manual work to get them atleast powered up properly (lots of bridges after soldering

07:09 <wolfspraul> so the fake TI parts was bad, yeah. shiny new replacement parts from Mouser will arrive on Friday...

07:09 <wolfspraul> now some Micrel failueres (it seems), well, just replace

07:09 <wolfspraul> small discoverise like the 0402 part on a 0603 footprint :-)

07:09 <roh> micrel?

07:10 <roh> some socket?

07:10 <wolfspraul> no a usb phy

07:10 <roh> i see.. defective part?

07:10 <wolfspraul> hard to say sometimes

07:11 <wolfspraul> http://en.qi-hardware.com/wiki/Milkymist_One_run_3_schedule#Classifications_of_Failure

07:11 <wolfspraul> then we have discoveries like the now longer USB cable causing problems with JTAG flashing

07:11 <wolfspraul> never noticed before

07:11 <wolfspraul> or Adam's testing VGA connector wearing out from all the connect cycles

07:13 <wolfspraul> roh: look here. 0402 part on 0603 footprint :-) http://en.qi-hardware.com/wiki/File:M1_rc3_0x4e_L1-2.png

07:13 <wolfspraul> cute

07:15 <wolfspraul> here's my current rc3 bottom line, as of today:

07:15 <wolfspraul> 1) the process from SMT date to having plenty of 100% good boards is slower than it should be

07:16 <roh> hm. weird.. the length of the usb shouldnt matter at all..given within regular limits

07:16 <wolfspraul> 2) we are making many good discoveries that will help us on rc4 and future products

07:16 <roh> maybe the cable is very thin and there is some ground loop?

07:16 <wolfspraul> 3) we are still on track to having an acceptable yield for a 3rd run, after a lot of work. Let's say 60 or 70 100% good boards.

07:16 <roh> thus it gets to noisy to use or so (high R is common for thin, cheap cables)

07:16 <wolfspraul> that's the current status

07:16 <wolfspraul> but don't tell Adam now, I think he has 4 boards in 100% pass condition right now :-)

07:17 <wolfspraul> not 70

07:17 <roh> hrhr

07:17 <wolfspraul> well yeah, "should be"

07:17 <wolfspraul> but that doesn't help

07:17 <roh> i see (lists).. nice progress

07:17 <wolfspraul> Adam has switched to a shorter cable to avoid running into too much noise

07:17 <wolfspraul> and we need to consider switching the longer one, bad choice

07:17 <roh> maybe somebody could think up a nice sw to help manage such logistics and tracking tasks

07:18 <wolfspraul> so the run is not like a Christmas present, indeed

07:18 <wolfspraul> but other than that, well, I think in the end we have plenty of sellable boards

07:18 <roh> special workflow management.. maybe even to help with sourcing/stock management

07:19 <wolfspraul> I think doing that manually is most effective right now. Adam has a spreadsheet and it helps keep things documented and tracked.

07:19 <roh> need something similar here.. quite a mess of 'stuff' we got here in the lab

07:19 <roh> wolfspraul: sure. i was just thinking that exactly that works for one person, but not if you let 2 or more work in parallel ;)

07:21 <wolfspraul> I stand corrected. 3 in 'available' state right now.

07:21 <wolfspraul> but that's because the Mouser parts that are arriving on Friday hold up 30-40 most likely passing right away.

07:21 <wolfspraul> we'll get there

07:26 <wolfspraul> roh: also need to keep in mind, our test regime is much stricter than in rc2, in a number of areas. for example adam is doing 10 power cycles now, on each board, and render 30 seconds each time.

07:27 <wolfspraul> maybe we will also add a 1h rendering test for each board, we'll see

07:27 <wolfspraul> so of course, the more you test, the more you find

07:27 <wolfspraul> eventually there's always something unexpected happening, and then the board falls out of the 100% pass category right away

07:28 <wolfspraul> having 90 boards in one place is a great opportunity to fix bugs on the hardware/software boundary

07:28 <wolfspraul> which may otherwise be dismissed as 'whatever' once the boards are in the field

07:29 <wolfspraul> I just realized that we are not yet testing the expansion header :-)

07:30 <wolfspraul> (but no worries, we won't add it. the expansion header is officially untested...)

07:36 <roh> ;)

07:37 <roh> well.. some things you cant test without a rig to press buttons and play/recieve signals and automated procedures

07:37 <wolfspraul> I think the test setup is pretty good now, of course it will improve more as we learn

07:38 <roh> after all.. if you already tested 50 boards and find a new error class you have to start over and retest the 'done' 50 again (well.. only that one test, but that would be a optimisation already)

07:38 <wolfspraul> yes correct, that can happen

07:39 <wolfspraul> oh, I also forgot the little scare at the beginning with the reset ic circuit

07:39 <wolfspraul> that required 2 days of debugging and a rework on all boards

07:39 <wolfspraul> rc4 can only get better :-)

07:39 <roh> i havent understood why yet.. did nobody test the new circuit before doing the gerber?

07:39 <wolfspraul> (which we will say until the day rc4 boards come back from the smt shop :-))

07:39 <wolfspraul> we tested it, 1000 times even

07:39 <wolfspraul> but the test was wrong

07:40 <wolfspraul> so we fooled ourselves successfully, 1000 times in a row

07:41 <wolfspraul> it's always easy to look back and say "oh, so stupid". But at the time you don't notice. it slips.

07:41 <wolfspraul> and in hardware you cannot just rebuild, reboot

07:41 <wolfspraul> you will eat through the mistakes you made in the past, one by one

07:41 <roh> heh

07:42 <wolfspraul> our list of improvements for rc4 grows, I think that's a good sign

07:42 <wolfspraul> we are still fighting back

07:47 <roh> yeah. battle is not lost. just annoyingly long

07:47 <wolfspraul> the jury is still out on which part is the last one

07:48 <wolfspraul> amazingly enough the various delays cancel each other out

07:48 <wolfspraul> but if your first package arrives early next week, it's not going to be the cases

07:49 <wolfspraul> there were some nasty delays with the boxes as well, they had to be redone once or twice

07:49 <wolfspraul> I think we can expect them in Taipei early next week as well

07:50 <wolfspraul> so... next week is the big week. if all goes perfect then we finally have _everything_ in one place :-)

08:02 <roh> wolfspraul: :) nice that you can stay so optimistic

08:03 <wolfspraul> roh: the first package still shows a very early tracking status. not even at "parcel center of origin" yet

08:03 <wolfspraul> if all goes well it's in Taipei early next week I think

08:03 <wolfspraul> otherwise later in the week :-)

08:05 <wolfspraul> optimistic, well. we chose this path. Let's make it a full product, a great and polished starting place whether you want to take it into a hacking or performance direction.

08:06 <wolfspraul> that's a very different path from "let's make a hacker board", and we go through that now

08:08 <wolfspraul> roh: you stayed optimistic when gluing the buttons too, right? :-)

08:08 <wolfspraul> or fatalistic?

08:12 <aw> xiangfu, is there possible that we can use test image to help on testing for records into flash rom for 10 times when I press middle btn to boot up and some where that gui can let me save after rendering 30 seconds?

08:13 <xiangfu> aw, sorry. what you mean on "records into flash rom for 10 times"?

08:14 <aw> xiangfu, Rendering - Boot-to-Rendering step and keeps rendering at least 30 seconds 10 times. i.e. power on -> reconfiguration -> press middle button -> boot up -> rendering (30 seconds) -> power off, be noticed that power-cycle (from power off to power on) is roughly 3 ~ 5 seconds.

08:16 <wolfspraul> the main reason we are doing those 10 boot-to-render cycles was to verify fix2

08:16 <aw> there's another extended header, can we let fpga to detect one pin of it so that s/w can record how many times I tested successfully?

08:16 <wolfspraul> but I do think you found 1 or 2 boards where you ran into a problem after X cycles, right?

08:17 <wolfspraul> aw: no that's all too difficult I think. let's keep our test software simple.

08:17 <wolfspraul> rather reduce the number of power cycles to 5

08:17 <aw> hmm..too badÂ Â though. alright

08:18 <wolfspraul> aw: you found some boards that failed after X cycles, right?

08:18 <aw> wolfspraul, yes

08:19 <aw> I'll back to see if my rework was bad. so replacing a diode or c238 220pF to test again.

08:19 <wolfspraul> like 5C

08:19 <wolfspraul> very strange

08:19 <wolfspraul> failed on 9th power cycle

08:20 <aw> yup..I'll back to check that one. ;-)

08:21 <wolfspraul> also 0x63 0x85

08:22 <wolfspraul> strange stuff

08:30 <xiangfu> aw, if we want 'Rendering' we have to do it in Flickernoise.

08:32 <wolfspraul> and we don't want to switch from a cold power cycle to a software reset either

08:33 <wolfspraul> the test should first test cold power cycle, software reset is a different test (and not important now imho)

08:48 <lekernel> I hope you will make that stupid supplier pay for those crappy TI parts they sold

08:48 <lekernel> including the massive time wastage and delays they caused

08:49 <wolfspraul> no, because it's not the suppliers fault

08:49 <wolfspraul> but we will not buy those parts from there anymore (all suppliers are in the wiki btw)

08:49 <wolfspraul> I do not know a magic way to avoid this kind of problem, it's nothing to be worked up about.

08:50 <wolfspraul> you cannot just try to 'play safe', there is no such path and then you cannot produce anything anymore

08:51 <wolfspraul> but we learn, another thing that will be improved in rc4

08:51 <wolfspraul> I personally visited that supplier several times, it's not their fault. They were mislead about this as well.

08:52 <wolfspraul> misled

08:55 <wolfspraul> I think the sourcing lesson learnt here is this: If a part is easily and even cheaper available outside of China, then by all means don't buy from inside China :-)

08:55 <wolfspraul> that's all

08:55 <wolfspraul> we didn't apply this simple rule for the rc3 sourcing, but we will for rc4, guaranteed

08:56 <wolfspraul> every time so far when a problem popped up with a part we sourced inside China, a quick digikey lookup showed that we actually paid more on the Chinese spot market than on digikey!

08:57 <wolfspraul> one day I will try to find out the reasons why that is so, but in the meantime we just need to source smarter, and buy such parts (even cheaper!) from digikey/arrow/mouser/etc.

08:58 <wolfspraul> Adam has already moved to other troublemakers now...

08:58 <Alarm> Is it possible to learn how to program the FPGA with the card Milkymist?

08:59 <wolfspraul> Alarm: don't understand what you mean. You mean programming the fpga (bitstream)?

09:00 <wolfspraul> "Milkymist" is the name of the SoC (system on a chip), the lowest level of what is programmable on Milkymist One (that's the name of the video synthesizer built on top of the Milkymist SoC)

09:00 <wolfspraul> you can program the Milkymist SoC in a language called Verilog, the microkernel (RTEMS) and application (Flickernoise) can be programmed in C

09:01 <wolfspraul> does that answer your question?

09:09 <Alarm> I rephrase my question: "is it possible to create a function with verilog for beginners?

09:11 <wolfspraul> you could probably move functions into the fpga, yes. but I'm not aware of a good 'verilog hello world' example in conjunction with Milkymist

09:12 <wolfspraul> kristianpaul went through some realizations as part of his GPS stack, so he may have some good starting points or code snippets at hand

09:25 <Alarm> I found the realization of kristianpaul: http://en.qi-hardware.com/wiki/GPS_Free_Stack

09:25 <wolfspraul> that's the whole project

09:25 <wolfspraul> but when he's online later he may point you to some small starting snippets to dive into the Milkymist SoC

09:26 <wolfspraul> unless lekernel has an idea for you, of course

09:26 <wolfspraul> I agree there should be starting points for people, the whole SoC is probably overwhelming. I think about 40,000 lines of Verilog code.

09:27 <wolfspraul> Alarm: but if you are interested in that, you most likely still want to download the sources and start reading :-) Reading is king in that case.

09:27 <wolfspraul> eventually it will dawn on you where and how to start putting your own stuff in

09:36 <Alarm> :) A small project of a few line of code verilog please me well

09:49 <lekernel> Alarm, https://github.com/sbourdeauducq/paltest

09:58 <Alarm> :) Thank you, I will test :)

10:00 <Fallenou> oh you connect CVBS to the red connector ?

10:00 <Fallenou> I thought you would have to connect to the green one

10:00 <Fallenou> (README of paltest)

10:06 <Alarm> CVBS is yellow on the tv?

10:13 <Fallenou> I have an Analog Device ADV7403 video input chip here, and I have to connect CVBS to the green input to make it work (composite)

10:26 <Alarm> i guess it must install the Xilinx tools?

10:26 <Fallenou> Alarm: you have to install ISE Webpack

10:26 <Fallenou> to get Xst

10:26 <Fallenou> you won't have to use theur GUI in theory, you can use the makefiles

10:26 <Fallenou> their*

10:32 <Alarm> ok thank

10:37 <lekernel> Fallenou, this repository is about CVBS _output_

10:37 <lekernel> through the VGA connector

11:17 <kristianpaul> paltest looks interesting to look at Alarm

11:17 <kristianpaul> yeah, i started printing all datahseets, also core documnetation and reading

11:17 <kristianpaul> no matter if you understand at first

11:17 <kristianpaul> brain catch later for sure, but read read !

11:18 <kristianpaul> easy way is CSR bus, a good starting point for _very_ basic stuff

11:18 <kristianpaul> once you learn about take a look and time to wishbone specs is very important

11:19 <kristianpaul> and of course in parellel stufy verilog and digital design and computer arcquitecture is not bad

11:19 <kristianpaul> actually thats a book from MIT i think

11:20 <kristianpaul> but there are plenty of good free sources in the internet

11:20 <kristianpaul> and please, take a lot of care of clock domains... i lost lot of time because dont doint it..

11:20 <kristianpaul> Fallenou: hi

11:21 <kristianpaul> Fallenou: Had, you a rought idea of how to handle mico32 interrupts

11:21 <kristianpaul> (note, i dint looked at milkymist demo yet)

11:23 <kristianpaul> bbl

11:33 <lekernel> report integer'image(to_integer(unsigned(count)));

11:33 <lekernel> vhdl die

12:27 <Alarm> kristianpaul: Thank you for your counsel

13:43 <Fallenou> kristianpaul: nop, but they are re designing it here (milkymist)

13:47 <xiangfu> Hi how I try openwrt in milkymist. just copy http://fidelio.qi-hardware.com/~xiangfu/compile-log/openwrt-milkymist.minimal-08032011-0953/simpleImage.milkymist_one to boot.bin then 'netboot'?

13:48 <Fallenou> kristianpaul: it's in the lm32 arch pdf anyway

14:53 <Alarm> Is there a difference in performance between VHDL and verilog?

15:05 <lekernel> no

16:33 <lekernel> xiangfu, yes, plus cmdline.txt and initrd.bin

16:40 <larsc> actually youd don't need cmdline.txt anymore

16:42 <larsc> xiangfu: you also need to change the size of the rootfs to 4MB

18:30 <kristianpaul> Fallenou: (pdf) yeah, just finished reading it :)

18:32 <kristianpaul> what i need is to wipeout the memcard and see if the bug about no boot.bin is gone :)

19:52 <mwalle> larsc: from what i understand on the lengthy discussion about the gcc bug, the only real solution is to do the syscall with hand written assembly code

19:56 <mwalle> lekernel: after pinging the patch again, i got one acked-by by gerd hoffmann

19:56 <mwalle> but still no merge by anthony, but i havent seen any mail from him today

19:56 <mwalle> so he might be ooo

19:58 <larsc> mwalle: but our workaround works, or doesn't it?

20:00 <mwalle> by chance

20:00 <mwalle> for now it seems to work, but there seems to be no connection between the register constraint and the actual inline asm

20:03 <mwalle> larsc: i would chance anything for now, there are many difference proposals. so as long as it works.. :)

20:04 <mwalle> s/would/wouldnt

20:17 <larsc> mwalle: do you want to apply your irqflags patch?

20:21 <mwalle> if you agree

20:22 <larsc> have you seen my relpy?

20:22 <mwalle> yes

20:23 <larsc> the only situation i can think of where behaviour changes is, for example if we have to spinlocks A and B and the following sequence lock A, lock B, unlock B, unlock A

20:24 <GitHub70> [linux-milkymist] mwalle pushed 1 new commit to master: https://github.com/milkymist/linux-milkymist/commit/00a52b0c1baac629016d18cf9d6da81199ac8f50

20:24 <GitHub70> [linux-milkymist/master] lm32: simplify irq handling even more - Michael Walle

20:24 <mwalle> larsc: but only on smp?

20:24 <larsc> but that should result in undefined behaviour anyway

20:25 <larsc> uhm, i meant lock A, lock B, unlock A, unlock B

20:26 <larsc> will clear EIE and BIE

20:28 <mwalle> this would break other architectures too, afaik, microblaze restores its whole status register

20:29 <mwalle> larsc: btw does this have any consequences on the generic atomic instructions

20:30 <larsc> yes, they are optimal now

20:31 <larsc> kernel size in general shrunk quite a bit

21:44 <mwalle> lekernel: framebuffer is working for me

21:45 <mwalle> did you select CONFIG_FRAMEBUFFER_CONSOLE ?

21:46 <lekernel> yes, it was working last time I tried, but I remember someone reporting problems on IRC

21:46 <lekernel> might be only a misunderstanding

21:47 <mwalle> kk

22:09 <mwalle> nice, /dev/mem wont work for address 0..

22:11 <mwalle> larsc: my my busbox/kernel seems to be not very stable

22:11 <mwalle> atm the kernel hangs in do_wait_thread, http://paste.debian.net/125026/

22:16 <larsc> can you send me your rootfs?

22:18 <larsc> i'm still busy with work atm. but i'll try to take a look at it later or tomorrow

22:19 <mwalle> http://www.walle.cc/mmone/initramfs.gz

22:19 <mwalle> its a initramfs ;)

22:20 <mwalle> http://paste.debian.net/125027/ << heres another backtrace with an error i see more often

22:20 <mwalle> writes to 0x1cc and 0x144

22:27 <mwalle> seem like the current is NULL

22:28 <larsc> hm, some check missing user_mode() check?

22:29 <larsc> hm, some missing user_mode() check?