<kristianpaul>
wpwrak: (ram corruption) well, not to alarm, but i remenber first ethe core that uses DMA had some issues with memory in RTEMS, dunno if just was something tricky in software, but lekernel just decided to re-implement as a FIFO using ram from fgpa
<kristianpaul>
and yes there are other DMA capable cores, but mostly for video/audio part, so i wonder if ram corruption can be noticed easilly..
<kristianpaul>
anyway, i always tought that, just droping my comments
<kristianpaul>
also fyi the only flash corruption (as the m1 neverd loaded neither bitstream or bios) was in the time i dint knew it urjtag had a command for reconfigure the fpga. so i can avoid that manual unplug/plug DC power..
<wpwrak>
okay, so you just stopped power cycling ?
<kristianpaul>
sure time ago
<kristianpaul>
at least not so often as i reflash several times at weekend..
<wpwrak>
maybe, if you were to start power cycling again, you could also reproduce NOR corruption
<kristianpaul>
10 time right? :)
<kristianpaul>
xiangfu: you have script for read nor, just to be sync i can try that powecycle as you did
<kristianpaul>
and confirm same issue in rc2
<kristianpaul>
i guess i need flash flickernoise and render :)
<kristianpaul>
started getting mad about makefile not happy with is current location..
<aw_>
time to rework 13 pcs usb transceivers after lunch. i updated wiki results.
<kristianpaul>
  i: Mking : standby.fpg : bios-rescue.bin(CRC)CRC passed (got e4cng : splash-rescue.rawsed (got e8ff824f)
<kristianpaul>
  Checking : flickernoise.fbi(rescue)(CRC)CRC )
<kristianpaul>
  Checking : bios.bin(CRC)CRC passed (4)
<kristianpaul>
  Checking : splash.rawCRC passed (got a2m PASSED  ]mages
<kristianpaul>
Images CRC: Checking : standby.fpg8905) Checking : soc-rescue.fpgChecking : bios-rescue.bin(CRC)ed (got aa12a56a) Checking : soc.fpg3a31e737) Checking : bios.bin(CRC)CRC passed (got 86e23684) Checking : splash.rawlow, or hit ENTER to run all tests:
<kristianpaul>
ok, sorry messy now
<kristianpaul>
PASSED
<kristianpaul>
(CRC)
<wolfspraul>
kristianpaul: the key is to try to get your board into the flash/d2 dimly lit/cannot reconfigure problems
<wolfspraul>
then it gets interesting :-)
<kristianpaul>
(dimly) oh now, i hope not ;)
<kristianpaul>
s/now/now
<kristianpaul>
not*
<kristianpaul>
btw what is the adam latop OS and libusb version?
<kristianpaul>
is it Fedora 15? :-D
<wolfspraul>
don't know
<kristianpaul>
:/
<kristianpaul>
worth to ask, as ubuntu/fedora like follow upstream a lot... well libusb..
<kristianpaul>
just in case
<xiangfu>
adam use ubuntu
<kristianpaul>
7.04? :)
<wpwrak>
wolfspraul: (can't reconfigure condition) if the corruption strikes at a random place, then you'd have a 50x higher probability to find it if you check the entire NOR after each cycle, instead of waiting for that little 640 kB partition with the bitstream to get hit ;)
<wpwrak>
oh wait. it's 1.5 MB. well, so 20x :)
<wolfspraul>
checking the entire nor is too slow now, via jtag
<wolfspraul>
4.5 hours I think
<wolfspraul>
a lot of power cycles in those 4.5 hours
<wpwrak>
the production test sw runs locally, no ?
<wolfspraul>
or maybe run the test software again?
<wpwrak>
yup. or maybe FN can do it, too
<wolfspraul>
that's a good idea! between each power cycle, add a run of the test sw, loaded via serial
<wolfspraul>
fn?
<wpwrak>
flickernoise
<wpwrak>
is there something like a shell ?
<wolfspraul>
yes I think so - rtems shell
<wolfspraul>
add crc checks there, and/or in the fn gui
<wpwrak>
then you could probably run a NOR checker from there
<wpwrak>
yup
<wolfspraul>
another good idea
<wolfspraul>
a stream of good ideas
<wolfspraul>
now only Xiangfu needs to type faster
<wpwrak>
they're kinda easy ;-)
<wolfspraul>
yes definitely, also adding it in the fn gui is a good idea
<wolfspraul>
could come in handy once the units are in the field
<wpwrak>
that means, i should get them patented. the more obvious the idea, the more likely someone else will have it too :)
<wolfspraul>
did you see the anti-Apple boot patent?
<wolfspraul>
ok I take notes - crc checks for all partitions in rtems shell and fn gui
<wolfspraul>
will do
<wolfspraul>
flickernoise could check on every boot, but then what? beep?
<wpwrak>
if the CRC doesn't cover the whole partition, you will have some means to know the partition size. then you could just check that the rest is 0xffff
<wolfspraul>
I think it should not fall back to the GUI and display an error message, that could do more harm than good
<wpwrak>
maybe the BIOS could ?
<wolfspraul>
no beeper
<wpwrak>
boot into recovery
<wolfspraul>
hmm
<wolfspraul>
dangerous
<wpwrak>
unless that one is broken too. if yes, beeo noisily for a few times before continuing anyway :)
<wpwrak>
beeP
<wolfspraul>
yes definitely continue
<wolfspraul>
otherwise we may cause more harm than good too easily
<wolfspraul>
anyway, let's add checks slowly and where they make sense. good idea! rtems shell first, and fn gui
<wpwrak>
(continue) yeah. not every corruption affects operation.
<wolfspraul>
and in the meantime, adam could do the power cycle tests with running the test image in between
<wpwrak>
ah, and mwalle mentioned that gdb has a faster NOR download than jtag. so that may be an option too
<wolfspraul>
but only after the full round is completed and we are zooming in on this, not now otherwise we may confuse ourselves
<wolfspraul>
yes, all has been added to xiangfu's todo list :-)
<wpwrak>
maybe that's the quickest approach. then analyze the data on the PC. would also allow just erasing all the other partitions and never running RTEMS.
<wpwrak>
(and this narrowing the set of things the M1 does between cycles. if the problem suddenly disappears, when we can suspect it was one of the skipped things)
<wolfspraul>
ah you mean you want to try to reproduce it with power cycles alone?
<wolfspraul>
so just power cycle 100 times (2 seconds in between), then crc check
<wpwrak>
maybe check every ten
<wpwrak>
if you examine the whole NOR, the corruption should show up quickly. and once it happens, you want to analyze the single event. if you get multiple corruption events between checks, things get blurry again
<wolfspraul>
ok but then setting a baseline of 100% erase is not bad
<wpwrak>
yup, then it would help
<wolfspraul>
and checking the entire area outside of the written partitions for anything non-1
<wpwrak>
i think you still need the "BIOS"
<wpwrak>
yup
<wolfspraul>
k gotta run, things are moving
<wolfspraul>
Adam will need at least another day or two for more testing and fixing. and in parallel we improve the test tools, erasing, crc checks, etc.
<GitHub20>
[scripts] xiangfu pushed 2 new commits to master: http://bit.ly/rkFUwp
<GitHub20>
[scripts/master] build: match latest development - Xiangfu Liu
<GitHub20>
[scripts/master] move all shell script file to one folder - Xiangfu Liu
<methril_work>
congratulations for the qemu lm32 support!! :)
<methril_work>
now it`s oficial
<Fallenou>
yep :)
<kristianpaul>
9.25 hrs render now crash :)
<kristianpaul>
s/now/NOT
<Fallenou>
sometimes spelling makes a difference :)
<xiangfu>
kristianpaul, have you connect the jtag/serial to your PC?
<xiangfu>
kristianpaul, oh. NOT
<xiangfu>
;)
<kristianpaul>
lol , yes all fine
<kristianpaul>
but jtag is connected ;)
<xiangfu>
kristianpaul, just finish crc command in flickernoise. but my system have problem open terminal.
<kristianpaul>
busy port?
<lekernel>
i've run mine for a full week without crashing
<lekernel>
intermittent bugs that only hit once in dozen days with serious consequences are the most pesky ... :(
<xiangfu>
full week that is long time
<xiangfu>
lekernel, can I put this 'crc' command to branch 'stable_1.0' then cherry-pick it to 'master' later.
<lekernel>
what is that crc command?
<xiangfu>
lekernel, just 'crc address length' then output the result
<xiangfu>
very simple.
<xiangfu>
but Werner think we can setup a automatic test : like power-on, crc test all images. power-off, then do it again and again. until meet the NOR flash bug :)
<lekernel>
just commit it to master
<lekernel>
or use Werner's test :)
<lekernel>
faster and does not clutter the flickernoise code base with spurious features
<lekernel>
actually, that crc command might belong in RTEMS instead
<xiangfu>
oh. yes. let me push to flickernoise first :)
<lekernel>
nah, add it to RTEMS
<GitHub122>
[flickernoise] xiangfu pushed 1 new commit to master: http://bit.ly/o6LrU0
<GitHub122>
[flickernoise/master] new command: crc address length - Xiangfu Liu
<GitHub198>
[flickernoise] xiangfu pushed 1 new commit to stable_1.0: http://bit.ly/pBuahO
<GitHub198>
[flickernoise/stable_1.0] new command: crc address length - Xiangfu Liu
<xiangfu>
ok. got it, reboot my system first. I only have one working terminal now :(
<lekernel>
(use the branch mmstaging for the rtems repository)
<lekernel>
4. submit a patch into the RTEMS PR system with the CRC command so it gets merged
<xiangfu>
ok
<xiangfu>
then I will work on the lock command in urjtag.
<xiangfu>
I am thinking add one command like "flashlock ADD BLOCK_COUNT"
<lekernel>
it's not supported already?
<lekernel>
surprising ...
<xiangfu>
not support in urjtag.
<xiangfu>
have to reboot, read IRC log later
<lekernel>
bye!
<wpwrak>
(werner's test) hmm, i really have to build that little USB relay switch ...
<wpwrak>
at openmoko, i once made some parallel port to transistor switch, but that was for lower currents. i mention this not because it was a great technical achievement, but because, for some unfathomable reason, nobody else had that idea. pressing buttons gets kinda boring after the 1000th time or so ...
<kristianpaul>
(transistor switch) lovelly:)
<mwalle>
wpwrak: mh is a software write protection persistent?
<mwalle>
(nor flashes)
<wpwrak>
yes, it survives power cycling
<wpwrak>
at least the one that's documented. the chip seems to have additional, undocumented protection modes
<mwalle>
mh ok
<mwalle>
lekernel: do i need ise 13.2 to synthesize the newest mm? or is 13.1 sufficient enough?
<lekernel>
git head needs 13.2 because it enables the block ram initialization workaround in bitgen
<lekernel>
if you disable this option, 13.1 should work (and the block ram initialization bug doesn't seem to affect the fpgas we have anyway)
<mwalle>
could you point me to the commit/file i have to look at?
<mwalle>
Für das Evaluationsboard LM32 gibt es eine grundlegende Unterstützung, das SoC Milkymist wird vollständig, inklusive Video-Rendering, unterstützt.
<mwalle>
lol golem.de
<lekernel>
is it just me, or do you need a user account to _read_ messages from the ubuntu forum?