<aw_> updated: applied xiangfu's new reflash_m1.sh, now..rendering...:-) data partitions exists there. :-)
<aw_> thanks xiangfu!
<wolfspraul> nice
<aw_> be noticed that this script enables 'verify' settings, although this is too long to speed up.
<aw_> but xiangfu and sebastien was discussing about planing if do 'verify' tasks in test program to reduce time. this is good idea from sebastien. :-)
<aw_> so well...i can enable or disable while testing rc3 board when i need to speed up on testing the same board to investigate. :-)
<aw_> so with 'verify' setting is good to reflash a fully NEW mounted board. :-)
<aw_> now re-test in test program to see if still all works well too.
<nixfreak> what would be a good way to be to have an embedded device rip , transcode,and play video  but then operation be very fast
<nixfreak> but then execute very fast especially ripping and transcoding
<wolfspraul> I don't understand what you wrote about verify...
<wolfspraul> nixfreak: transcode in which way?
<wolfspraul> sounds like the Milkymist One could do it, no? video-in, transcode, vga-out? but depending on what you want to do exactly a lot of programming may be needed :-)
<aw_> wolfspraul, just forwarded their discussions in email. :-) you will know.
<kristianpaul> wolfspraul: i tought same, thats why i point nixfreak to join channel and confirm it by it self :)
<aw_> new reflash_m1.sh script: http://pastebin.com/tstnwy9E
<wolfspraul> nixfreak: get a Milkymist One, start hacking :-)
<wolfspraul> aw_: remember the audio_white_noise ogv you created yesterday? Will you upload it into the wiki or shoudl I do it?
<wolfspraul> we should collect the various testing documents and materials, and that's a nice one...
<nixfreak> basically wanna a create a device that can rip optical medium and then encode to say x264 with a very simple UI
<wolfspraul> can you be more precise? what do you mean with 'rip optical medium'?
<wolfspraul> what do you want to do with the encoded x264 stream? save somewhere? stream over the internet?
<nixfreak> for consumers that are sick of dvd players but also want to archive the video
<nixfreak> and be able to view the video in what ever resolution the tv / monitor is set to
<aw_> wolfspraul, hi you can, pls. actually i wanted to make a page to show those two ogv that can show up how difference they are. :-)
<wolfspraul> is there a wiki page already that collects rc3 tests?
<wolfspraul> I will add it there
<aw_> i am working others... :-)
<aw_> second
<wolfspraul> nixfreak: sounds possible, but I still don't fully understand what you want. You want to buy such a device? You want to manufacture one? You want to hack Milkymist One to be such a device?
<wolfspraul> or you just want to tell us that you think you have a cool idea? :-)
<aw_> wolfspraul, you could creat a subtitle under http://en.qi-hardware.com/wiki/Milkymist_One_run_3_schedule
<wolfspraul> I'll check
<aw_> then later I link them all. :-)
<wolfspraul> ah yes, I'll just throw it in there for now
<wolfspraul> let's not create too many pages, rather a few and longer pages
<nixfreak> I was thinking of FPGA to design/ create this project then was told to try this  channel
<wolfspraul> yes you are definitely in the right place
<wolfspraul> but what do you want to do now?
<nixfreak> I guess ask some advice what would be good way to implement this
<wolfspraul> what do you mean with 'implement'?
<wolfspraul> one way to set this up would be just to take a notebook and some software, no?
<nixfreak> right
<wolfspraul> are you trying to do this for yourself at your home, and then you are done. or do you want to learn about FPGA hacking? or are you trying to manufacture and sell a dedicated embedded device to perform this function?
<nixfreak> but in the end it should be embedded and small as possible
<wolfspraul> 'should'?
<wolfspraul> :-)
<wolfspraul> can you just get to the bottom line
<nixfreak> yes learn and create
<wolfspraul> many things 'should' be this or that way
<wolfspraul> tell kristianpaul :-)
<wolfspraul> GPS 'should' work by now, no?
<wolfspraul> what is your background?
<wolfspraul> student?
<wolfspraul> learning? what?
<nixfreak> not a student ,self taught on many things
<wolfspraul> have you done FPGA and Verilog programming before?
<nixfreak> no
<wolfspraul> how much time do you want to invest in this?
<nixfreak> as much as it takes
<wolfspraul> it sounds like get a notebook, set it up, and enjoy :-)
<wolfspraul> or... get a Milkymist One and start hacking
<wolfspraul> but those two paths are very different
<nixfreak> I like hacking code to learn
<wolfspraul> Milkymist One costs 499 USD + shipping, you can buy it in a few weeks, if you like
<wolfspraul> but then you will need _a lot_ of hacking, months, maybe years
<wolfspraul> just saying...
<nixfreak> did say I was going to create this over night (:
<nixfreak> didn't
<wolfspraul> why do you want to embard on this project?
<wolfspraul> embark
<nixfreak> cause I would like to see a cheap price for media implementation especially for older folk that get frustrated with complex controls
<kristianpaul> (GPS) yes it will
<wolfspraul> "cheap price", ok. slowly you are telling us more about your motivations.
<wolfspraul> cheap price will require big investment
<wolfspraul> you can make such a device for 20 USD, I'm sure. but only if someone invests millions of USD or more.
<wolfspraul> that's why most users tend to use standardized mass-market hardware, because even though it is overpowered, it is still cheaper and quicker to apply to a particular problem.
<wolfspraul> at least someone has invested billions of USD to drive performance up in those devices
<wolfspraul> so if you look for "cheap price", that's a big question you have to answer first. cheap for whom? for yourself? for the world? who invests? why? why not use existing devices? etc.
<nixfreak> cheap as in $100.00
<wolfspraul> so far my feeling is milkymist one is not the right thing for you, but I could be wrong. and of course I will happily sell you one :-)
<wolfspraul> yes, someone has to invest millions of USD
<wolfspraul> 100 USD retail price itself is no problem
<nixfreak> just looking at my options
<wolfspraul> sure
<wolfspraul> and I try to give you free consulting :-)
<wolfspraul> if it's just for yourself, get a notebook and try to set it up that way. problem solved.
<nixfreak> i appreciate it
<wolfspraul> if you are serious about learning Verilog and FPGA hacking, fine, get a Milkymist One and start
<wolfspraul> the 500 USD will be nothing compared to the thousands of hours you will sink into that over the next years
<nixfreak> yep I understand
<aw_> lunch time here, back soon.
<wolfspraul> if you are hoping that you somehow can contribute to such a device on the market for 100 USD, you still need to find financially strong partners at some point that invest millions in this, and don't mind doing that (i.e. they have no better alternative for their money, called 'opportunity costs')
<wolfspraul> investments in technology are hard because there is such tough competition
<wolfspraul> so, given what I currently understand about you, I would think you should categorize that option under "not very likely to actually happen"
<wolfspraul> you need a very thorough market understanding (sales trends etc) to make such an investment decision
<wolfspraul> unless you have a lot of industry experience already, and speak as an insider (which you are clearly not :-)), I'd say don't do it, forget it, it's not an option
<wolfspraul> you will never get that 100 USD device unless Samsung or LG or whoever one day decide to make one, for whatever reason
<nixfreak> only if I don't try
<wolfspraul> :-)
<nixfreak> have a good one thx for the advice
<wolfspraul> oops
<wolfspraul> too much realidad maybe. I was just looking up a nice Wikipedia link...
<wolfspraul> there is something about hardware that attracts people to do stupid things.
<wolfspraul> I'm wondering which percentage of sparkfun sales (for example) ends up in failed projects.
<wolfspraul> 'failed' is hard to define probably, maybe the learning experience from the failure is all that matters
<wolfspraul> kristianpaul: this was the link I wanted to send him :-) http://en.wikipedia.org/wiki/Sysiphus
<wolfspraul> kristianpaul: sorry in case I wasn't friendly enough to this guy, since you brought him here...
<kristianpaul> no no, i just pointed to milkymist because he asked about video related stuff with fpgas in #fpga
<wolfspraul> I doubt he will buy an m1, I doubt he will ever contribute 1 line of code anywhere, and I doubt he will ever get that 100 USD x.264 encoder made. Now he can proove me wrong...
<kristianpaul> :-)
<wolfspraul> I know people hate it when someone says "you will never" :-)
<wolfspraul> he he. being nasty.
<kristianpaul> realidad, is nasty anyway
<wolfspraul> I think it's fun. don't fight mother nature.
<wolfspraul> without reality problems, thinking would be easy, we could all just get drunk and dream.
<wolfspraul> but once reality hits, and you still want your idea to come true, that's hard
<wolfspraul> 1000 people want to build an airplane, 1 makes one that can actually fly. no?
<kristianpaul> yes
<wolfspraul> the whole snapshots is gone, maybe they cleaned up
<wolfspraul> 'snapshots' is not a good name
<wolfspraul> we need one name, and a clear testing and release process
<wolfspraul> so we don't push out broken updates to users
<wolfspraul> looks good
<wolfspraul> what's next?
<aw> btw we can cancel the test about Press [0] in test program, since the new remote control doesn't have [0] push key. I'll let xiangfu know this.
<aw> next to capture full screen
<wolfspraul> no '0' on remote?
<aw> at the left up corner of remote control is hex code '0'
<kristianpaul> arghh, second time something happen with rtems, and all get stucj in I: Booting...
<kristianpaul> stuck*
<wolfspraul> I don't think in the test we need to press that many buttons anyway.
<aw> so i used my old to push '0
<wolfspraul> two or three is enough. If those work I cannot imagine what other things we test, unless we press each button on the remote (then we test the remote), but that's a waste imo.
<wolfspraul> ok, we need to fix that '0' test then, something is wrong there
<aw> yeah...can reduce some to 5 keys, well... i pressed quickly though. not bit deal.
<wolfspraul> but it's still stupid and unfocused
<wolfspraul> two or three is enough, then we know the numbers are correctly getting across and we test the wires, transceiver, even rc-5 implementation in the fpga
<wolfspraul> if we want to test the remote control itself (which I think we don't need to), then we need to press each button on the remote
<wolfspraul> anyway, small detail
<wolfspraul> ok, full screen test now. and then D16, the one I'm waiting for :-)
<kristianpaul> lekernel: the right way of sync a core with a lower clock than m1 soc using wishbone is,  re-use wishbone handshake but sync the control signals. right? like in page 5 of this pdf http://www.edn.com/file/17561-310388.pdf?force=true
<aw> do i need to always type IP address/Netmask/Gateway/DNS ?
<aw> btw, i entered m1 by 'ftp' now. :-)
<aw> background is my living room. :-)
<wolfspraul> screenshot-02 reminds me that it would be cool if we had better brightness defaults, or auto-brightness...
<wolfspraul> but I like that we have brightness and contrast adjustment on the keyboard now, I think I saw that somewhere. maybe remote next...
<wolfspraul> aw: I uploaded the audio noise videos, see http://en.qi-hardware.com/wiki/Milkymist_One_run_3_schedule#Audio_Noise
<wolfspraul> under the rc3 video, I document it was measured between C21 and R18. Was that how you measured the rc2 video as well? I have no information about that under the RC2 video. If you don't remember where you measured, no big deal... I think the key point is documented.
<aw> wolfspraul, yeah...good video in one column. tks.
<wolfspraul> one row
<wolfspraul> do you remember the test points for the rc2 video?
<aw> yeah..i should set my brightness firstly...
<aw> you can write the same
<aw> the audio signals acts alternatively so that it doesn't matter on which C21 pad though. :-)
<aw> yes, good that i just checked the video I took on rc2 is the same on C21. :-)
<aw> i modified. :-)
<wolfspraul> ok great
<GitHub171> [extras-m1] yizhangsh pushed 1 new commit to master: http://bit.ly/ozVVMY
<GitHub171> [extras-m1/master] modified box die-cutting files and added size markers - Yi Zhang
<aw> soldered three wires relevantly on TP36 (PROGRAM_B), TP35 (Done), TP37(RP#) to get ready to scope.
<aw_> now I uploaded two waveforms without D16:
<aw_> channels: what is what as file name.
<aw_> next steps: I am going to scope with D16. moments...
<aw_> i found it: yes. the diode D16 was reversed when I stayed in factory. I must be nervous and calm yesterday. hehe...stupid adam & smart werner, yes, there's exact a bar marked cathode on diode's body. and also I trained factory to see footprint's bar mark as anode. then we got in this results.
<aw_> but...also bad thing is now that : I tested 10 times, i got 5 random NO-reconfiguration too. now go to scope again though.
<aw_> didn't keep calm yesterday. :(
<wolfspraul> so wait. now you are saying D16 'works', but now you still get the boot error it was supposed to fix?
<wolfspraul> so it doesn't work?
<aw_> sorry that again , i changed my words about NO-reconfigurations. keep calm again.
<wolfspraul> huh?
<wolfspraul> I'm calm, just trying to understand :-)
<wolfspraul> for importance - the "no-reconfiguration" bug is a critical bug. Basically it means that m1 will not boot, right?
<wolfspraul> that is very important to be properly fixed, 100% of the time. So when you see this bug, that's a serious issue.
<aw_> now: D16 soldered correctly. D2 ON slightly when power-on and OFF.
<wolfspraul> yes but that's wrong
<aw_> no no...wait ..let me finish my words. :-)
<wolfspraul> ok :-)
<wolfspraul> that's one of the most important bugs fixed with rc3... (together with audio noise)
<aw_> 1. D16 soldered correctly. D2 ON slightly when power-on and OFF.
<aw_> 2. I powered-on ten times, D2 ON slightly and goes OFF well.
<wolfspraul> I don't understand
<wolfspraul> can we focus on product behavior, either correct product behavior, or incorrect product behavior
<wolfspraul> do you see incorrect product behavior?
<aw_> 3. among those ten times, I pressed SW2 to try to boot and enter gui, then 5 / 10 times D2 didn't ON.
<wolfspraul> that's a very serious bug
<wolfspraul> :-)
<wolfspraul> so D16 has the right polarity now, but you say that basically the reset IC + diode does not fix the boot bug?
<aw_> let me scope finished again firstly. :-)
<wolfspraul> ok but focus on understanding product behavior: 1) correct 2) incorrect
<wolfspraul> otherwise we collect data points that are meaningless
<wolfspraul> it sounds like you have D16 back on now, and you believe it's correct (polarity), but the product behavior is still INCORRECT
<wolfspraul> that is, the boot bug is not fixed
<xiangfu> upload them there: http://milkymist.org/updates/2011-07-13/for-rc3/, include the new IR test. only needs 1,3,5
<wolfspraul> but with our rc2 when we tested the reset IC+diode, the bug was fixed. What is the difference now?
<wolfspraul> that's my thinking...
<aw_> wait...later the new waveforms I can know if the "forward voltage" of D16 is too high. or too margin.
<wolfspraul> from your #1-#3 above, you seem to say that in 5/10 times, you cannot boot even though the D2 is not lit dimly before?
<wolfspraul> from before I only know that basically whenever the D2 was dimly lit after power-on, it would not boot. but whenever it was clearly off, it would boot.
<wolfspraul> if your description is accurate and I understand it correct, we have a third case now?
<wolfspraul> D2 goes off entirely, but m1 still doesn't boot?
<GitHub109> [autotest-m1] xiangfu pushed 1 new commit to master: http://bit.ly/rcNar1
<GitHub109> [autotest-m1/master] tests_ir: reduce test buttons to 3 - Xiangfu Liu
<wolfspraul> aw_: to guide your work, focus on correct/incorrect behavior of rc3. rc3 must always boot into the GUI after power-on and button press, 100% of the time.
<wolfspraul> not 99% or anything, 100% of the time
<wolfspraul> so the moment you have an rc3 not booting once, something is wrong
<wolfspraul> so yesterday in the factory it didn't boot at all because the D16 polarity was wrong? will D16 have the correct polarity on the other 89 boards?
<aw_> xiangfu, thanks.
<aw_> first sum up a little from yesterday until now: (noticed that reset ic is always there on rc3)
<aw_> 1. without D16, m1 rc3 can boot up successfully at least more than 20 times
<aw_> 1.1 but not intensively to power up
<aw_> 2. without D16, after I scoped with a prober stuck on PROGRAM_B, it still can boot up 10 times.
<aw_> 3. with D16 (soldered), I didn't scope a prober stuck on PROGRAM_B, it got 5 / 10 times failed on boot up, so i still
<aw_> 4. with D16, scoped with a prober stuck on PROGRAM_B again, it can still boot up until my 26th power up, then got D2 keeps ON.
<aw_> now m1 can't reconfigure. :(
<aw_> item 3, made me remembered last time I did 1000 times to reconfigure successfully but forgot to test also 1000 times to boot up.
<aw_> we indeedly didn't do boot up 1000 times. now I have no idea...later to see if m1 can restore.
<aw_> i 'feel' an equivalent capacitance in scope's prober (which connected to PROGRAM_B) will easily let m1 get into NO reconfiguration status. But not proved.
<lekernel> if the initial configuration works, this is certainly something which is fixable in software
<lekernel> so that's extremely pesky, but not a big worry
<wolfspraul> ok that's messy, we need to clear this up
<wolfspraul> if we are unsure whether the way we tested this on rc2 actually was good, then we need to repeat the test on the rc2 board, including full bootup
<wolfspraul> maybe not 1000 times, but 100 or 200 should be enough
<wolfspraul> then I'm surprised that I guess rc3 booted fine without D16. that makes me wonder why we added D16 in the first place and whether the problem is fixed if we simply remove it.
<wolfspraul> which is something we can also verify on the rc2 board, after the full bootup test, by removing the diode there...
<wolfspraul> after that, we need to stop testing irrelevant cases on rc3, we should only focus on testing the one and only design we plan to manufacture and sell
<wolfspraul> that means we don't need to find out that the behavior is different when we keep the scope connected, because our users will not keep a scope connected :-)
<wolfspraul> but if we don't even know which design we actually think will fix the bug, then that introduces some uncertainty that makes us keep the scope connected all the time, I guess
<wolfspraul> finally, it's surprising that the rc3 board is now in a 'dead' state, hopefully we can find out which state this exactly is and recover
<wolfspraul> aw_: does this all make sense?
<aw_> wolfspraul, yeah...i stop and slow a bit. need to think steps on rc2 test again.
<wolfspraul> good. and also see my other points - why do we need D16 at all? why did rc3 boot without D16? stop testing irrelevant cases. how to recover rc3 board now.
<wolfspraul> seems those are all valid questions, at least to me... and without answering them we are just producing something we don't really understand.
<wpwrak_> aw_: (D16  reversed) that was an easy guess ;-)
<lekernel> if it's the same problem as I had twice, reflash
<aw_> wpwrak_, yeah :(
<aw_> lekernel, what did you mean about last sentence? :-)
<lekernel> to recover your board that no longer boots, reflash it
<wolfspraul> lekernel: why did we add D16 in the first place? we believe the circuit is incorrect without it? How come Adam had no booting problems while he had D16 removed?
<lekernel> a funny thing I noticed is that reflashing only the standby bitstream doesn't seem to help
<aw_> lekernel, yeah
<lekernel> I suspect some weird behaviour of the xilinx silicon
<lekernel> try reflashing standby only first, to confirm
<lekernel> then if it still doesn't boot (D2 dimly lit) reflash the rest
<lekernel> I'll use altera next time, I'm tired of this kind of things
<wpwrak_> wolfspraul: documenting behaviour with and without scope can be quite relevant, particularly while hunting for a problem that's not fully understood yet
<wolfspraul> I'm not against it, but we will run into Altera specific bugs for sure.
<lekernel> i've heard a lot less negative comments and errata entries about any altera chip than about the spartan6...
<wolfspraul> wpwrak_: yes, that's why we first need to define what we are workign on here. do we have doubts about the design/schematic? or the particular rc3 board under test?
<wpwrak_> wolfspraul: does adam have enough boards yet to even make this distinction ?
<wolfspraul> (that's unrelated to this particular run here) I am open minded about Altera, we just need to keep in mind that between Altera and Xilinx, over time the 'leadership position' will probably bounce back and forth between them. So we may end up with bad timing and always regret our switch later...
<wolfspraul> another thing to consider is that once we sold a board, we have to support it for good, in software updates. So our build and test/release process will get more difficult, if we build up a trail of chip changes in our product history.
<lekernel> with D16 present and with the correct polarity, does the initial configuration always work? (i.e. D2 is totally off)
<wolfspraul> other than that I'm cool about Altera, it proves the portability point, may lower barriers for some contributors, and gets us better chips
<wpwrak_> lekernel: the art of reading errata ;-) here's one i came across recently: http://www.cypress.com/?docID=27429  lovely items, particularly the very last one (#14) should warm the heart of any connoisseur of USB. and the "we won't fix any of these" isn't very encouraging either. particularly in a chip that something like 8 years on the market. (context: i heard of that one in a discussion on how to bring USB host to the ben)
<wolfspraul> lekernel: I think adam's tests are inconclusive to answer that
<lekernel> that's the one and only thing to test about the reset IC and D16
<wolfspraul> yes I know
<wolfspraul> like I said - lots of irrelevant tests
<wolfspraul> without D16, with probe connected
<wolfspraul> but what would help is if we build upon solid assumptions, for example it turns out when we did the test on rc2, we forgot to actually boot
<wolfspraul> you say even if there is a problem there it can be fixed in software, but that's speculative at this point
<lekernel> then there might be other layers of peskiness that randomly and rarely corrupt the flash, prevent booting after pushbutton press, etc. but those are highly unlikely to be related to the reset IC
<wolfspraul> ok, but let's collect hard data points, then we are not swimming in an ocean of uncertainty
<wolfspraul> maybe it just becomes a little lake of uncertainty :-)
<wolfspraul> lekernel: did you see my question above - why did we put D16 in in the first place? why did rc3 boot fine without it?
<lekernel> see mailing list archives
<lekernel> it's to make sure the flash is in reset at power up and doesn't register wrong commands
<lekernel> and all boards boot fine without it in 99% of the cases ...
<Vaati_> oh whats this?
<wolfspraul> lekernel: you mean boot fine without the entire reset_ic+diode fix?
<lekernel> yes
<wolfspraul> totally wrong.
<wolfspraul> you have seen how many boards?
<wolfspraul> 2?
<wolfspraul> I saw the other 38, each one of them
<wolfspraul> if you want I can make a little survey of the few active users :-)
<wolfspraul> I have this problem all the time
<wolfspraul> xiangfu has it, Jon has it
<wolfspraul> to turn on my m1, I always need to try to plug the power supply in several times, it's normal for me
<lekernel> yes, I have it as well, but it's still rare - on my board at least
<wolfspraul> fine but please don't say "all boards fine in 99%" if you actually have such little visibility
<wolfspraul> just trust me as your manufacturer telling you that it's a serious problem and I'm very happy and optimistic that we will fully fix it in rc3
<lekernel> ok, then some boards boot fine without it in 99% of the cases ...
<wpwrak_> Vaati_: at the moment, it's a bunch of people trying to figure out what's wrong with the ~90 boards that are just going through SMT :) boards with FPGA, video, and such.
<wolfspraul> maybe you got a lucky 2 of the 40 :-)
<wolfspraul> if there is any value I can provide, it's solid testing across the entire run
<Vaati_> hmmm  milkymist sounds familiar -- I think I came across it once while googling for some stuff
<Vaati_> but I havent really ever looked into it
<wolfspraul> kristianpaul: do you have this problem sometimes? that you need to plug in the power of your m1 multiple times before you can boot?
<Vaati_> oh wow
<wpwrak_> wolfspraul: it's not luck. the guy who's in the best position to fix a problem always gets the boards that don't have it. one of the many corollaries of murphy's law :)
<wpwrak_> wolfspraul: isn't reliable boot more the domain of the reset chip ? D16 sounds more like preventing flash corruption (?)
<lekernel> one of the problem we can have if the flash reset isn't asserted at power up is that the flash register a command and sends status info instead of data
<lekernel> this would make the fpga unable to read its bitstream, so no configuration, d2 dimly lit, etc.
<lekernel> d16 is here to assert the flash reset at power up
<lekernel> is that clear?
<wolfspraul> wpwrak_: I am pretty sure we are very close to fixing this bug once and for all. But I guess the bug doesn't want to go without some drama... a nasty bug indeed.
<wolfspraul> Adam's testing was a little unfortunate/unfocused, but in the next round we'll narrow it down and then it's done, I'm sure.
<wolfspraul> adam has only 1 rc3 right now, the only one that is fully soldered
<wpwrak_> lekernel: okay, makes sense. is this also possibly connected to the obscure flash corruption problem you had a while ago ? or is that one already off the table
<lekernel> yes, if the flash happens to register a write or erase command at power up (unlikely but who knows...), that might well corrupt it.
<lekernel> d16 is meant to prevent that as well ...
<wolfspraul> ok so we should focus on getting things to work with D16, because we believe D16 is what fixes the underlying bug
<wolfspraul> together with the reset ic
<lekernel> yes
<lekernel> we also connected the reset IC to PROGRAM_B to make sure the FPGA doesn't attempt to read the flash while it's still in reset
<wolfspraul> maybe Adam's next quick test should be the old rc2 board, unmodified (with reset+diode), and test the complete bootup, 100 times or so
<lekernel> that's all this reset circuit is about.
<lekernel> do you get it now?
<lekernel> I have explained it several times already :p
<wpwrak_> lekernel: perfect. now the question is just why it doesn't to that. pity you used a diode and not a 74xxx1G logic gate. the be wary of diodes is one of the lessons we learned in the early GTA02 days at openmoko. there we also had such a reset "mixer" with a diode that caused all sorts of fun. (there, the problem was the reverse current)
<lekernel> reverse current? mh
<lekernel> impedance is some 10k there
<lekernel> yeah, 10K
<lekernel> the diode pulls down a 10K pull-up resistor to 3.3V
<wpwrak_> lekernel: the diode we had back then was a grotesquely ill-fitting choice. so the chance that you fell into that trap is small :)
<wpwrak_> lemme find the schematics ....
<lekernel> and on the other side it's 4.7K
<wpwrak_> gaah. too many pictures on that page :)
<lekernel> leakage current should make a totally negligible voltage drop across 4.7K
<lekernel> unless that's a very, very bad diode
<wpwrak_> PROGRAM_B_2 some sort of a nWAIT signal ?
<lekernel> then forward voltage is 0.33V at 0.1A for that diode
<lekernel> wpwrak_, no, it's like a reset signal that clears FPGA configuration and prevents configuration attempts when asserted
<lekernel> and for the flash, any voltage below 0.6V is treated as logic low... so no problem there either
<lekernel> pfff ...
<lekernel> there's a note in the flash datasheet says "Sampled, not 100% tested." for the logic low voltage specification, whatever that means
<wolfspraul> that means there is a small change that they will later withdraw from claiming this feature
<wolfspraul> either the documentation hasn't been updated saying that it was 100% tested, or the feature is not reliable and should be removed from documentation
<wpwrak_> hmm, FLASH_RESET_N comes from the FPGA and goes back into it again (via PROGRAM_B_2)
<wolfspraul> s/small change/small chance/
<lekernel> wpwrak_, no, the diode D16 blocks that path
<lekernel> so when the FPGA asserts the flash reset, it only resets the flash (and not itself)
<lekernel> both are active low signals
<wpwrak_> ah yes, got it.
<wpwrak_> is the FLASH_RESET_N output open-drain ? (IO_L48N_MIDQ9_1)
<lekernel> before FPGA configuration it is high impedance with weak pull up resistor (inside the fpga)
<wpwrak_> wolfspraul: (sampled and other weasle words) data sheets usually keep a lot of things rather vague. nothing new there at all :)
<lekernel> after FPGA configuration it is push pull... can be made open drain, but shouldn't matter
<lekernel> theoretically, we should be able to remove R60, since it is already present in the FPGA
<wpwrak_> i'm not sure if i'm reading the A4809 data sheet correctly, but the output current seems to be extremely low
<wpwrak_> well, in some cases. now looking at the graphs on page 7, and things look fairly normal there (i.e. several mA instead of uA)
<lekernel> ah, indeed...
<lekernel> that could well be the problem
<lekernel> yeah 0.05mA
<lekernel> that has to pull low a parallel combination of 4.7k and 10k resistor (neglecting the diode impedance)
<wpwrak_> at low Vdd, though
<lekernel> we get low Vdd during power the power ramp that is causing our headaches
<wpwrak_> ah, i see. yes, then that would be bad
<lekernel> that's an equivalent resistor of roughly 3.2k... 0.05mA drops 160mV (!) there
<lekernel> ok we need larger resistor values
<lekernel> though the fpga's built-in resistor might cause us some trouble... argh
<lekernel> or, there's a P-channel version of the reset IC, with 2mA output
<wpwrak_> would be good to see the signals on a scope. let's hope adam's has at least three working channels :)
<lekernel> maybe when Vdd is low enough the flash wouldn't register any command at all anyway
<lekernel> and if we are lucky we could get away by removing R60 and using a higher R30 value
<lekernel> from the reset IC datasheet: "The value of R0 need to be selected in different application, typical value is 470K&"
<lekernel> we are 100 times below that
<lekernel> meh
<lekernel> nice catch ...
<wpwrak_> oopsie. 2 orders of magnitude off isn't so nice
<lekernel> the FPGA pull up resistor sources 200 to 500 uamp according to datasheet
<lekernel> there's still a chance we could get those boards to work :)
<lekernel> remove R60, use a large R30... might do the trick
<wpwrak_> worst case, replace U24
<lekernel> assuming we could find a footprint compatible replacement with the right characteristics
<wpwrak_> the resistor swapping game is a bit complicated by having a diode there. for rc4, you may want to consider some 74xxx1G logic gate. they're nice and clean :)
<lekernel> there's also the option of cutting the FPGA trace to insert a diode to block the on-chip pull-up, but that would be messy
<wpwrak_> oh there's tons of them ...
<lekernel> the logic gate may not work when the power supply voltage is too low
<wpwrak_> yuo can pick one that goes pretty low. lower than a D+R circuit :)
<lekernel> won't they have output current problems too?
<lekernel> also, the FPGA pull up resistor only gives 12 to 100 uamp at 1.2V
<lekernel> there's a good chance this could work, heh
<lekernel> aw, remove R60 and use 470K for R30
<lekernel> this should hopefully fix all power up issues
<lekernel> (see IRC log)
<aw> sounds have good news. :-) oaky...let me read them completely first. :-)
<wpwrak_> lekernel: not so quick ... what's the input leakage current of the FPGA ? (PROGRAM_B_2, to be precise)
<wpwrak_> (let's hope did something sane there. not like samsung ...)
<lekernel> oh, it has a pull up too
<lekernel> so you can remove R30
<wpwrak_> for similar chips, on digi-key, search for reset, then pick category "PMIC - Supervisors", then type "Simple Reset/Power-On Reset", then narrow down by package and voltage
<wpwrak_> kewl. it's getting simpler all the time :)
<lekernel> it's the same pull up as the other pins +/- 20 uamp
<wpwrak_> and the NOR has some input leakage as well
<lekernel> aw, so just remove R30 + R60
<lekernel> NOR has 1 uamp leakage
<wpwrak_> good. that won't cause problems. neither up nor down.
<aw> lekernel, second..i read very slow. sorry. not completely finished read. :-)
<wpwrak_> lekernel: and for single gates, the 74AUP1G family is quite nice. works from 0.8 V to 3.6 V
<wolfspraul> lekernel: the R30/R60 change is in addition to keeping the original reset ic + D16 diode, right?
<lekernel> yes
<wolfspraul> aw: one more thing we are sure about now (lekernel and wpwrak_ please speak up if I'm wrong): we do not need to test any case without D16
<wolfspraul> D16 is an essential part of the circuit we have in mind, removing it makes no sense at all. we do not need to test that.
<wolfspraul> always keep D16 there (in correct polarity of course)
<aw> good catchs on current driven between fpga inside and reset ic's output analysis. As well as the parallel equivalent resistors(R60//R30), nice analysis!
<wolfspraul> if the R30/R60 removal works now (for rc3), do we still want another (cleaner?) solution for rc4, or can we keep this solution?
<aw> firstly, do i just reflash firstly standby image to restore after I replace a R30(470K) and remove(R60)?
<wpwrak_> wolfspraul: (keeping D16) yes, you want to keep D16 or the rc2 gremlins come back.
<lekernel> aw, yes, you can test the reflashing first
<aw> or just replace R30 and remove R60 to directly check if rc3 works back well?
<wpwrak_> wolfspraul: (rc4) i would recommend considering a 74AUP1G08 or 09 instead of the diode-based "wired and". that would provide a cleaner barrier than the diode does.
<lekernel> 1) reflash standby image, check if it works
<lekernel> 2) if it didn't work reflash all the rest
<lekernel> 3) remove R30/60 and check it works _reliably_, i.e. power cycle a few hundred times
<wolfspraul> and boot
<wpwrak_> wolfspraul: (rc4) of course, if things work perfectly now, you may not want to take the risk.
<wolfspraul> and put D16 back in, and no scope
<kristianpaul> wolfspraul: (cant boot rtems) is not too often, but so far happened two times, at least that i'm aware of
<wolfspraul> kristianpaul: ok so from your feeling that's in 5% of power on attempts? (that it won't power on)
<wolfspraul> or 10% or 1%? (just roughly)
<wolfspraul> in my board maybe 30%, I have a feeling it's a bit higher when the board is warm/hot
<wolfspraul> I'm not worried if I go somewhere that I won't be able to power it up, but I'm fully prepared that I may have to replug the power a few times. With that in mind it's bearable for me.
<aw> lekernel, wait. you wrote 'remove R30/60'. but i read back above is to 'remove R60 and use 470K for R30'. :-)
<lekernel> no, remove R30 and R60, do not put 470K back in
<lekernel> we figured out later that R30 is already included in the FPGA
<aw> okay. got it. an equivalent resistance as to be R30 role.
<wpwrak_> lekernel: oh, speaking of the diode's reverse current: it should be about 2 uA at 25 C, 2 mA at 100 C (fig. 4). so on a hot day, you could in fact replace it with a 0R ;-)
<lekernel> mh, crap
<wpwrak_> lekernel: since you already have a wired-AND, even without the diode, it may not be much trouble
<aw> lekernel, i just modified xiangfu's script, let me if it's okay for reflash standby only >> http://pastebin.com/J0DPhHxk
<lekernel> yeah should be ok
<aw> alright
<kristianpaul> wolfspraul: not power on, thats different issue, i meant it load bitstream but wont load rtems.. or stay loading, as is flash we're corrupted,
<wolfspraul> how about power on?
<kristianpaul> its hard to tell, a guess will be unfair, but yes i remenber having this issue, but the no the last month..
<wolfspraul> ok, makes sense
<kristianpaul> may be not since jtag let me on a state in wich M1 is electrically power on, is not needed, as just fpga get in a standby state after i flash somthing new on it
<wolfspraul> I think every board will show it, percentage I am not clear (and never cared much because whether it's 1% or 80%, it needed to be fixed fully anyway)
<kristianpaul> i think there are two issues here, for making product behavior incorrect
<wolfspraul> yes
<wolfspraul> correct
<wolfspraul> we hope both are 100% fixed in rc3 :-)
<kristianpaul> the electrically power on problem, that you're tryin to fix delaying fpga
<wolfspraul> but please keep describing
<kristianpaul> and the booting in to rtems problem that also happen !
<kristianpaul> and for end user will be just a booting issue as well
<wolfspraul> from now on Adam will test with a complete boot all the way to rendering
<kristianpaul> yes, please,
<kristianpaul> i'm sorry i dint describe rtems booting issue before, so fat i tought was a nornmal error percentage after flashing m1 more than 10 times a day...
<kristianpaul> or may be because a partial reflhash of nor? so if you dont reflash the whole thing it may lead to corruption somwehere?..
<kristianpaul> well, just a few guesses
<aw> step 1):  good, now m1 reconfigure normally, let me see if boot up. :-)
<aw> yeah...can't boot up now...go to remove R30/R60. :-)
<wolfspraul> kristianpaul: there may be multiple bugs, that's why we need to calmly fix them one by one
<wolfspraul> otherwise we are a remote group of people, communication is never perfect, and we are all confused by different reports and different things we mean when we report something
<wolfspraul> for example in lekernel's list earlier he wrote about 'check that it works', but now Adam is writing about "cannot boot". Do they mean the same thing? works == boots? not sure.
<wolfspraul> let's see what Adam finds next :-)
<wolfspraul> with 'boot' I mean all the way to gui or rendering
<wolfspraul> but others may mean different things, or even different things depending on context
<wolfspraul> what do we have? power-on -> fpga reconfigure (?) -> D2 goes off -> middle button -> D2 goes on -> boot -> gui/render
<wolfspraul> right or wrong?
<wolfspraul> kristianpaul: the first step after power-on is called 'reconfigure'?
<wolfspraul> I'm a bit confused about the 're' since it's the first thing after power-on...
<aw> when i said 'reconfigure' means D2 dimly lit short time then OFF, when i said that 'boot up' yes i meant the way to gui and D2/D3 is all ON. :-)
<wolfspraul> ok so we use roughly the same terminology
<wolfspraul> wpwrak_: will R30/R60 impact anything after reconfigure?
<wolfspraul> my understanding was that any impact of the R30/R60 change is for reconfigure itself, not after it
<kristianpaul> yeah, well reconfigure is okay, at least you sort the bitstream is loaded :)
<aw> okay..removed. let's check if reconfigure normally first after power up.
<kristianpaul> lekernel: standy bitstream reconfigure fpga with soc bistream after middle button pressed right?
<aw> good on reconfiguration stage, but D2/D3 only keeps ON when I pressing SW2. then both D2/D3 is OFF. :(
<aw> the go both LEDs OFF.
<wolfspraul> aw: did you reflash only the standby image, or everything?
<aw> i stop now. :-)
<wolfspraul> I think you should reflash everything.
<aw> i have to reflash everything?
<wolfspraul> sure I would do that
<aw> hmm..second.
<wolfspraul> no point in trying to fix 3 bugs at once
<wolfspraul> get everything back to the best state we can imagine now
<wolfspraul> and then test. first power-on, then middle-button, then boot, then render.
<aw> reflashing...
<aw> i set 'NOVERIFY="noverify"' , so speed up. :-)
<wolfspraul> after reflash, unplug the power from the board entirely, so we start from a known state
<aw> reflash done. power off > power on > can reconfigure > press SW2 > D2 doesn't keep ON....not success on boot
<aw> yes, i plugged off adapter power.
<wolfspraul> interesting
<wolfspraul> :-)
<wolfspraul> D16 is there and with correct polarity?
<aw> yes, soldered there
<aw> polarity correctly.
<wolfspraul> so that's exactly the same as you reported at the beginning
<wolfspraul> but this may not be related to R30/R60 anyway, because it's a problem after reconfigure
<wolfspraul> it's strange though that booting doesn't work because of the diode? maybe it loaded a corrupt bitstream?
<wolfspraul> if the bitstream is correct, then I wouldn't know how the diode could affect the chances of booting
<wolfspraul> we need to wait for lekernel for more input and ideas :-)
<aw> yes
<wpwrak_> (r30/r60 after reconfigure) dunno. i wouldn't think it should, but then i don't know these things too well
<wpwrak_> i wonder if the flash reset output could glitch during configuration
<wolfspraul> maybe looking at the serial console could give us clues?
<aw> wpwrak_, but indeedly your discussion above with lekernel was good and reasonable in tech knowledge well...
<wpwrak_> aw: you should be able to see glitches on FLASH_RESET_N by probing TP37, with a falling or rising edge trigger
<wolfspraul> aw: can you press the middle button multiple times?
<wpwrak_> aw: (see glitches) that is, if there are any :)
<wolfspraul> basically now the D2 goes off, and pressing the middle button does nothing, right?
<aw> wait...
<aw> D2 will show ON in a short time that time is that when I press middle buttun(SW2), then D2 goes to OFF
<wolfspraul> maybe it actually boots? how long do you wait?
<wolfspraul> ah no, I think D2 should stay on during the boot, and finally D3 will go on as well
<wolfspraul> aw: when you press the middle button again (second time), will D2 go on again? or stay off
<aw> D2 will always go ON then OFF after I press SW2.
<aw> it's not right. :-)
<wolfspraul> hmm
<wolfspraul> but it does that every time
<wolfspraul> I think that means it's still running
<wolfspraul> ok, last test
<aw> wpwrak_, yeah..maybe the D16's forwarding voltage doesn't make a good low enough.
<wolfspraul> try to power cycle, reconfigure, press middle button
<wolfspraul> say 5 times, should be enough
<wolfspraul> I want to see whether it ever boots to gui/rendering
<aw> hmm? no. i stop. :-) it should not like this. :-)
<aw> bad adam. :-)
<wolfspraul> ok
<wolfspraul> but I think the behavior sounds stable now
<wolfspraul> of course we still don't have a solution
<aw> i need to see like werner's said on glitches later. :-)
<wolfspraul> ok
<wolfspraul> maybe also compare to our earlier rc2 results?
<wolfspraul> we tested this circuit before and found it working, or our test back then was completely wrong?
<aw> TP37 should have something to discover. :-)
<wolfspraul> then there is werners 74AUP1G08/09 idea, I don't know how hard it is to try that. sounds like that is not an option for rc3.
<wolfspraul> or a different D16 diode with other specs?
<lekernel> aw, you completely reflashed your board?
<wolfspraul> yes
<wolfspraul> what he sees now is that it seems to reconfigure fine (D2 goes off), but then when pressing the middle-button D2 will go on briefly then back off
<wolfspraul> repeatedly, so if he presses the middle button again, D2 will come on again briefly and go off again
<lekernel> yes, it should do that
<aw> wolfspraul, i feel discussions from werner and lekernel is good, just don't know once a corrupt occurred, does flash's rest pin internal or fpga itself doesn't nver restore back more? don't know
<lekernel> then turn hard on
<wolfspraul> no, that doesn't happen
<wolfspraul> it stays off
<lekernel> ok, leakage current of the diode causing problems i'd guess
<wolfspraul> aw: if the circuit is stable, there should be no corruptions ever, I'm sure.
<wolfspraul> so we don't need to worry that much how to recover from a corrupted nor, I think. because that just won't happen in normal use.
<lekernel> aw, can you try with a 470k R30?
<aw> i can try reflashing all image again to see. second
<lekernel> no, don't waste time on reflashing
<aw> before I try 470K 30, try again. :-)
<lekernel> do you have anything on the serial console btw?
<aw> hmm..no time no use serial console. please let me know the baud ratio setting.
<aw> sorry that, long time no use. :-)
<wolfspraul> got disconnected
<lekernel> 115200
<aw> flow control? 8 bits 1 bit stop
<lekernel> no flow control
<aw> okay, no parity?
<lekernel> no
<lekernel> have you soldered r30 already?
<lekernel> serial console is unimportant
<aw> want to at the same time if you want to see. :-)
<aw> moment...
<wpwrak_> lekernel: btw, could FLASH_RESET_N glitch between the start of (re)configuration and when the system should start running ? or is there maybe even an intentional flash reset at the end of configuration ?
<wpwrak_> lekernel: also, are there any checksums or such in configuration ? i.e., when configuration completes, do we have any knowledge of whether things were loaded correctly ?
<aw> lekernel, can reconfigure but no boot after 470K R30.
<wolfspraul> aw: I think this is a stable condition now, not bad.
<wolfspraul> here is my current understanding:
<aw> mmm
<wolfspraul> our best bet circuit right now is with reset_ic, diode, both R30 and R60 removed
<wolfspraul> but in this case, for some reason after reconfiguration (successful reconfiguration?), the m1 doesn't boot
<wolfspraul> from here we could go in a number of directions - we could go back to the rc2 and understand why it worked there
<wolfspraul> you could dig into suspected glitches
<wolfspraul> we could come up with changes beyond r30/r60
<wolfspraul> we can go back and remove d16 (which makes it boot), but probably that will expose us to the old reconfigure problem
<wpwrak_> aw: when will you get more boards ?
<wolfspraul> we could see whether the serial console holds any clues
<wolfspraul> or lekernel or wpwrak_ could come up with any other idea :-)
<aw> guess i need to go there to solder myself to get 2 ~ 3 pcs next Monday
<wolfspraul> which way to go?
<wolfspraul> aw: I think you should definitely go back to the old rc2 you had with reset ic + diode, and see whether that one fully boots
<aw> mrt or scooter
<wpwrak_> wolfspraul: first, i'd like to understand a little better how all the signals are supposed to behave
<wolfspraul> no no, not 'which way to go to the factory' :-)
<wpwrak_> hehe ;-))
<wolfspraul> which way to go in our analysis & fix
<aw> ;-O
<wolfspraul> I definitely want to know whether the old rc2+reset_ic+diode boots or not
<wolfspraul> that was the basis for our design decision, but we may have overlooked several issues there, I guess
<wolfspraul> then we should also connect the serial console on the rc3 adam has now, just to see if we are lucky and anything comes up there
<lekernel> aw, record flash reset and fpga program_b with a 2-channel scope, at 1) power up 2) boot time. use 1:10 probes if you are worried about capacitance.
<lekernel> wpwrak_, there is a flash reset after configuration with the soc bitstream
<lekernel> and after each soft reboot
<wpwrak_> lekernel: at such low currents, i'd be VERY worried about capacitance :)
<aw> lekernel, alright...I'll do this tomorrow morning and links given here.
<wpwrak_> lekernel: (reset) perfect
<wpwrak_> lekernel: does PROGRAM_B_2 do anything after the configuration ?
<lekernel> no, it should not
<wpwrak_> lekernel: also good. will it become push-pull after configuration ?
<lekernel> program_b is a dedicated input with no change when the fpga is configured
<lekernel> no this has nothing to do here
<wpwrak_> lekernel: (program_b_2 input only) hmm, then the reverse leakage current shouldn't matter
<lekernel> wpwrak_, what can happen is that when the soc boots, it resets the flash, then the leakage diode current pulls program_b low and clears the fpga
<lekernel> aw, scope traces first
<lekernel> then we will know
<lekernel> instead of just guessing ...
<wpwrak_> agrees. scope next :)
<wpwrak_> lekernel: (clear the fpga) so there are no checksums or such in configuration ?
<lekernel> wpwrak_, what does it have to do with checksums?
<lekernel> when program_b is pulsed, the fpga is cleared, period
<wpwrak_> oh, like that. i see.
<lekernel> this has absolutely nothing to do with checksums
<aw> alright...let me measure now...then i sleep. :-) :-)
<lekernel> aw, how many channels does your scope have?
<aw> so before I scope, should i remove 470K R30?
<lekernel> can you measure flash_reset_n, program_b and 3v3 at the same time?
<aw> two only
<lekernel> no, leave it there
<wpwrak_> lekernel: yes, then we would have a reverse current scenario indeed
<lekernel> ok then skip 3v3
<kristianpaul> why serial console is unimportant?, it could tell if flickernoise loaded or not
<wpwrak_> wolfspraul: when you come across some money, get adam a 4 channel scope :)
<wolfspraul> the other 2 channels are in Buga :-)
<kristianpaul> ;-)
<wpwrak_> wolfspraul: ideally, one with lots of memory. alas, the good ones are expensive. 10+ kUSD
<wpwrak_> kristianpaul: seems that lekernel it pretty sure there's nothing alive in the fpga. if the reverse current scenario is what's happening, that would indeed be the case.
<kristianpaul> okay,so... lets asume that :)
<wpwrak_> hmm, 26 C in taipei. leakage should be around 2-4 uA then. still sub-critical, at least in theory. one thing to keep in mind: if it suddenly starts to work, that may be because it gets colder for the next ~4-5 hours.
<wolfspraul> what if m1 runs at a nightclub in 45 degree Bangalore?
<wolfspraul> (just joking... we don't need to discuss this now :-))
<wpwrak_> wolfspraul: you should let an M1 roast out in the sun for a bit and then see how well it works ;)
<aw> hi, interesting things happens again. once my prober connected to TP then m1 get boot up!
<wpwrak_> i was afraid something like that would happen ...
<wpwrak_> aw: can you identify if a single probe already makes a difference ?
<aw> from my views seeing flash reset and program_b, they are both synchronized at the same rising pulse @ 2.5ms :-)
<wpwrak_> aw: note: please try a few times. i'd say at least 5 times, better 10. this can easily be a statistical problem now, and we'll need a reasonably large sample size.
<aw> yeah...yes..try program_b or flash reset pin caused that..second. :-)
<wpwrak_> (same pulse) @2.5 ms = the pulse duration ? or the time after powering up ?
<wpwrak_> if both show a relatively fast pulse, that would be the diode's evil work
<aw> the time after powerin g up. they both are the same. :-)
<wpwrak_> how long is the pulse ?
<wpwrak_> maybe just post a screenshot
<aw> please see this firstly , the CH1 is program_b
<lekernel> aw, are you using a 10:1 probe?
<aw> after a rest ic's delay time (~= 200ms) , program_b goes from LOW to HIGH.
<wpwrak_> kristianpaul: do you have a script that downloads a screenshot from the scope and puts it on downloads.qi-hardware.com/people ? might be useful for adam
<lekernel> aw, why is there a glitch on PROGRAM_B in the beginning?
<wpwrak_> aw: ah, you;'re already there. great :)
<kristianpaul> wpwrak_: nope i dont, i just visit by web
<lekernel> the first pulse
<lekernel> where you put the cursor btw
<aw> wpwrak_, that DONE(CH2) is initalize after power on then goes LOW...then once fpga reconfigure works done, that DONE pins goes high to tell it's done. :-)
<wpwrak_> kristianpaul: ah, he's got a TDS1012. that one may not even have ethernet. i thought he had a scope similar to yours
<aw> lekernel, yeah..moment...let me change. ;-)
<wpwrak_> interesting ... 200 ms load time (to DONE), and 200 ms reset delay by the reset chip
<aw> yes
<wpwrak_> lekernel: do PROGRAM_B_2 is edge-triggered, not level-triggered ?
<wpwrak_> s/do/so/
<lekernel> it's level triggered
<aw> do you want me to level triggered? guess 200ms, they should be the same.
<wpwrak_> then i don't understand what i'm seeing. looks as if PROGRAM_B_2 was low all the time, thus constantly resetting, yet the FPGA thinks it finishes configuration (indicated by DONE)
<wpwrak_> something doesn't compute :)
<wpwrak_> aw: (level triggered) that was about how the FPGA works, not your scope setup
<lekernel> aw, on your scope, the first is PROGRAM_B and the other one DONE?
<aw> lekernel, yes. sorry that ...now i changed to 1:10, then can't boot more. :(
<lekernel> aw, ok, no, this is good!
<aw> lekernel, yes, CH1 is program_b, CH2 is DONE pin.
<lekernel> at least the measurement is not making that ultrapesky bug disappear
<lekernel> ok, then wpwrak_ is right, DONE shouldn't go high when PROGRAM_B is low
<lekernel> wtf
<wpwrak_> aw: just to check, can you please tell us the TP numbers you connected to ?
<aw> wpwrak_, program_b(TP36), DONE(TP35), flash Reset(TP37)
<wpwrak_> so CH1 is on TP36 and CH2 is on TP35 ? trigger is ... on CH1
<wpwrak_> ah, no D16 !
<wpwrak_> now, please do all this again, but with D16 in place :)
<aw> wpwrak_, exactly
<aw> yes, no D16.
<lekernel> no more tests without d16 please
<aw> i show this just let you know program_b & done relationship.
<aw> yeah...moments
<wpwrak_> lekernel: of course, even without D16, the pattern looks weird, with PROGRAM_B_2 low and things still (apparently) completing
<lekernel> rhaa... apparently the s6 needs INIT_B to be driven low as well to delay configuration
<lekernel> the amount of peskiness that lies into this configuration process is incredible
<wpwrak_> lekernel: so the whole contraption around D16 won't work ?
<lekernel> it can fail if the fpga attempts to read the flash while it's still in reset
<lekernel> aw, add a second diode between the output of the reset IC and INIT_B (accessible through R157)
<wpwrak_> searches for INIT_B ...
<lekernel> oh, and it'd be much better if we could get a reset IC with more current sink capabilities and/or diodes with less leakage current
<wpwrak_> lekernel: reset ic should be no problem. for the diodes, you'd likely have to trade Vf for Irev. the only "clean" way out of this is probably a 1G gate
<lekernel> btw, that init_b vs. program_b discovery explains why adding a capacitor (instead of the reset ic) didn't work either...
<lekernel> it's amazing the time we spend on small issues like that
<lekernel> and extremely frustrating
<lekernel> aw, all further tests must be done with a diode to INIT_B
<wpwrak_> lekernel: 74AUG1G have nice features like an input leakage of max. +/- 0.75 uA, and even at Vcc = 1.1 V, they can sink some 1.1 mA. they're nice chips to know. they lack the coolness of just solving a tricky problem with a few passive elements, but they do a lot to make things more predictable.
<lekernel> as your scope trace shows, holding PROGRAM_B low does nothing ...
<wpwrak_> (trace) good. same result as without D16. so D16 may be off the hook for now.
<lekernel> wpwrak_, if that stupid reset IC was able to sink more than a ridiculous 500 uamp, the diodes (with additional external pull ups) would work just fine
<lekernel> s/500/50
<wpwrak_> lekernel: you could still run into trouble with the reverse current
<lekernel> on a ~5K impedance, the reverse current wouldn't be able to cause much trouble, would it?
<lekernel> aw, so 1st one is program_b and the other flash reset?
<lekernel> or is it the other way around?
<aw> yes
<lekernel> ok
<aw> scoped after power up
<wpwrak_> lekernel: okay, with a 5 k pull-up, you can probably kill it :)
<aw> so you can use program_b as reference base
<wpwrak_> aw: btw, it may be good to set the scope's acquisition to peak detect
<lekernel> aw, now one problem. program_b does nothing. we must use init_b instead
<lekernel> and another problem - init_b becomes active after configuration, so we must use another diode
<aw> ?
<lekernel> wpwrak_, I also have heard horror stories about FPGAs not configuring correctly because of rise time too slow on their external control signals. I don't know if Xilinx improved this, but because of that I'm for small pullup values.
<wpwrak_> lekernel: still haven't found INIT_B or R157 :-( on which sheet are they ?
<lekernel> with the fpga schematics, on bank 2
<lekernel> at the top right of bank 2
<aw> well...i get to sleep though, guys ;-)
<wolfspraul> sure, 'night
<lekernel> wpwrak_, the xilinx reference designs have 4.7k external pullups on program_b and init_b
<wpwrak_> aah, got it. thanks ! nicely hidden :)
<lekernel> gn8
<aw> let me know if somethings i can help tomorrow. cu
<kristianpaul> n8
<lekernel> I'm for trying to keep those pullups, and having a reset IC that has some serious current sink capability
<wpwrak_> sounds reasonable
<lekernel> any reset ic you could recommend off the top of your head?
<lekernel> it's in sot-23
<wpwrak_> you'll need at least 1 mA, right ?
<wpwrak_> let's first check if the one you have isn't good enough after all
<lekernel> so, this thing is going to drive: flash reset (10k), program_b (4.7k) unless we manage to cut the trace when reworking the board and init_b (4.7k)
<wpwrak_> i haven't used reset chips yet, so there's none i "like from experience". but there's a ton of choice at digi-key. so i'm not worried about finding something decent, if necessary
<wpwrak_> if i interpret the A4809 data sheet correctly, the sink current is very low at low Vdd, but gets reasonable when Vdd increases
<lekernel> that's 1.9k equivalent resistance, which needs at least 1.7mA at 3.3V
<wpwrak_> okay, with tolerances that's 2 mA absolute minimum
<lekernel> yeah and we must also account the fpga internal pullups btw
<lekernel> those will add 0.2 to 0.5 mA/pin at 3.3V
<lekernel> so worst case 1.5mA total
<wpwrak_> 4 mA then
<lekernel> which brings that minimum current for the reset ic to 3.5mA
<lekernel> or 4 if you want extra safety
<wpwrak_> i totally love extra safety ;-)
<lekernel> worst case we will use a relay instead lol
<wpwrak_> i'm not sure we really need to worry about the early ramp. as long as we get a proper reset when we approach 3.3 V, we should still be fine, right ? maybe with the possible exception of freak flash corruption
<wpwrak_> (relay) naw, they bounce :)
<wpwrak_> the flash corruption scenario would be that we have Vdd high enough for the FPGA to try to talk to the flash, and Vdd high enough for the flash to be able to change its content, and Vdd too low for the reset chip to pull enough
<lekernel> yeah, the output current of our reset ic approaches 10mA
<lekernel> though what the heck is VDS?
<wpwrak_> figure 10 ?
<wpwrak_> VDS is the voltage on the output
<wpwrak_> oh wait. relative to Vdd :)
<wpwrak_> err no, to ground
<wpwrak_> page 11, figure 3
<wpwrak_> hmm. still looks suckish.
<lekernel> I'm looking at this atm: http://www.technorise.ne.jp/doc/ait/A4809-v10.pdf
<lekernel> ah, page 12 :)
<wpwrak_> a have rev 1.3 (found by google)
<wpwrak_> for Vdd = 3.3 V, the closest approximation seems to be figure 12 on page 7 (of rev 1.3): Vdd = 3.0 V
<wpwrak_> if we assume Vds = 0.5 V (you said Vil(max) = 0.6 V, right ?), then we get about ... 8 mA
<lekernel> Vds should be lower than that
<lekernel> we have to take into account the drop of the diode
<wpwrak_> urgh. right.
<lekernel> also 0.6V is for the flash, I'm trying to find what the FPGA needs
<lekernel> should be somewhere in that
<lekernel> for PROGRAM_B and INIT_B
<wpwrak_> hmm, what diode current shall we assume ? 2-3 mA ?
<wpwrak_> let's say 3 mA. then Vf should be < 200 mV
<wpwrak_> so we get Vds = 0.3 V
<lekernel> have you found the maximum low level voltage for the fpga?
<lekernel> i'm still searching through this big datasheet
<wpwrak_> no, i hope you're quicker than me ;-)
<wpwrak_> besides, i don't think FPGA.Vil(max) will be higher than NOR.Vil(max)-200 mV. but i'm sure you'll set me straight if my guess wasn't right ;)
<lekernel> wpwrak_, INIT_B will need to be connected through a diode
<wpwrak_> okay, then it matters. darn.
<lekernel> ok, Vil is either 0.8V or 0.7V... can't figure out, but the flash is the limiting factor anyway
<lekernel> those are the numbers for the LVCMOS33 and LVCMOS25 I/O standards
<lekernel> so let's assume Vds = 0.3V
<wpwrak_> good. i was just about to ask which of the gazillion I/O standards it was ;-))
<lekernel> I'm not actually sure
<wpwrak_> okay. 0.3 V ... A4809 data sheet rev 1.3, page 7
<lekernel> those are for the configurable I/O pins, not for the fixed function pins
<lekernel> but it'd make sense to assume the fixed function pins use either LVCMOS33 or LVCMOS25
<wpwrak_> figure 12 says that, with VDD = 3.0 V, we can expect something like 5-6 mA
<lekernel> no, we are using Detector Threshold=2.7V
<lekernel> so figure 10 (not 12) is relevant
<wpwrak_> fig. 10 only goes up to VDD = 2.0 V
<wpwrak_> for VDD = 2.0 V, fig 10 and fig 12 are similar. so i would assume the characteristics of the output transistor are comparable
<lekernel> maybe...
<wpwrak_> i.e., i'm currently looking at the performance in the 200 ms after crossing the threshold
<wpwrak_> so i think the chip should be barely adequate. you don't have a lot of headroom, but it should be sufficient.
<lekernel> if we used 10k pullups (instead of 4.7k) the minimum current for the reset IC would be 2.5mA ... this should give more margin on the reset IC side
<lekernel> so what would you think of:
<lekernel> 1) we keep the current reset IC
<wpwrak_> right now we have 4.7 k plus 10 k, so it's already a bit friendlier
<lekernel> 2) we use a 10k pullup on INIT_B and PROGRAM_B (instead of 4.7k)
<wpwrak_> ah, you mean R157, okay
<lekernel> 3) we add a diode between the output of the reset IC and INIT_B
<lekernel> 4) we remove R60 (the pullup on the flash reset) since diode leakage and fast rise time do not matter here
<wpwrak_> what are the functions of PRORGAM_B_2 and INIT_B ? you said INIT_B becomes an output after configuration ?
<lekernel> yes, it becomes an open drain output
<lekernel> which might pull low
<wpwrak_> under what conditions does it pull low ?
<lekernel> Before the Mode pins are sampled, INIT_B is an
<lekernel> input that can be held Low to delay configuration.
<lekernel> After the Mode pins are sampled, INIT_B is an
<lekernel> open-drain active-Low output indicating whether
<lekernel> a CRC error occurred during configuration:
<lekernel> 0 = CRC error
<lekernel> 1 = No CRC error
<wpwrak_> aah !
<lekernel> also this doesn't say what happens while the configuration take place. are there glitches on INIT_B?
<wpwrak_> what would happen if you connected INIT_B to PROGRAM_B_2 ?
<lekernel> I don't really want to know :) diode is safer, no?
<wpwrak_> ;-))
<lekernel> plus it'd hold the flash in reset
<lekernel> and I'm not sure we could later deassert INIT_B e.g. from JTAG
<lekernel> so not using the diode looks like murphy bait
<Vaati_> what chip are you discussing?  is it something in the milkymist itself ?
<lekernel> Vaati_, fpga, flash and reset ic
<Vaati_> lekernel: whats the manufacturer of the fpga
<Vaati_> ?
<lekernel> and yes, those are giving us inordinate amounts of trouble to get the milkymist devices to work reliably
<kristianpaul> xilinx
<Vaati_> ah
<kristianpaul> Vaati_: you can see http://en.qi-hardware.com/wiki/Milkymist_One_RC3_BOM
<wpwrak_> (JTAG) yeah, that's the big question. PROGRAM_B_2 = INIT_B looks vaguely useful for recovering from glitches causing a CRC error. but of course, if you then can't fix the flash via jtag, that would suck more than anything else.
<lekernel> wpwrak_, for rework, I think it should be easy to use a non-SMD diode between the output of the reset IC and the R157 pins
<lekernel> wpwrak_, I have read somewhere (it seems) the FPGA should already contain logic to retry configuration after failed CRCs
<lekernel> I don't really want to mess with it
<lekernel> wpwrak_, what would you think of keeping R60 and adding a capacitor between flash reset and ground?
<lekernel> or not keeping R60 and still having the capacitor, which would charge through the fpga
<lekernel> it should keep reset low during the very early stages of power up, before the reset IC takes over
<wpwrak_> (retry logic) very good !
<lekernel> but at the same time it should be small enough not to delay the flash too much, otherwise the fpga will attempt to read from it when it's not ready ...
<wpwrak_> (no mess with it) yeah, feels unsafe
<lekernel> but maybe that's overkill and cause more problems than it solves
<wpwrak_> le't check the diode .. at 25 C, Irev would be about 2.5 uA. at 100 C more like 2.5 mA. now, how to interpolate ?
<wpwrak_> s/le't/let's/
<wpwrak_> the cap would also make the reset voltage crawl very slowly from a clean low to a clean high.
<lekernel> yeah, let's try without first
<lekernel> and without R60
<roh>
<lekernel> roh, ?
<wolfspraul> so we have a new plan?
<wpwrak_> hmm, for Irev, we'd mainly work against R30 = 10 kOhm. to stay on the safe side of Vih(min), we shouldn't drop more than 100 uA
<wolfspraul> I am curious about one thing - why did we not notice this in our rc2 tests? I mean the need for those additional improvements.
<wpwrak_> that's 40 x the T = 25 C and 1/25 x the T = 100 C value. tricky.
<wpwrak_> goes looking for a Irev vs. T curve
<lekernel> wolfspraul, I guess because a consequence of Murphy's law was that the flash we used for the test was fast to come out of reset and the FPGA was slow to begin reading from it
<wolfspraul> ok
<wolfspraul> just want to make sure we have at least a theory for everything we see :-)
<lekernel> or... did we use the exact same reset IC?
<lekernel> or something with less delay?
<wolfspraul> I don't want to do too much history digging, I just asked to make sure our new theories are still in line with old discoveries.
<wolfspraul> but it seems you are not worried about that, so whatever we saw in rc2 is still in sync with the new realizations
<wolfspraul> that's good then!
<wolfspraul> so reset ic stays, D16 stays, and now a few more things on top
<wpwrak_> (T vs. Irev) we're probably good up to 75 C (according to The Circuit Designer's Companion, giving 1N4148 characteristics)
<lekernel> 1N4148 is a PN pure silicon junction, does such a curve also apply to a Schottky junction?
<wpwrak_> lekernel: i checked the schottky section and he didn't warn of any perversions there
<wolfspraul> ok I try to summarize, see whether I followed the discussion correctly
<wolfspraul> 1) 10k pullup on init_b and program_b (instead of 4.7k now)
<wolfspraul> 2) add diode between output of reset_ic and init_b
<wolfspraul> 3) remove r60
<lekernel> (2) with negative terminal towards the reset IC
<wolfspraul> what about R30?
<wpwrak_> (list) that's what i have too
<lekernel> R30 is the pullup on PROGRAM_B, so you already mentioned what should be done to it
<wpwrak_> R30 needs to stay. else D16.Irev may cause FLASH_RESET_N to PROGRAM_B_2 contamination
<wolfspraul> ok
<lekernel> wolfspraul, ok for your 3 points. do you mail Adam and the list?
<wpwrak_> (stay) changed a little, to 10 kOhm. but not higher. and not removed.
<wolfspraul> Adam will check the backlog
<wpwrak_> better write a mail with the final verdict :)
<wolfspraul> lekernel: you mentioned earlier that it is frustrating and depressing we spend so much time on these little things
<wolfspraul> but in my experience it is totally normal, not much at all, and not a hopeless case or anything
<wolfspraul> I'm not making gloom or doom predictions, but these things pop up, and they need to be addressed. it should not be frustrating.
<wpwrak_> plus, i wouldn't call ~1 day "so much time" ;-)
<wolfspraul> it's a complex board and nearly impossible to get the hundreds (maybe thousands) of details right immediately
<wolfspraul> no not at all
<wpwrak_> wolfspraul: we've seen worse, haven't we ? (-:C
<wolfspraul> I'm just speaking from my experience and comparing with successful and failed projects.
<wolfspraul> rc1 set the bar very high, it was an _excellent_ first shot
<wolfspraul> from there on it continued very well, with rc2 (I believe 0 regressions from rc1), now rc3 (again so far it seems no regressions, and many improvements)
<wpwrak_> (complex board) indeed. it's a little PC. i also like that the resolution makes sense. it's not just a shot in the dark.
<wolfspraul> I still do expect some more issues to pop up on rc3, I have to say
<wolfspraul> I'm not saying this to annoy people or to be the wise guy doing nothing.
<wolfspraul> it's just realistic to expect that, next week when we test all 90 boards
<wolfspraul> there will be something
<wolfspraul> :-)
<wolfspraul> so... with the latest great of very good ideas for the bootup problem, let's see what Adam reports tomorrow
<wpwrak_> have things like DMX and MIDI seen any major testing yet ?
<wolfspraul> latest round
<wolfspraul> 'major' is hard to define
<wolfspraul> Adam did a number of electrical tests, got some very long cables, got loopback cables, etc.
<wolfspraul> Sebastien has been using DMX for performances
<wolfspraul> 'major' as in hundreds of people having used it with hundreds of devices - no
<wpwrak_> (performances) okay, very good
<wolfspraul> 'major' as in we tried to do a good job internally - yes
<wolfspraul> so yes, there could be surprises in midi and dmx, but right now I'd say what we have is not bad
<wolfspraul> we need more customer feedback before committing resources on tracking something specific down
<wpwrak_> one customer issue you'll likely hit is MIDI-over-USB. this seems to be quite popular these days, also with people attaching things to their iGadgets.
<wolfspraul> don't mention USB in the presence of Sebastien
<wpwrak_> ;-)
<wolfspraul> realistically those things have to wait
<wolfspraul> I'm just being realistic
<wolfspraul> I will throw myself behind marketing and selling the product on what it can do today, its strengths
<wolfspraul> and then we invest every penny back to make it better
<wolfspraul> so realistically - do not expect midi over usb to work in the next few months
<wolfspraul> Sebastien will correct me if I'm wrong...
<wpwrak_> so what's the plan when those things come up ? just an excuse ? an ETA for a solution ? a work-around ?
<wolfspraul> that's why I keep jtag-serial in every unit, I am hoping to attract some serious new contributors, at least I will try
<wolfspraul> we are collecting feature challenges here http://en.qi-hardware.com/wiki/Milkymist_One_marketing#Feature_Challenges
<wolfspraul> I can add midi over usb :-)
<wpwrak_> (plan) ah, or hope for someone else to come up with an answer :)
<wolfspraul> no it's fine, I think we need to communicate effectively
<roh> has a plan. barbecue. (actually i am getting pulled away to one. bbl)
<wolfspraul> there is no point in getting stuck on things that don't work
<wolfspraul> so I'm very frank in the "does not work list"
<wolfspraul> instead, I want to focus on what works
<wolfspraul> roh: enjoy barbecue!
<wpwrak_> i think something along the lines of "we don't support MIDI-over-USB yet, but you could use the following low-cost/widely-available true MIDI keyboard/whatever ..." would help
<wpwrak_> at least it would remove a blocker for people who are serious
<wolfspraul> refresh the wiki :-)
<wpwrak_> *grin*
<wolfspraul> my main concern on a run like rc3 are regressions
<wolfspraul> and it seems so far we have zero, which is how it should be but which is also great
<wolfspraul> I'm not so worried whether all our improvements are a hit, or whether we discover new problems
<wolfspraul> this is my success gauge
<wolfspraul> when a project starts to accumulate regressions, then it's really serious
<wpwrak_> ;-)
<wpwrak_> yeah. that's a sign of loss of control.
<wolfspraul> at least then I may quickly not know how to continue, because obviously the foundations of the engineering work are flawed somehow
<wolfspraul> wpwrak_: thanks a lot for lending us a helping hand on the bootup problems! very appreciated!
<wolfspraul> I'm anxious to see Adam's new reports tomorrow :-)
<wpwrak_> you're welcome. always fun to stick my nose into something tricky :)
<wolfspraul> unfortunately Adam cannot follow at this speed on understanding the thought process and reasoning for the changes, but that's OK
<wolfspraul> so he will just try it all tomorrow and give us new input data points...
<wolfspraul> we are a team after all, so the most important for me again is that he is in an excellent position to do a good testing job for the 89 boards that will soon flood his apartment
<wolfspraul> that's going to be a mess :-)
<wpwrak_> let's hope the inputs are all along the lines of "it works now" :) else, it's back to the drawing board
<wpwrak_> (89 boards) yeah, i don't envy him ;-)
<wolfspraul> Jon is in Taipei soon, and I suggested to Adam I send him for help
<wolfspraul> ha!
<wolfspraul> that was kindly rejected :-)
<wpwrak_> and tuxbrain can probably share the sentiment :)
<wpwrak_> hehe ;-)))
<wolfspraul> which I can understand
<wolfspraul> I wouldn't want to send myself for help either
<wpwrak_> who rejected ? adam or rejon ?
<wolfspraul> at 40 it was still OK, but with 90 boards that's going to be quite some stress
<wolfspraul> no Adam
<wolfspraul> because of course Jon will not be a great help
<wolfspraul> just sitting around piles of stuff. then Adam has two problems. the boards & Jon.
<wpwrak_> yeah. by the time he'd be properly trained, most of the boards would be tested already
<wolfspraul> 90 is really tough already. if this is all successful and sells well, and the next run is 160, then we may have to rethink his home office setup.
<wpwrak_> he knows his M1 reasonably well, though. also managed to solve all his flashing issues at fisl. all he really needed from me was my mouse ;-)
<wolfspraul> but step by step, now is this run of 90/80 first
<wolfspraul> I dont' know exactly when Jon arrives in Taipei
<wolfspraul> if it is after Adam has most of the chaos under control, the visit may still make sense
<wolfspraul> it depends on how many reworks are needed, and how the tests are going
<wolfspraul> the problem is not to think through the process if everything goes smooth
<wpwrak_> he could finally get that L19 (?) rework, too
<wolfspraul> the problem is to think through the process if there are massive problems with the first 20 boards
<wolfspraul> :-)
<wolfspraul> and different problems, so there is one pile here, one pile there, etc.
<wolfspraul> :-)
<wolfspraul> so when things go bad, that's when you know whether your testing setup was robust or not
<wolfspraul> I spare everyone the stories from the famous Openmoko production lines...
<wolfspraul> :-)
<wpwrak_> does adam live alone ? or will have have to declare some restricted areas ? :)
<wolfspraul> don't know exactly, last time he had a sub-tenant renting a room, his apartment is quite big actually
<wolfspraul> there are enough options
<wolfspraul> let's hope things go smooth
<wolfspraul> I'm sure Adam hopes so too :-)
<wpwrak_> oh, openmoko production. so much fun. first, the suspense ... when will they produce ? have they already ? who knows the results ?
<wolfspraul> otherwise his apartment will quickly turn into a moon landscape
<wpwrak_> then the sherlock-holmes-like discovery of what exactly happened
<wolfspraul> this rework so far still sounds manageable, let's hope it works
<wolfspraul> I mean the new diode & pullup
<wolfspraul> if he can completely verify it tomorrow he may even tell the factory to do it on the other 89 boards
<wpwrak_> then the chaotic struggle with somehow patching up the bugs. some with sequels and sequels of sequels. remember how long it too to finally find out what LED and transistor configuration our freerunners has ? was it half a year ? a year ? longer ? :)
<wolfspraul> so that's an important thing actually
<wolfspraul> well the project was out of control
<wolfspraul> that's what I try to avoid here
<wolfspraul> once you are in that situation you can only pray that one day you will be back in control
<wolfspraul> not fun
<wolfspraul> so, back to us. in a perfect world adam can confirm the final solution tomorrow, and enlist the smt factory's help in the rework on the other 89 boards.
<wolfspraul> that would be ideal
<wolfspraul> if we are not that fast, the boards go back to his place and then most likely he will manually rework whatever we eventually come up with
<wolfspraul> unless the rework is so difficult that it would better be done at the factory, which means ship back and forth, etc.
<wpwrak_> (diode & pullup) so far, everything very civilized, yes. we had one wrong try, but then sebastien found INIT_B, and we now have a good consistent theory for the next try. and all fairly quickly and with swift, useful feedback. not days of puzzlement.
<wolfspraul> those kids at the factory are also unbeatably fast and precise, of course. another thing to consider.
<wolfspraul> so getting this settled tomorrow would be awesome
<wpwrak_> dunno how hard it is to find a diode. if adam has some of the D16, an option would be to just add that, plus a wire
<wpwrak_> else, he needs to find some suitable replacement on short notice (i.e., go to the electronics mall and just buy a few diodes). shouldn't be horribly difficult, but needs review to make sure they don't add funny surprises.
<wolfspraul> sure. Taipei is not Shenzhen, but it should be no problem.
<wpwrak_> (funny surprises) like that monster we had in GTA02. i think you were already there when that happened, weren't you ? the XXXL diode for the USB reset.
<wolfspraul> don't remember
<wpwrak_> (taipei) i think i could even find the shops ;-)
<wpwrak_> wolfspraul: (gta02 diode) it was a similar circuit to the one we have in M1: the system reset (shared by various components) was fed with a diode to the USB side, to make sure the pull-up was disabled while the CPU was in reset.
<wpwrak_> wolfspraul: now, the diode was something to behold. it was also a schottky. but designed for high-current use. i think it could handle something like 10 A. what we really needed was maybe 1 mA or even less. that power diode was HUGE. plus, it has a completely insane reverse current. several mA even under favourable conditions.
<wpwrak_> wolfspraul: so what happened was that the system reset was pulled down via the reverse current going though this diode and the board never made it out of reset. that is, unless you remove the monster.
<wpwrak_> wolfspraul: so a lot of finicky rework was done. this story has a sequel. we then redesigned the whole mess. a proposed a solution with a pair of 74xxx1G gates.
<wpwrak_> wolfspraul: that solution worked beautifully. shortly thereafter, i moved to HXD8. there, we found the need for something similar (not entirely sure if it was really necessary or whether we just thought it was). so we just copied the solution from GTA02, which we already knew was good. so far, so good.
<wpwrak_> wolfspraul: then the day of making our first prototype run after the death of fiwin and the repatriation of HXD8 to FIC+Openmoko approached. a few days before, i thought i'd ask our hw team for the BOM. not quite sure why ... maybe i needed some information, or maybe i just thought i hadn't heard enough mentioning of the BOM.
<kristianpaul> ah, Dash Express, now i see from where some ideas came from :-)
<wpwrak_> wolfspraul: the answer was "just a moment". then the team developed a level of activity reminiscent of a termite hill being peed upon.
<wpwrak_> wolfspraul: after politely inquiring after a while how things were going, i got the truth: so far, no BOM had existed. but there were bits and pieces scattered all over the place. much later that night, i got a first draft of a consolidated bom.
<wpwrak_> wolfspraul: i think i also asked then whether we had the parts :) well, to make a long story short, soon thereafter, the help of FIC sourcing was enlisted, to get us the things we didn't have yet
<wolfspraul> now we know what a well run organization we have today...
<wpwrak_> wolfspraul: we then had two parts of the BOM - the one where we had the parts or were certain we'd have them, and the one with the parts that were still unknown
<wpwrak_> wolfspraul: then, i think less than a week before SMT, the bomb dropped.
<wpwrak_> wolfspraul: there were several components with a lead time measured in weeks if not months. they included some filters, the LCMs, and ... those 74xxx1G chips.
<wpwrak_> wolfspraul: now, back when i had designed that reset circuit, rookie that i was, i had used digi-key stock as my guide for what are "safe" components (in terms of sourcing)
<wpwrak_> wolfspraul: so i found it a little surprising that these things would all of a sudden be so terribly hard to get. had i made a grave mistake ?
<wpwrak_> wolfspraul: well, i went back to my cubicle, chased the bugs away (FIC HQ was crawling with vermin), and checked at digi-key. lo and behold, they had thousands of these chips in stock. i also found some of the other "impossible" items, or very similar replacements.
<wolfspraul> you can discard that sourcing input (as you know by now)
<wpwrak_> wolfspraul: in the end, i got sean's credit card and did a bit of shopping. two days later we had those items with a supposed lead time of months. we also got the lcms. from mouser, at the cost of a small car. all on liane's credit card :)
<wolfspraul> it was more due to incompetencies within the company and parent company, not a real outside problem
<wolfspraul> digikey is a good guide, first of all
<wolfspraul> but sourcing is a long story, as you know modules are very tough, some things like LCM are tough, rf baseband chips, etc. etc.
<wolfspraul> I feel pretty good now about the simple 'availability' part of sourcing, I am more worried about iqc nowadays (incoming quality control)
<wpwrak_> wolfspraul: yeah, eventually i found out what had happened: FIC sourcing has been specifically ordered not to go to digi-key, because there were too expensive, but to try to get things from the official distributors, if possible, free samples
<wpwrak_> wolfspraul: that little stunt with bypassing FIC sourcing apparently caused some bad blood inside FIC :) so we weren't supposed to repeat this
<wpwrak_> wolfspraul: but sourcing also got a bit more agile afterwards :)
<wolfspraul> calling it a day, 3 am here
<wolfspraul> I'm getting old, I guess :-)
<wpwrak_> wolfspraul: (lcm) that was the Sony PSP display. so there were actually lots of them at distributors. i don't quite sure what had happened to our supply there - either we forgot to order or the supplier had let us down. hence the (super-expensive
<wpwrak_> oops. s/quite sure/quite remember/
<wpwrak_> s/$/) order from mouser/
<wpwrak_> wolfspraul: untroubled dreams then ! ;-)
<wolfspraul> I will digest on diodes and forward voltages
<wolfspraul> n8
<kristianpaul> hey, this project have same spartant6 also a flash prom,, ah
<kristianpaul> flash PROM for multiboot FPGA po
<kristianpaul> well, SPI flash..
<kristianpaul> afaik, i don have altium to check schematics