http://www.theregister.co.uk/2009/04/14/nasa_reboot_over_easter_weekend/
Moderator adding: the JPL press release is http://marsrovers.jpl.nasa.gov/newsroom/pressreleases/20090413a.html.
One hopes that Spirit isn't seeing some wear and tear from the computer doing very many reboots at the start of its mission. Didn't Spirit just have some computer issues before the latest software upload?
And how does this effect Oppy. Does it stand down to see if a common software bug could effect it?
Brian
> One hopes that Spirit isn't seeing some wear and tear from the computer doing very many reboots at the start of its mission.
Huh? I can't see how reboots may cause wear and tear to the computer but perhaps the opposite. Wear and tear to the computer causing reboots.
> Didn't Spirit just have some computer issues before the latest software upload?
See here: http://marsrovers.jpl.nasa.gov/mission/status_spiritAll.html#sol1797
Edited:
> And how does this effect Oppy. Does it stand down to see if a common software bug could effect it?
Just checked today's imaging plan for Opportunity and it has all signs of a driving sol.
I don't think the early boots would have done permanent damage to the computer. Brian, shouldn't you be looking on the bright side of life?
"Brian, shouldn't you be looking on the bright side of life? "
Good one, Ted
Phil
I don't think reboots should affect much but Flash memory does degrade with use. It takes a while but we are running into fairly large data volumes for the lifetime of the rovers. I'm pretty sure that the type of Flash memory used in the MER's is good for around 100k write cycles per cell but five years with a few tens of GB of data throughput in the relatively harsh environment of the Martian surface might be enough to start seeing more frequent transient errors if there was any significant "hotspot" on the Flash drive that was getting a lot more write activity than the average. However I suspect that if this was the root cause Opportunity would be more likely to exhibit the problem as I'm pretty sure she has delivered more data - and given the use of deep sleep mode any wear that was related to the boot process should also hit Opportunity sooner than Spirit since the former has made much more use of that than Spirit IIRC.
Here's hoping it was just some freak occurrence of cosmic ray hits.
If the program is merely reading from those cells, its not an issue. Just writing. So you could use flash as instruction memory that you might update a few times in a mission, and you can use it as a storage repository for photos. Even if you filled the flash every sol, we're not at 2000 yet. What you cannot use it as, is RAM-- a scratchpad for doing calculation.
Yes, the ECC is there (if it's there) to correct errors in the memory word. For 128 bits, you might write a 16 extra syndrome bits that algorithmically would allow you to correct a single bit in the 128 that is wrong. To my knowledge, the ECC isn't there to correct the hard errors that come with exceeding write cycling, it's there to correct for errors that just happen on occasion, in fantastically mind-boggling, flash-specific ways. But it would help cover up hard errors.
To guarantee 100K cycles, you have to bear in mind that, yes, you might be making this guarantee for over 16 billion cells on a 16Gb chip. So if your guarantee for your typical statistical cell meets that to even 10 sigma--or whatever one in a billion cells not meeting the spec would mean--you're still going to get fails on that chip. What they do to spec 100K would be a combination of test (throw out entire bad chips), redundant cells and repair (find the bad bits and fix them... how you find suspect bits without destroying a chip-- top secret), and the aforementioned ECC if your process engineers can't totally solve this particular problem. And yeah, you might still get a cell in an iPod somewhere that goes bad before its time, but the stats guys are trying really hard to ensure that that is extremely rare by eliminating the tail of the distribution. My point was that the actual center of the distribution is still going to be somewhere far far above 100K to make this guarantee.
Just delivering a memory chip that works from Day 0 is a similar game of stats... even if your process engineers deliver a process where only one in a million cells is failing a spec, every single 1Gb chip would have on average 1000 bad cells! So after manufacturing, there is a lot of test to be done to fix things and eliminate those fliers. At the same time, there are 900 million cells that greatly exceed the spec.
A few details on the Spirit anomalies in http://marsrovers.jpl.nasa.gov/mission/status_spiritAll.html#sol1872
Guarantee is not a term you'd use for a Martian rover. It's a business term. The chips would probably have something like a mean time between, or before failure rate. Age can be a factor, since almost all mechanical failures are from thermal cycles. And of course you have random failures that are as likely day one as day five thousand.
I'd guess they could map out physical bit failures in memory, but don't really know if that was included.
Travi, thanks. Had to read that twice before I got it, but makes sense.
Speaking of reboots - the raw image pipeline just flushed
...still alive then?
No commanded remote sensing, but downlinking of older data is occurring on the PCDB
Well, to look on the bright side: looks like these kind of delays mean the rovers have time to send back plenty of old navigation images.
Speaking of delays....
http://www.theonion.com/content/news/nasa_embarks_on_epic_delay?utm_source=a-section
The news is... http://www.jpl.nasa.gov/news/news.cfm?release=2009-071 (Via http://twitter.com/jetlab)
Hmm indeed. Hope we're not heading into the land of complex/unusual failure modes that systems advanced in years too often enter. (The last years of both the F-4 & the C-141 were often quite bizarre in this regard...)
Emily posted updated information in the Planetary Society blog: http://www.planetary.org/blog/article/00001916/
Speaking of failing brains, I read that 3 times before I parsed anything but "Emily Post updated the information..."
Powered by Invision Power Board (http://www.invisionboard.com)
© Invision Power Services (http://www.invisionpower.com)