X-Message-Number: 5558 Date: Wed, 10 Jan 1996 08:20:38 -0800 (PST) From: Joseph Strout <> Subject: Re: Data Storage Again, many thanks for your comments on my archiving suggestion. First, I'd like to respond to some comments from Edgar Swank (#5554): > There is at least > one commercial service that does this for $40 per CDROM, although that > would not include data preparation. This is a good start, but it is not quite enough. A service like this might provide a subcontractor for another organization, though. The characteristics we need (in addition to just creating CD-ROMs or whatever) are: 1. Safe storage: if we all keep our own CD-ROMs, they will probably get lost when we die (family members may not recognize their significance, etc.). Or our house could burn down, they might get stolen, etc. Better to have the company keep them in a safe -- or better yet, two safes, separated by a few thousand miles. 2. Long-term storage: Ideally, you could choose to (1) pay a monthly fee, or (2) pay a bigger up-front sum, for which they'll store your data "forever". (If the service is associated with a cryonics arrangement, of course, the latter would be implied and probably covered by other fees.) 3. Active maintenance: the biggest losses of data are not through media degradation, but through obsolescence -- we (as a society) forget how to read old formats. NASA has miles of tape that nothing can read anymore (though, as I recall, they're attempting to decipher & upgrade them). This can only be avoided through (1) storing complete data on the format in the same safe as the media, and (2) copying to new media whenever the old becomes obsolete. An interesting article on this problem appeared in Scientific American recently: Rothenberg, J. Ensuring the longevity of digital documents. Jan. 1995, vol.272, (no.1):24-9. 4. Cryonics company involvement: I want my storage provider to at least know where that data is, and how they can get it when they need it. Better still is if they actually keep it for you, as CI does with some info now. As you point out, > OTOH, CDROM recorders for use with PC's now start in price around > $1000, so anyone with a PC system already and especially if he already > has a CDROM recorder for other purposes might want to assist on this. ...so rather than pay the above company forty bucks a disc, an alert cryonics provider might just invest in their own recorder. Then in message #5556, Perry Metzger writes: > 1) Once you get to a small enough level, "physical" and "chemical" are > the same thing. Yes, but the CD-ROM pits aren't on a level that small. I don't have the figures, but I have the impression that they're at least a few microns wide and deep, which is MUCH bigger than a molecular scale. Can anyone give us the exact numbers? > 2) CD-ROMs ... decay very nicely. Among other > things, degradation of the plastic and glues that surround the > pitted metal surface occur, as well as fun things like > photodegradation of the surface itself. Photodegradation is greatly reduced by keeping them in the dark. =) (This is a problem to which film is prone too, I might point out.) > Some early CDs sold at the beginning of the CD era have > become useless because of glues decaying or opaquing. I think you think that the data is gone when you can't put the disk in your home CD reader and read it. In fact, however, the data is almost certainly there; you'd just need a microscope to see it, and more sophisticated automation to read it. It would take a long time for the pits to disappear completely, I think. > 3) Properly developed and fixed b&w photographic negatives have a > *demonstrated* lifetime of at least six or seven decades, and in > some cases a century or more. > 4) I have, in the past, routinely used thirty and fourty year old > microfiche with no noticeable decay other than that from use. This is not comparable, because you're not attempting to read this data digitally. If you did, they would not have demonstrated nearly this lifetime. Instead, you look at them visually, as *analog* image data, which your brain (being very good at such things) can interpret despite a great deal of fading and noise. To digital data, fading and noise mean "loss" to an ordinary reader. So you're not judging the two by the same criteria at all. Don't get me wrong: microfilm is not a bad idea either. But I'd be more comfortable with digital storage, since we can expect it to be lossless (even after multiple copying), which is not true for the analog storage (esp. images) normally used on film. Also, there is not no standard format for storing sounds on film. > The whole question is always "how much data do you want to store, and > how much are you willing to let it degrade". Add, "How much are you willing to pay." This is why I didn't suggest chisled stone or engraved metal, though I share your respect for these media. > I recommend going back to the future. Specially > printed text in special OCR fonts on low acid paper. Special bar codes > that are REALLY BIG and trivial to write scanners for. I like this idea... in fact, rather than bar code, one could develop a robust and simple dot code which would encode any binary data equally well. (Indeed, such a system was in use once; computer magazines printed this hash in the sidebar, and if you had a reader, you could read the code directly into your computer.) As long as the format was clearly defined and relatively carved in stone, this would do. > Unfortunately, it seems that people want to store lots of "bulky" > information -- video, pictures, vast and bulky records, etc. Yes, I think this is important. At 72 dots per inch (we use such low res to reduce degradation), we fit about 50K on a page. It would take at couple file drawers to hold all my archival data this way. > Magnetic media have a known bad track record on this. Mag tapes > recorded in the 1950s have in many cases decayed to uselessness. This > is a well known problem. Paper isn't dense enough. Agreed. Magnetic changes are even more volatile than chemical ones. > 1) Film, microfilm, and microfiche, and you accept the problems of > analog media. Agreed. > 2) Put the stuff on multiple redundant machine readable storage media, > use fiendish and expensive error correcting codes that would not > normally be used, and read and re-record the information onto the > most survivable known archival media every couple of years. Yes, but I don't think this is as hard as you make it sound. A simple checksum per block of data would suffice to detect errors. Keep two copies of the data, and when an error is detected (due to a checksum mismatch), make a fresh copy from the good one. Change your media type every ten years or so, when a new format becomes standard. > Who knows if anyone will have good enough data to reconstruct the > recording formats on other types of media, anyway... That's why you need to define the format explicitly, and include all the technical detail needed to build a reader. (And how would you store this information? On high-quality paper!) Of course, if we actively maintain the data, this won't be necessary. ,------------------------------------------------------------------. | Joseph J. Strout Department of Neuroscience, UCSD | | http://www-acs.ucsd.edu/~jstrout/ | `------------------------------------------------------------------' Rate This Message: http://www.cryonet.org/cgi-bin/rate.cgi?msg=5558