Data Degradation, The Silent Killer of Your Data

silent_corruption_detector_screenshotData rot is the scourge of modern day computing. When I think of the ~2 million files on my iMac, from photos and videos to school reports dating back to the 90s, it’s disturbing to think that those virtual memories could become corrupt at any time without any warning.

What’s a moderately concerned and bored person to do? But not so concerned or bored as to take on the challenge of grafting a 3rd party filesystem and the associated unknown bugs on to OS X? They hack together a poor man’s version of course! Thus, I present to you my hack, Silent Corruption Detector. This is a simple little ruby script that will generate a hash of each of your files and store them in a SQLite database. Check out the screenshot above for an example of what the output looks like and how to run the script.

The basic principal is that I run this script about once a week which is well within my backup rotation schedule. That gives me time to recover any silently corrupted data from a good backup before it’s rotated.

The odds of a bit flipping are hard to believe. Much like a lottery with odds of winning so low that it’s not worth entering en masse. But unlike a lottery where a single ticket is no great expense, a single bit flip can mean the permanent loss of data.

What’s worse is that the error may not be found until several months or more likely years have passed. By then, it will be too late to recover. What about Time Machine you ask? If your setup is like mine, Time Machine is only able to keep a few months of history before it starts rotating. The window of opportunity to go back and retrieve an uncorrupted version is slim at best. The only saving grace would be the use of an eternal archive system such as bup combined with parity archives where every change back to the beginning of time are tracked. But these are only band aids to the symptoms of the problem.

That problem is that we don’t have perfect hardware yet. Our hard drives, Blu-rays, DVDs, and such are not infallible and everlasting. So, as engineers we do the best that we can to work around those limitations. The ideal place to do this is at the filesystem layer where it is an integral feature to the library housing our data. I say ideal place because the filesystem provides an extra layer of security on top of the checksumming that a hard drive controller should already be doing, and provides the protection without a complex advanced RAID configuration with error correction.

What’s interesting is that there seems to be a general feeling of “eh the odds are still pretty low” and so there’s not a lot of push by the majority to move to the next generation filesystems.

2 comments… add one
  • Michael Foley Dec 17, 2015

    I definitely see the value of all this. Has happened to several movies I have had…they play fine for 45 minutes and checkerboard. Not to mention these high capacity drives (1 TB+) like to fail, have high fly writes…heck who knows if the disk even wrote the data correctly to begin with. So many factors that come into play. My problem, having 10 TB of storage is how the hell do you back that crap up? I mean like besides having another 10 TB of storage or shell out crap loads for cloud storage. The best solution I have come across, at least on the cheap and easy is mirror drives…RAID 1/10 something along those lines but that doesn’t stop data rot. Now this script that you run is that something you wrote yourself or is there a software package that you recommend for checking these?

    • Jon Stacey Dec 18, 2015

      It’s a script I wrote myself and a rough hack at that. It was spawned from the Time Machine Integrity script which I’m glad I looked at because Time Machine was broken for at least two major OS X releases. How that made it through QA I don’t know, but it seems that the general consensus is that the quality of software that Apple’s churning out has been declining in the last couple of years.

      Ideally this sort of hack script should not be needed and the protection included in the filesystem. However, for those of us living on an Apple platform with HFS+, we’re not able to partake, at least easily.

Leave a Comment

Time limit is exhausted. Please reload CAPTCHA.