Table of Contents
Generally when a Solid State Drive (SSD) fails, you cannot easily recover data from it. However, it depends on how the drive fails.
It is generally thought that mechanical Hard Disk Drives (HDD), is more reliable in the long run with reads/writes, as a SSD has a maximum number of writes that it can handle. However, SSDs are more reliable with shock damage because they contain no moving parts.
How Does an SSD Fail or Wear Out?
In order to understand how a SSD can fail, you need to know how an SSD works. SSDs use a NAND type of non-volatile flash memory storage that retains data by trapping electrons inside of nano-scale memory cells. A process called tunneling is used to move electrons in and out of the cells, but the back-and-forth traffic erodes the physical structure of the cell, leading to breaches that can render it useless.
Electrons also get stuck in the cell wall, where their associated negative charges complicate the process of reading and writing data. This accumulation of stray electrons eventually compromises the cell’s ability to retain data reliably—and to access it quickly. Three-bit TLC NAND differentiates between more values within the cell’s possible voltage range, making it more sensitive to electron build-up than two-bit MLC NAND.
Even with wear-leveling algorithms spreading writes evenly across the flash, all cells will eventually fail or become unfit for duty. When that happens, they’re retired and replaced with flash allocated from the SSD’s overprovisioned area. This spare NAND ensures that the drive’s user-accessible capacity is unaffected by the war of attrition ravaging its cells.
HDD Fails Differently from SSD, How Do We Compare Reliability?
Whereas in theory an HDD has infinite write/read lifetime, because HDDs do not damage their storage medium when writing/erasing. SSDs physically degrade their NAND flash when erasing/writing. But of course, HDDs can fail mechanically from other things like the needle or read/write head hitting the disk or motor failure. The read/write head for HDDs floats on a layer of air molecules, and it should never touch the spinning disks. They can touch from events such as you shaking your HDD while it is reading/writing.
So its not as simple as saying an HDD or SSD is more reliable than the other. They both work in different ways, and they both fail differently too. And as per usual, any product that is consumer-grade is not going to be as astringently checked for defects as enterprise-grade products. In other words, there tend to be a few duds here & there for a small minority of consumers who buy a HDD or SSD.
So for us to compare an HDD to an SSD, we need to find some common grounds for comparison. How about overall endurance until failure, regardless of the type?
In terms of endurance, TechReport reveals that that the majority of consumer quality SSDs tends to be able to endure more than 700TB of reading & writing, with a few others surviving up to an exceptional 2.5 pentabytes. They also found that TLC type SSDs had generally less endurance than their MLC counterparts.
- Corsair’s Neutron GTX 240GB wrote 1.1PB before dying
- Intel’s 335 Series 240GB wrote 700TB before shifting into read-only mode to protect the data
- Kingston’s HyperX 3K 240GB wrote 800TB before dying
- Samsung’s 840 Series 250GB wrote 900TB before dying
- Samsung’s 840 Pro 256GB wrote an astounding 2.4PB before dying
Compare that to Backblaze’s tests with their HDDs. Backblaze has kept up to 25,000 hard drives constantly online for the last four years. Every time a drive of theirs failed, they noted it down, and then slotted in a replacement. After four years, Backblaze has collected detailed data of the failure rates of Hard Disk Drives over the first four years of their life.
It seems that hard drives have three distinct failure “phases.” In the first phase, which lasts 1.5 years, hard drives have an annual failure rate of 5.1%. For the next 1.5 years, the annual failure rate drops to 1.4%. After three years, the failure rate explodes to 11.8% per year. In short, this means that around 92% of drives survive the first 18 months, and almost all of those (90%) then go on to reach three years.
Extrapolating from these figures, just under 80% of all hard drives will survive to their fourth anniversary. Backblaze doesn’t have figures beyond that, but its distinguished engineer, Brian Beach, speculates that the failure rate will probably stick to around 12% per year.
This means that 50% of hard drives will survive until their sixth birthday.
But how long would an SSD last?
Although some people worry that SSDs have a limited number of reads and writes, in reality we already know that a Solid State Drive’s read/write limit lasts an extremely long time under normal use. A limit of 700TB or more data in fact. Considering that Solid State drives usually come with a three to five year warranty, it means that manufacturers assume you will be writing 20GB-40GB of data per day. So to reach the 700TB limit, you would have to write 40GB worth of data every day for 17,500 days, or about 50 years. That doesn’t mean you can mistreat your drive, and it doesn’t mean SSDs won’t fail due to other issues, but if you’re worrying your SSD will die because your using it too much, don’t.
But in terms of data security, evidence of flash wear appeared after 200TB of writes for TechReport’s Solid State Drives, when their Samsung 840 Series started logging reallocated sectors. As the only TLC candidate in the bunch, this drive was expected to show the first cracks. The 840 Series didn’t encounter actual problems until 300TB, when it failed a hash check during the setup for an unpowered data retention test. The drive went on to pass that test and continue writing, but it recorded a rash of uncorrectable errors around the same time. Uncorrectable errors can compromise data integrity and system stability, so I’d recommend taking drives out of service the moment they appear.
Recalculating the limit until data becomes compromised at 300TB, an SSD like the Samsung 840 Series is theoretically reliable up to 21.4 years. Compare that to the fact that an HD drive is 50% likely to fail after 6 years.
Other things to consider
- Other sources indicate that SSDs tend to have a higher bit error rate than HDDs. In addition, those error rates increase with age, with usage having almost nothing to do with it. So, data integrity is an issue.
- If you write a lot of data to a drive 24/7, HDDs are generally more reliable. SSDs are pretty reliable now a days for consumers and some server applications, and you would normally replace your SSD by the time you hit its write limit (assuming average use). Do note that SSDs do lose data if left unpowered for prolonged periods of time (think a few years). They can lose data faster if the storage temps are abnormal. So, SSDs aren’t exactly ideal for prolonged cold storage.
So you can come to a conclusion that Solid State Drives, at least for mobile machines that will experience a significant amount of shock, is much more reliable than an Hard Disk Drive. But that doesn’t mean you should follow the rules for keeping your data safe: Back It Up!
General Rule for Backups
In consumer world precise statistics, failure modes and recovery chances aren’t really crucial parameters. What is actually important is that both can fail irrecoverably and without prior signals. If you want specifics – then SSD probably has higher risk of complete irrecoverable failure, while HDDs often can be restored at considerable expense.
So instead of relying on single part not failing you should use reasonable backup scheme, like the ever-so-popular 3-2-1:
- At least 3 copies of data.
- On at least 2 different mediums.
- At least one off-site.
First point ensures that you are protected against only moderately unlikely problem with both original and copy – and so that if one copy fails you still have peace of mind as your data is still backed up.
Second point aims to protect you against mass failure affecting given storage type. For example a power surge killing all the hard drives in an NAS – but your disconnected external HDD copy is still fine.
Last point is basically disaster protection – so that your data is safe in case of fire, tornadoes, floods, earthquakes and so on.
This probably rightfully sounds like an overkill, but you have to weight how much your data is worth to you.