Technical Reference
Why Rebuilding a Degraded RAID Destroys Data

When a drive fails in a RAID 5 array, the controller enters degraded mode: it can still serve data by calculating the missing drive's contribution from parity on every read. The standard response is to replace the failed drive and initiate a rebuild. During rebuild, the controller reads every sector on every surviving drive, recalculates the failed drive's data via XOR parity, and writes it to the replacement drive. This process is one of the most dangerous operations in storage management. On modern large-capacity drives, the probability of a second failure during rebuild is high enough that the rebuild itself frequently causes data loss.
A RAID 5 rebuild on a four-drive array of 16 TB consumer drives reads 48 TB across the surviving members. At the consumer URE rate of 1 error per 12.5 TB, the expected number of read errors during that rebuild is roughly 3.8. Each error on a RAID 5 array is an unrecoverable stripe. For IT administrators managing arrays larger than 24 TB of parity-protected data, forced rebuild of a degraded RAID 5 carries a statistically high probability of partial or total data loss. If the data matters, image the drives first and attempt recovery against the images; our RAID data recovery service handles this workflow in-lab.
What Is URE Probability During RAID Rebuild?
Consumer hard drives are rated for 1 Unrecoverable Read Error (URE) per 10^14 bits read, roughly 12.5 TB. A four-drive RAID 5 rebuild on 16 TB drives reads 48 TB across surviving members, producing an expected 3.8 UREs. Each URE during a degraded RAID 5 rebuild means the affected stripe cannot be reconstructed and its data is permanently lost.
Every hard drive has a specified URE rate. Consumer drives (WD Blue, Seagate Barracuda, Toshiba P300) are typically rated at 1 URE per 10^14 bits read, which equals approximately 12.5 TB. Enterprise drives (WD Ultrastar, Seagate Exos) are rated at 1 URE per 10^15 bits read (125 TB).
During a RAID 5 rebuild on a four-drive array with 16 TB drives, the controller must read every sector on the three surviving drives: 48 TB of total reads. With a consumer URE rate of 1 per 12.5 TB, the expected number of UREs across 48 TB is approximately 3.8. Each URE on a surviving drive during rebuild means the controller cannot reconstruct the corresponding stripe. That stripe's data is permanently lost.
| Drive Size | Array (4-drive RAID 5) | Rebuild Read Volume | Expected UREs (Consumer) | Expected UREs (Enterprise) |
|---|---|---|---|---|
| 4 TB | 12 TB usable | 12 TB | ~1.0 | ~0.1 |
| 8 TB | 24 TB usable | 24 TB | ~1.9 | ~0.2 |
| 16 TB | 48 TB usable | 48 TB | ~3.8 | ~0.4 |
| 20 TB | 60 TB usable | 60 TB | ~4.8 | ~0.5 |
The URE rate is a statistical specification, not a guaranteed threshold. A drive may encounter UREs well before reaching 12.5 TB of reads, or it may never encounter one. The rates above represent the manufacturer's warranty specification: the point at which encountering a read error is within expected behavior, not a defect.
How Rebuild Stress Causes Surviving Drive Failures
The URE math only accounts for read errors. RAID arrays are frequently built from drives purchased together, with matching model, firmware revision, and manufacturing batch. When one drive fails, the survivors share the same wear profile; the rebuild's sustained sequential full-surface reads push marginal drives past their failure threshold, causing a second failure and total data loss.
Surviving drives face sustained sequential reads across their entire capacity for hours or days, depending on drive size and rebuild priority settings. This is a workload pattern that most drives rarely experience during normal operation.
Drives in the same RAID array are often purchased together, installed at the same time, and exposed to identical thermal and vibration conditions. They are the same model, same firmware revision, and similar manufacturing batch. This shared history means their wear profiles are correlated. If one drive has worn to the point of failure, the others in the array are at a similar wear state. The sustained stress of a rebuild can push a marginal drive over the edge.
Common failure modes triggered by rebuild stress include:
- Head degradation. Read/write heads that are near end-of-life may fail under continuous full-surface reads. The rebuild forces the heads to sweep from outer to inner diameter continuously, which is higher stress than typical random workloads.
- Spindle motor bearing failure. Continuous operation for hours without idle periods accelerates bearing wear on drives that are already aging.
- Firmware timeout. If a drive encounters a difficult sector and its internal error recovery loop takes too long, the RAID controller may declare the drive failed (timeout). Dell PERC controllers default to 7-second timeouts; consumer drives may take 30+ seconds for error recovery on damaged sectors.
How Does TLER Decide Whether a Rebuild Survives?
Desktop SATA drives ship without time-limited error recovery, so a weak sector triggers an internal retry loop that can hold the bus for the full Linux SCSI block-layer timeout (30 to 90 seconds by default). RAID-rated drives ship with TLER (Western Digital), ERC (Seagate), or CCTL (Hitachi, Samsung) set to about 7 seconds, so the drive surrenders quickly and lets the controller fall back to parity instead of waiting on the head to finish reseeking.
Dell PERC and LSI MegaRAID controllers enforce an 8-second command timeout for physical disk queries. A consumer drive that stalls for 60 seconds during a rebuild read will be issued a bus reset, marked as a second failure, and dropped from the array. The underlying media is often still recoverable in a lab; the controller has already booted the drive out of the configuration. This is why mixing consumer drives into a hardware RAID 5 is the single most common cause of degraded-to-failed transitions during rebuild.
SCT-ERC support and value can be inspected with:
smartctl -l scterc /dev/sdXreads the current TLER value in deciseconds.smartctl -l scterc,70,70 /dev/sdXsets a 7-second read and write recovery limit on drives that accept the command.echo 180 > /sys/block/sdX/device/timeoutextends the Linux block-layer timeout to 180 seconds so the kernel waits for the drive's deep recovery cycle to complete instead of issuing a bus reset and ejecting the drive. This is the mdadm workaround for consumer drives with no TLER support; the default 30-second timeout is shorter than a non-TLER drive's internal retry window and causes premature ejections.
Drive-Managed Shingled Magnetic Recording (DM-SMR) drives introduce a second timeout failure mode. When a rebuild fills the on-drive conventional cache zone, the drive pauses host I/O while it rewrites data into shingled zones. That pause routinely exceeds controller timeouts and causes the drive to be ejected mid-rebuild. The 2020 WD Red SMR incident documented this failure pattern in Synology and mdadm arrays, and ZFS resilver on SMR drives is known to abort for the same reason.
What Is the I/O Load During a RAID Rebuild?
During rebuild, the controller reads all surviving drives, computes XOR parity, and writes to the replacement drive simultaneously. Write performance drops 50-80% on arrays without battery-backed write cache. Rebuild times for a 16 TB drive in a busy production array can exceed 48 hours. Neither high nor low rebuild priority reduces URE risk or stress on surviving drives.
During rebuild, the array must continue serving production I/O while simultaneously reading all surviving drives and writing to the replacement drive. The controller is performing three tasks: reading source data from surviving drives, computing parity (recalculating the missing drive's contribution), and writing the rebuilt data to the new drive.
This creates a sustained I/O load across every drive in the array. On arrays without battery-backed write cache, write performance drops by 50-80% during rebuild. Rebuild times for a 16 TB drive in a busy production array can exceed 48 hours.
Every hour the array operates in degraded mode during rebuild is an hour where a single additional failure means complete data loss. Many RAID controllers allow setting rebuild priority (low, medium, high). High priority completes the rebuild faster but further degrades production performance. Low priority preserves performance but extends the vulnerability window. Neither option reduces the URE risk or the stress on surviving drives.
A forced rebuild on a degraded RAID 5 with large consumer drives is a gamble with your data.
RAID 5 was designed in an era of drives measured in gigabytes. With modern drives of 8 TB, 16 TB, or 20 TB, the rebuild read volume routinely exceeds the consumer URE threshold. RAID 6 or mirrored configurations (RAID 10) are the appropriate choices for large-capacity arrays. If a RAID 5 array with large drives has lost a drive and the data is irreplaceable, a lab recovery is safer than a forced rebuild.
Parity Recalculation Stress on Marginal Drives
For each stripe, the controller reads blocks from every surviving member, XORs them, and writes the result to the replacement drive. On a 16 TB replacement with a 64 KB chunk size, that is roughly 244 million chunk reconstructions. Every surviving drive runs at 100% sequential read throughout, pushing drives with weak heads or marginal preamps through their failure threshold.
The XOR parity math behind a RAID 5 rebuild is computationally cheap per stripe, but the volume of operations is substantial. For each stripe on the replacement drive, the controller must read the corresponding block from every surviving member, XOR them together, and write the result. On a 16 TB replacement drive with a 64 KB chunk size, that is roughly 244 million chunk reconstructions across the rebuild. Hardware RAID controllers (Dell PERC, LSI MegaRAID, Adaptec) offload this to a dedicated ASIC; software RAID (mdadm, ZFS, Storage Spaces) consumes host CPU for every parity computation.
The stress on surviving drives compounds the URE risk. During normal array operation, each member sees a random mix of reads, writes, and idle time. During rebuild, every surviving drive is pinned to 100% sequential read utilization for the full duration.
A drive with a pending sector reallocation, a weak head, or a marginal preamp will be pushed through its failure threshold by this workload. The controller then interprets that failure as a second drive loss, and the array transitions from degraded to failed mid-rebuild.
Parity recalculation also amplifies the cost of a single read error. In a healthy RAID 5, a URE on one drive is recoverable because parity plus the other members reconstruct the missing block. During a degraded rebuild, there is no redundancy left; the array is already down one drive.
Any URE on a surviving member during rebuild is a direct, unrecoverable data loss for the affected stripe. This is the mechanism behind the common outcome where a RAID rebuild fails at 60-90% completion: the rebuild proceeded successfully across most of the capacity, then encountered a URE and aborted.
How Does RAID 6 Reduce Rebuild Risk Compared to RAID 5?
RAID 6 adds a second independent parity block (P and Q, computed via Reed-Solomon rather than pure XOR) to every stripe, allowing the array to survive two concurrent drive failures. A URE on a surviving drive during a RAID 6 rebuild is recoverable from the second parity block, so the rebuild continues rather than aborting. For arrays using drives of 8 TB or larger, RAID 6 is the appropriate parity configuration.
The cost is additional write overhead during normal operation and one less usable drive of capacity. A four-drive RAID 6 array provides 50% usable capacity versus 75% for RAID 5.
| Attribute | RAID 5 | RAID 6 |
|---|---|---|
| Parity blocks per stripe | 1 (XOR) | 2 (P via XOR, Q via Reed-Solomon) |
| Drive failures tolerable | 1 | 2 |
| URE during rebuild | Stripe permanently lost | Recovered from second parity block |
| Usable capacity (4 drives) | 3 drives (75%) | 2 drives (50%) |
| Random write penalty | 4 I/O per write (read-modify-write) | 6 I/O per write (dual parity update) |
| Rebuild data-loss probability (16 TB consumer drives) | High (UREs expected across 48 TB of rebuild reads) | Low (second parity absorbs UREs) |
| Recommended maximum drive size | 4 TB (enterprise) or avoid for large capacity | 20 TB+ acceptable with enterprise drives |
The random write penalty is the reason RAID 5 persisted so long on small arrays: write amplification of 4 is already painful, and 6 is worse. On modern NVMe and SSD arrays the overhead is masked by device bandwidth. On spinning disks, RAID 6 is measurably slower for small random writes, which is why transaction-heavy workloads often use RAID 10 instead.
Why Are RAID 10 Rebuild Reads an Order of Magnitude Smaller?
A RAID 10 rebuild reads only from the surviving mirror partner of the failed drive, not from every member. Replacing one failed 16 TB drive in an eight-drive RAID 10 produces 16 TB of rebuild reads against a single drive. Replacing one failed 16 TB drive in an eight-drive RAID 5 produces 112 TB of rebuild reads across seven surviving members. At consumer URE rates that is the difference between roughly 1.3 expected UREs and roughly 8.9 expected UREs during the rebuild window.
RAID 10 also localizes the second-failure risk. The array survives any drive failure that is not the mirror partner of the rebuilding drive. In a typical eight-drive RAID 10, only one of the seven surviving drives is fatal to lose; the other six can fail without taking the array down. The trade-off is 50% usable capacity versus 75% for RAID 5 or 67% for RAID 6 of the same drive count. For arrays of 8 TB or larger drives where rebuild time and second-failure exposure matter more than raw capacity per spindle, RAID 10 is the parity-free choice with the smallest exposure window.
How Does Re-Importing a Stale Drive Silently Overwrite Parity?
A drive that drops offline mid-array becomes a stale member: its data blocks are frozen at the timestamp of its failure, while the array continues to serve writes to the surviving drives. If that stale drive is later reinserted and the controller is told to "Import Foreign Configuration," the controller treats the stale block contents as authoritative, recalculates P (and Q on RAID 6) parity to match those stale blocks, and writes the new parity across every surviving member.
The result is on-disk parity that is mathematically consistent with stale data. A subsequent consistency check or patrol read reports zero errors because the math aligns. The filesystem inside the array has been silently corrupted on every stripe that received a host write during the degraded interval. Application-level file checksums, database page checksums, and ZFS checksums (if the array is exporting raw block storage to a ZFS host) will surface the corruption only when each affected block is accessed, often weeks after the import.
Dell PowerEdge RAID Controller user manuals explicitly warn that importing an improper foreign configuration can flush the active cache and result in data corruption; HP Smart Array advisories carry comparable cautions. The exact parity-recalculation step is not spelled out in OEM manuals but is the documented behavior of the SNIA-style metadata import path used by Broadcom LSI MegaRAID, Dell PERC, and Adaptec controllers, well characterized in independent data recovery engineering practice. The most common production triggers are a reboot where drive sled order is disturbed, a controller swap that imports the wrong metadata generation, and a hot-plug reseat of a previously failed drive performed in the hope that it will rejoin cleanly.
Once parity has been rewritten against stale data, there is no in-controller path to recovery: every surviving member now carries authoritative-looking but wrong parity. Block-level imaging of every drive in the array (including the stale one) and offline reconstruction against the images is the only path that preserves the post-degradation writes. The reconstruction must use the original parity sequence, which is why a lab RAID recovery with PC-3000 RAID or UFS Explorer is the recommended workflow once a foreign configuration import has been initiated.
What Are the Safer Alternatives to Rebuilding a Degraded RAID?
Before initiating any rebuild, create sector-level images of every surviving drive. Rebuild attempts can then be performed against the copies using a virtual RAID tool such as PC-3000 RAID, R-Studio, or UFS Explorer. If a second drive fails during imaging, the sectors already read remain available. Touching production hardware before imaging is the most common cause of unrecoverable RAID data loss.
- Image all drives first. Before touching the array, create sector-level images of every surviving drive using a hardware imager (DeepSpar Disk Imager, PC-3000 Portable III) or GNU ddrescue with a mapfile. For ddrescue on a weak member, run the bulk pass with
ddrescue -d -n -r0 /dev/sdX image.bin map.logfirst ( -d bypasses kernel cache via O_DIRECT, -n skips the scraping phase that pounds damaged surfaces, and the mapfile records the state of every sector). Run scraping passes ( -r3 ) only after the bulk image is secured. If the drive dies during imaging, the mapfile preserves every sector that was already read, so the next attempt resumes against the same image instead of starting from zero. - Virtual RAID assembly. Import the drive images into a RAID recovery tool (PC-3000 RAID, R-Studio, UFS Explorer) and reconstruct the array virtually. This avoids any physical stress on the original drives and allows multiple reconstruction attempts with different parameters.
- Do not initialize or resync. If the RAID controller prompts to "initialize" or "resync" the array after detecting a problem, do not proceed without understanding what the operation will do. Some controllers will overwrite RAID metadata or parity data during initialization, making the original data unrecoverable.
Frequently Asked Questions
What is a URE and why does it matter during RAID rebuild?
A URE (Unrecoverable Read Error) is a sector the drive's error correction cannot read. Consumer drives are rated for 1 URE per 12.5 TB of reads. During rebuild, the controller reads every sector on every surviving drive. On large drives, the total read volume approaches or exceeds this threshold, making UREs statistically likely. Each URE during a RAID 5 rebuild means the corresponding stripe cannot be reconstructed.
Should I rebuild a degraded RAID array myself?
If you have verified backups, rebuilding is reasonable. If the data is irreplaceable, rebuilding a degraded RAID 5 is risky: the surviving drives share the same age and wear profile as the failed drive, and the rebuild subjects them to sustained stress. The safer approach is to image all drives individually first, then attempt a virtual rebuild from the images. If a rebuild has already failed, review the failure modes documented on our RAID rebuild failed page before making the next move.
What should I do if my Dell PERC or LSI MegaRAID controller shows a foreign configuration after a reboot?
Do not click Import Foreign Configuration or Force Online without first imaging every member drive. Dell PERC and LSI MegaRAID controllers write new metadata when a foreign config is imported or a virtual disk is forced online. If the controller reconstructs parity against a drive that was not the most recent member, parity will be written incorrectly across surviving drives. The array will appear healthy and mount, but file data on parity-touched stripes will be silently corrupted. Power the server off, pull the drives in labeled order, image each one, then test the import on the images with a virtual RAID tool such as PC-3000 RAID or UFS Explorer before touching production hardware.
Why does using consumer drives in a hardware RAID array make rebuilds fail?
Hardware RAID controllers (Dell PERC, LSI MegaRAID, HPE Smart Array, Adaptec) enforce an 8-second command timeout for physical disk queries. Desktop SATA drives ship without time-limited error recovery, so a weak sector triggers an internal retry loop that holds the bus for the full Linux SCSI block-layer timeout of 30 to 90 seconds. The controller interprets that stall as a drive failure, issues a bus reset, and drops the drive from the array. On a degraded RAID 5, the second drop ends the array. RAID-rated drives (WD Red Pro, WD Gold, Seagate IronWolf Pro, Toshiba N300) ship with TLER, ERC, or CCTL set to 7 seconds, which fits inside the controller window. Drives that support SCT-ERC can be inspected and reprogrammed with smartctl -l scterc /dev/sdX.
Does ZFS resilver or mdadm rebuild face the same URE risk as a hardware RAID 5?
Partially. ZFS resilver traverses the block pointer tree instead of reading entire drives sector by sector, so it touches only allocated blocks and verifies each one against a cryptographic checksum. A URE on a surviving drive during resilver is detected and repaired from the second copy of metadata (ditto blocks) or from the redundancy tier (raidz2 or raidz3). ZFS on raidz1 is still vulnerable to a second drive failure during resilver, and ZFS on Drive-Managed SMR drives experiences the same write-cache-stall ejections that abort hardware-controller rebuilds. mdadm RAID 5 has the same URE failure mode as hardware RAID 5, but the write-intent bitmap shortens the resync window when a drive drops out transiently and is re-added without having been replaced. For drive sizes above 8 TB the safer parity choices on Linux are raidz2 or mdadm RAID 6.
What should I do the moment one drive fails in a degraded RAID array?
Stop writing to the array, then stop the array. Do not click rebuild, do not insert a replacement drive, and do not reboot the server to see if the drive comes back. Every host write to a degraded RAID 5 widens the gap between the surviving members and the failed drive, and every reboot risks the controller marking a second member as foreign or pending. Power the server down cleanly, label each drive with its bay position, pull all members including the failed one, and image each surviving drive sector-by-sector to a separate target disk using a hardware imager such as the DeepSpar Disk Imager or PC-3000 Portable III, or GNU ddrescue with a mapfile for software-only workflows. The failed drive is imaged last, and only if it still spins. Reconstruction is then attempted against the images using PC-3000 RAID, R-Studio, or UFS Explorer. Every operation against the original hardware is irreversible, so the imaging step preserves the option to retry with different parameters. Our RAID data recovery service handles this workflow end-to-end for arrays where the data is irreplaceable.
Why does Rossmann image surviving members to clones first and rebuild virtually instead of letting the controller rebuild onto a hot spare?
A controller-driven rebuild reads every sector of every surviving member to recalculate the missing payload and stream it to the replacement drive, and any URE or timeout-induced ejection of a second member that occurs mid-rebuild ends the array. The rebuild also pins the surviving members at 100% sequential read for hours or days on drives that may already be marginal. By imaging each surviving member to a clone first, the lab freezes the on-disk state at the moment of intake; subsequent reconstruction attempts run against the clones, not the originals. If a clone read fails on a marginal source drive, the imaging tool's mapfile records exactly which sectors were missed, and a second pass with a different head map, slower read speed, or donor-head rework on the source drive can fill the gaps without restarting from zero. The virtual rebuild itself runs in software (PC-3000 RAID, UFS Explorer, R-Studio) where stripe size, parity rotation, block order, and member sequence can be tested against the actual filesystem until directories and file checksums verify. None of that is possible once a controller rebuild has completed and normal writes have resumed against the array. This is the same general principle behind every imaging-first workflow in single-drive hard drive data recovery, applied at the array level.
How long does a RAID 5 rebuild on multi-terabyte drives actually take, and how does that compound URE risk?
On an idle four-drive RAID 5 of 16 TB enterprise SATA drives with a high rebuild priority, the controller reads roughly 48 TB across surviving members and writes 16 TB to the replacement. At a sustained sequential rate of 180-200 MB/s per drive, the math floor is around 22-25 hours; in practice, controller throttling, foreground I/O, and chunk-size overhead push real-world rebuilds on busy production arrays to 48-100+ hours. A 20 TB drive in an eight-bay array can take three to seven days. Every hour the array runs degraded is an hour where one URE on a surviving drive loses a stripe, and one further drive ejection ends the array. Lengthening the rebuild window also lengthens the thermal-stress window on the surviving members; drives that spent years at low duty cycles are pinned to 100% sequential read utilization for days, which is the workload that exposes pending sector reallocations and weak preamps. The duration alone is one of the reasons RAID 6 or RAID 10 are the appropriate parity choices for arrays of 8 TB or larger drives, independent of the URE probability math. If a rebuild has already aborted partway through, our RAID data recovery workflow recovers from images of the surviving members rather than retrying the controller rebuild.
If you are experiencing this issue, learn about our RAID recovery service.