Skip to main contentSkip to navigation
Lab Operational Since: 17 Years, 7 Months, 19 DaysFacility Status: Fully Operational & Accepting New Cases

mdadm Recovery

mdadm recovery separates sector imaging from array math. The single most destructive thing you can do is run mdadm --create --assume-clean on the original disks to "fix" an inactive array. That writes fresh superblocks; if the order, metadata version, or chunk size is off, the data_offset shifts and the filesystem turns to garbage. We clone every member, attach the images read-only with losetup -r, then use mdadm --examine and mdadm --assemble --readonly against the clones.
Louis Rossmann
Written by
Louis Rossmann
Founder & Chief Technician
Updated 2026-06-12

Send your drives to our lab and we recover the data in-house. All mdadm work is performed at 2410 San Antonio Street, Austin, TX 78705. There is no diagnostic fee, and under no data, no recovery fee, you pay nothing if we cannot reassemble your array and extract the filesystem. The Rossmann Group has been recovering data since 2008, and software RAID cases never leave the building or get outsourced.

What is mdadm, and why does corruption look like drive failure?

mdadm is the Linux Multiple Device administrator, the software RAID layer underneath almost every modern NAS and most Linux servers. It binds several block devices into one virtual array using a small binary superblock written to each member. The array geometry, chunk size, parity rotation, disk order, and data_offset all live in that superblock, not in any controller card.

That is the weak point. When mdadm cannot read a coherent set of superblocks, the array goes inactive and the kernel reports something like mdadm: no recogniseable superblock or refuses to assemble at all. Most people assume the platters died. They usually have not. A single bad sector in the metadata region, a dropped member, or a desynchronized events counter produces the same symptom as a dead array while the heads, motor, and user data area stay healthy.

Misreading that state is how data gets destroyed. People accept a firmware prompt to rebuild, move the disks into a new chassis that auto-initializes them, or follow a forum recipe to recreate the array. Each of those overwrites the only surviving map of where the data lives.

What happens if you run mdadm --create?

Running mdadm --create on an existing array is the leading cause of permanent software RAID data loss. Forums, StackOverflow threads, and even vendor support scripts routinely tell people to run mdadm --create ... --assume-clean with the original disks to force an inactive array back to life. The QNAP-style recipe looks like mdadm -CfR /dev/md1 --assume-clean -l 5 -n 5 -c 512 -e 1.0 followed by the member partitions.

--create writes a fresh superblock to every member. The original superblocks, with the true disk order, metadata version, chunk size, and parity rotation, are gone. If any of those parameters is even slightly off, the new superblock places data_offset at the wrong sector. The array assembles and mounts, but every read lands at the wrong byte offset, so the filesystem payload comes back as perfectly interlaced garbage.

It gets worse with two common layers. If LVM was on top, the new mdadm superblock overwrites the LVM PV header, severing the link to the volume group.

If the user then runs fsck against the garbled volume, the repair tool decides the inodes are corrupt and zeroes them out, which shreds the payload for good. "It worked for someone on a forum" only means that person guessed every parameter correctly. It is a coin flip you do not need to take.

The safe path never touches the originals. We read superblocks with mdadm --examine and assemble read-only with mdadm --assemble --readonly against clones. When on-disk metadata is too damaged for mdadm to parse at all, we reconstruct the array geometry virtually in software using Data Extractor Express RAID Edition, the genuine ACE Lab array tool that runs on our PC-3000 Express, working from the image files rather than the live members.

How is the mdadm superblock laid out on disk?

Recovery is only possible because the superblock has a fixed structure and the metadata version dictates exactly where it sits and where the data payload begins. Misdetecting the version shifts the payload, so the version is the first thing an engineer pins down.

VersionMetadata locationPayload startNotes
0.90End of deviceByte 0Legacy. 2TB member limit, 28-device limit. Found on older Synology and Debian systems.
1.08–12KB from endByte 0Payload at offset 0, so a member can auto-mount as a standalone volume and desynchronize. Dangerous.
1.1Start of deviceAfter superblockRare in modern production.
1.24KB from startdata_offsetModern default. Payload starts after a calculated data_offset, typically 1MB-aligned.

Whatever the version, a valid superblock carries the array UUID, the chunk size and parity layout (left-symmetric is the RAID 5 default), the data_offset, the member role or device index, and the update time and events counter. Those last two are the synchronization metric. The events counter is what tells us which members are current and which dropped early.

Read the version wrong and you place the data payload in the wrong spot, which is exactly the trap mdadm --create falls into. A wiped or unreadable superblock is its own recovery path, detailed in our mdadm missing superblock recovery writeup.

Does a missing superblock on one drive mean that drive is dead?

No. The mdadm superblock is a single 4KB metadata structure. On the modern 1.2 format it sits 4KB from the start of the member, and on the 1.0 and 0.90 formats it sits near the end. The striped user data lives in the multi-terabyte payload area after data_offset, physically separate from that metadata sector. A lone bad sector or a stray write inside the superblock region makes mdadm reject the member with no recogniseable superblock, but the data stripe on that drive is intact and fully readable. The drive is not dead. Only its metadata sector is corrupt.

The destructive reflex is to treat that member as failed and re-add it. Assemble the array writable, run mdadm --add on the rejected drive, or let the NAS firmware re-add it on its own, and the kernel treats it as a fresh spare. It immediately starts a recovery that recomputes the member's contents from the surviving drives and writes the result back over the intact stripe.

If the survivors are stale or the geometry is even slightly wrong, that resync overwrites good data with a bad reconstruction, and the original stripe is gone. This is why imaging is not optional and has to come before any assembly attempt. Every member, including the one with the missing superblock, is cloned first, then the superblock is rebuilt or the geometry carved against the clones with mdadm --assemble --readonly, never re-added on the originals.

How do you carve array geometry from the ext4 0xEF53 magic?

When the superblocks are missing or untrustworthy, the geometry can be rebuilt from the filesystem itself. An ext4 filesystem keeps its primary superblock at byte offset 1024 (0x400) from the start of the filesystem, and the magic number 0xEF53 (stored on disk as little-endian bytes 53 EF) sits at offset 0x38 inside that superblock. So the magic always lands 0x438 bytes into the logical ext4 payload.

That fixed relationship turns a hex search into a geometry proof. Finding that 0xEF53 magic at 0x100438 on a member, which is 1MB plus 0x438, proves that member is disk 0 and that the mdadm data_offset was exactly 1MB. Locating fragments of the ext4 block-group descriptor tables across the other members then validates the chunk size and the disk order. We confirm all of that against the clones before any assembly is attempted, so the array math is verified, not guessed.

What does an events counter mismatch mean, and is --force safe?

When a drive disconnects, the kernel drops it and keeps writing to the survivors. The dropped drive's events counter freezes while the active members keep counting up. After a reboot, mdadm sees the gap and refuses to assemble, because it cannot tell which members are coherent.

mdadm --assemble --force permits assembly of slightly desynchronized superblocks, and it has a place in recovery, but it is a sharp tool. Run against the wrong member, it feeds stale blocks into the parity math and corrupts the reconstruction.

We use it only read-only, only against clones, and only after the events counters across all members have been compared with mdadm --examine to identify and exclude the stale, lowest-event drive. On the original disks, in writable mode, with the stale member included, it is how a recoverable array becomes an unrecoverable one.

Why is RAID 5 rebuild not routine on a degraded array?

A RAID 5 rebuild has to read every sector of every surviving member to recompute the missing parity. Consumer SATA drives carry an Unrecoverable Read Error rate near 1 in 10^14 bits, which is roughly one URE per 12.5TB read. A degraded four-bay array of 16TB drives forces about 48TB of sequential reads during resync, so encountering a URE is statistically near-certain.

Enterprise hardware controllers can puncture the stripe and keep going. Linux mdadm does not. It treats a URE during rebuild as a total drive failure, drops that member, and aborts the rebuild, which collapses the whole volume. That is why we never frame a degraded RAID 5 rebuild as a routine operation. RAID is availability, not a backup. Before any rebuild is even discussed, every member is imaged so the resync runs against clones where a URE costs nothing.

The hardware-controller side of this same problem, with PERC, Smart Array, and MegaRAID parity rotation, runs through our RAID data recovery workflow, since the on-disk geometry, not the card, is what we reconstruct.

Why do SMR drives get ejected mid-rebuild?

Drive-Managed Shingled Magnetic Recording drives stall during the sustained sequential writes a rebuild demands. SMR disks absorb random writes into a small Conventional Magnetic Recording cache, then flush that cache into the overlapping shingled tracks in the background. A resync floods the disk, the CMR cache fills, and the drive pauses host communication while it reshingles. That pause can run up to about two minutes.

The default Linux block-layer timeout at /sys/block/sdX/device/timeout is usually 30 seconds. When an SMR member stalls past that, the kernel decides the drive is dead, issues a bus reset, and drops it from the array mid-rebuild. Stack a second SMR member into the same resync and the array cascades into total collapse. Cloning the slow members first, with the DeepSpar Disk Imager handling the long latencies, takes that failure mode off the table.

Why is most NAS storage really a nested mdadm stack?

Most mdadm recovery cases do not come from bare-metal servers. They come from NAS appliances, where mdadm is layer one of a stack. The vendors do not use a proprietary format. They use mainline Linux mdadm, open-source LVM2, and standard filesystems. The complexity is the nesting, not secrecy.

Vendor / OSLayer oneVolume layerFilesystem
Synology SHRmdadm (RAID 1/5/6)LVM2 volume groupBtrfs or ext4
QNAP QTSmdadm (RAID 1/5/6)LVM2 thick and thin (dm-thin)ext4
Asustor / TerraMastermdadm (RAID 1/5/6)LVM2 volume groupext4 or Btrfs

Synology SHR slices mixed-capacity drives into uniform partitions, builds multiple mdadm arrays across those slices, pools every md device into one LVM volume group, and formats the resulting logical volume with Btrfs or ext4. SHR is not proprietary. It is mdadm plus LVM plus Btrfs/ext4, and it reassembles on a vanilla Linux workstation with no Synology hardware present. QNAP QTS follows the same pattern with standard mdadm under LVM2 thick or dm-thin pools.

The consequence for recovery is strict ordering. The mdadm layer must be reassembled read-only first, because the LVM PV and the filesystem do not exist until the md device is presented. Trying to repair LVM or mount Btrfs while the underlying software RAID is still inactive is futile, because the block device it lives on has not been built yet.

That same mdadm-then-LVM ordering drives our Synology SHR recovery and our QNAP QTS recovery, and it is the reason these are NAS recoveries first and filesystem recoveries second. Once the array is up, the LVM step runs through our Linux LVM recovery process and the on-disk Btrfs through our Btrfs recovery process.

How do professional engineers recover an mdadm array?

Our workflow is strictly offline and read-only. Every command runs against forensic clones, never the original drives. Imaging and reconstruction are two separate stages.

  1. Write-blocked imaging: Every member is cloned sector-by-sector with ddrescue, the PC-3000 Portable III, or the DeepSpar Disk Imager behind a write blocker. On drives with bad sectors or SMR stalls, the imager's adaptive read parameters and long timeouts pull the metadata regions and the data area without killing the drive.
  2. Map to read-only loop devices: Each image is attached with losetup -r so every later operation is physically read-only against the clone.
  3. Read-only superblock inspection: We run mdadm --examine on each loop device to read the UUID, chunk size, parity layout, data_offset, device role, and events counter, and to identify any stale member.
  4. Geometry validation when metadata is gone: When superblocks are missing, we carve the ext4 0xEF53 magic and block-group descriptors across the images to prove disk order, chunk size, and data_offset before assembling anything.
  5. Software reconstruction: The geometry is reassembled virtually against the image files, with mdadm --assemble --readonly where the on-disk metadata is usable, or with Data Extractor Express RAID Edition on our PC-3000 Express when the metadata is too corrupt for mdadm to parse. The hardware images the drives; the software rebuilds the array. Neither ever writes to a member.
  6. LVM, filesystem, mount, export: For NAS stacks we recover the LVM layer next, then mount the ext4 or Btrfs filesystem read-only and copy the data to a separate drive you provide.

All work is performed in-house in Austin, TX. We use named equipment: PC-3000 Portable III, PC-3000 Express, Data Extractor Express RAID Edition, DeepSpar Disk Imager, and a 0.02 micron ULPA-filtered clean bench. Single location, no franchises, no outsourcing. Founded in 2008.

What should you avoid before sending the drives?

Most permanent mdadm data loss comes from advice that treats a logical array problem as a hardware problem. The following patterns are the ones we routinely undo, or fail to undo when it is already too late.

Running mdadm --create to "rebuild" the array
mdadm --create --assume-clean writes fresh superblocks. One wrong parameter shifts data_offset and turns the payload to garbage, and a later fsck finishes the job. Never run it on original media.
Scanning the NAS live over SSH or its IP address
Tools that promise to recover a NAS without unplugging the drives force the kernel to serve heavy reads through a degraded abstraction and retry weak sectors without limit. That converts recoverable surfaces into head crashes. Power down and image first.
Letting the appliance or a new chassis auto-rebuild
Booting the drives in a replacement NAS or accepting a firmware repair prompt can auto-initialize or resync from corrupt state, overwriting the user data area. Image first, migrate never.
Forcing assembly with the stale member included
mdadm --assemble --force with the lowest-event drive still in the set feeds outdated blocks into parity. Identify and exclude the stale member first, read-only, against clones.

How much does mdadm recovery cost?

mdadm recovery is priced per drive: each member is imaged and analyzed individually, and the virtual array reassembly is performed in-house at no separate line item. Standard consumer NAS and server drives use our HDD pricing tiers:

  • Array reassembly and filesystem recovery: From $250
  • Firmware repair (drive unrecognized or wrong size): $600–$900
  • Head swap (clicking or not spinning): $1,200–$1,500

Helium-filled enterprise drives (10TB and larger Toshiba MG, WD Ultrastar, Seagate Exos), common in modern arrays, use helium-specific pricing: From $200 through $3,000–$4,500. A multi-bay NAS or server array is priced as the sum of the applicable per-drive tiers.

Rush service adds 100. +$100 rush fee to move to the front of the queue Donor drives are matching drives used for parts. Typical donor cost: $50–$150 for common drives, $200–$400 for rare or high-capacity models. We source the cheapest compatible donor available.

No diagnostic fees. No data, no recovery fee. If we cannot reassemble your array and extract the filesystem, you pay nothing for the recovery attempt.

How long does mdadm recovery take?

Imaging takes roughly 6 to 10 hours per 4TB member. A clean array with intact superblocks is usually reassembled in 1 to 2 business days. A NAS stack that needs mdadm reassembly, LVM metadata recovery, and hex carving of a lost geometry typically takes 2 to 4 business days. Drives with bad sectors, SMR stalls, or mechanical damage need slower bitwise imaging with PC-3000, which adds time.

Frequently asked questions

Can I run mdadm --create --assume-clean to recover my array?

No. mdadm --create writes fresh superblocks to the member drives. If the disk order, metadata version (1.2 versus 1.0), chunk size, or parity rotation is wrong, the data_offset shifts and the filesystem payload becomes interlaced garbage. If LVM sat on top, the new superblock overwrites the PV header, and a later fsck zeroes the inodes it thinks are corrupt. Use mdadm --examine and mdadm --assemble --readonly against clones instead.

Why did my RAID 5 rebuild fail and crash the volume?

A rebuild reads every sector of every surviving drive to recompute parity. Consumer drives carry a URE rate near 1 in 10^14 bits, about one per 12.5TB. A degraded four-bay array of 16TB drives forces about 48TB of reads, so hitting a URE is near-certain. Linux mdadm treats that URE as a dead drive, aborts the rebuild, and crashes the volume. RAID is availability, not a backup.

How do I safely reassemble a degraded mdadm array?

Never work on the originals. Clone each member with ddrescue or PC-3000 Portable III, attach the images read-only with losetup -r, read the superblocks with mdadm --examine, then assemble with mdadm --assemble --readonly and mount read-only to verify before copying anything off.

Can I recover NAS data without the original enclosure?

Yes. Synology, QNAP, Asustor, and TerraMaster run standard Linux mdadm with LVM on top, not a proprietary format. The drives parse on a vanilla Linux workstation using read-only loop devices and standard mdadm and LVM tools. The chassis is not required.

What does an mdadm events counter mismatch mean?

The members dropped out of the array at different times. The drive with the lowest events count is stale and holds old data. Forcing assembly without excluding that stale member feeds outdated blocks into the parity math and corrupts the reconstruction. We identify it with mdadm --examine before doing anything else.

Is Synology SHR a proprietary RAID format?

No. SHR is mdadm aggregating mixed-size partitions into RAID 1, 5, or 6 sets, LVM pooling those md devices into a volume group, and Btrfs or ext4 on top. It reassembles on any Linux workstation without Synology hardware. The work is Linux storage expertise, not a closed-format secret. The same is true of QNAP, Asustor, and TerraMaster.

Data Recovery Standards & Verification

Our Austin lab operates on a transparency-first model. We use industry-standard recovery tools, including PC-3000 and DeepSpar, combined with strict environmental controls to maintain drive integrity. This approach allows us to serve clients nationwide with consistent technical standards.

Open-drive work is performed in a ULPA-filtered laminar-flow bench, validated to 0.02 µm particle count, verified using TSI P-Trak instrumentation.

Transparent History

Serving clients nationwide via mail-in service since 2008. Our lead engineer holds PC-3000 and HEX Akademia certifications for hard drive firmware repair and mechanical recovery.

Media Coverage

Our repair work has been covered by The Wall Street Journal and Business Insider, with CBC News reporting on our pricing transparency. Louis Rossmann has testified in Right to Repair hearings in multiple states and founded the Repair Preservation Group.

Aligned Incentives

Our "No Data, No Charge" policy means we assume the risk of the recovery attempt, not the client.

We believe in proving standards rather than just stating them. We use TSI P-Trak instrumentation to verify that clean-air benchmarks are met before any drive is opened.

See our clean bench validation data and particle test video

Related services

Need Recovery for Other Devices?

Ship us your drives. We'll reassemble the array.

Linux mdadm software RAID recovery with offline read-only tools. No data, no recovery fee. Free diagnosis. Austin, TX lab.

(512) 212-9111Mon-Fri 10am-6pm CT
No diagnostic fee
No data, no fee
4.9 stars, 1,837+ reviews