Btrfs RAID Recovery & Filesystem Repair

Btrfs is a Copy-on-Write B-tree filesystem with up to four redundant superblock mirrors. Recovery means imaging every drive first, then using read-only tools like btrfs-find-root and btrfs restore to extract data. We never run btrfs check --repair, because repair overwrites historical generation tree roots and destroys recoverable versions of your data.

No Data, No Fee

Guarantee

2.49M+

Subscribers

4.9

1,837+ Google Reviews

Since 2008

Established

Repairs on Video

Full Transparency

As Featured In

Written by

Founder & Chief Technician

Updated 2026-06-29

What Btrfs is and why it fails differently

Btrfs stores data in a B-tree structure where every block pointer carries a checksum. The filesystem mirrors its superblock to up to four fixed locations: 0x10000 (64 KiB), 0x4000000 (64 MiB), 0x4000000000 (256 GiB), and 0x4000000000000 (1 PiB).

Only those locations that fall within the device size are written, so a smaller device carries fewer than four. When a NAS reports a crashed Btrfs volume, one or more of these superblocks has a mismatched generation number, or a tree block fails its checksum.

Because Btrfs is Copy-on-Write, it never overwrites a block in place. A new version is written elsewhere and the parent pointer is updated. This is why a sudden power loss during a write can leave a tree block pointing to a location that was never fully committed.

The result is a parent transid verify failed error or csum failed in dmesg. The filesystem will not mount because the tree is internally inconsistent.

Why btrfs check --repair destroys data

The btrfs-progs tool btrfs check --repair attempts to rebuild broken trees by scanning the entire device and rewriting metadata. On a CoW filesystem, rewriting metadata means allocating new blocks and updating pointers.

The old blocks are not erased, but the generation tree roots that pointed to them are overwritten. If the repair logic guesses wrong, you lose the historical roots that might have contained intact data.

Every recovery lab that recommends btrfs check --repair as a first step is gambling with your only remaining copies of the metadata tree. We do not run it on original drives. We image the drive first, then run read-only analysis on the image.

How Btrfs generation trees preserve older metadata

Btrfs generation trees preserve older metadata because Copy-on-Write writes new tree blocks and advances a root pointer instead of overwriting the old tree in place. A crash can damage the newest transaction generation while earlier roots still describe intact extents, snapshots, and directory trees.

The useful recovery target is not always the newest root. We compare generation numbers from btrfs inspect-internal dump-super against candidate roots reported by btrfs-find-root, then test extraction from images only. If generation 812304 points into a torn write but 812289 still walks the directory tree, the older root is the safer source.

btrfs inspect-internal dump-super /mnt/images/member-array.img
btrfs-find-root /mnt/images/member-array.img
btrfs restore -t <root_bytenr> /mnt/images/member-array.img /recovery/output/

How btrfs-find-root surfaces candidate roots

btrfs-find-root does not choose the recovery root by itself. It scans the imaged filesystem for root tree blocks and reports candidate bytenr, level, and generation values. The technician has to match those candidates against the superblock generation, chunk tree, and snapshot tree.

On Synology SHR and ReadyNAS OS6, that Btrfs layer usually sits above mdadm. We assemble the member images read-only first, then run Btrfs tools against the virtual array image. Running mdadm --create or btrfs check --repair before this step rewrites metadata that the root search depends on.

Bytenr: The byte address of a candidate tree root inside the Btrfs address space. btrfs restore -t uses this value to walk that root without modifying the source image.
Generation: The transaction number attached to a tree block. A lower generation can be more useful when the newest generation was interrupted by power loss or a dropped NAS member drive.
Level: The tree depth of the candidate block. A valid root must have a level that matches the expected B-tree structure for the root tree being restored.

Why mounting a failing Btrfs drive is not safe

The first mistake users make is mounting a failing Btrfs drive read-write to look at the files. Any read-write mount triggers log-tree replay, which writes updated pointers to disk. If the log references corrupted blocks, replay can push bad pointers into the live filesystem structure and turn a recoverable volume into an unmountable one.

The older mount -o recovery option (renamed to usebackuproot in newer kernels) goes further. It tells Btrfs to fall back to backup tree roots and rewrite the active root pointer if the newest root is unreadable. That is an in-place modification of the Copy-on-Write history we depend on for recovery, even when paired with ro.

For data recovery, the safe sequence is to image the device first with ddrescue or the PC-3000 Portable III, then inspect the image with read-only tools such as btrfs inspect-internal dump-super and btrfs-find-root. If mounting an image is necessary, use ro,nologreplay on a copy, never the original drive.

Read-only forensic toolchain

Our workflow uses four read-only or image-based tools in sequence. We run these on copies of your drives, never the originals.

Imaging: Every drive is imaged with ddrescue or the PC-3000 Portable III before any filesystem access. A 4TB drive takes 6 to 10 hours at full speed.
Superblock inspection: btrfs inspect-internal dump-super /dev/image reads all four superblock copies and reports generation, total bytes, and root tree addresses.
Root search: btrfs-find-root /dev/image scans for historical tree roots with valid generation numbers. A Btrfs filesystem may contain dozens of older roots from before the corruption event.
Data extraction: btrfs restore -t <bytenr> /dev/image /output/ extracts files from a specific historical root. This is entirely read-only on the source image.

Btrfs RAID profiles and the write hole

Btrfs supports RAID 0, 1, 10, 5, and 6 natively. RAID 1 and 10 are stable and widely used. RAID 5 and RAID 6 inside Btrfs are not production-ready.

The Btrfs RAID 5 implementation has a known write hole: if the system crashes during a parity update, the stripe is left partially updated. A subsequent scrub or read may use the wrong parity block to reconstruct data, silently corrupting it.

Synology, QNAP, and TrueNAS do not use Btrfs native RAID 5 for their primary storage pools. Synology SHR uses mdadm RAID 1/5/6 underneath Btrfs. QNAP and TrueNAS use mdadm or ZFS for the RAID layer and format the resulting volume with Btrfs or another filesystem.

If you encounter a Btrfs RAID 5 array, it is likely a custom Linux installation, and the write hole is a real risk.

Consumer drives carry a worst-case specification of one unrecoverable read error per approximately 10^14 bits read, about 12.5TB. That figure is a warranty floor, not a schedule, so a 48TB rebuild does not guarantee a read error; it makes the probability of encountering at least one unrecoverable sector high. If Btrfs RAID 5 hits a URE during a degraded scrub it cannot reconstruct that stripe, and the data in it is lost.

Because the Btrfs RAID 5 write hole makes degraded rebuilds risky, when the data is irreplaceable and unbacked we image every member read-only before any rebuild and reconstruct from the images.

Which Btrfs B-tree failed and what it means for recovery

Btrfs is not one flat structure. It stores its metadata in a stack of separate B-trees, and the tree that failed decides whether your data comes back read-only or whether the whole array needs raw destriping.

Most labs treat a Btrfs crash as one undifferentiated event. It is not.

Damage to the file system tree is localized and usually recoverable by walking an older Copy-on-Write generation. Damage to the chunk tree or extent tree severs the logical-to-physical map for every drive in the pool at once.

Root Tree (tree of tree roots): Holds the root nodes for every other B-tree (extent, chunk, FS, checksum). Failure shows as failed to read tree root, open_ctree failed, or corrupt node: root=1. Because of CoW, older root tree nodes from earlier transactions usually survive, so btrfs-find-root can scan the raw image for an older generation and route around the damage read-only, no destriping needed.
Chunk Tree: Maps logical Btrfs addresses to physical device byte offsets, and tracks which physical drive holds each chunk in a multi-device array. Failure shows as read_block_for_search: logical addr mirror 1 failed, scan chunk headers error, or type mismatch with chunk. Damage here destroys the logical-to-physical map. btrfs rescue chunk-recover exists but can falsely abort on valid extent metadata, so severe damage forces full raw destriping.
Extent Tree: Tracks allocated space and reference counts for all data and metadata blocks. Failure shows as bad extent type mismatch with chunk or Couldn't setup extent tree, and the filesystem drops to read-only. btrfs check --init-extent-tree is dangerous; recovery means read-only extraction with btrfs restore or deep lab reconstruction.
Checksum (csum) Tree: Stores the checksums (CRC32C by default) Btrfs uses to detect bit rot. If the tree itself is corrupt, Btrfs rejects perfectly valid data on a checksum failure. Recovery bypasses validation to pull the raw file blocks out.
FS Tree / subvolume trees: The actual file and directory hierarchy: inodes, INODE_ITEM records, directory entries, and EXTENT_DATA references. Failure shows as corrupt leaf: invalid extent data backref or read time tree block corruption detected. btrfs restore can target an older FS tree root objectid and walk the directory tree cleanly without touching the array.
Log Tree: The journal for fsync operations. A damaged log tree blocks mounting. It can be cleared with btrfs rescue zero-log, which writes to disk and therefore only runs on a clone.
Device Tree: Holds the physical device mapping. Failure shows as Couldn't setup device tree. Severe damage stops the array from assembling at all and forces raw destriping.
Free-Space Tree: Caches free-space tracking. Failure shows as extent buffer leaks or mount failures, and is cleared with btrfs rescue clear-space-cache on a clone.

btrfs check --readonly and btrfs rescue: what is safe on the original

btrfs check --readonly is the default, non-destructive diagnostic. It sits at the very start of a read-only workflow, before any btrfs restore attempt, and it does not modify the device.

It reports on root items, extents, free-space caches, csum items, and root refs, printing the exact errors above (for example bad extent type mismatch with chunk). That output tells the technician which tree failed before anything writes a single byte.

The trap is escalating it to btrfs check --repair as a reflex. Repair can scramble block allocations and should never be the routine next step after a read-only check. The check tells you what is wrong; it is not a license to let the same tool guess at fixing it.

Two flags go further than ordinary repair and are worth naming so you do not reach for them. btrfs check --init-extent-tree rebuilds the extent tree from scratch, and btrfs check --init-csum-tree recreates the checksum tree from scratch.

The btrfs-progs documentation labels both as dangerous: it warns not to use --init-extent-tree unless you know what you are doing, and states plainly not to use --init-csum-tree blindly to fix checksum mismatch problems. Each one throws away the existing tree pointers and tries to rebuild them, so if the underlying data is already fragmented or corrupt, running them can permanently bar a later professional recovery. We never run them on original media.

The btrfs rescue subcommands all write to disk, so they only ever run on forensic clones or images, never original client media:

btrfs rescue super-recover overwrites corrupted superblocks with valid backup copies found at the fixed offsets.
btrfs rescue chunk-recover scans for chunk headers to rebuild the chunk tree.
btrfs rescue zero-log clears the log tree so the volume can mount.

Each of these is a write operation. Running one on the only copy of your data turns a recoverable failure into a permanent one, which is why we clone first and run them against the clone.

When partial recovery works and when the array needs full destriping

Whether your files come back from an older generation root or whether the array has to be raw-destriped comes down to which trees survived. The split is clean once you know which tree failed.

Partial CoW recovery is possible when the extent tree and chunk tree are intact but the root tree or FS tree is damaged. Because Btrfs writes new blocks instead of overwriting old ones, an older valid generation root can still walk the FS tree. btrfs-find-root locates those older generations and btrfs restore extracts the files read-only, without repairing the array at all.

Full re-image plus raw destriping is forced when the chunk tree or extent tree is corrupted across members. The chunk tree owns the logical-to-physical translation layer; once that map is gone, the RAID members can no longer locate their own data blocks.

Btrfs native RAID 5 and 6 make this worse, because the write hole leaves stripes inconsistent and a damaged chunk allocation tree means the logical-to-physical mapping across member images is lost. At that point recovery means deep forensic destriping: imaging each member with the PC-3000 Portable III, then using Data Extractor Express RAID Edition to identify the physical parity blocks and stripe size manually, bypassing Btrfs native logic.

On Synology SHR this stays simpler than the marketing labs claim, because SHR is just mdadm plus LVM plus Btrfs: we clone the members first, assemble the underlying blocks, then begin Btrfs recovery on the assembled image.

Subvolumes and snapshots

Btrfs subvolumes are independent filesystem namespaces within the same partition. A snapshot is a read-only or writable subvolume that shares data blocks with its parent through the CoW mechanism. When a Btrfs filesystem is corrupted, the snapshot tree is often still intact even if the default subvolume is not.

Recovery means identifying which subvolume IDs are still reachable from a valid root. The btrfs subvolume list -t command on a mounted image shows the subvolume tree. If the default subvolume is damaged, we can extract data from an older snapshot by pointing btrfs restore at the snapshot's root bytenr.

Layered failures: Btrfs over mdadm

Most prosumer NAS devices run Btrfs on top of a software RAID layer. Synology SHR uses mdadm plus LVM plus Btrfs. Netgear ReadyNAS OS6 uses mdadm plus Btrfs directly.

In these setups, the Btrfs corruption is often a symptom of an underlying RAID problem, not the root cause. If you are staring at a crashed volume on a Synology or ReadyNAS, the box-specific steps live on our NAS-specific Btrfs corruption recovery page.

A degraded mdadm RAID 5 array that is missing one drive will still assemble in read-only mode with mdadm --assemble --readonly --force. Btrfs can then be mounted on the assembled array with mount -o ro,degraded.

If the array was accidentally rebuilt (a drive was removed and reinserted, and the NAS started a rebuild), the parity is wrong and Btrfs checksums will fail across the entire stripe. In that case, we reassemble the pre-rebuild mdadm geometry from drive images and extract Btrfs from the original state.

What happens to a Btrfs native RAID-1 array when a drive goes missing?

When a Btrfs native RAID-1 array loses a device, the array can still be read, but only along a narrow safe path: a strictly read-only degraded mount run against drive images, never the originals. The reason a missing device is more fragile on Btrfs than on a hardware mirror comes down to how Btrfs places its copies.

Btrfs native RAID-1 is not the same thing as mdadm RAID-1. mdadm RAID-1 mirrors every block onto every member, so a 4-drive mdadm mirror holds four identical copies. Btrfs native RAID-1 keeps exactly two copies of each chunk on two different devices, no matter how many devices are in the pool.

Add a fifth drive to a Btrfs RAID-1 array and you get more capacity, not more copies. If you want three or four copies you have to ask for them explicitly with the RAID-1C3 and RAID-1C4 profiles. A single-device Btrfs filesystem sits at the other end of that scale: it defaults its metadata to DUP, two copies on the same disk.

That two-copies-on-two-devices rule is also why a degraded Btrfs RAID-1 array is dangerous to mount carelessly. By documented kernel behavior, a degraded Btrfs RAID-1 with a missing or dropped member is reliably mountable read-write only once while degraded. Mount it read-write a second time and the kernel can refuse with a too many missing devices failure, stranding the array.

The general mechanism is that with one device gone, the chunk allocator can no longer satisfy the rule that each new chunk needs two copies on two separate devices, so anything written during that one read-write window lands without its normal redundancy and a later mount sees critical metadata it cannot trust. We do not lean on that internal detail; we lean on the documented once-only behavior, and we never spend that one mount on the original drives.

The safe extraction path is a read-only degraded mount on imaged members:

mount -o ro,degraded /dev/loopX /mnt/recovery

The ro flag is the part forum threads leave out. Plenty of posts tell you to run mount -o degraded and copy your files off, which is a read-write mount by default and spends your one safe mount writing single-profile chunks into a half-redundant array. Image every member first, then mount the images read-only.

Before touching a mount at all, confirm what the array thinks it has. Three read-only btrfs commands describe the device set without changing it:

btrfs filesystem show: Lists the filesystem by UUID and flags a member that is registered but not present, so a missing device ID surfaces here before you mount anything.
btrfs device usage: Reports how block-group allocation is spread across members. A missing device shows as an allocation imbalance, which tells you which chunks lost a copy.
btrfs device scan: Registers the devices that are actually present by UUID so the kernel can pair them into the right filesystem. It reads identities; it does not write array metadata.

Common Btrfs error messages

parent transid verify failed: A child tree block has a generation number higher than its parent expects. This usually means a torn write or a drive that acknowledged a write it did not complete. The filesystem will not mount.
csum failed: A data block or metadata block failed its checksum. In a RAID 1 or 10 setup, Btrfs can read the mirror copy. In a single-drive or RAID 0 setup, the block is lost unless a historical snapshot contains an older version.
open_ctree failed: The kernel could not open the root tree. This is a catch-all error that appears when superblock inspection or root tree traversal fails. The underlying cause is usually a parent transid verify failed at the root tree level.
block group has wrong amount of free space: The block group's accounting metadata does not match the actual free blocks. This happens after an unclean shutdown on a nearly full filesystem. Btrfs refuses to mount because it cannot guarantee allocation safety.

Pricing

Btrfs recovery is priced per drive, multiplied by the number of drives requiring imaging and analysis. Standard consumer NAS drives use our HDD pricing tiers:

File system recovery (logical): From $250
Firmware repair (unrecognized / wrong size): $600–$900
Head swap (clicking / beeping): $1,200–$1,500

Helium-filled enterprise drives (8TB and larger Toshiba MG, WD Ultrastar, Seagate Exos series) use helium-specific pricing: From $200 through $3,000–$4,500. A 5-bay Synology with four standard drives and one helium drive would be priced as the sum of the applicable per-drive tiers, plus the array reconstruction fee ($400-$800).

Rush service adds 100. +$100 rush fee to move to the front of the queue Donor drives are matching drives used for parts. Typical donor cost: $50–$150 for common drives, $200–$400 for rare or high-capacity models. We source the cheapest compatible donor available.

No diagnostic fees. No data, no recovery fee. If we cannot extract your files, you pay nothing for the recovery attempt.

How we recover it

Our lab is at 2410 San Antonio Street, Austin, TX 78705. Nationwide service is mail-in. We do not have satellite locations or franchise partners.

Intake & imaging: Every drive is forensically imaged with ddrescue or the PC-3000 Portable III. We do not touch the original drives with repair tools.
RAID reassembly (if applicable): For mdadm-based arrays, we assemble the RAID on a Linux workstation using the original drive order and mdadm superblocks. For Btrfs native RAID, we map the chunk allocation tree to determine which drives hold which stripes.
Metadata analysis: We inspect the superblock copies and run btrfs-find-root to identify valid generation trees. If multiple historical roots exist, we test extraction from each to find the most complete dataset.
Data extraction: Files are extracted with btrfs restore to a separate storage array. Extracted data is verified by checksum where possible.
Return: Recovered data is returned on an external drive or via secure download. The original drives and images are retained for 30 days, then securely wiped.

All work is performed in-house. We use named equipment including the PC-3000 Portable III, PC-3000 Express, DeepSpar Disk Imager, and a 0.02 micron ULPA-filtered clean bench for mechanical work. Founded in 2008.

Frequently asked questions

Can I run btrfs check --repair on a corrupted Btrfs volume?

No. btrfs check --repair overwrites historical generation tree roots on a Copy-on-Write filesystem. This destroys older metadata versions that might contain intact data. The safe approach is read-only extraction with btrfs-find-root and btrfs restore.

Is mounting a failing Btrfs drive read-write safe?

No. Any read-write mount triggers log-tree replay, which writes to disk. On a degraded or corrupted Btrfs volume, any write risks updating pointers to bad blocks and making the filesystem unmountable. The recovery and usebackuproot options modify the backup root pointer even when combined with ro. Always image the drive first, then mount the image with ro,nologreplay if mounting is necessary.

What does parent transid verify failed mean?

It means a Btrfs tree block has a generation number that does not match its parent's expected transaction ID. This indicates a torn write, an interrupted scrub, or a drive that dropped writes. The filesystem will not mount until a valid root is found with btrfs-find-root.

Does Btrfs RAID 5 have a write hole?

Yes. Btrfs RAID 5 and RAID 6 are not production-ready and contain a known write hole. If a crash occurs during a parity update, the stripe becomes inconsistent. A subsequent scrub may corrupt data rather than repair it. Synology and most NAS vendors avoid Btrfs RAID 5 for this reason.

How do I recover a Synology SHR volume with Btrfs?

Synology SHR is standard Linux mdadm plus LVM plus Btrfs. It can be assembled on any Linux workstation with mdadm --assemble --readonly. The Btrfs layer is then accessible with standard btrfs tools. No proprietary hardware is required. If a recovery lab tells you SHR is a black box only their proprietary tool can read, walk away.

What is the safe way to extract data from a corrupted Btrfs filesystem?

Image every drive first with ddrescue or PC-3000 Portable III. Then run btrfs inspect-internal dump-super to inspect metadata, btrfs-find-root to locate a valid generation tree, and btrfs restore -t to extract files. Never run btrfs check --repair, mount the original drive read-write, or use the recovery or usebackuproot options.

Why do older Btrfs generation roots matter during recovery?

Older Btrfs generation roots matter because Copy-on-Write writes new tree blocks instead of overwriting old ones. If the newest root points into a torn write, an earlier generation can still describe intact extents, snapshots, and directory trees.

How does btrfs-find-root help without repairing the filesystem?

btrfs-find-root scans the imaged device for candidate root tree blocks and reports their bytenr and generation values. It does not repair the filesystem. The recovery step is choosing a readable root, then extracting files with btrfs restore -t to separate storage.

Can I just move the drives to a new NAS enclosure?

Only if the new enclosure uses the exact same RAID metadata format and drive order. Most NAS enclosures write their own configuration to the trailing sectors of each drive during initialization. Inserting your old drives into a new NAS often triggers an initialization that overwrites the mdadm or Btrfs superblocks, making recovery harder. Image the drives first, then experiment.

Why did my Btrfs RAID 1 volume crash if RAID 1 is supposed to be safe?

RAID 1 provides availability, not data protection. If both drives in a 2-drive RAID 1 have a corrupted Btrfs tree at the same logical offset, the filesystem has no good mirror to read. This can happen after a firmware bug, a simultaneous power event, or a bad RAM module that wrote incorrect data to both drives through the controller. The RAID layer did not fail; the filesystem layer above it did.

Are Btrfs snapshots a backup?

Snapshots are not a backup if they live on the same physical pool as the original data. A snapshot protects against accidental deletion or ransomware, but it does not protect against drive failure, controller corruption, or a fire in the server closet. A backup is a separate copy on separate hardware.

Do I need to send all drives if my NAS has a hot spare?

Send every drive that was part of the array at the time of failure, including the hot spare if it was ever activated. The Btrfs chunk tree and RAID geometry metadata are spread across all member drives. Missing one drive means missing part of the metadata tree, which can make the entire array unrecoverable.

How long does Btrfs recovery take?

Imaging takes 6 to 10 hours per 4TB drive. A 4-drive NAS typically takes 1 to 3 business days for analysis and extraction, assuming no mechanical failures. Drives with bad sectors or head degradation require additional time for bitwise imaging with the PC-3000.

Why does my Btrfs filesystem report open_ctree failed?

open_ctree failed points to corruption in the Btrfs root tree or superblock. Because Btrfs uses Copy-on-Write, older versions of the root tree usually still exist on the drive. A lab can locate a previous valid generation with btrfs-find-root and perform a read-only extraction without risking further damage to the original media.

Can btrfs check --repair fix a damaged extent tree?

No. btrfs check --repair should never be the routine fix. When the extent tree is damaged, repair blindly guesses block allocations, which often permanently destroys directory structures. We image the drives first and use read-only extraction instead of destructive in-place repair.

What happens if the Btrfs chunk tree is corrupted on a RAID array?

The chunk tree maps logical data to physical locations across your RAID members. If it is destroyed, the filesystem loses track of where your files live on the physical disks. Partial recovery tools fail at that point, and the array needs full raw destriping and manual reconstruction with lab equipment like Data Extractor Express RAID Edition.

Can I mount my degraded Btrfs RAID-1 to copy data off?

Only as a strictly read-only degraded mount, and only on drive images rather than your original disks. A degraded Btrfs RAID-1 with a missing device is documented as reliably mountable read-write just once, so a read-write mount spends that one chance and can leave the array refusing to mount again with a too many missing devices error. Image every member first, then run mount -o ro,degraded against the images.

Does Btrfs RAID-1 keep a copy on every drive?

No. Btrfs native RAID-1 keeps exactly two copies of each chunk on two different devices, regardless of how many devices are in the pool. That is different from mdadm RAID-1, which mirrors onto every member. If you want three or four copies on Btrfs you have to use the RAID-1C3 or RAID-1C4 profiles, and a single-device Btrfs defaults its metadata to two copies on the same disk (DUP).

Related services

Need Recovery for Other Devices?

Synology Data Recovery

Ship us your drives. We'll extract the data.

Btrfs recovery with read-only forensic tools. No data, no recovery fee. Free diagnosis. Austin, TX lab.

Call (512) 212-9111 Mail-in instructions

(512) 212-9111Mon-Fri 10am-6pm CT

No diagnostic fee

No data, no fee

4.9 stars, 1,837+ reviews