Skip to main contentSkip to navigation
Rossmann Repair Group logo - data recovery and MacBook repair

Pool Recovery & RAIDZ Reconstruction

ZFS Data Recovery

We recover faulted ZFS pools by imaging every drive, parsing vdev labels and uberblocks, and importing the pool offline from cloned images. Covers TrueNAS, FreeNAS, Proxmox, Oracle Solaris, and Linux OpenZFS. Free evaluation. No data = no charge.

Louis Rossmann
Written by
Louis Rossmann
Founder & Chief Technician
Updated March 2026
16 min read

How ZFS Pools Fail and How We Reconstruct Them

ZFS stores all data and metadata in a Merkle tree rooted at the uberblock. Each pool contains one or more vdevs (mirror, RAIDZ1, RAIDZ2, or RAIDZ3), and each vdev distributes data across its member drives using variable-width stripes. When enough drives in a vdev fail that ZFS cannot reconstruct missing blocks from parity, the pool transitions to FAULTED and refuses import. Recovery requires imaging every drive in the pool, locating valid uberblocks in the 128-entry ring on each drive's ZFS labels, and force-importing the pool read-only from images to extract datasets and zvols.

ZFS checksums every block using SHA-256 (or fletcher4 for non-dedup pools). This per-block verification catches silent corruption thattraditional RAID controllers miss. The trade-off: when ZFS detects a checksum mismatch and cannot correct it from parity, it returns an I/O error instead of serving corrupt data. Scrub errors on a degraded pool signal that more blocks will become inaccessible if another drive fails.

ZFS On-Disk Architecture for Recovery Engineers

Targeted ZFS recovery requires understanding five on-disk structures that standard file-carving tools do not handle. Each structure has a specific role in the pool's self-describing metadata tree, and corruption at any level produces different symptoms.

Vdev Labels (L0, L1, L2, L3)

Every drive in a ZFS pool carries four label copies: L0 and L1 at the first 512 KB, L2 and L3 at the last 512 KB. Each label contains the pool GUID, vdev GUID, the full vdev tree encoded as an nvlist, and the uberblock ring (128 entries, 1 KB each). ZFS writes labels in a specific order to prevent all four from being corrupted by a single interrupted write. During recovery, we read all four labels from every drive image to find the most complete vdev tree and the highest-txg uberblock.

Uberblock Ring and Transaction Groups

The uberblock is the root pointer for the entire pool. ZFS maintains a ring buffer of 128 uberblocks per label, written round-robin as each transaction group (TXG) commits. Each uberblock records the TXG number, a timestamp, and a block pointer to the Meta Object Set (MOS). The active uberblock is the one with the highest TXG that also has a valid checksum. When the latest TXG is corrupted, we walk backward through the ring to find an older, consistent state. The trade-off: rolling back to an earlier TXG means any data written after that TXG is lost. For a pool that saw 10 TXGs of write activity before the failure, that window is usually seconds to minutes of data.

DVA Pointers and the Block Pointer Tree

Every ZFS block pointer contains up to three Data Virtual Addresses (DVAs). Each DVA encodes the vdev ID, offset within the vdev, and the gang bit (indicating whether the block is a gang block split across multiple sub-blocks). The block pointer also stores the checksum of the target block, the compression algorithm, the logical and physical sizes, and the birth TXG. We use zdb -bbb on the imported pool image to traverse the full block pointer tree. When DVA pointers reference sectors on a failed drive, we reconstruct the missing data from RAIDZ parity (if within tolerance) or flag those blocks as unrecoverable.

Dnode Objects and the Meta Object Set

The MOS is the top-level object set containing all pool-wide metadata: the dataset directory, the space map, the DDT (if dedup is enabled), and configuration objects. Each dataset within the pool has its own object set, and within that set, every file or directory is represented by a dnode. A dnode is a 512-byte structure that stores the object type, bonus data (such as the ZPL file attributes), and up to three block pointers for the object's data. When MOS corruption prevents normal import, we locate the MOS block pointer directly from the uberblock and traverse the dnode tree manually using zdb -dddd to enumerate datasets.

Space Maps and Free Space Tracking

ZFS tracks allocated and free space using space maps: on-disk logs of allocation and free events for each metaslab. Corrupted space maps do not lose user data, but they prevent ZFS from mounting the pool because it cannot determine which blocks are in use. Recovery involves bypassing the space map check during read-only import and rebuilding the space map from the block pointer tree. This is a metadata-only repair that does not modify user data blocks.

Raw Block Extraction via objset_phys_t

When catastrophic MOS corruption prevents any form of pool import, we fall back to raw block extraction. The objset_phys_t structure is the root of every dataset in ZFS; it contains an array of dnode_phys_t entries, each holding block pointers (blkptr_t) to the actual file data. Using zdb -R pool:vdev:offset:psize:d,lzjb,lsize, we read raw compressed data blocks directly from drive images, decompress them with the correct algorithm (LZ4, LZJB, or ZSTD), and reconstruct individual files by walking the dnode tree manually. This bypasses the entire ZFS import pipeline. It's the last-resort method for severely corrupted pools on custom-built NAS enclosures where no valid uberblock survives on any drive.

RAIDZ Parity Distribution and Rebuild Constraints

RAIDZ differs from traditional RAID 5/6 in a fundamental way: it uses variable-width stripes. Each logical block is distributed across a stripe whose width depends on the block's size and the number of drives in the vdev. This eliminates the RAID write hole without requiring a battery-backed cache, but it means recovery tools designed for fixed-stripe RAID arrays cannot parse RAIDZ data.

RAIDZ LevelParity DrivesFault ToleranceRecovery When Exceeded
RAIDZ11 per stripe1 drive per vdevBlocks on 2+ failed drives are unrecoverable from parity alone. Partial recovery based on which blocks landed on which drives.
RAIDZ22 per stripe2 drives per vdevBlocks spanning 3+ failed drives lost. RAIDZ2 is the most common production configuration and offers a better recovery margin than RAIDZ1.
RAIDZ33 per stripe3 drives per vdevRarely exceeds tolerance in practice. Typically seen in large vdevs (8+ drives) where the probability of three simultaneous failures is non-trivial during resilver.

For traditional hardware RAID arrays (Dell PERC, HP SmartArray, LSI MegaRAID), see our RAID data recovery service. RAIDZ is software RAID managed by ZFS and uses a different on-disk layout than controller-based arrays.

Hardware-Assisted RAIDZ Reconstruction with PC-3000

Software-only tools (ReclaiMe, UFS Explorer, DiskInternals) parse ZFS pools by reading vdev labels through the operating system's block device layer. When a drive has hardware-level read timeouts, firmware lockouts, or degraded heads, the OS cannot deliver clean sectors and the software tool stalls or returns garbage data.

We bypass the OS entirely using PC-3000 Data Extractor RAID Edition, which images each drive through a direct hardware interface (SAS or SATA) with configurable read timeout, head map, and sector retry parameters. Once all drives are imaged, the RAID Edition's autodetection module identifies the RAIDZ level (1, 2, or 3), block size, and stripe shift from the raw sector data. The "RAID Member Statistics" analytical method calculates the variable-width stripe distribution by comparing block entropy patterns across drive images, which is necessary because RAIDZ stripes vary in width per-block rather than using the fixed stripe width that traditional hardware RAID controllers impose. This hardware-assisted block shift identification is the difference between a successful & failed TrueNAS data recovery when the software stack can't read the drives.

ZFS Pool Failure Scenarios We Recover

Faulted Pool After Drive Failures

The most common scenario. RAIDZ1 pools fault when two drives in the same vdev fail. RAIDZ2 pools fault on three failures. We image all drives (including failed ones) and reconstruct what parity can provide. Board-level repair on electrically failed drives can restore one member and bring the pool back within tolerance.

Failed Resilver

Resilvering writes parity data to all surviving members. If a surviving drive develops bad sectors during the resilver, ZFS cannot complete the rebuild. A mid-resilver failure is dangerous because parity is partially recalculated: some stripes reflect the old layout, others reflect the new. We handle this by imaging all drives and reconciling both parity states offline.

Corrupted Uberblock or MOS

Power loss during a TXG commit can corrupt the active uberblock or the MOS it points to. ZFS normally recovers by falling back to a previous TXG, but if multiple TXGs are affected (e.g., UPS failure during a long scrub), manual uberblock selection is needed. We use zpool import -T to target a specific TXG, or parse the uberblock ring manually with zdb -lu to find the last consistent state.

ZIL / SLOG Device Failure

The ZFS Intent Log (ZIL) records synchronous write transactions. If a dedicated SLOG device (typically a fast NVMe SSD) fails, any uncommitted synchronous writes are lost. For pools where the SLOG failed during active database writes or NFS/iSCSI operations, we image both the SLOG device and the pool drives. If the SLOG contains recoverable log records, we replay them into the pool. If the SLOG is physically dead, the pool imports without those pending writes.

NVMe SLOG Firmware Lockout

TrueNAS and Proxmox administrators frequently assign Phison PS5012-E12 or PS5016-E16 based NVMe drives as dedicated SLOG devices for synchronous write caching. A sudden power loss can corrupt the Flash Translation Layer on these controllers, locking the drive to a 0-byte capacity state where the BIOS no longer detects it. The uncommitted ZIL transactions on that SLOG are lost to the operating system. We use PC-3000 Portable III to issue NVMe Vendor Specific Commands that force the controller into diagnostic mode, reconstruct the corrupted FTL mapping tables, and image the drive contents through the controller to recover uncommitted log records for replay into the pool. The same Phison controller failure modes affect consumer SSD data recovery outside the ZFS context.

Silicon Motion SM2262EN / SM2259XT Caching Drive Failures

ZFS administrators frequently assign consumer NVMe drives with Silicon Motion SM2262EN controllers (ADATA SX8200 Pro, HP EX950) or SATA drives with DRAM-less SM2259XT controllers as L2ARC or SLOG devices. A power loss can corrupt the Flash Translation Layer on these controllers, locking the drive into a 0-byte or diagnostic 2MB capacity state where the BIOS no longer detects usable storage. Standard ZFS tooling cannot access the drive in this state. We use the PC-3000 SSD utility to issue vendor-specific commands that force the controller into diagnostic mode, rebuild the corrupted FTL mapping tables from the NAND spare area, and image the drive contents to recover uncommitted ZIL transactions for replay into the pool. The same SM2259XT controller failures affect consumer SSDs outside the ZFS context.

ZIL Replay and Log-Write Block (LWB) Extraction

Synchronous writes in ZFS are stored as intent transactions (itx records) packaged into Log-Write Blocks (LWBs) within the ZFS Intent Log. When a dedicated SLOG device fails mid-commit, uncommitted LWBs are stranded on the dead NVMe drive. Unlike hardware RAID arrays with battery-backed caches, the ZIL relies entirely on the SLOG device committing itx records safely to persistent storage. We use PC-3000 Portable III to reconstruct the corrupted FTL on the failed SLOG (typically Phison E12 or SM2262EN controllers), image the raw NVMe namespace, and extract the uncommitted LWBs. The isolated itx records are then replayed into the pool's transaction groups to recover pending database writes and NFS/iSCSI operations that would otherwise be lost.

Accidental Pool Destruction

Running zpool destroy wipes labels from each drive. Running zpool labelclear does the same to individual drives. If no new data has been written afterward, the uberblocks and block pointer tree remain on disk at their original offsets. We scan for the characteristic uberblock magic number (0x00bab10c) and pool GUID to locate and reconstruct from them.

Dedup Table Corruption

ZFS deduplication stores a Dedup Table (DDT) that maps block checksums to their physical locations. The DDT consumes RAM (about 320 bytes per entry) and is backed by on-disk log-structured storage. When the DDT becomes corrupted (common when pools run low on memory under heavy dedup loads), files referencing deduplicated blocks cannot be resolved. We rebuild the DDT by scanning the full block pointer tree and reconstructing the checksum-to-DVA mappings.

dRAID Vdev Failure (OpenZFS 2.1+)

Distributed RAID (dRAID), introduced in OpenZFS 2.1 & available in TrueNAS SCALE 23.10+, replaces traditional RAIDZ hot spares with distributed spare space across all pool members. dRAID uses precomputed permutation maps to shuffle parity across drives, enabling faster sequential resilvering. Consumer recovery software (Klennet, UFS Explorer, ReclaiMe) cannot parse dRAID because the permutation map defines a non-standard block routing that doesn't match fixed-width RAIDZ striping. We use PC-3000 Data Extractor RAID Edition to manually define the tabular matrix configuration matching the dRAID permutation, then reconstruct the distributed spare blocks when a sequential resilver fails due to cascaded hardware degradation across same-batch drives.

Persistent L2ARC Header Corruption

OpenZFS 2.0+ added persistent L2ARC, allowing cache tables to survive reboots by writing header metadata to the cache device. If the NVMe cache drive suffers a controller lockout or partial firmware corruption, the corrupted L2ARC header triggers a kernel panic or infinite hang during zpool import. The pool data is intact on the main vdevs; only the cache metadata is damaged. We bypass the panic by setting l2arc_rebuild_enabled=0 in the ZFS module parameters before import, which instructs OpenZFS to skip the persistent cache rebuild and mount the pool read-only for extraction.

ZFS Pool Recovery When Software Fails

When all drives are physically healthy but the pool refuses to import, the failure is in ZFS metadata, not hardware. These software-level failures require forensic analysis of the on-disk structures rather than clean bench work.

Corrupted Uberblock Ring

Each drive in the pool stores 128 uberblocks in a ring buffer across its four ZFS labels. If a power loss corrupts the active uberblock and the next several entries in the ring, ZFS cannot locate a valid root pointer to the Meta Object Set. We parse all 512 uberblock slots (128 per label × 4 labels) across every drive image using zdb -lu to find the highest transaction group with a valid checksum. If no valid uberblock exists on the primary drive, we cross-reference uberblocks from other pool members where the ring may be intact.

Destroyed Spacemap

Spacemaps log allocation and free events for each metaslab. Corruption typically occurs when a pool runs at 95%+ capacity and a write fails mid-commit. The pool refuses to mount because ZFS cannot verify which blocks are allocated. User data remains on disk at its original offsets. We import the pool read-only with spacemap validation bypassed and rebuild the allocation table from the block pointer tree, which is an independent metadata structure that records every live block's location.

Vdev GUID Mismatch

Every vdev and pool has a unique GUID recorded in the ZFS labels. If a drive is replaced and the resilver aborts partway through, the new drive carries a different GUID than the pool expects. OpenZFS refuses the import because the vdev tree no longer matches the topology stored in the uberblock. This is common on TrueNAS SCALE systems where a failed resilver leaves the replacement drive with an incomplete label. ZFS identifies drives by on-disk GUIDs, not by OS device names, so the mismatch is a metadata conflict. We reconstruct the correct topology from historical labels and reimport with the original GUIDs. Unlike hardware RAID recovery, where the controller stores configuration on a dedicated ROM chip, ZFS distributes its configuration across all member drives, so the topology can always be rebuilt from the surviving labels.

OpenZFS 2.2 Block Cloning Corruption

OpenZFS 2.2 introduced block cloning via the copy_file_range(2) system call (used by coreutils 9.x & cp --reflink). A concurrency bug in the DMU offset reporting causes chunks of cloned files to be silently replaced by zeroes. Standard zpool scrub won't catch it because the checksum matches the zero-filled block that ZFS committed. This affects TrueNAS SCALE, Proxmox, & any Linux system running OpenZFS 2.2.0 through 2.2.2. During forensic extraction of affected pools, we set zfs_dmu_offset_next_sync=0 to bypass the timing window and extract uncorrupted blocks from historical transaction groups where the original data predates the cloning operation. Recovery from this bug requires comparing block birth TXGs against the cloning timestamp to identify which files contain zeroed chunks.

Platform-Specific ZFS Recovery Notes

ZFS runs on four major platforms. The on-disk format is cross-platform compatible (a pool created on Solaris can be imported on Linux), but encryption layers, feature flags, and bootloader integration differ across platforms.

TrueNAS / FreeNAS

TrueNAS CORE (FreeBSD) uses GELI disk-level encryption. TrueNAS SCALE (Debian Linux) uses ZFS native encryption at the dataset level. Both require key material for encrypted pools. See our dedicated TrueNAS recovery page for GELI-specific workflows. GELI keys are stored on the boot pool; if the boot drive is lost and no backup exists, the encrypted pool is unrecoverable.

QNAP QuTS hero

Newer enterprise QNAP NAS devices (TVS-x72XT, TVS-hx74, TS-x73A series) run the QuTS hero operating system, which replaces QTS's traditional ext4/mdadm stack with a full ZFS implementation. QNAP adds proprietary volume management wrappers and SSD caching layers on top of ZFS. We image the individual drives, bypass the QNAP hardware interface, and parse the QuTS hero ZFS pool offline using standard OpenZFS tooling. For other QNAP models running standard QTS with mdadm, see our NAS data recovery service.

Proxmox VE

Proxmox uses OpenZFS on Linux for VM storage. Pools typically store qcow2 disk images or zvols used as raw block devices by KVM/QEMU. Recovery involves importing the pool from images and extracting the guest VM disk files, then mounting the guest filesystem (NTFS, ext4, XFS) to verify the VM data. See Proxmox recovery for Ceph-related failures.

Oracle Solaris

Solaris is the original ZFS platform and may use older pool versions (pre-feature flags, pool version 28 or earlier). Older Solaris pools lack features like LZ4 compression and large dnode support. Recovery is straightforward if the pool version is identified correctly; we import on a matching OpenZFS version or use Solaris-native tools when feature flags are incompatible.

Linux OpenZFS

Ubuntu, Debian, Fedora, and Arch all support OpenZFS through the ZFS on Linux (ZoL) kernel module. Common in custom-built NAS servers and Proxmox hosts. Linux OpenZFS supports ZFS native encryption (dataset-level AES-256-GCM). Pool recovery is identical to other platforms once drives are imaged; the key difference is that Linux systems sometimes mix ZFS and mdadm (e.g., mdadm mirror for boot, ZFS pool for data), which requires handling both metadata formats.

Recovering VMware VMFS Datastores from ZFS iSCSI Zvols

TrueNAS Enterprise and custom ZFS servers frequently export zvols as iSCSI targets consumed by VMware ESXi hosts as VMFS datastores. ZFS treats each zvol as a raw block device and has no awareness of the VMFS structures or .vmdk files inside it.

When the underlying ZFS pool faults, the ESXi host loses access to the datastore and all VMs on it go offline. Recovery requires reconstructing the ZFS pool from drive images, importing it read-only, and extracting the raw zvol as a binary image. We then parse the VMFS volume header from the zvol image to locate the file descriptor table and extract individual .vmdk flat extents without requiring the original ESXi hypervisor. Each extracted VM disk is mounted independently to verify guest filesystem integrity (NTFS, ext4, XFS). For ESXi-specific failure modes outside the ZFS layer, see our VMware ESXi recovery service.

On-Disk Format Differences Across ZFS Implementations

Pool version 28 is the last universally interoperable format across all ZFS implementations. Modern pools use feature flags instead of version numbers, and these flags diverge between OpenZFS on Linux, FreeBSD, and legacy Solaris.

Feature FlagZFS-on-LinuxFreeBSD ZFSSolaris (pre-OpenZFS)Recovery Impact
large_dnodeSupported (ZoL 0.7+)Supported (FreeBSD 12+)Not supportedPools with large_dnode cannot be imported on Solaris. Recovery environment must run OpenZFS 0.7+.
spacemap_v2Supported (ZoL 0.8+)Supported (FreeBSD 13+)Not supportedOlder ZoL versions (0.7.x) cannot read spacemap_v2 pools. Importing on a mismatched version produces a ZFS-8000-A5 error.
allocation_classesSupported (ZoL 0.8+)Supported (FreeBSD 13+)Not supportedPools using special allocation classes (metadata vdevs) require a recovery environment that supports this feature.
Native encryptionDataset-level AES-256-GCMGELI (disk-level) or nativeOracle proprietaryEncryption type determines key handling. GELI keys live on the boot pool; native encryption keys are per-dataset. Losing the key means the data is unrecoverable regardless of pool health.

During recovery, we match the import environment's OpenZFS version to the pool's feature flags. Attempting to import a pool on an older OpenZFS version that lacks a required feature flag produces a ZFS-8000-A5 error and refuses the import entirely. We maintain recovery workstations running multiple OpenZFS versions to handle pools from NAS enclosures and Proxmox hosts running different kernel versions.

LSI HBA Firmware Crashes and ZFS Label Destruction

ZFS requires SAS/SATA Host Bus Adapters flashed to IT (Initiator Target) mode for direct disk access. Broadcom/LSI controllers (9211-8i, 9300-8i, 9400-8i) are the standard in enterprise ZFS deployments. If the HBA firmware reverts to IR (Integrated RAID) mode during a power event, the controller writes DDF RAID metadata at the end of each attached drive, destroying ZFS labels L2 and L3.

Administrators can verify their firmware mode by running sas2flash -list (for SAS2 controllers) or sas3flash -list (for SAS3). If the output shows IR firmware where IT was expected, the labels at the end of each drive have been overwritten by DDF metadata. We recover these pools by calculating the exact byte offset of the IR-mode metadata overlay and parsing the surviving L0 and L1 labels at the beginning of each drive to reconstruct the vdev topology. Because ZFS stores redundant label copies at both ends of every disk, a single-end overwrite is recoverable if the drives are not subsequently reformatted. For pools where the HBA also managed a hardware RAID array, the recovery becomes a dual-layer operation: reconstruct the hardware RAID geometry first, then parse ZFS structures from the reconstructed logical volume.

Emergency Steps for a Failed ZFS Pool Import

If your ZFS pool will not import, follow these steps before running any destructive commands. Each step is diagnostic only and does not write to the pool.

  1. Check kernel I/O errors. Run dmesg | grep -i "error\|fault\|reset" to identify which drives reported hardware-level errors before ZFS faulted the pool. SCSI sense codes or ATA timeout messages point to the specific failing drive.
  2. Attempt a standard import without the -f flag. Run zpool import (no arguments) to list all visible pools and their state. If the pool appears as UNAVAIL or DEGRADED, note which vdevs are missing. Do not use -f at this stage.
  3. Verify ZFS labels on each drive. Run zdb -l /dev/sdX on each pool member to confirm label presence and read the vdev GUID, pool GUID, and highest transaction group number. Drives missing all four labels were either wiped or belong to a different pool.
  4. Power down if mechanical failure is suspected. Clicking, grinding, or repeated spin-up/spin-down cycles indicate physical drive failure requiring clean bench work. Continued operation risks platter scoring. Do not attempt to offline/online the drive or force a resilver. Ship the drives to a no-fix-no-fee recovery lab for imaging under controlled conditions.

Commands That Destroy ZFS Recovery Options

The following commands, commonly suggested in forum posts, will reduce or eliminate recovery chances if run on a failing pool. If a pool is degraded or faulted, executing these commands permanently overwrites the historical transaction groups required for offline forensic reconstruction. Power down the system instead.

zpool import -f on a faulted pool

Forces import of a pool that ZFS has refused. This writes new TXGs to the pool, overwriting the metadata ZFS needs for self-consistency checks. If the pool is faulted due to drive failures, the forced import will record that the failed drives are absent, and any subsequent export-reimport cycle will reference the damaged state rather than the pre-failure state.

How Forced Imports Destroy the Uberblock Ring

ZFS uses a copy-on-write model: every zpool import -f allocates new objset_phys_t metadata trees and commits a new Transaction Group (TXG) to every surviving drive. The 128-entry uberblock ring is a circular buffer; each new TXG overwrites the oldest entry. A forced import on a degraded pool permanently records the missing drives as absent in the new TXG baseline. Once that TXG is written, rolling back to the pre-failure topology becomes impossible because the uberblock entry that referenced the original vdev tree has been overwritten. If the forced import triggers additional write activity (scrub commands, dataset mounts, ZIL replays), multiple uberblock entries are consumed in rapid succession, shrinking the recovery window from 128 TXGs to as few as a handful. This is why we image every drive before attempting any import variant.

zpool clear followed by resilver on a degraded pool

Clearing errors and resilvering writes parity data across all surviving drives. If any surviving drive has developing bad sectors (common with same-batch drives of the same age), the resilver can trigger that drive to fail, pushing the pool past its parity tolerance. We see this cascade failure regularly.

zfs_max_missing_tvds tunable

This kernel tunable allows ZFS to import a pool with missing top-level vdevs. Setting it to a non-zero value and importing writes new TXGs that permanently record the missing vdevs as absent. If you then add the missing vdevs back, ZFS treats them as foreign devices and will not reattach them. The original pool topology is overwritten. This tunable is a last-resort forensic tool, not a recovery shortcut.

Our ZFS Recovery Methodology

1. Drive Imaging with PC-3000

Every drive in the pool is imaged through PC-3000 with write-blocking. SAS drives (common in TrueNAS Enterprise and Solaris servers) are imaged via SAS HBAs in IT mode. For drives with bad sectors, we capture healthy regions first using sector maps, then retry damaged areas with aggressive read parameters. Drives with mechanical failures (clicking, motor seizure) receive clean bench work before imaging: head swaps, motor transplants, or platter stabilization, all performed under 0.02 µm ULPA filtration.

2. Vdev Label Analysis

We read all four ZFS labels from each drive image using zdb -l. The labels contain the pool name, pool GUID, vdev GUID, vdev tree (encoded as an nvlist), and the 128-entry uberblock ring. By comparing labels across all drive images, we reconstruct the complete vdev topology even when some drives have corrupted labels. The vdev tree tells us which drives belong to which vdev, whether each vdev is a mirror or RAIDZ, and the ashift (sector size alignment, typically 9 for 512-byte or 12 for 4K-native drives).

3. Uberblock Selection and TXG Rollback

The uberblock ring on each drive contains the last 128 transaction groups. We examine each uberblock using zdb -lu to find the highest TXG with a valid checksum. If the latest TXG is corrupted, we roll back to an earlier state using zpool import -T [txg]. The data loss from TXG rollback is limited to writes that occurred between the target TXG and the failed TXG. For most failures triggered by drive loss rather than active corruption, the rollback window is seconds.

4. Offline Pool Import and Dataset Extraction

The pool is imported read-only from the drive images using loopback devices on a dedicated recovery workstation. We verify the pool status, check for data errors using zpool status -v, and extract individual datasets, zvols, and snapshots. For zvols used as VM storage (Proxmox, bhyve), we mount the guest filesystem to verify the VM data is intact. Snapshots are preserved; if the live dataset has corruption but a recent snapshot is clean, we recover from the snapshot.

ZFS Recovery Pricing

Same transparent model as our RAID recovery pricing: per-drive imaging based on each drive's condition, plus a $400-$800 pool reconstruction fee covering vdev analysis, pool import, and dataset extraction. No data recovered = no charge.

Service TierPrice Range (Per Drive)Description
Logical / Firmware Imaging$250-$900Firmware module damage, SMART threshold failures, or filesystem corruption on individual pool members.
Mechanical (Head Swap / Motor)$1,200-$1,50050% depositDonor parts consumed during transplant. SAS drives (common in enterprise ZFS servers) require SAS-specific donors.
ZFS Pool Reconstruction$400-$800per poolVdev reconstruction, uberblock analysis, pool import, and dataset/zvol extraction. Includes ZFS native decryption or GELI decryption if key material is provided.

No Data = No Charge: If we recover nothing from your ZFS pool, you owe $0. Free evaluation, no obligation.

Before sending drives: export your encryption key (GELI recovery key for TrueNAS CORE, ZFS encryption passphrase for TrueNAS SCALE or Linux). Note the pool name and vdev layout from zpool status if the system still boots.

Lab Location and Mail-In

All ZFS recovery work is performed in-house at our Austin lab: 2410 San Antonio Street, Austin, TX 78705. Walk-in evaluations are available Monday - Friday, 10 AM - 6 PM CT. For clients outside Austin, we accept mail-in shipments from all 50 states. Ship drives in anti-static bags with foam padding. Label each drive with its slot number from the original system if possible.

Data Recovery Standards & Verification

Our Austin lab operates on a transparency-first model. We use industry-standard recovery tools, including PC-3000 and DeepSpar, combined with strict environmental controls to make sure your hard drive is handled safely and properly. This approach allows us to serve clients nationwide with consistent technical standards.

Open-drive work is performed in a ULPA-filtered laminar-flow bench, validated to 0.02 µm particle count, verified using TSI P-Trak instrumentation.

Transparent History

Serving clients nationwide via mail-in service since 2008. Our lead engineer holds PC-3000 and HEX Akademia certifications for hard drive firmware repair and mechanical recovery.

Media Coverage

Our repair work has been covered by The Wall Street Journal and Business Insider, with CBC News reporting on our pricing transparency. Louis Rossmann has testified in Right to Repair hearings in multiple states and founded the Repair Preservation Group.

Aligned Incentives

Our "No Data, No Charge" policy means we assume the risk of the recovery attempt, not the client.

LR

Louis Rossmann

Louis Rossmann's well trained staff review our lab protocols to ensure technical accuracy and honest service. Since 2008, his focus has been on clear technical communication and accurate diagnostics rather than sales-driven explanations.

We believe in proving standards rather than just stating them. We use TSI P-Trak instrumentation to verify that clean-air benchmarks are met before any drive is opened.

See our clean bench validation data and particle test video

ZFS Recovery; Common Questions

My ZFS pool shows FAULTED and zpool import fails. Can you recover the data?
Yes. A FAULTED pool means ZFS cannot guarantee consistency with the remaining vdevs. We image all drives including the failed ones, reconstruct vdev geometry from ZFS label data at sectors 0 and end-of-disk on each member, and force-import the pool from images to extract datasets.
I ran zpool import -f and it made things worse. Is recovery still possible?
Usually yes. A forced import writes new transaction groups to the pool, which can overwrite metadata ZFS needs for self-repair. The severity depends on how much write activity occurred after the forced import. We image the drives in their current state and attempt recovery from historical transaction groups that predate the forced import.
Can you recover a ZFS pool after zpool destroy or zpool labelclear?
If no new data has been written to the drives after the destroy command, the uberblocks and metadata trees are still on disk. We scan for historical uberblocks at known offsets and reconstruct the pool from the most recent valid transaction group.
Does it matter if I use RAIDZ1, RAIDZ2, or RAIDZ3?
The RAIDZ level determines how many drives can fail before the pool faults. RAIDZ1 tolerates one failure per vdev, RAIDZ2 tolerates two, RAIDZ3 tolerates three. Recovery complexity increases when failures exceed these thresholds because we must reconstruct data without parity assistance for the extra failed drives.
My ZFS pool uses deduplication and the DDT is corrupted. Can you recover files?
DDT corruption is one of the harder ZFS recovery scenarios. The dedup table maps block references to their physical locations on disk. If the DDT is damaged, files that reference deduplicated blocks cannot be resolved through normal import. We reconstruct the DDT from the block pointer tree by scanning every dnode in the pool and rebuilding the reference map.
How is ZFS recovery priced?
Per-drive imaging based on each drive's condition ($300-$900 per drive), plus a $400-$800 pool reconstruction fee covering vdev analysis, pool import, and dataset extraction. If we recover nothing, you pay $0.
Why does TrueNAS SCALE refuse to import my pool after a drive replacement?
TrueNAS SCALE pools transition to UNAVAIL if a resilver aborts mid-transaction group or if a replacement drive's ZFS label is overwritten, causing a vdev GUID mismatch. ZFS identifies drives by on-disk GUIDs in the vdev labels, not by OS device names, so the mismatch is a metadata conflict, not a Linux vs. FreeBSD enumeration issue. Recovery requires imaging all drives, locating historical vdev labels that predate the mismatch, and reconstructing the pool offline.
Can you recover a TrueNAS ZFS pool built on top of a Dell PERC hardware RAID?
Yes, but it is a dual-layer recovery. ZFS on top of hardware RAID (Dell PERC, HP SmartArray, LSI MegaRAID) prevents ZFS from seeing individual disks, disabling self-healing and SMART monitoring. When the hardware RAID degrades and causes a kernel panic, we first reconstruct the hardware RAID block geometry using PC-3000 to virtualize the logical unit, then parse the ZFS uberblocks and datasets from that reconstructed volume. This is more complex than standard ZFS recovery because the stripe width, parity layout, and block alignment of the hardware RAID must be resolved before ZFS metadata becomes readable.
Can data be recovered if the ZFS spacemap is corrupted?
Yes. Spacemaps track allocated and free blocks for each metaslab. Corruption prevents normal pool mounting because ZFS cannot determine which blocks are in use, but user data blocks remain intact on disk. We bypass the spacemap check during a read-only import and extract datasets directly from the block pointer tree.
Is it safe to run zpool import -F on a degraded or unmountable pool?
The -F (rewind) flag forces ZFS to discard the most recent transaction groups until it finds a consistent state. This irreversibly destroys the most recent writes (typically 5 to 30 seconds of data, depending on the TXG sync interval and I/O load), and the discarded TXGs cannot be recovered. We image all drives first, then use read-only TXG rollback (zpool import -T) on cloned images rather than executing destructive rewind commands on live hardware. If you have already run -F, recovery from pre-rewind TXGs depends on whether the rewind overwrote the uberblock ring entries we need.
Why does ZFS data recovery cost more than standard single-drive logical recovery?
ZFS recovery is a multi-stage process. A degraded RAIDZ2 pool on 8 SAS drives requires imaging every individual drive through PC-3000 with SAS HBAs, then mathematically reconstructing variable-width RAIDZ stripes across those images before user data can be extracted. The $400-$800 pool reconstruction fee covers vdev analysis, uberblock selection, and dataset extraction. Per-drive imaging fees ($250-$900 each) depend on whether each drive needs firmware repair or clean bench head swaps. A single NTFS drive needs one image and one filesystem scan; a ZFS pool needs N images plus stripe reconstruction.
What does it mean when zpool import hangs instead of returning an error?
If zpool import fails instantly with an I/O error or UNAVAIL status, ZFS read the vdev labels but determined the pool lacks enough drives to meet the RAIDZ parity threshold. If the command hangs indefinitely, the Linux kernel is retrying failed read operations on a drive with degraded read/write heads or severe bad sectors. A hanging import is a physical warning: the drive is still powered on and the heads are scraping the platter surface with each retry. Power down immediately. Don't wait for the command to timeout. Ship the drives to a lab for imaging under controlled conditions.
How do you recover a pool if the ZIL (SLOG) drive fails during a synchronous write?
If a dedicated SLOG drive dies while holding uncommitted Log-Write Blocks, attempting zpool import often causes a kernel panic or a blocked task error ("task z_wr_iss blocked for more than 122 seconds"). We don't run aggressive filesystem checks on the pool drives. Instead, we set zil_replay_disable=1 in the ZFS module parameters (/sys/module/zfs/parameters/zil_replay_disable) before import. This tells OpenZFS to discard the intent log rather than replaying it, sacrificing the last few seconds of synchronous writes but allowing the rest of the pool to mount safely for extraction.
Can you recover data if a TrueNAS CORE to SCALE migration destroys the pool?
Yes. Migrations from TrueNAS CORE (FreeBSD) to SCALE (Debian Linux) fail when GELI disk-level encryption wrappers aren't decrypted before the OS transition. The OpenZFS Linux module can't unpack FreeBSD GELI-encrypted vdev labels, so the pool appears empty or refuses import entirely. We reconstruct a FreeBSD recovery environment, supply the geli.key from the original CORE boot pool, decrypt the vdev labels, and extract datasets to a staging server. If the boot pool is also lost, recovery depends on whether a backup of the GELI key file exists.

Ready to recover your ZFS pool?

Free evaluation. No data = no charge. Mail-in from anywhere in the U.S.

(512) 212-9111Mon-Fri 10am-6pm CT
No diagnostic fee
No data, no fee
Free return shipping
4.9 stars, 1,837+ reviews