Pool Recovery & RAIDZ Reconstruction
ZFS Data Recovery
We recover faulted ZFS pools by imaging every drive, parsing vdev labels and uberblocks, and importing the pool offline from cloned images. Covers TrueNAS, FreeNAS, Proxmox, Oracle Solaris, and Linux OpenZFS. Free evaluation. No data = no charge.

How ZFS Pools Fail and How We Reconstruct Them
ZFS stores all data and metadata in a Merkle tree rooted at the uberblock. Each pool contains one or more vdevs (mirror, RAIDZ1, RAIDZ2, or RAIDZ3), and each vdev distributes data across its member drives using variable-width stripes. When enough drives in a vdev fail that ZFS cannot reconstruct missing blocks from parity, the pool transitions to FAULTED and refuses import. Recovery requires imaging every drive in the pool, locating valid uberblocks in the 128-entry ring on each drive's ZFS labels, and force-importing the pool read-only from images to extract datasets and zvols.
ZFS checksums every block using SHA-256 (or fletcher4 for non-dedup pools). This per-block verification catches silent corruption thattraditional RAID controllers miss. The trade-off: when ZFS detects a checksum mismatch and cannot correct it from parity, it returns an I/O error instead of serving corrupt data. Scrub errors on a degraded pool signal that more blocks will become inaccessible if another drive fails.
ZFS On-Disk Architecture for Recovery Engineers
Targeted ZFS recovery requires understanding five on-disk structures that standard file-carving tools do not handle. Each structure has a specific role in the pool's self-describing metadata tree, and corruption at any level produces different symptoms.
Vdev Labels (L0, L1, L2, L3)
Every drive in a ZFS pool carries four label copies: L0 and L1 at the first 512 KB, L2 and L3 at the last 512 KB. Each label contains the pool GUID, vdev GUID, the full vdev tree encoded as an nvlist, and the uberblock ring (128 entries, 1 KB each). ZFS writes labels in a specific order to prevent all four from being corrupted by a single interrupted write. During recovery, we read all four labels from every drive image to find the most complete vdev tree and the highest-txg uberblock.
Uberblock Ring and Transaction Groups
The uberblock is the root pointer for the entire pool. ZFS maintains a ring buffer of 128 uberblocks per label, written round-robin as each transaction group (TXG) commits. Each uberblock records the TXG number, a timestamp, and a block pointer to the Meta Object Set (MOS). The active uberblock is the one with the highest TXG that also has a valid checksum. When the latest TXG is corrupted, we walk backward through the ring to find an older, consistent state. The trade-off: rolling back to an earlier TXG means any data written after that TXG is lost. For a pool that saw 10 TXGs of write activity before the failure, that window is usually seconds to minutes of data.
DVA Pointers and the Block Pointer Tree
Every ZFS block pointer contains up to three Data Virtual Addresses (DVAs). Each DVA encodes the vdev ID, offset within the vdev, and the gang bit (indicating whether the block is a gang block split across multiple sub-blocks). The block pointer also stores the checksum of the target block, the compression algorithm, the logical and physical sizes, and the birth TXG. We use zdb -bbb on the imported pool image to traverse the full block pointer tree. When DVA pointers reference sectors on a failed drive, we reconstruct the missing data from RAIDZ parity (if within tolerance) or flag those blocks as unrecoverable.
Dnode Objects and the Meta Object Set
The MOS is the top-level object set containing all pool-wide metadata: the dataset directory, the space map, the DDT (if dedup is enabled), and configuration objects. Each dataset within the pool has its own object set, and within that set, every file or directory is represented by a dnode. A dnode is a 512-byte structure that stores the object type, bonus data (such as the ZPL file attributes), and up to three block pointers for the object's data. When MOS corruption prevents normal import, we locate the MOS block pointer directly from the uberblock and traverse the dnode tree manually using zdb -dddd to enumerate datasets.
Space Maps and Free Space Tracking
ZFS tracks allocated and free space using space maps: on-disk logs of allocation and free events for each metaslab. Corrupted space maps do not lose user data, but they prevent ZFS from mounting the pool because it cannot determine which blocks are in use. Recovery involves bypassing the space map check during read-only import and rebuilding the space map from the block pointer tree. This is a metadata-only repair that does not modify user data blocks.
Raw Block Extraction via objset_phys_t
When catastrophic MOS corruption prevents any form of pool import, we fall back to raw block extraction. The objset_phys_t structure is the root of every dataset in ZFS; it contains an array of dnode_phys_t entries, each holding block pointers (blkptr_t) to the actual file data. Using zdb -R pool:vdev:offset:psize:d,lzjb,lsize, we read raw compressed data blocks directly from drive images, decompress them with the correct algorithm (LZ4, LZJB, or ZSTD), and reconstruct individual files by walking the dnode tree manually. This bypasses the entire ZFS import pipeline. It's the last-resort method for severely corrupted pools on custom-built NAS enclosures where no valid uberblock survives on any drive.
RAIDZ Parity Distribution and Rebuild Constraints
RAIDZ differs from traditional RAID 5/6 in a fundamental way: it uses variable-width stripes. Each logical block is distributed across a stripe whose width depends on the block's size and the number of drives in the vdev. This eliminates the RAID write hole without requiring a battery-backed cache, but it means recovery tools designed for fixed-stripe RAID arrays cannot parse RAIDZ data.
| RAIDZ Level | Parity Drives | Fault Tolerance | Recovery When Exceeded |
|---|---|---|---|
| RAIDZ1 | 1 per stripe | 1 drive per vdev | Blocks on 2+ failed drives are unrecoverable from parity alone. Partial recovery based on which blocks landed on which drives. |
| RAIDZ2 | 2 per stripe | 2 drives per vdev | Blocks spanning 3+ failed drives lost. RAIDZ2 is the most common production configuration and offers a better recovery margin than RAIDZ1. |
| RAIDZ3 | 3 per stripe | 3 drives per vdev | Rarely exceeds tolerance in practice. Typically seen in large vdevs (8+ drives) where the probability of three simultaneous failures is non-trivial during resilver. |
For traditional hardware RAID arrays (Dell PERC, HP SmartArray, LSI MegaRAID), see our RAID data recovery service. RAIDZ is software RAID managed by ZFS and uses a different on-disk layout than controller-based arrays.
Hardware-Assisted RAIDZ Reconstruction with PC-3000
Software-only tools (ReclaiMe, UFS Explorer, DiskInternals) parse ZFS pools by reading vdev labels through the operating system's block device layer. When a drive has hardware-level read timeouts, firmware lockouts, or degraded heads, the OS cannot deliver clean sectors and the software tool stalls or returns garbage data.
We bypass the OS entirely using PC-3000 Data Extractor RAID Edition, which images each drive through a direct hardware interface (SAS or SATA) with configurable read timeout, head map, and sector retry parameters. Once all drives are imaged, the RAID Edition's autodetection module identifies the RAIDZ level (1, 2, or 3), block size, and stripe shift from the raw sector data. The "RAID Member Statistics" analytical method calculates the variable-width stripe distribution by comparing block entropy patterns across drive images, which is necessary because RAIDZ stripes vary in width per-block rather than using the fixed stripe width that traditional hardware RAID controllers impose. This hardware-assisted block shift identification is the difference between a successful & failed TrueNAS data recovery when the software stack can't read the drives.
ZFS Pool Failure Scenarios We Recover
Faulted Pool After Drive Failures
The most common scenario. RAIDZ1 pools fault when two drives in the same vdev fail. RAIDZ2 pools fault on three failures. We image all drives (including failed ones) and reconstruct what parity can provide. Board-level repair on electrically failed drives can restore one member and bring the pool back within tolerance.
Failed Resilver
Resilvering writes parity data to all surviving members. If a surviving drive develops bad sectors during the resilver, ZFS cannot complete the rebuild. A mid-resilver failure is dangerous because parity is partially recalculated: some stripes reflect the old layout, others reflect the new. We handle this by imaging all drives and reconciling both parity states offline.
Corrupted Uberblock or MOS
Power loss during a TXG commit can corrupt the active uberblock or the MOS it points to. ZFS normally recovers by falling back to a previous TXG, but if multiple TXGs are affected (e.g., UPS failure during a long scrub), manual uberblock selection is needed. We use zpool import -T to target a specific TXG, or parse the uberblock ring manually with zdb -lu to find the last consistent state.
ZIL / SLOG Device Failure
The ZFS Intent Log (ZIL) records synchronous write transactions. If a dedicated SLOG device (typically a fast NVMe SSD) fails, any uncommitted synchronous writes are lost. For pools where the SLOG failed during active database writes or NFS/iSCSI operations, we image both the SLOG device and the pool drives. If the SLOG contains recoverable log records, we replay them into the pool. If the SLOG is physically dead, the pool imports without those pending writes.
NVMe SLOG Firmware Lockout
TrueNAS and Proxmox administrators frequently assign Phison PS5012-E12 or PS5016-E16 based NVMe drives as dedicated SLOG devices for synchronous write caching. A sudden power loss can corrupt the Flash Translation Layer on these controllers, locking the drive to a 0-byte capacity state where the BIOS no longer detects it. The uncommitted ZIL transactions on that SLOG are lost to the operating system. We use PC-3000 Portable III to issue NVMe Vendor Specific Commands that force the controller into diagnostic mode, reconstruct the corrupted FTL mapping tables, and image the drive contents through the controller to recover uncommitted log records for replay into the pool. The same Phison controller failure modes affect consumer SSD data recovery outside the ZFS context.
Silicon Motion SM2262EN / SM2259XT Caching Drive Failures
ZFS administrators frequently assign consumer NVMe drives with Silicon Motion SM2262EN controllers (ADATA SX8200 Pro, HP EX950) or SATA drives with DRAM-less SM2259XT controllers as L2ARC or SLOG devices. A power loss can corrupt the Flash Translation Layer on these controllers, locking the drive into a 0-byte or diagnostic 2MB capacity state where the BIOS no longer detects usable storage. Standard ZFS tooling cannot access the drive in this state. We use the PC-3000 SSD utility to issue vendor-specific commands that force the controller into diagnostic mode, rebuild the corrupted FTL mapping tables from the NAND spare area, and image the drive contents to recover uncommitted ZIL transactions for replay into the pool. The same SM2259XT controller failures affect consumer SSDs outside the ZFS context.
ZIL Replay and Log-Write Block (LWB) Extraction
Synchronous writes in ZFS are stored as intent transactions (itx records) packaged into Log-Write Blocks (LWBs) within the ZFS Intent Log. When a dedicated SLOG device fails mid-commit, uncommitted LWBs are stranded on the dead NVMe drive. Unlike hardware RAID arrays with battery-backed caches, the ZIL relies entirely on the SLOG device committing itx records safely to persistent storage. We use PC-3000 Portable III to reconstruct the corrupted FTL on the failed SLOG (typically Phison E12 or SM2262EN controllers), image the raw NVMe namespace, and extract the uncommitted LWBs. The isolated itx records are then replayed into the pool's transaction groups to recover pending database writes and NFS/iSCSI operations that would otherwise be lost.
Accidental Pool Destruction
Running zpool destroy wipes labels from each drive. Running zpool labelclear does the same to individual drives. If no new data has been written afterward, the uberblocks and block pointer tree remain on disk at their original offsets. We scan for the characteristic uberblock magic number (0x00bab10c) and pool GUID to locate and reconstruct from them.
Dedup Table Corruption
ZFS deduplication stores a Dedup Table (DDT) that maps block checksums to their physical locations. The DDT consumes RAM (about 320 bytes per entry) and is backed by on-disk log-structured storage. When the DDT becomes corrupted (common when pools run low on memory under heavy dedup loads), files referencing deduplicated blocks cannot be resolved. We rebuild the DDT by scanning the full block pointer tree and reconstructing the checksum-to-DVA mappings.
dRAID Vdev Failure (OpenZFS 2.1+)
Distributed RAID (dRAID), introduced in OpenZFS 2.1 & available in TrueNAS SCALE 23.10+, replaces traditional RAIDZ hot spares with distributed spare space across all pool members. dRAID uses precomputed permutation maps to shuffle parity across drives, enabling faster sequential resilvering. Consumer recovery software (Klennet, UFS Explorer, ReclaiMe) cannot parse dRAID because the permutation map defines a non-standard block routing that doesn't match fixed-width RAIDZ striping. We use PC-3000 Data Extractor RAID Edition to manually define the tabular matrix configuration matching the dRAID permutation, then reconstruct the distributed spare blocks when a sequential resilver fails due to cascaded hardware degradation across same-batch drives.
Persistent L2ARC Header Corruption
OpenZFS 2.0+ added persistent L2ARC, allowing cache tables to survive reboots by writing header metadata to the cache device. If the NVMe cache drive suffers a controller lockout or partial firmware corruption, the corrupted L2ARC header triggers a kernel panic or infinite hang during zpool import. The pool data is intact on the main vdevs; only the cache metadata is damaged. We bypass the panic by setting l2arc_rebuild_enabled=0 in the ZFS module parameters before import, which instructs OpenZFS to skip the persistent cache rebuild and mount the pool read-only for extraction.
ZFS Pool Recovery When Software Fails
When all drives are physically healthy but the pool refuses to import, the failure is in ZFS metadata, not hardware. These software-level failures require forensic analysis of the on-disk structures rather than clean bench work.
Corrupted Uberblock Ring
Each drive in the pool stores 128 uberblocks in a ring buffer across its four ZFS labels. If a power loss corrupts the active uberblock and the next several entries in the ring, ZFS cannot locate a valid root pointer to the Meta Object Set. We parse all 512 uberblock slots (128 per label × 4 labels) across every drive image using zdb -lu to find the highest transaction group with a valid checksum. If no valid uberblock exists on the primary drive, we cross-reference uberblocks from other pool members where the ring may be intact.
Destroyed Spacemap
Spacemaps log allocation and free events for each metaslab. Corruption typically occurs when a pool runs at 95%+ capacity and a write fails mid-commit. The pool refuses to mount because ZFS cannot verify which blocks are allocated. User data remains on disk at its original offsets. We import the pool read-only with spacemap validation bypassed and rebuild the allocation table from the block pointer tree, which is an independent metadata structure that records every live block's location.
Vdev GUID Mismatch
Every vdev and pool has a unique GUID recorded in the ZFS labels. If a drive is replaced and the resilver aborts partway through, the new drive carries a different GUID than the pool expects. OpenZFS refuses the import because the vdev tree no longer matches the topology stored in the uberblock. This is common on TrueNAS SCALE systems where a failed resilver leaves the replacement drive with an incomplete label. ZFS identifies drives by on-disk GUIDs, not by OS device names, so the mismatch is a metadata conflict. We reconstruct the correct topology from historical labels and reimport with the original GUIDs. Unlike hardware RAID recovery, where the controller stores configuration on a dedicated ROM chip, ZFS distributes its configuration across all member drives, so the topology can always be rebuilt from the surviving labels.
OpenZFS 2.2 Block Cloning Corruption
OpenZFS 2.2 introduced block cloning via the copy_file_range(2) system call (used by coreutils 9.x & cp --reflink). A concurrency bug in the DMU offset reporting causes chunks of cloned files to be silently replaced by zeroes. Standard zpool scrub won't catch it because the checksum matches the zero-filled block that ZFS committed. This affects TrueNAS SCALE, Proxmox, & any Linux system running OpenZFS 2.2.0 through 2.2.2. During forensic extraction of affected pools, we set zfs_dmu_offset_next_sync=0 to bypass the timing window and extract uncorrupted blocks from historical transaction groups where the original data predates the cloning operation. Recovery from this bug requires comparing block birth TXGs against the cloning timestamp to identify which files contain zeroed chunks.
Platform-Specific ZFS Recovery Notes
ZFS runs on four major platforms. The on-disk format is cross-platform compatible (a pool created on Solaris can be imported on Linux), but encryption layers, feature flags, and bootloader integration differ across platforms.
TrueNAS / FreeNAS
TrueNAS CORE (FreeBSD) uses GELI disk-level encryption. TrueNAS SCALE (Debian Linux) uses ZFS native encryption at the dataset level. Both require key material for encrypted pools. See our dedicated TrueNAS recovery page for GELI-specific workflows. GELI keys are stored on the boot pool; if the boot drive is lost and no backup exists, the encrypted pool is unrecoverable.
QNAP QuTS hero
Newer enterprise QNAP NAS devices (TVS-x72XT, TVS-hx74, TS-x73A series) run the QuTS hero operating system, which replaces QTS's traditional ext4/mdadm stack with a full ZFS implementation. QNAP adds proprietary volume management wrappers and SSD caching layers on top of ZFS. We image the individual drives, bypass the QNAP hardware interface, and parse the QuTS hero ZFS pool offline using standard OpenZFS tooling. For other QNAP models running standard QTS with mdadm, see our NAS data recovery service.
Proxmox VE
Proxmox uses OpenZFS on Linux for VM storage. Pools typically store qcow2 disk images or zvols used as raw block devices by KVM/QEMU. Recovery involves importing the pool from images and extracting the guest VM disk files, then mounting the guest filesystem (NTFS, ext4, XFS) to verify the VM data. See Proxmox recovery for Ceph-related failures.
Oracle Solaris
Solaris is the original ZFS platform and may use older pool versions (pre-feature flags, pool version 28 or earlier). Older Solaris pools lack features like LZ4 compression and large dnode support. Recovery is straightforward if the pool version is identified correctly; we import on a matching OpenZFS version or use Solaris-native tools when feature flags are incompatible.
Linux OpenZFS
Ubuntu, Debian, Fedora, and Arch all support OpenZFS through the ZFS on Linux (ZoL) kernel module. Common in custom-built NAS servers and Proxmox hosts. Linux OpenZFS supports ZFS native encryption (dataset-level AES-256-GCM). Pool recovery is identical to other platforms once drives are imaged; the key difference is that Linux systems sometimes mix ZFS and mdadm (e.g., mdadm mirror for boot, ZFS pool for data), which requires handling both metadata formats.
Recovering VMware VMFS Datastores from ZFS iSCSI Zvols
TrueNAS Enterprise and custom ZFS servers frequently export zvols as iSCSI targets consumed by VMware ESXi hosts as VMFS datastores. ZFS treats each zvol as a raw block device and has no awareness of the VMFS structures or .vmdk files inside it.
When the underlying ZFS pool faults, the ESXi host loses access to the datastore and all VMs on it go offline. Recovery requires reconstructing the ZFS pool from drive images, importing it read-only, and extracting the raw zvol as a binary image. We then parse the VMFS volume header from the zvol image to locate the file descriptor table and extract individual .vmdk flat extents without requiring the original ESXi hypervisor. Each extracted VM disk is mounted independently to verify guest filesystem integrity (NTFS, ext4, XFS). For ESXi-specific failure modes outside the ZFS layer, see our VMware ESXi recovery service.
On-Disk Format Differences Across ZFS Implementations
Pool version 28 is the last universally interoperable format across all ZFS implementations. Modern pools use feature flags instead of version numbers, and these flags diverge between OpenZFS on Linux, FreeBSD, and legacy Solaris.
| Feature Flag | ZFS-on-Linux | FreeBSD ZFS | Solaris (pre-OpenZFS) | Recovery Impact |
|---|---|---|---|---|
| large_dnode | Supported (ZoL 0.7+) | Supported (FreeBSD 12+) | Not supported | Pools with large_dnode cannot be imported on Solaris. Recovery environment must run OpenZFS 0.7+. |
| spacemap_v2 | Supported (ZoL 0.8+) | Supported (FreeBSD 13+) | Not supported | Older ZoL versions (0.7.x) cannot read spacemap_v2 pools. Importing on a mismatched version produces a ZFS-8000-A5 error. |
| allocation_classes | Supported (ZoL 0.8+) | Supported (FreeBSD 13+) | Not supported | Pools using special allocation classes (metadata vdevs) require a recovery environment that supports this feature. |
| Native encryption | Dataset-level AES-256-GCM | GELI (disk-level) or native | Oracle proprietary | Encryption type determines key handling. GELI keys live on the boot pool; native encryption keys are per-dataset. Losing the key means the data is unrecoverable regardless of pool health. |
During recovery, we match the import environment's OpenZFS version to the pool's feature flags. Attempting to import a pool on an older OpenZFS version that lacks a required feature flag produces a ZFS-8000-A5 error and refuses the import entirely. We maintain recovery workstations running multiple OpenZFS versions to handle pools from NAS enclosures and Proxmox hosts running different kernel versions.
LSI HBA Firmware Crashes and ZFS Label Destruction
ZFS requires SAS/SATA Host Bus Adapters flashed to IT (Initiator Target) mode for direct disk access. Broadcom/LSI controllers (9211-8i, 9300-8i, 9400-8i) are the standard in enterprise ZFS deployments. If the HBA firmware reverts to IR (Integrated RAID) mode during a power event, the controller writes DDF RAID metadata at the end of each attached drive, destroying ZFS labels L2 and L3.
Administrators can verify their firmware mode by running sas2flash -list (for SAS2 controllers) or sas3flash -list (for SAS3). If the output shows IR firmware where IT was expected, the labels at the end of each drive have been overwritten by DDF metadata. We recover these pools by calculating the exact byte offset of the IR-mode metadata overlay and parsing the surviving L0 and L1 labels at the beginning of each drive to reconstruct the vdev topology. Because ZFS stores redundant label copies at both ends of every disk, a single-end overwrite is recoverable if the drives are not subsequently reformatted. For pools where the HBA also managed a hardware RAID array, the recovery becomes a dual-layer operation: reconstruct the hardware RAID geometry first, then parse ZFS structures from the reconstructed logical volume.
Emergency Steps for a Failed ZFS Pool Import
If your ZFS pool will not import, follow these steps before running any destructive commands. Each step is diagnostic only and does not write to the pool.
- Check kernel I/O errors. Run
dmesg | grep -i "error\|fault\|reset"to identify which drives reported hardware-level errors before ZFS faulted the pool. SCSI sense codes or ATA timeout messages point to the specific failing drive. - Attempt a standard import without the -f flag. Run
zpool import(no arguments) to list all visible pools and their state. If the pool appears as UNAVAIL or DEGRADED, note which vdevs are missing. Do not use-fat this stage. - Verify ZFS labels on each drive. Run
zdb -l /dev/sdXon each pool member to confirm label presence and read the vdev GUID, pool GUID, and highest transaction group number. Drives missing all four labels were either wiped or belong to a different pool. - Power down if mechanical failure is suspected. Clicking, grinding, or repeated spin-up/spin-down cycles indicate physical drive failure requiring clean bench work. Continued operation risks platter scoring. Do not attempt to offline/online the drive or force a resilver. Ship the drives to a no-fix-no-fee recovery lab for imaging under controlled conditions.
Commands That Destroy ZFS Recovery Options
The following commands, commonly suggested in forum posts, will reduce or eliminate recovery chances if run on a failing pool. If a pool is degraded or faulted, executing these commands permanently overwrites the historical transaction groups required for offline forensic reconstruction. Power down the system instead.
zpool import -f on a faulted pool
Forces import of a pool that ZFS has refused. This writes new TXGs to the pool, overwriting the metadata ZFS needs for self-consistency checks. If the pool is faulted due to drive failures, the forced import will record that the failed drives are absent, and any subsequent export-reimport cycle will reference the damaged state rather than the pre-failure state.
How Forced Imports Destroy the Uberblock Ring
ZFS uses a copy-on-write model: every zpool import -f allocates new objset_phys_t metadata trees and commits a new Transaction Group (TXG) to every surviving drive. The 128-entry uberblock ring is a circular buffer; each new TXG overwrites the oldest entry. A forced import on a degraded pool permanently records the missing drives as absent in the new TXG baseline. Once that TXG is written, rolling back to the pre-failure topology becomes impossible because the uberblock entry that referenced the original vdev tree has been overwritten. If the forced import triggers additional write activity (scrub commands, dataset mounts, ZIL replays), multiple uberblock entries are consumed in rapid succession, shrinking the recovery window from 128 TXGs to as few as a handful. This is why we image every drive before attempting any import variant.
zpool clear followed by resilver on a degraded pool
Clearing errors and resilvering writes parity data across all surviving drives. If any surviving drive has developing bad sectors (common with same-batch drives of the same age), the resilver can trigger that drive to fail, pushing the pool past its parity tolerance. We see this cascade failure regularly.
zfs_max_missing_tvds tunable
This kernel tunable allows ZFS to import a pool with missing top-level vdevs. Setting it to a non-zero value and importing writes new TXGs that permanently record the missing vdevs as absent. If you then add the missing vdevs back, ZFS treats them as foreign devices and will not reattach them. The original pool topology is overwritten. This tunable is a last-resort forensic tool, not a recovery shortcut.
Our ZFS Recovery Methodology
1. Drive Imaging with PC-3000
Every drive in the pool is imaged through PC-3000 with write-blocking. SAS drives (common in TrueNAS Enterprise and Solaris servers) are imaged via SAS HBAs in IT mode. For drives with bad sectors, we capture healthy regions first using sector maps, then retry damaged areas with aggressive read parameters. Drives with mechanical failures (clicking, motor seizure) receive clean bench work before imaging: head swaps, motor transplants, or platter stabilization, all performed under 0.02 µm ULPA filtration.
2. Vdev Label Analysis
We read all four ZFS labels from each drive image using zdb -l. The labels contain the pool name, pool GUID, vdev GUID, vdev tree (encoded as an nvlist), and the 128-entry uberblock ring. By comparing labels across all drive images, we reconstruct the complete vdev topology even when some drives have corrupted labels. The vdev tree tells us which drives belong to which vdev, whether each vdev is a mirror or RAIDZ, and the ashift (sector size alignment, typically 9 for 512-byte or 12 for 4K-native drives).
3. Uberblock Selection and TXG Rollback
The uberblock ring on each drive contains the last 128 transaction groups. We examine each uberblock using zdb -lu to find the highest TXG with a valid checksum. If the latest TXG is corrupted, we roll back to an earlier state using zpool import -T [txg]. The data loss from TXG rollback is limited to writes that occurred between the target TXG and the failed TXG. For most failures triggered by drive loss rather than active corruption, the rollback window is seconds.
4. Offline Pool Import and Dataset Extraction
The pool is imported read-only from the drive images using loopback devices on a dedicated recovery workstation. We verify the pool status, check for data errors using zpool status -v, and extract individual datasets, zvols, and snapshots. For zvols used as VM storage (Proxmox, bhyve), we mount the guest filesystem to verify the VM data is intact. Snapshots are preserved; if the live dataset has corruption but a recent snapshot is clean, we recover from the snapshot.
ZFS Recovery Pricing
Same transparent model as our RAID recovery pricing: per-drive imaging based on each drive's condition, plus a $400-$800 pool reconstruction fee covering vdev analysis, pool import, and dataset extraction. No data recovered = no charge.
| Service Tier | Price Range (Per Drive) | Description |
|---|---|---|
| Logical / Firmware Imaging | $250-$900 | Firmware module damage, SMART threshold failures, or filesystem corruption on individual pool members. |
| Mechanical (Head Swap / Motor) | $1,200-$1,50050% deposit | Donor parts consumed during transplant. SAS drives (common in enterprise ZFS servers) require SAS-specific donors. |
| ZFS Pool Reconstruction | $400-$800per pool | Vdev reconstruction, uberblock analysis, pool import, and dataset/zvol extraction. Includes ZFS native decryption or GELI decryption if key material is provided. |
No Data = No Charge: If we recover nothing from your ZFS pool, you owe $0. Free evaluation, no obligation.
Before sending drives: export your encryption key (GELI recovery key for TrueNAS CORE, ZFS encryption passphrase for TrueNAS SCALE or Linux). Note the pool name and vdev layout from zpool status if the system still boots.
Lab Location and Mail-In
All ZFS recovery work is performed in-house at our Austin lab: 2410 San Antonio Street, Austin, TX 78705. Walk-in evaluations are available Monday - Friday, 10 AM - 6 PM CT. For clients outside Austin, we accept mail-in shipments from all 50 states. Ship drives in anti-static bags with foam padding. Label each drive with its slot number from the original system if possible.
Data Recovery Standards & Verification
Our Austin lab operates on a transparency-first model. We use industry-standard recovery tools, including PC-3000 and DeepSpar, combined with strict environmental controls to make sure your hard drive is handled safely and properly. This approach allows us to serve clients nationwide with consistent technical standards.
Open-drive work is performed in a ULPA-filtered laminar-flow bench, validated to 0.02 µm particle count, verified using TSI P-Trak instrumentation.
Transparent History
Serving clients nationwide via mail-in service since 2008. Our lead engineer holds PC-3000 and HEX Akademia certifications for hard drive firmware repair and mechanical recovery.
Media Coverage
Our repair work has been covered by The Wall Street Journal and Business Insider, with CBC News reporting on our pricing transparency. Louis Rossmann has testified in Right to Repair hearings in multiple states and founded the Repair Preservation Group.
Aligned Incentives
Our "No Data, No Charge" policy means we assume the risk of the recovery attempt, not the client.
Technical Oversight
Louis Rossmann
Louis Rossmann's well trained staff review our lab protocols to ensure technical accuracy and honest service. Since 2008, his focus has been on clear technical communication and accurate diagnostics rather than sales-driven explanations.
We believe in proving standards rather than just stating them. We use TSI P-Trak instrumentation to verify that clean-air benchmarks are met before any drive is opened.
See our clean bench validation data and particle test videoZFS Recovery; Common Questions
My ZFS pool shows FAULTED and zpool import fails. Can you recover the data?
I ran zpool import -f and it made things worse. Is recovery still possible?
Can you recover a ZFS pool after zpool destroy or zpool labelclear?
Does it matter if I use RAIDZ1, RAIDZ2, or RAIDZ3?
My ZFS pool uses deduplication and the DDT is corrupted. Can you recover files?
How is ZFS recovery priced?
Why does TrueNAS SCALE refuse to import my pool after a drive replacement?
Can you recover a TrueNAS ZFS pool built on top of a Dell PERC hardware RAID?
Can data be recovered if the ZFS spacemap is corrupted?
Is it safe to run zpool import -F on a degraded or unmountable pool?
Why does ZFS data recovery cost more than standard single-drive logical recovery?
What does it mean when zpool import hangs instead of returning an error?
How do you recover a pool if the ZIL (SLOG) drive fails during a synchronous write?
Can you recover data if a TrueNAS CORE to SCALE migration destroys the pool?
Ready to recover your ZFS pool?
Free evaluation. No data = no charge. Mail-in from anywhere in the U.S.