Skip to main contentSkip to navigation
Rossmann Repair Group logo - data recovery and MacBook repair
ZFS Recovery

ZFS Pool Recovery and Troubleshooting Guide

Your ZFS pool is reporting DEGRADED, FAULTED, or UNAVAIL in zpool status. The pool may have refused to import, or it imported but shows data errors. Before running zpool clear or zpool replace, you need to understand which operations are safe and which will overwrite the on-disk state you need for recovery.

This guide covers ZFS pool states, safe export/import procedures, transaction group rollbacks, and raidz fault tolerance.

Understanding ZFS Pool States

ZFS tracks pool health at the vdev level. Each vdev reports one of four states: ONLINE, DEGRADED, FAULTED, or UNAVAIL. The pool state is the worst state among its top-level vdevs.

ONLINE

All vdevs are healthy. No errors detected. Normal operation. No action needed.

DEGRADED

One or more vdevs have lost a member but can still serve data using redundancy (raidz parity or mirror copies). The pool is operational but running without full fault tolerance. Safe to read from; assess before writing.

FAULTED

A vdev has lost too many members to maintain data integrity. For raidz1, this means 2+ drives are down. For raidz2, 3+ drives. The pool cannot serve I/O. Do not write to the remaining drives.

UNAVAIL

ZFS cannot open the device at all. The drive may be disconnected, failed, or its device path may have changed (common after controller or cabling changes). Check physical connections and device paths before assuming hardware failure.

Example: A TrueNAS server with a 6-drive raidz2 pool. Drive 4 fails (UNAVAIL). The vdev transitions to DEGRADED. Pool continues serving data. Drive 2 then reports checksum errors and ZFS marks it FAULTED. The raidz2 vdev has now lost 2 of its allowed 2 parity drives. One more failure and the pool transitions to FAULTED. The admin has a narrow window to image the remaining drives.

Safe Export and Import Procedures

Exporting a ZFS pool flushes pending writes and marks the pool as cleanly closed. Importing reads the on-disk metadata and reconstructs the in-memory state. Both operations are safe when done correctly, but import with the wrong flags can overwrite recoverable metadata.

  • 1.zpool export poolname flushes all pending transaction groups to disk and marks the pool as exported. This is the cleanest way to take a pool offline. Only works if the pool is in ONLINE or DEGRADED state.
  • 2.zpool import -o readonly=on poolname imports the pool in read-only mode. ZFS will not write any metadata updates to the drives. This is the safest way to access data on a pool you suspect is damaged.
  • 3.zpool import -f poolname force-imports a pool that was not cleanly exported. ZFS replays any pending transaction groups. This writes to the drives and may advance the on-disk state past a recoverable point.
  • 4.If the pool will not import at all, do not use -f repeatedly. Image the drives and work from copies.

Example: A QNAP NAS data recovery case where the unit lost power during a scrub. The admin connects the drives to a Linux workstation and runs zpool import. The pool shows as available but not exported. Running zpool import -o readonly=on tank mounts the pool without writing anything. The admin copies data to a new destination before deciding whether to repair or rebuild the pool.

The Risks of zpool clear

zpool clear resets error counters on a vdev and tells ZFS to retry I/O. If the errors were transient (a loose cable), this brings the vdev back online. If the drive is failing, clearing errors masks the problem and allows further corruption.

  • 1.ZFS tracks read errors, write errors, and checksum errors per device. When error counts exceed internal thresholds, ZFS marks the vdev as FAULTED.
  • 2.zpool clear poolname resets these counters to zero and retries failed I/O. If the drive responds, ZFS marks it ONLINE again.
  • 3.If the underlying drive has bad sectors or a failing head, the errors will return. In the meantime, ZFS will write new data to the faulty drive, and that data may be lost when the errors recur.
  • 4.The next scrub will detect the corruption, but by then the pool may have advanced past the last consistent transaction group.

Rule: Only run zpool clear if you have identified and fixed the root cause (reseated a cable, replaced a controller, resolved a power issue). If the drive itself is failing (check SMART), replace it instead of clearing errors.

Example: A homelab server running TrueNAS with a 4-drive raidz1 pool. Drive 3 shows 47 checksum errors after a thunderstorm. The admin runs zpool clear. The errors disappear. Two weeks later, a scrub finds 200+ checksum errors on the same drive. The errors were not transient; the drive has failing sectors from a power surge. Data written to those sectors in the interim is now corrupted, and the raidz1 parity cannot fix sectors that were written with bad data.

Transaction Group Rollbacks

ZFS uses copy-on-write for all data and metadata. Every write goes into a new location on disk, and the old data remains until the space is reclaimed. Transaction groups (TXGs) are batched commits that advance the pool to a new consistent state. If the latest TXG is corrupted, you can roll back to a previous one.

  • 1.ZFS flushes a new TXG to disk every 5 seconds (default) or when the write buffer fills.
  • 2.The uberblock at the top of the pool metadata tree points to the most recent valid TXG. ZFS stores a ring buffer of historical uberblocks, providing a history of recent TXGs.
  • 3.zpool import -T <txg> -o readonly=on poolname imports the pool using a specific historical TXG instead of the most recent. This effectively rolls back the pool to an earlier consistent state.
  • 4.TXG rollback only works if the on-disk blocks for the older TXG have not been overwritten by subsequent writes. Copy-on-write preserves old blocks until the space is needed, so recently written pools with available free space have better rollback success rates.
  • 5.Use zdb -u poolname to list available uberblocks and their TXG numbers before attempting a rollback.

Example: A FreeBSD server loses power during heavy writes. On reboot, zpool import fails with "one or more devices has experienced an unrecoverable error." The admin runs zdb -u tank and sees that the latest TXG (482917) is corrupt but TXG 482910 has a valid uberblock. Running zpool import -T 482910 -o readonly=on tank imports the pool at the state from 35 seconds before the power failure. The 7 missing TXGs represent approximately 35 seconds of writes; everything before that is intact.

Handling FAULTED and UNAVAIL Vdevs

When a vdev is FAULTED or UNAVAIL, the decision to replace or stop depends on the pool's remaining redundancy, the value of the data, and whether the failure is a drive issue or a connection issue.

  • 1.Check physical connections first. UNAVAIL often means ZFS cannot find the device path. A reseated SATA cable or a different controller port may resolve it.
  • 2.Check SMART data. If the drive reports reallocated sectors, pending sectors, or UNC errors, the hardware is failing.
  • 3.If the pool is DEGRADED (still has margin), zpool replace poolname old-dev new-dev initiates a resilver (ZFS term for rebuild). This reads all surviving vdev members to reconstruct the replacement.
  • 4.If the pool is FAULTED (no remaining margin), do not attempt to replace. The pool cannot guarantee data integrity. Image every drive and attempt offline reconstruction.

For enterprise server data recovery, FAULTED ZFS pools on production systems require imaging before any repair attempt. ZFS's copy-on-write architecture means the historical TXG data is still on-disk; any write operation (including a replace or scrub) can overwrite those blocks.

Example: A Proxmox hypervisor with a 3-drive raidz1 pool. Drive 1 shows FAULTED with 312 checksum errors. The pool is DEGRADED. The admin checks SMART on drives 2 and 3: both are clean. The admin images all 3 drives, then runs zpool replace tank /dev/sda /dev/sdd on the live pool (using the images as a fallback). The resilver completes in 6 hours. If the resilver had failed, the admin could have reconstructed the pool offline from the pre-resilver images.

raidz1, raidz2, raidz3: Fault Tolerance Compared

ZFS raidz levels map to traditional RAID data recovery parity concepts: raidz1 is single-parity (like RAID 5), raidz2 is dual-parity (like RAID 6), and raidz3 is triple-parity (no traditional RAID equivalent).

raidz1

Tolerates 1 drive failure. Same URE risk as RAID 5 during resilver. Not recommended for drives larger than 2TB.

raidz2

Tolerates 2 drive failures. The current recommendation for most ZFS deployments with large drives. Resilver can complete even with one URE.

raidz3

Tolerates 3 drive failures. Used in large-capacity deployments (12+ drives) where resilver times exceed 48 hours and multi-drive failure is a realistic scenario.

ZFS has one advantage over traditional RAID during rebuilds: it only resilvers allocated blocks, not the entire drive. A raidz2 pool at 50% capacity resilvers roughly half the data compared to a RAID 6 rebuild. This reduces both the time window and the total bytes read, lowering URE risk.

Example: A FreeNAS server with 8x 12TB drives in raidz2. Total raw capacity: 96TB. Usable: 72TB. Current usage: 36TB (50%). Drive 5 fails. The resilver reads approximately 36TB of allocated data across 7 surviving drives (not the full 72TB a traditional RAID 6 rebuild would require). At the enterprise URE rate, the probability of completing the resilver without error is higher than a traditional full-disk rebuild.

Frequently Asked Questions

Can data be recovered from a FAULTED ZFS pool?

In most cases, yes. FAULTED means ZFS has determined that the pool cannot guarantee data integrity with its current set of available vdevs. The data is still on the drives. Recovery involves imaging each drive with a write-blocker and reconstructing the pool offline. ZFS stores extensive metadata, including multiple copies of the uberblock and transaction group history, which professional tools can use to rebuild the pool state even when the live pool refuses to import.

What does UNAVAIL mean in ZFS?

UNAVAIL means ZFS cannot open the vdev at all. The drive may have failed, been disconnected, or the device path may have changed. If the vdev is part of a raidz group and too many members are UNAVAIL (more than the parity level allows), the entire pool transitions to FAULTED. A single UNAVAIL vdev in a mirror is tolerated as long as the other mirror member is ONLINE. Check 'zpool status' for the specific vdev and drive identifier.

Is it safe to use zpool clear on a degraded pool?

zpool clear resets error counters and retries I/O on vdevs that ZFS has flagged. If the errors were transient (a loose cable, a temporary controller issue), clearing can bring the vdev back online. If the errors reflect a real hardware failure, clearing masks the problem and allows the pool to continue operating with a drive that is actively failing. Future writes to that drive may be lost. Only use zpool clear if you have identified and resolved the root cause of the errors.

ZFS pool FAULTED or UNAVAIL?

Free evaluation. Write-blocked drive imaging. Offline pool reconstruction with TXG history preserved. No data, no fee.