Skip to main contentSkip to navigation
Lab Operational Since: 17 Years, 6 Months, 28 DaysFacility Status: Fully Operational & Accepting New Cases

Technical Reference

How ZFS Differs from Hardware RAID

Louis Rossmann
Written by
Louis Rossmann
Founder & Chief Technician
Published March 8, 2026
Updated March 8, 2026

Summary: ZFS versus hardware RAID

ZFS checksums every block and self-heals from redundant copies during normal reads; hardware RAID uses parity only and cannot detect silent corruption. ZFS copy-on-write keeps the on-disk tree intrinsically consistent, while hardware RAID relies on a battery-backed write cache to survive torn writes. ZFS needs direct disk access via HBA passthrough or JBOD, never behind a hardware RAID controller.

ZFS and hardware RAID both provide data redundancy across multiple drives, but they operate at different layers of the storage stack. Hardware RAID controllers (Dell PERC, HP SmartArray, LSI MegaRAID, Adaptec) manage redundancy below the filesystem, presenting a single virtual volume to the operating system.

ZFS manages redundancy within the filesystem itself, combining volume management and filesystem operations into a single integrated layer. This architectural difference affects data integrity, failure handling, and recovery.

How does copy-on-write differ from in-place update?

Traditional filesystems (NTFS, ext4, XFS) and hardware RAID controllers use in-place updates: when data is modified, the new version overwrites the old version at the same physical location. If power is lost during the write, the block may contain a mix of old and new data (a torn write). Hardware RAID controllers mitigate this with battery-backed write cache (BBU/BBM) that preserves pending writes across power loss. Filesystems use journaling to record write intent before committing changes.

ZFS uses copy-on-write (COW): modified data is always written to a new location. The old data remains intact until the new write completes and the metadata tree is updated to point to the new location. The metadata tree itself is also written copy-on-write, all the way up to the root (the "uberblock"). Only after the entire tree of changes is written does ZFS atomically update the uberblock pointer.

The effect: ZFS never overwrites live data. A power loss at any point during a write leaves the filesystem tree in a consistent state, either reflecting the old data or the new data, never a torn mix.

On mount, ZFS opens the most recent valid uberblock. If acknowledged synchronous writes (fsync, O_SYNC, NFS COMMIT) were logged to the ZFS Intent Log (ZIL) but not yet committed in a transaction group, ZFS replays the ZIL to recover them. Async writes that never reached stable storage are lost, same as on any filesystem.

How do checksumming and self-healing reads work?

Hardware RAID has no mechanism to detect silent data corruption. If a drive returns incorrect data without reporting a read error, the RAID controller accepts it as valid and may even incorporate it into parity calculations. This is called silent data corruption or bit rot.

ZFS checksums every block of data and metadata using a 256-bit hash (fletcher4 by default, or SHA-256 for dedup). The checksum is stored in the block's parent pointer, not alongside the data. This separation means a single disk corruption event cannot simultaneously damage both the data and its checksum.

When ZFS reads a block, it verifies the checksum before returning the data. If the checksum does not match, ZFS knows the data is corrupt.

In a redundant configuration (mirror or RAIDZ), ZFS automatically reads the block from a different copy or reconstructs it from parity. If the alternate copy is valid, ZFS overwrites the corrupted copy with correct data. This is self-healing: corruption is detected and repaired transparently during normal reads.

FeatureHardware RAIDZFS
Silent corruption detectionNo (trusts drive-reported data)Yes (every block checksummed)
Self-healing readsNoYes (with redundancy)
Write consistencyRequires BBU + journalCopy-on-write (inherent)
Disk visibilityController hides individual disksFilesystem manages individual disks
ExpansionReplace drives or add expansion unitAdd new vdevs (RAIDZ vdev expansion added in OpenZFS 2.3+)
Cache safetyBBU-backed write cacheZIL on separate SLOG device (optional)

How does ZFS scrub differ from a hardware RAID rebuild?

Hardware RAID has no built-in mechanism to proactively check data integrity. Some enterprise controllers support "patrol reads" that scan drives in the background, but these only detect drive-reported errors, not silent corruption.

ZFS scrub reads every allocated block on every drive and verifies its checksum. If a block fails verification, ZFS repairs it from redundant copies (mirror or parity). A scrub is non-destructive and can run on a live, mounted filesystem. Running regular scrubs (weekly or monthly) catches corruption early, before it accumulates across multiple blocks.

When a drive fails, ZFS resilvering (the equivalent of a RAID rebuild) only reads and writes the blocks that are actually allocated. If a 16 TB drive is 40% full, ZFS resilvers approximately 6.4 TB of data, not 16 TB. Hardware RAID rebuilds always process the entire drive capacity because the controller operates below the filesystem and does not know which blocks contain data.

What causes ZFS pools to fail, and what does recovery look like?

ZFS is not immune to failure. The most common scenarios:

  • Too many drive failures. RAIDZ1 (single parity) tolerates one drive failure. RAIDZ2 (double parity) tolerates two. RAIDZ3 tolerates three. Exceeding the redundancy level makes the pool unimportable.
  • Pool metadata corruption. The uberblock, MOS (Meta Object Set), and space maps are critical metadata structures. If these corrupt on all copies (possible with firmware bugs or controller errors during a multi-drive event), the pool cannot mount.
  • RAIDZ expansion complications. ZFS traditionally did not allow adding drives to an existing RAIDZ vdev (OpenZFS 2.3+ added this feature). Misconfigured pool expansions or interrupted vdev additions can leave the pool in an inconsistent state.
  • Accidental pool destruction. The command "zpool destroy" is irreversible and immediate. It clears pool labels from all member drives.

Recovery from a failed ZFS pool involves imaging all drives individually and using ZFS-aware recovery tools to parse the on-disk structures. Because ZFS stores metadata in a Merkle tree (every block pointer includes the checksum of the block it points to), recovery tools can validate data integrity during reconstruction. Damaged metadata blocks can sometimes be reconstructed from the multiple copies ZFS maintains (uberblocks are stored redundantly across all drives, and metadata blocks have a configurable number of copies via the "copies" property).

ZFS should never sit behind a hardware RAID controller.

ZFS needs direct access to individual disks to perform checksumming, self-healing, and copy-on-write operations. A hardware RAID controller hides individual disks behind a virtual volume, preventing ZFS from detecting which disk returned bad data. The controller's write cache can also interfere with ZFS's write ordering guarantees. Use an HBA (Host Bus Adapter) in passthrough or JBOD mode instead.

Frequently Asked Questions

Does ZFS make data recovery unnecessary?
No. ZFS provides stronger data integrity guarantees than hardware RAID through checksumming, copy-on-write, and self-healing reads. However, ZFS cannot protect against all failure modes. If enough drives in a vdev fail to exceed the redundancy level (e.g., two drives in a single-parity RAIDZ1), the pool becomes unimportable. If the pool metadata on all drives is corrupted (possible with firmware bugs, controller errors, or multiple simultaneous failures), the pool cannot mount. ZFS also cannot protect against user error (deleting files without snapshots) or NAND degradation in SSD-based pools.
Why should I not put ZFS behind a hardware RAID controller?
ZFS needs direct access to individual disks to perform its checksumming, copy-on-write, and self-healing operations. A hardware RAID controller presents a single virtual volume to the operating system, hiding the individual disks. ZFS cannot checksum individual disk sectors, cannot detect which disk returned bad data, and cannot perform self-healing reads from redundant copies. The RAID controller's write cache also interferes with ZFS's own write ordering guarantees, potentially corrupting the pool during power loss if the controller's battery-backed cache fails. ZFS should be connected to an HBA (Host Bus Adapter) in passthrough/JBOD mode, not a RAID controller.

If you are experiencing this issue, learn about our RAID recovery service.