What "Volume Crashed" Means at the Linux Level
Synology DSM runs on Linux and uses mdadm for software RAID management. "Volume Crashed" means the mdadm array has entered an inactive or failed state. It is not a DSM UI glitch; it reflects a real failure in the underlying RAID layer.
- 1.DSM creates Linux md (multiple device) arrays using mdadm. Each storage pool corresponds to one or more md devices (/dev/md0, /dev/md1, etc.).
- 2.Each drive in the array carries an mdadm superblock containing the array UUID, layout, chunk size, and device role.
- 3.When enough member drives fail, disconnect, or report I/O errors, mdadm marks the array as inactive. DSM reads this state and displays "Volume Crashed."
- 4.The drives themselves are usually still readable individually. The RAID metadata binding them into a single volume is what has broken.
Mechanism: In a four-bay SHR-1 array, a single drive can develop bad sectors over weeks and DSM will mark the volume degraded but keep it online. If a second drive then reports read errors during a scheduled data scrub, mdadm cannot maintain a single-parity array with two faulty members. The kernel marks the array inactive, and DSM reports "Volume Crashed".
SHR Architecture and mdadm Underneath
Synology Hybrid RAID (SHR) is not a custom RAID implementation. It is a partition layout that creates standard mdadm RAID arrays across partitions of different sizes, allowing mixed-capacity drives to share one storage pool. All md arrays are combined into a single LVM volume group formatted with Btrfs or EXT4.
- 1.SHR partitions each drive into slices sized to match the smallest drive in the pool.
- 2.Each slice group forms a standard mdadm RAID 5 array (for SHR-1) or RAID 6 array (for SHR-2).
- 3.Leftover capacity on larger drives forms additional mdadm arrays (often RAID 1 pairs) to use the extra space.
- 4.All md arrays are combined into a single LVM volume group, and the logical volume is formatted with Btrfs or EXT4.
- 5.A "Volume Crashed" error means at least one of these md arrays has failed, which takes the entire LVM volume offline.
Mechanism: In a mixed-capacity SHR-1 pool (for example, two 8TB drives paired with two 4TB drives), DSM creates a RAID 5 md array across the matching 4TB partitions, and a RAID 1 md array across the extra 4TB carved from each 8TB drive. Both md arrays are joined into a single LVM volume group. If the first md array fails, the entire LVM volume crashes even when the second md array remains healthy.
NVMe Read/Write Cache Failures
Not every "Volume Crashed" error comes from a failing hard drive. On models like the DS920+, DS1520+, & DS1621+, an M.2 NVMe cache drive failure can crash the volume while every mechanical HDD is healthy. Read/write cache intercepts writes before they commit to disk; if the NVMe drops off the PCIe bus, the LVM volume becomes inconsistent and DSM reports "Volume Crashed."
DSM supports two cache modes: read-only & read/write. Read-only cache stores copies of frequently accessed data and carries no crash risk. Read/write cache is the failure point: if the NVMe drive drops off the PCIe bus, those uncommitted writes are lost and DSM reports "Volume Crashed."
Common triggers include consumer-grade NVMe controllers that suffer firmware panics under sustained write loads or sudden power loss. The drive may report 0 bytes capacity or fail to enumerate on the PCIe bus entirely.
Recovery requires imaging the cache drive separately, then reconciling the uncommitted write journal with the HDD array images. Professional NAS data recovery with write-blocked imaging of both the cache & array drives is the safe path.
Btrfs vs EXT4: Filesystem Recovery Differences
Synology supports two filesystems: Btrfs (copy-on-write, with snapshots and checksumming) and EXT4 (traditional journaled filesystem). The filesystem type determines which recovery tools work, which failure modes are possible, and whether silent corruption is detected before data loss occurs.
| Feature | Btrfs | EXT4 |
|---|---|---|
| Write model | Copy-on-write; data is never overwritten in place | Journal-based; metadata writes are journaled, data may not be |
| Checksumming | Metadata and data checksums detect silent corruption | No built-in checksumming; RAID parity mismatches are not detected at the filesystem level |
| Snapshots | Snapshots may preserve earlier file versions after corruption of the latest copy | No native snapshots |
| Recovery tools | btrfs restore, btrfs check (Btrfs-specific; standard undelete tools do not understand COW metadata) | e2fsck, debugfs, extundelete (well-documented, widely available) |
Mechanism: When a Btrfs SHR volume takes a power loss mid-transaction, the mdadm array can reassemble while Btrfs refuses to mount because the filesystem tree root checksum does not match. Running "btrfs check --repair" can fix metadata inconsistencies but may also drop files whose metadata cannot be validated. On EXT4, e2fsck would instead replay the journal and reconnect orphaned inodes to lost+found. The recovery risk profile is different on each filesystem.
Using photorec and testdisk on Crashed Volumes
photorec and testdisk are open-source tools that scan raw block devices for file signatures (file carving). They can recover files from damaged filesystems, but they carry real risks when run against live or degraded arrays. Running them directly on a degraded array can trigger additional reads that stress failing drives and provoke mdadm to attempt resync operations.
- 1.photorec scans raw sectors for known file headers (JPEG, PDF, DOCX, etc.) and extracts files regardless of filesystem state. It does not preserve filenames or directory structure.
- 2.testdisk analyzes partition tables and can sometimes rebuild a damaged partition map or recover a deleted partition.
- 3.On Btrfs, file carving with photorec is less effective because COW scatters file extents across the device. Large files are often fragmented in ways that photorec cannot reconstruct.
Image first, scan second. Running recovery tools directly on a degraded array can trigger additional reads that stress failing drives, cause rapid mechanical degradation of weak read/write heads, or provoke mdadm to attempt resync operations. For irreplaceable data, create write-blocked images of every drive using ddrescue before running photorec, testdisk, or any other scanning tool. Work from the images, not the source drives.
Mechanism: Running photorec directly on a live md device of a degraded SHR array issues sustained sequential reads across every surviving drive. A drive that is already reporting SMART warnings can develop additional bad sectors under that read load. mdadm then kicks the drive from the array entirely, and a single-drive degraded failure escalates to a double-drive failure. This is why imaging precedes scanning.
Why Reinstalling DSM Destroys Your Data
When Synology DSM prompts you to reinstall or migrate after a volume crash, it is preparing to rewrite the system partition on every drive. While user data on partition 3 remains untouched, reinstalling the OS can alter partition tables and complicate the recovery of corrupted LVM metadata.
- 1.Each Synology drive contains a small system partition (partition 1) holding the DSM operating system, and one or more data partitions (partition 2, 3, etc.) holding the RAID members.
- 2.The DSM installer rewrites partition 1 (md0, system) and partition 2 (md1, swap). A standard Mode 2 reinstall does not directly touch user data on partition 3+ (md2+), but partition table changes can complicate reassembly.
- 3.Without valid mdadm superblocks, the array cannot be automatically reassembled. Manual reconstruction requires knowing the exact RAID level, chunk size, layout, and drive order; information that was stored in the superblocks.
- 4.Moving drives to a new Synology unit carries the same risk. The new unit's DSM installer may treat the drives as uninitialized if it cannot read the existing RAID metadata.
Mechanism: Moving the drives from a crashed unit into a new Synology chassis and selecting "Migrate" causes DSM to rewrite the system partitions and attempt to import the md arrays. If the original array metadata was damaged during the crash, the import fails and DSM may only offer "Install fresh", which will destroy the remaining data partitions. At that point, Synology NAS data recovery requires imaging the drives before any further DSM operations.
How Do SMR Drives Cause Synology Volume Crashes?
Shingled Magnetic Recording (SMR) drives are a common hidden cause of Synology volume crashes during rebuild operations. Under sustained sequential write loads like an mdadm array resync, an SMR drive's cache zone overflows and the drive pauses to reorganize shingles. The Linux kernel interprets this timeout as a dead drive and drops it from the array.
SMR drives write data in overlapping tracks to increase density. They maintain a small CMR (Conventional Magnetic Recording) cache zone for incoming writes, then reorganize data into shingles during idle time. Under a rebuild workload, that cache overflows and the pause can last seconds to minutes.
If DSM shows a degraded volume & one of the drives is SMR, clicking "Repair" to initiate a rebuild can trigger this exact timeout cascade. The rebuild operation hammers every surviving drive with sustained reads & writes; the SMR drive stalls, gets kicked, & the array collapses from degraded to crashed.
If you aren't sure whether your drives are CMR or SMR, do not attempt a RAID rebuild without professional guidance. A failed rebuild on a degraded array usually results in total data loss. If the underlying drives have mechanical damage, they need hard drive data recovery before any array-level work can begin.
What Are the Safe Diagnostic Steps After a Volume Crash?
If the volume contains data you need, the correct sequence is: stop DSM from making changes, image every drive, then attempt reassembly on the images. The original drives should not be written to at any point. Do not click Repair, Migrate, or Reinstall in DSM before drives are imaged.
- Power down the NAS. Do not click Repair, Migrate, or Reinstall in DSM.
- Label each drive with its bay number. Remove the drives.
- Connect each drive to a Linux workstation using a write-blocker or mount read-only.
- Run
mdadm --examine /dev/sdX3on each drive to read the RAID superblock. This tells you the array UUID, RAID level, chunk size, and device roles. - Image each drive with ddrescue to a separate destination disk. This preserves the original state.
- Attempt
mdadm --assemble --readonlyon the images. If the array assembles, mount the filesystem read-only and copy files to a new destination.
If you are not comfortable working with mdadm, LVM, and Btrfs or EXT4 at the command line, or if the array does not assemble from images, professional NAS data recovery with write-blocked imaging is the lower-risk path.
If mdadm --examine returns no valid superblock on one or more drives, the array metadata may have been overwritten by a DSM reinstall or a failed migration attempt. Reconstructing array geometry without intact metadata is a missing mdadm superblock recovery procedure that requires knowing the original RAID level, chunk size, layout, & drive order.
DSM 6.x vs 7.x: Volume Crash Behavior Differences
The DSM version running on the NAS changes what the user sees when a volume fails and what DSM will do on its own after a reboot. DSM 7.x added automated storage pool repair that can begin a rebuild without user confirmation, converting a degraded state into a Volume Crashed state on marginal hardware.
- 1.DSM 6.2 ships mdadm 3.4-era utilities. Degraded storage pools require an explicit Repair click in Storage Manager. The OS does not start a rebuild on its own.
- 2.DSM 7.0 and later add an automated pool repair workflow on supported models. After a reboot with a replacement drive inserted, DSM can begin resync without user confirmation. On healthy hardware this is convenient; on a degraded array with a marginal SMR drive, it can turn a degraded state into a crashed state.
- 3.DSM 7 also introduced the Storage Pool 2.0 layout on newer units. The partition offsets for the data array changed from partition 3 to different offsets on some models, so diagnostic commands that worked on DSM 6 (
mdadm --examine /dev/sdX3) may need to target a different partition on DSM 7. - 4.Btrfs support is mandatory for Snapshot Replication in DSM 7; more DSM 7 volumes use Btrfs than DSM 6 volumes did. Btrfs metadata damage after a power loss is a more common failure mode on DSM 7 units as a result.
If the unit runs DSM 7 and shows any degraded or warning status, power it off before the next scheduled reboot. Auto Repair can be disabled under Storage Manager settings, but it ships enabled by default; an auto-resync onto a marginal drive is a common cause of degraded-to-crashed escalation during unattended overnight reboots.
PC-3000 Portable III for mdadm Superblock Extraction
When one or more Synology drives develop physical read errors, a standard Linux workstation with a SATA dock cannot produce a clean image. Standard controllers issue retries that stress failing heads; the Linux kernel times out and marks the drive dead. Purpose-built imaging hardware like PC-3000 Portable III handles bad-sector skipping and per-head retry budgets that standard controllers cannot.
- 1.The PC-3000 Portable III images Synology drives at the physical layer with head-map control, bad-sector skipping, reverse-read modes, and per-head retry budgets. The output is a sector-accurate image file even when the source drive cannot sustain continuous sequential reads.
- 2.The mdadm superblock lives at a fixed offset inside the data partition (commonly partition 3 on DSM 6, varies on DSM 7). Once the image exists, standard Linux utilities can read it:
mdadm --examine image.imgreveals array UUID, RAID level, chunk size, and device role; all required for offline reassembly. - 3.PC-3000 does not interpret the RAID metadata itself. Its role is producing a complete image from a drive that a normal controller cannot read. Array reconstruction then happens on a separate Linux system operating on the images, never on the original drives.
- 4.If a head has failed outright, imaging pauses until the drive is opened in the clean bench and donor heads are installed. Our Austin lab performs hard drive data recovery in a 0.02 µm ULPA-filtered clean bench; the imaged drive then goes back onto the PC-3000 to finish extraction.
RTO and RPO Planning for Synology Recovery
Businesses planning around a Synology failure need realistic numbers for Recovery Time Objective (RTO: how long until data is back in hand) and Recovery Point Objective (RPO: how much recent data is lost). The physics of sector-by-sector imaging set hard lower bounds that no amount of rush fee can compress.
- 1.Imaging time: ddrescue on a healthy 4 TB drive runs at roughly 120 to 180 MB/s sustained; a full pass is 6 to 10 hours. A drive with read errors runs 10x slower on the affected zones. Expect 1 to 2 calendar days to image a 4-bay SHR-1 with mixed drive health.
- 2.Array and filesystem work: mdadm reassembly from images, Btrfs or EXT4 mount testing, and file extraction add 4 to 12 hours for a healthy-metadata case; 1 to 3 days if the filesystem requires block-level carving.
- 3.Head swaps: If a drive has a failed head stack, allow 3 to 7 days per drive for donor sourcing, physical transplant, and adaptive re-imaging. Multiple bad drives extend this in parallel rather than serially.
- 4.Typical RTO: 3 to 7 calendar days for a healthy-drive SHR-1 recovery; 10 to 21 days for cases involving head swaps, severe Btrfs metadata damage, or large arrays (6 drives and up). The optional rush fee (+$100 rush fee to move to the front of the queue) moves a case to the front of the queue but does not change imaging throughput.
- 5.RPO: Snapshot replication (DSM 7) or Hyper Backup to a second device is the only way to reduce RPO below the last scheduled backup window. On-NAS snapshots survive a Btrfs volume crash only if the snapshot metadata is independently intact, which is not guaranteed. Treat the NAS itself as a single failure domain when planning RPO.
For business-critical arrays, pair the NAS with an offsite RAID backup target running a different filesystem and hardware generation so that a single firmware or controller bug cannot take down both copies at once.
How Much Does Synology Recovery Cost
Synology NAS recovery uses our standard hard drive data recovery tiers, applied per drive that requires work. Multi-drive arrays are not billed as a flat NAS fee; you pay for the recovery work each drive actually needs. Free diagnostic, no data no fee, +$100 rush fee to move to the front of the queue.
Low complexity
Simple Copy
Your drive works, you just need the data moved off it
Functional drive; data transfer to new media
Rush available: +$100
$100
3-5 business days
Low complexity
File System Recovery
Your drive isn't recognized by your computer, but it's not making unusual sounds
File system corruption. Accessible with professional recovery software but not by the OS
Starting price; final depends on complexity
From $250
2-4 weeks
Medium complexity
Firmware Repair
Your drive is completely inaccessible. It may be detected but shows the wrong size or won't respond
Firmware corruption: ROM, modules, or translator tables corrupted; requires PC-3000 terminal access
CMR drive: $600. SMR drive: $900.
$600–$900
3-6 weeks
High complexity
Most Common
Head Swap
Your drive is clicking, beeping, or won't spin. The internal read/write heads have failed
Head stack assembly failure. Transplanting heads from a matching donor drive on a clean bench
50% deposit required. CMR: $1,200-$1,500 + donor. SMR: $1,500 + donor.
50% deposit required
$1,200–$1,500
4-8 weeks
High complexity
Surface / Platter Damage
Your drive was dropped, has visible damage, or a head crash scraped the platters
Platter scoring or contamination. Requires platter cleaning and head swap
50% deposit required. Donor parts are consumed in the repair. Most difficult recovery type.
50% deposit required
$2,000
4-8 weeks
Hardware Repair vs. Software Locks
Our "no data, no fee" policy applies to hardware recovery. We do not bill for unsuccessful physical repairs. If we replace a hard drive read/write head assembly or repair a liquid-damaged logic board to a bootable state, the hardware repair is complete and standard rates apply. If data remains inaccessible due to user-configured software locks, a forgotten passcode, or a remote wipe command, the physical repair is still billable. We cannot bypass user encryption or activation locks.
No data, no fee. Free evaluation and firm quote before any paid work. Full guarantee details. Head swap and surface damage require a 50% deposit because donor parts are consumed in the attempt.
- Rush fee
- +$100 rush fee to move to the front of the queue
- Donor drives
- Donor drives are matching drives used for parts. Typical donor cost: $50–$150 for common drives, $200–$400 for rare or high-capacity models. We source the cheapest compatible donor available.
- Target drive
- The destination drive we copy recovered data onto. You can supply your own or we provide one at cost plus a small markup. For larger capacities (8TB, 10TB, 16TB and above), target drives cost $400+ extra. All prices are plus applicable tax.
Terminology Reference
- DSM (DiskStation Manager)
- Synology's Linux-based operating system. Runs on every DiskStation and RackStation unit and exposes the web UI that reports Volume Crashed.
- SHR (Synology Hybrid RAID)
- A partition layout that builds standard mdadm RAID 5 (SHR-1) or RAID 6 (SHR-2) arrays across same-sized slices of mixed-capacity drives, then joins them into one LVM volume group.
- btrfs scrub
- A Btrfs background process that reads every data and metadata block, verifies checksums, and rewrites blocks that fail verification using mirrored or parity data. A scheduled scrub on a marginal drive can be the event that escalates a degraded volume into a crashed state.
- dm-cache
- The Linux device-mapper layer Synology uses to implement SSD read/write cache in front of the HDD array. When the cache backing device drops off the PCIe bus mid-write, dm-cache cannot reconcile dirty blocks and the LVM volume on top of the cached array becomes inconsistent.
Data Recovery Standards & Verification
Our Austin lab operates on a transparency-first model. We use industry-standard recovery tools, including PC-3000 and DeepSpar, combined with strict environmental controls to make sure your hard drive is handled safely and properly. This approach allows us to serve clients nationwide with consistent technical standards.
Open-drive work is performed in a ULPA-filtered laminar-flow bench, validated to 0.02 µm particle count, verified using TSI P-Trak instrumentation.
Transparent History
Serving clients nationwide via mail-in service since 2008. Our lead engineer holds PC-3000 and HEX Akademia certifications for hard drive firmware repair and mechanical recovery.
Media Coverage
Our repair work has been covered by The Wall Street Journal and Business Insider, with CBC News reporting on our pricing transparency. Louis Rossmann has testified in Right to Repair hearings in multiple states and founded the Repair Preservation Group.
Aligned Incentives
Our "No Data, No Charge" policy means we assume the risk of the recovery attempt, not the client.
Technical Oversight
Louis Rossmann
Louis Rossmann's well trained staff review our lab protocols to ensure technical accuracy and honest service. Since 2008, his focus has been on clear technical communication and accurate diagnostics rather than sales-driven explanations.
We believe in proving standards rather than just stating them. We use TSI P-Trak instrumentation to verify that clean-air benchmarks are met before any drive is opened.
See our clean bench validation data and particle test videoFrequently Asked Questions
What does Volume Crashed mean on Synology?
Can I recover data after a Synology volume crash?
Should I reinstall DSM after a volume crash?
Can an NVMe SSD cache failure crash my Synology volume?
What does SMART attribute ID 199 mean on a Synology NAS?
How much does Synology NAS data recovery cost?
How does SHR-1 vs SHR-2 crash recovery differ?
What does a Btrfs 'parent transid verify failed' error mean for recovery?
Does DSM 7 auto-repair behave differently than DSM 6 on a crashed volume?
What is a realistic RTO (recovery time) for a crashed Synology volume?
Can PC-3000 read Synology mdadm superblocks from a damaged drive?
Can DRAM-less NVMe cache drives cause a Synology volume crash?
Rossmann Repair Group recovers data from all Synology DiskStation & RackStation models. We image every drive with write-blockers, reassemble the mdadm array offline, & extract files from the Btrfs or EXT4 filesystem. Our no data, no fee guarantee means you don't pay if we can't recover your files. See our Synology NAS recovery page for model-specific details & pricing.
Related services
Related Recovery Services
Model-specific Synology recovery details
Full NAS recovery service overview
Missing or corrupted mdadm superblocks
Hardware and software RAID recovery
Failed rebuild diagnostic guide
Recovering from degraded arrays
Synology volume crashed?
Free evaluation. Write-blocked drive imaging. mdadm array reconstruction. No data, no fee.
