Skip to main contentSkip to navigation
Rossmann Repair Group logo - data recovery and MacBook repair

Virtual Machine Data Recovery

When the physical storage beneath your VMs fails, the hypervisor cannot help you. RAID controller failures, SAN LUN corruption, and drive mechanical failures take entire datastores offline. We image the failed drives, reconstruct the storage array, parse the host filesystem, and extract your virtual disk files with their guest data intact.

All work is performed in-house at our Austin, TX lab using PC-3000 and DeepSpar Disk Imager. No data recovered means no charge.

As Featured In

Louis Rossmann
Written by
Louis Rossmann
Founder & Chief Technician
Updated March 14, 2026

How Virtual Machine Recovery Works

Virtual machine recovery is a two-layer problem. The first layer is the physical storage: RAID arrays, SAN LUNs, or standalone drives that hold the datastore. The second layer is the virtual disk container: VMDK, VHDX, qcow2, or raw files stored on the host filesystem. A hardware failure at the physical layer makes both layers inaccessible, but the virtual disk data is typically intact within the container files once the underlying storage is reconstructed.

  1. Isolate and image the physical drives using PC-3000 with sector-by-sector cloning and custom read timeouts to prevent degraded heads from further damaging platters.
  2. Reconstruct the RAID array offline by parsing controller metadata (PERC, Smart Array, LSI, mdadm superblocks, ZFS labels) from drive images. Determine stripe size, parity rotation, and member ordering without touching original hardware.
  3. Parse the host filesystem (VMFS5/6, NTFS, ReFS, ext4, XFS, ZFS) from the reconstructed array to locate the virtual disk container files and their metadata.
  4. Extract the virtual disks and consolidate any snapshot deltas back into the base disk, producing a single flat image representing the VM's last consistent state.
  5. Verify guest filesystem integrity by mounting the recovered virtual disk read-only and confirming NTFS, ext4, or XFS structures are intact.

Hypervisor-Specific Recovery

Each hypervisor platform stores virtual disks differently. Select your environment for platform-specific recovery details, pricing, and FAQs.

Virtual Disk Formats and Failure Modes

Each virtual disk format has a distinct on-disk structure. The format determines what metadata must survive (or be reconstructed) for recovery to succeed.

VMDK (VMware ESXi / Workstation)

A VMDK consists of two files: a text-based descriptor file (.vmdk) containing geometry, adapter type, and extent references, and a flat data file (-flat.vmdk) containing the raw virtual disk contents. Monolithic flat VMDKs store everything in a single extent. Split VMDKs fragment the data into 2GB extent files (-s001.vmdk through -sNNN.vmdk).

Snapshot chains add delta disks (-delta.vmdk on VMFS5, -sesparse.vmdk on VMFS6). Each delta contains a grain directory and grain tables mapping changed blocks relative to the parent. The descriptor's CID field must match the parentCID of each child delta. When an ESXi host crashes during a "Delete All Snapshots" operation, orphaned deltas disconnect from the .vmsd file and the CID chain breaks. We reconstruct the chain by reading grain tables, determining the actual write sequence, and recalculating CID/parentCID values.

If the descriptor file is destroyed entirely, the flat file becomes headless. We locate the flat extent boundaries on the VMFS volume, calculate geometry (cylinders, heads, sectors) from the file size, and rebuild the descriptor manually.

VHDX (Microsoft Hyper-V)

VHDX replaced the legacy VHD format in Windows Server 2012. It supports virtual disks up to 64 TB and uses a structured layout: a file type identifier, two redundant header copies (at 64 KB and 128 KB offsets), a region table pointing to the BAT (Block Allocation Table) and metadata regions, and a replay log for crash consistency.

Fixed VHDX files pre-allocate all blocks at creation. Dynamic VHDX files allocate blocks on demand as the guest writes data. Dynamic disks are vulnerable to BAT corruption if the underlying storage disconnects mid-write, because the BAT update and the payload block write are separate I/O operations. If the BAT points to an uninitialized block offset, the guest filesystem reads garbage.

Hyper-V checkpoints create AVHDX differencing disks. Each AVHDX has its own BAT mapping changed blocks relative to the parent. A failed checkpoint merge leaves orphaned AVHDX files that the VM configuration (VMCX) no longer tracks. We parse each differencing disk's BAT, determine the correct parent-child ordering by creation timestamp, and consolidate the writes into a single base VHDX.

qcow2 (Proxmox VE / KVM / OpenStack)

qcow2 (QEMU Copy-On-Write version 2) uses a two-level reference table system: L1 entries point to L2 tables, and L2 entries point to data clusters. This indirection allows sparse allocation, internal snapshots, and backing file chains. A separate reference count table tracks cluster usage for copy-on-write operations.

Proxmox VE environments using cache=none bypass the host page cache, sending writes directly to the storage backend. If the storage loses power during a metadata commit, the L1/L2 tables and reference counts can become inconsistent. This is a torn write: the data cluster was written but the L2 entry still points to the old location (or to nothing). We scan the qcow2 file for valid cluster boundaries, rebuild the L1/L2 mapping tables from discovered data clusters, and recalculate reference counts.

Backing file chains (used for Proxmox linked clones) add another failure dimension. If the base image is on a different storage backend than the overlay, a failure on either backend breaks the chain. Both the base and overlay must be recovered and reconnected for a complete VM image.

SAN-Backed Datastore Failures

Enterprise VM environments typically store datastores on SAN LUNs presented over Fibre Channel or iSCSI. When the SAN controller fails, the array degrades, or multiple drives in the SAN shelf fail simultaneously, every VM on every datastore hosted by that LUN goes offline.

Recovery requires imaging the individual drives from the SAN shelf (not the LUN), reconstructing the RAID topology from the SAN controller's on-disk metadata, and then parsing the VMFS, NTFS, or ZFS filesystem on the reconstructed LUN to locate the virtual disk files. The SAN controller itself is not needed; the metadata is on the drives.

For SAN environments using SSD caching tiers (read cache or write-back cache), the cache drive must also be imaged. Write-back cache drives may contain committed writes that never reached the capacity tier. Losing the cache drive in this scenario means losing those pending writes permanently.

TRIM, UNMAP, and SSD-Backed VM Storage

VMFS6 enables automatic UNMAP by default, periodically issuing SCSI UNMAP commands to the underlying storage for deleted blocks. Modern Hyper-V environments on Windows Server 2016+ pass TRIM commands from the guest through ReFS/NTFS to the physical SSDs. Proxmox with ZFS also supports autotrim.

If the SAN or local SSD controller has executed TRIM on the blocks that held a deleted virtual disk, the controller marks those blocks as no longer needed. Garbage collection then erases the underlying NAND pages. Recovery of those blocks is not possible at any price point.

If you suspect a VM was accidentally deleted from an SSD-backed datastore, power down the storage immediately. Every second the storage remains online gives the SSD controller more time to execute pending TRIM operations and run garbage collection.

Physical vs. Logical Failure Domains

VM recovery splits into two distinct failure domains. Understanding which domain your failure falls into determines the recovery approach and cost.

DomainFailure ExamplesRecovery ApproachTools
Physical (Hardware)Drive head crash, motor seizure, PCB failure, SAS/SATA interface fault, SSD controller failureWrite-blocked imaging through PC-3000, head swaps in clean bench, firmware repairPC-3000, DeepSpar, 0.02µm ULPA clean bench
Logical (Software)VMFS corruption, VMDK descriptor loss, VHDX BAT damage, qcow2 L1/L2 table corruption, broken snapshot chainHost filesystem parsing, virtual disk metadata reconstruction, snapshot chain consolidationPC-3000 RAID Edition, hex analysis, custom parsing tools

RAID Rebuilds and VM Recovery

Rebuilding a degraded RAID array containing physically failing drives is not data recovery. It is a destructive process that forces the controller to read every sector from every surviving member and recalculate parity. If a second drive develops read errors during the rebuild (common with aged drives in the same batch), the controller drops the array entirely.

For RAID arrays hosting VM datastores with mechanically failing members: power down the server, label each drive with its bay position, and ship the drives to us. We image each drive individually through PC-3000, replacing heads on failed members as needed, then reconstruct the array virtually from the images. The original drives are never written to.

SMR Drives in Virtualization Environments

Shingled Magnetic Recording (SMR) drives overlap write tracks to increase density. Random writes require read-modify-write cycles across entire bands, which causes severe performance degradation under sustained random I/O. VM workloads produce high random I/O.

If SMR drives were used in a RAID or ZFS pool hosting VMs, rebuild times extend from hours to days. The extended stress on surviving members during a rebuild increases the probability of cascading failures. Recovery of arrays built on SMR drives requires imaging each member with extended timeout configurations to handle the slow random-read performance inherent to SMR architectures.

Virtual Machine Recovery Pricing

Virtual machine recovery pricing is based on each drive's physical condition. Per-drive pricing follows the same five published tiers used for all drive recoveries. Multi-drive arrays involve additional reconstruction work to detect RAID parameters, extract virtual disks, and consolidate snapshots. No data recovered means no charge.

Service TierPriceDescription
Simple CopyLow complexity$100

Your drive works, you just need the data moved off it

Functional drive; data transfer to new media

Rush available: +$100

File System RecoveryLow complexityFrom $250

Your drive isn't recognized by your computer, but it's not making unusual sounds

File system corruption. Accessible with professional recovery software but not by the OS

Starting price; final depends on complexity

Firmware RepairMedium complexity – PC-3000 required$600–$900

Your drive is completely inaccessible. It may be detected but shows the wrong size or won't respond

Firmware corruption: ROM, modules, or translator tables corrupted; requires PC-3000 terminal access

Standard drives at lower end; high-density drives at higher end

Head SwapHigh complexity – clean bench surgery50% deposit$1,200–$1,500

Your drive is clicking, beeping, or won't spin. The internal read/write heads have failed

Head stack assembly failure. Transplanting heads from a matching donor drive on a clean bench

50% deposit required. Donor parts are consumed in the repair

Surface / Platter DamageHigh complexity – clean bench surgery50% deposit$2,000

Your drive was dropped, has visible damage, or a head crash scraped the platters

Platter scoring or contamination. Requires platter cleaning and head swap

50% deposit required. Donor parts are consumed in the repair. Most difficult recovery type.

Hardware Repair vs. Software Locks

Our "no data, no fee" policy applies to hardware recovery. We do not bill for unsuccessful physical repairs. If we replace a hard drive read/write head assembly or repair a liquid-damaged logic board to a bootable state, the hardware repair is complete and standard rates apply. If data remains inaccessible due to user-configured software locks, a forgotten passcode, or a remote wipe command, the physical repair is still billable. We cannot bypass user encryption or activation locks.

All tiers: Free evaluation and firm quote before any paid work. No data, no fee on simple copy, file system, and firmware tiers. Head swap and surface damage require a 50% deposit because donor parts are consumed in the attempt.

Target drive: The destination drive we copy recovered data onto. You can supply your own or we provide one at cost. For ultra-high-capacity drives (20TB and above), the target drive costs approximately $400+ due to the large media required. All prices are plus applicable tax.

Data Recovery Standards & Verification

Our Austin lab operates on a transparency-first model. We use industry-standard recovery tools, including PC-3000 and DeepSpar, combined with strict environmental controls to make sure your hard drive is handled safely and properly. This approach allows us to serve clients nationwide with consistent technical standards.

Open-drive work is performed in a ULPA-filtered laminar-flow bench, validated to 0.02 µm particle count, verified using TSI P-Trak instrumentation.

Transparent History

Serving clients nationwide via mail-in service since 2008. Our lead engineer holds PC-3000 and HEX Akademia certifications for hard drive firmware repair and mechanical recovery.

Media Coverage

Our repair work has been covered by The Wall Street Journal and Business Insider, with CBC News reporting on our pricing transparency. Louis Rossmann has testified in Right to Repair hearings in multiple states and founded the Repair Preservation Group.

Aligned Incentives

Our "No Data, No Charge" policy means we assume the risk of the recovery attempt, not the client.

LR

Louis Rossmann

Louis Rossmann's well trained staff review our lab protocols to ensure technical accuracy and honest service. Since 2008, his focus has been on clear technical communication and accurate diagnostics rather than sales-driven explanations.

We believe in proving standards rather than just stating them. We use TSI P-Trak instrumentation to verify that clean-air benchmarks are met before any drive is opened.

See our clean bench validation data and particle test video

Virtual Machine Recovery FAQ

Which virtual disk formats can you recover?
We recover VMDK (VMware ESXi and Workstation), VHD and VHDX (Microsoft Hyper-V), QCOW2 (KVM, Proxmox VE, OpenStack), VDI (VirtualBox), and raw disk images. The virtual disk format determines the metadata structures we parse, but the underlying physical recovery process is the same: image the failed storage, reconstruct the array or volume, then extract the VM disk files from the host filesystem.
What causes a VMDK CID mismatch and can you fix it?
A Content ID (CID) mismatch occurs when a VMware snapshot chain breaks. This typically happens if the ESXi host crashes or loses storage connectivity during a snapshot commit, causing the parentCID of the delta disk to lose synchronization with the base flat file. We read the grain directory and grain tables from each delta, verify the actual data lineage, and reconstruct the descriptor file with correct CID/parentCID references.
Can you recover a dynamically expanding VHDX that became corrupted?
Yes. Dynamic VHDX files store data in blocks mapped by a Block Allocation Table (BAT). If the VHDX headers, BAT, or log entries are corrupted (common during power loss or storage disconnection), we parse the VHDX payload blocks from the raw disk image and reconstruct the BAT by scanning block signatures. If both redundant headers are destroyed, we calculate the virtual disk geometry from the payload block layout.
How do you handle qcow2 corruption on Proxmox VE?
Proxmox stores KVM virtual machine disks as qcow2 files on ZFS, LVM-thin, or Ceph storage backends. Power loss during write operations with cache=none can produce torn writes that corrupt the qcow2 L1/L2 reference count tables. We reconstruct the qcow2 metadata by scanning the file for cluster boundaries and rebuilding the mapping tables from the data clusters themselves.
Do I need to send the entire server or just the drives?
Send the drives. We do not need the server chassis, controller card, or cabling. For RAID arrays, label each drive with its slot position (bay 0, bay 1, etc.) before removing them. We extract RAID metadata (DDF, PERC, Smart Array, mdadm superblocks, ZFS labels) from the drives themselves and reconstruct the array offline using PC-3000 RAID Edition.
Can deleted VMs be recovered from SSD-backed datastores?
It depends on whether TRIM/UNMAP was active. VMFS6 enables automatic UNMAP by default, and modern Hyper-V environments pass TRIM commands through to underlying storage. If the SAN or local SSD controller has already executed TRIM on the blocks that held the deleted virtual disk, the controller marks those blocks as no longer needed and garbage collection erases the NAND pages. Recovery is not possible. If TRIM was disabled or has not yet executed, recovery may still be feasible. Power down the storage immediately to prevent garbage collection.
How much does virtual machine data recovery cost?
Pricing depends on the physical condition of the drives hosting the datastore. Per-drive pricing starts at $100 for simple copies, $250 for file system recovery, $600 to $900 for firmware repair, $1,200 to $1,500 for head swaps, and $2,000 for platter damage. These are the same five published tiers we use for all drive recoveries. Multi-drive RAID arrays involve additional reconstruction work. No data recovered means no charge.

Ready to recover your virtual machines?

Free evaluation. No data = no charge. Ship your drives from anywhere in the U.S.