“Had a raid 0 array (windows storage pool) (failed 2tb Seagate, and a working 1tb wd blue) recovered last year, it was much cheaper than the $1500 to $3500 Canadian dollars i was quoted by a Canadian data recovery service. the price while expensive was a comparatively reasonable $900USD (about $1100 CAD at the time). they had very good communication with me about the status of my recovery and were extremely professional. the drive they sent back was Very well packaged. I would 100% have a drive recovered by them again if i ever needed to again.”
Enterprise Virtualization Recovery
VMware ESXi Data Recovery
We recover VMFS datastores from failed RAID arrays, repair broken snapshot chains, and extract individual .vmdk virtual disks from corrupted ESXi hosts and vSAN clusters. Free evaluation. No data = no charge.

How VMware ESXi Datastores Fail and How We Recover Them
VMware ESXi stores virtual machines on VMFS (Virtual Machine File System) datastores backed by RAID arrays. When the underlying array degrades, the ESXi host loses access to the VMFS volume and all VMs on it go offline. Recovery requires imaging the RAID member drives, reconstructing the array offline, and parsing VMFS metadata to extract each .vmdk virtual disk file.
VMFS is a clustered filesystem designed for shared storage access across multiple ESXi hosts. It uses on-disk locking mechanisms (heartbeat regions and ATS primitives on VMFS6) to coordinate concurrent access. When a RAID failure corrupts the volume header or allocation bitmap, the lock state becomes inconsistent and ESXi refuses to mount the datastore. Standard VMware tools (vmkfstools, vscsiStats) cannot repair a datastore with underlying media errors. The data must be recovered at the physical layer first.
VMFS Metadata Architecture and Failure Points
Understanding VMFS on-disk layout is essential for targeted recovery. Both VMFS5 and VMFS6 share a common structural pattern, but differ in block allocation, UNMAP behavior, and snapshot formats.
VMFS5 On-Disk Layout
- Volume header at LBA 0 contains the VMFS superblock, including UUID, version, and volume label
- Heartbeat region at offset 0x100000 (1MB); each ESXi host writes its UUID here to claim lock ownership
- Resource bitmap tracks 1MB block allocation across the volume; corruption here causes "no space" errors on a half-empty datastore
- File descriptor heap stores inode-like entries for .vmdk files, including pointer block addresses for data extents
- Sub-block allocation (8KB granularity) handles small files like .vmx config files and descriptor VMDKs
VMFS6 Changes
- Automatic UNMAP (space reclamation) runs in the background, which can zero-fill previously allocated blocks on thin-provisioned LUNs
- SE Sparse (Space Efficient Sparse) snapshot format replaces vmfsSparse by default; uses grain directories and grain tables with a default 4KB grain size for block-level change tracking
- Native 512e and 4Kn drive support; VMFS6 aligns I/O to physical sector boundaries, affecting how data is laid out on AF drives
- GPT-based partition layout on the backing LUN (VMFS5 used MBR)
- ATS (Atomic Test and Set) VAAI primitives replace some SCSI reservation locks; ATS misfire during power loss can leave orphaned locks
When a RAID member fails mid-write, the VMFS journal may contain an incomplete transaction. ESXi attempts to replay this journal on mount. If the journal references sectors that are now unreadable (because the RAID array is degraded), the mount fails entirely. Our approach bypasses the ESXi mount process: we parse VMFS structures directly from the raw RAID image and extract .vmdk files by following pointer block chains, regardless of journal state.
ESXi Snapshot Chain Reconstruction
Snapshot chains in ESXi consist of a base .vmdk and one or more delta files (-delta.vmdk using vmfsSparse on VMFS5, -sesparse.vmdk by default on VMFS6). Each delta records changed blocks relative to its parent. When the chain breaks, the VM cannot power on and standard consolidation fails.
How Snapshot Chains Break
- CID mismatch: Each VMDK descriptor contains a Content ID (CID) and a Parent Content ID (parentCID). When a snapshot is created, the new delta's parentCID must match the parent's CID. ESXi crashes or storage disconnects during snapshot creation can leave these values out of sync.
- Orphaned deltas: Failed "Delete All Snapshots" operations can leave delta files on disk with no corresponding entry in the VM's .vmsd snapshot descriptor file. The snapshot manager no longer tracks these deltas, but the VM still references them in its disk chain.
- Corrupted grain tables: SE sparse deltas on VMFS6 use grain directories and grain tables to map changed sectors. A power loss during a grain table update can corrupt the mapping, causing reads to return incorrect data or I/O errors.
We reconstruct broken chains by reading the grain directory from each delta, determining the correct parent-child ordering from creation timestamps and CID values, and manually consolidating the changed blocks back into the base extent. The result is a single flat .vmdk representing the VM's most recent consistent state.
vSAN Distributed Datastore Recovery
VMware vSAN aggregates local SSDs and HDDs from multiple ESXi hosts into a single distributed datastore. VM storage objects are split into components and distributed across hosts according to a storage policy. FTT=1 defaults to mirroring but can use RAID-5 erasure coding; FTT=2 defaults to triple mirroring but can use RAID-6 erasure coding, depending on the failure tolerance method (FTM) setting. Multi-node failures or CMMDS metadata corruption can take the entire vSAN datastore offline.
- CMMDS reconstruction: The Cluster Monitoring, Membership, and Directory Service maintains a distributed database of all object locations across the cluster. When CMMDS becomes inconsistent (typically after simultaneous host failures), we rebuild the object map by scanning each host's capacity disks for object headers and component metadata.
- DOM object reassembly: The Distributed Object Manager splits each .vmdk into components (up to 255GB per component on most vSAN versions). Each component is a RAID-1 mirror or RAID-5/6 stripe across disk groups. We locate each component on the physical disks, reconstruct the stripe or mirror, and reassemble the full .vmdk from its component pieces.
- Disk group structure: Each vSAN disk group contains one SSD cache tier and up to seven HDD/SSD capacity devices. The SSD cache provides a write buffer (and read cache in hybrid configurations); deduplication metadata, when enabled, resides on the capacity tier. We image the capacity devices (where persistent data resides) and use the cache device to resolve any in-flight writes.
- Witness and stretched clusters: Two-node vSAN configurations use a witness host for quorum. If the witness becomes unavailable simultaneously with a data node, the remaining node cannot confirm object ownership. We bypass the quorum requirement by working directly with the physical disk images.
Common ESXi Failure Scenarios We Handle
RAID Array Degradation
PERC or Smart Array controller detects multiple failed members. ESXi host loses access to the VMFS LUN. All VMs on the datastore go offline simultaneously.
VMFS Metadata Corruption
Power loss during metadata commit corrupts the resource bitmap or file descriptor heap. ESXi refuses to mount the datastore with "cannot open the disk" or "no such file" errors.
Failed Snapshot Consolidation
"Delete All Snapshots" task fails, leaving orphaned delta files. The VM runs on an increasingly fragmented chain until the datastore fills or performance degrades to zero.
ESXi Boot Failure / PSOD
ESXi host fails to boot after firmware update, boot media corruption, or a Purple Screen of Death (PSOD) kernel panic. VMs are intact on the VMFS datastore but inaccessible without a running hypervisor.
vSAN Multi-Node Failure
Power event takes down multiple vSAN hosts simultaneously. Object components become stale across the cluster and vSAN cannot rebuild without manual intervention.
Accidental VM Deletion
VM removed from inventory or .vmdk files deleted from the datastore browser. VMFS does not immediately zero-fill freed blocks, so recovery is possible if no new writes have overwritten the extents.
Recovery Methodology for IT Administrators
This section details the low-level procedures we use. If you are evaluating our technical capability, this is how the work gets done.
1. RAID Member Imaging with Sector-Level Granularity
Each member drive is imaged through PC-3000 using SAS HBAs for SAS drives or NVMe adapters for PCIe SSDs. The imaging process captures every addressable LBA, including those beyond the standard ATA/SCSI command set boundary (service area, G-list entries). For drives with bad sectors, we configure PC-3000 head maps to skip damaged heads on initial passes and return to them with aggressive retry parameters after capturing all healthy sectors. DeepSpar Disk Imager provides hardware-level timeout control for drives that lock up during reads.
2. Controller Metadata Extraction
PERC controllers store their DDF (Disk Data Format) metadata in the last sectors of each member drive. This metadata block contains the virtual disk configuration: RAID level, stripe size, member ordering, rebuild checkpoint, and consistency state. Smart Array controllers use a similar reserved area but with an HP-proprietary format. PC-3000 RAID Edition reads these metadata blocks and uses them to reconstruct the virtual disk layout without needing the original controller hardware. For arrays where the metadata has been overwritten or zeroed (firmware flash gone wrong), we fall back to brute-force parameter detection: testing stripe size permutations (64KB, 128KB, 256KB, 512KB, 1MB) and member orderings against known filesystem signatures.
3. VMFS Parsing and VMDK Extraction
With the RAID image reconstructed, we parse the VMFS volume directly from the raw image. The process reads the superblock at LBA 0 to determine VMFS version, block size (always 1MB on VMFS5+), and total volume capacity. The file descriptor heap is scanned for entries matching .vmdk, .vmx, .nvram, and .vmsd file types. For each .vmdk, we read the descriptor file to determine whether it is a monolithic flat disk, a split sparse, or a snapshot delta. Flat extent data is located by following the pointer block chain from the file descriptor. The extracted .vmdk is verified by mounting it read-only and checking guest filesystem integrity (NTFS, ext4, XFS) with standard filesystem tools.
4. Hyper-V Coexistence
Environments migrated from Hyper-V to ESXi (or running both) may contain .vhdx files stored on VMFS datastores. We extract .vhdx files using the same VMFS parsing pipeline and process them separately. VHDX uses a 4KB log structure for crash consistency, and recovery follows the same image-first, parse-from-raw methodology. For broader server recovery needs including Hyper-V standalone environments, see our main server recovery page.
VMDK File Format and Descriptor Reconstruction
VMware virtual disks consist of two parts: a descriptor file (plain text, typically under 1KB) and one or more data extents containing the actual disk content. When the descriptor is lost or corrupted, the data extent is an opaque binary blob that ESXi cannot address. Reconstruction requires understanding the VMDK format at the byte level.
- monolithicFlat
- A single contiguous data extent file (-flat.vmdk) paired with a descriptor. Standard format for thick-provisioned VM disks on VMFS. The descriptor's
RWline defines the extent size in 512-byte sectors. - twoGbMaxExtentSparse
- Splits the virtual disk into multiple 2GB sparse extents. Used when exporting VMs for transport (OVF/OVA). Each extent has its own grain table tracking allocated blocks. Recovery requires reassembling the extents in the correct sequence defined by the descriptor.
- vmfsSparse (VMFS5 snapshots)
- Delta files (-delta.vmdk) used by VMFS5 for snapshot change tracking. Each delta records changed 512-byte sectors relative to its parent. The grain directory maps changed sectors to their physical location in the delta file.
- seSparse (VMFS6 snapshots)
- Space-Efficient Sparse format (-sesparse.vmdk), default on VMFS6. Uses 4KB grain size for finer-grained change tracking and supports space reclamation. The grain directory and grain table structure is more complex than vmfsSparse, with a two-level lookup (directory → table → grain).
Descriptor File Reconstruction
When the descriptor file is missing but the flat extent remains on the VMFS volume, we reconstruct it by calculating the extent boundaries. For a 40GB monolithicFlat disk, the sector count is 40 × 1,073,741,824 / 512 = 83,886,080 sectors. The reconstructed descriptor maps this extent with the correct createType, CID, and parent references. For virtual machine disk recovery involving snapshot chains, the CID and parentCID values in each delta descriptor must match; a single mismatch renders the chain unreadable.
# Reconstructed VMDK descriptor (monolithicFlat)
version=1
CID=d0f5e5f6
parentCID=ffffffff
createType="vmfs"
# Extent description
RW 83886080 VMFS "server-flat.vmdk" 0
# Disk Data Base
ddb.virtualHWVersion = "21"
ddb.geometry.cylinders = "5221"
ddb.geometry.heads = "255"
ddb.geometry.sectors = "63"
ddb.adapterType = "lsilogic"ESXi 8.0 and vSAN 8 Storage Architecture Changes
ESXi 8.0 introduced structural changes to the boot and storage layout that create new failure patterns for IT administrators to manage. These changes affect both standalone ESXi hosts and vSAN clusters.
ESX-OSData Partition Wear
ESXi 8.0 replaced the legacy /scratch partition with ESX-OSData, which stores VMware Tools images, host configuration, core dumps, and diagnostic logs. This partition generates sustained write activity. Hosts booting from SD cards or USB devices (common in ESXi 6.x and 7.x deployments) experience rapid media wear because these devices were never designed for continuous write workloads. VMware deprecated SD/USB boot in ESXi 8.0 for this reason. When the boot device fails, the host goes offline but the VMFS datastores on separate storage remain recoverable.
vSAN 8 Persistent Disk Headers
When deactivating a vSAN 8 cluster (OSA or ESA) to repurpose drives for local VMFS datastores, the physical disks retain low-level vSAN partition metadata. ESXi flags these drives as "In-use by vSAN" and blocks VMFS initialization. The standard fix is esxcli vsan storage remove -s <naaid>, but if the vSAN cluster is already destroyed, the disk group membership cannot be cleanly removed through the CLI. We handle these drives by reading and clearing the partition headers at the byte level during the imaging process.
RAID rebuild warning: If the VMFS datastore sits on a RAID array where one or more members have failed, do not initiate a RAID rebuild. Rebuilding a degraded array forces parity recalculation across surviving members, and if any surviving drive has unread bad sectors, the rebuild injects corrupt parity data directly into the VMFS metadata and VMDK flat extents. Power down the server and contact us for evaluation.
VMware Recovery Pricing
VMware datastore recovery follows the same transparent pricing model as every other service: per-drive imaging based on each drive's condition. VMFS parsing, array reconstruction, and VMDK extraction are included in the per-drive fee at no extra charge. No data recovered means no charge.
| Service Tier | Price Range (Per Drive) | Description |
|---|---|---|
| Logical / Firmware Imaging | $250-$900 | Firmware module damage, SMART threshold failures, or filesystem corruption on individual array members. |
| Mechanical (Head Swap / Motor) | $1,200-$1,50050% deposit | Donor parts consumed during transplant. SAS drives require SAS-specific donors matched by model, firmware revision, and head count. |
No Data = No Charge: If we recover nothing from your VMware environment, you owe $0. Free evaluation, no obligation.
Enterprise competitors charge $5,000-$15,000 with opaque "emergency" surcharges. We publish our pricing because the work is the same regardless of what label gets put on the invoice.
We sign NDAs for corporate data recovery. All drives remain in our Austin lab under chain-of-custody documentation throughout the process. We are not HIPAA certified and do not sign BAAs, but we are willing to discuss your specific compliance requirements before work begins.
VMware ESXi Recovery; Common Questions
What causes VMFS datastore corruption and can it be recovered?
Can you fix a broken ESXi snapshot chain?
How do you recover data from a failed vSAN cluster?
Does the ESXi version affect recovery?
Can you recover thin-provisioned VMs that were deleted from the datastore?
How much does VMware datastore recovery cost?
Is it safe to run vmkfstools or filesystem checks on a corrupted VMFS datastore?
Why does ESXi report 'In-use by vSAN' when creating a new VMFS datastore?
Can you reconstruct a missing or corrupted VMDK descriptor file?
Need Recovery for Other Devices?
PSOD diagnostics, MCE decoding, VMFS extraction after kernel panic
Dell, HP, IBM enterprise servers
Dell EMC, NetApp, HPE arrays
RAID 0, 1, 5, 6, 10 arrays
Synology, QNAP, Buffalo
VMDK, VHD/VHDX, QCOW2 extraction
NVMe and SATA SSDs
Data Recovery Standards & Verification
Our Austin lab operates on a transparency-first model. We use industry-standard recovery tools, including PC-3000 and DeepSpar, combined with strict environmental controls to make sure your hard drive is handled safely and properly. This approach allows us to serve clients nationwide with consistent technical standards.
Open-drive work is performed in a ULPA-filtered laminar-flow bench, validated to 0.02 µm particle count, verified using TSI P-Trak instrumentation.
Transparent History
Serving clients nationwide via mail-in service since 2008. Our lead engineer holds PC-3000 and HEX Akademia certifications for hard drive firmware repair and mechanical recovery.
Media Coverage
Our repair work has been covered by The Wall Street Journal and Business Insider, with CBC News reporting on our pricing transparency. Louis Rossmann has testified in Right to Repair hearings in multiple states and founded the Repair Preservation Group.
Aligned Incentives
Our "No Data, No Charge" policy means we assume the risk of the recovery attempt, not the client.
Technical Oversight
Louis Rossmann
Louis Rossmann's well trained staff review our lab protocols to ensure technical accuracy and honest service. Since 2008, his focus has been on clear technical communication and accurate diagnostics rather than sales-driven explanations.
We believe in proving standards rather than just stating them. We use TSI P-Trak instrumentation to verify that clean-air benchmarks are met before any drive is opened.
See our clean bench validation data and particle test videoWhat Server Recovery Customers Say
“HIGHLIGHT & CONCLUSION ******Overall I'm having a good experience with this store because they have great customer services, best third party replacement parts, justify price for those replacement parts, short estimate waiting time to fix the device, 1 year warranty, and good prediction of pricing and the device life conditions whether it can fix it or not.”
“Didn't *fix* my issue but a great experience. Shipped a drive from an old NAS whose board had failed. Rossmann Repair wanted to go straight for data extraction (~$600-900). Did some research on my own and discovered the file table was Linux based and asked if they could take a look. They said that their decision still stands and would only go straight for data recovery.”
“I've been following the YouTube tutorials since my family and I were in India on business. My son spilled Geteraid on my keyboard and my computer wouldn't come on after I opened it and cleaned it, laying it upside down for a week. To make the story short I took my computer to the shop while I'm in New York on business and did charged me $45.00 for a rush assessment.”
Ready to recover your VMware environment?
Free evaluation. No data = no charge. Mail-in from anywhere in the U.S.