MongoDB Data Recovery
MongoDB stores data in BSON documents across WiredTiger collection files. When the underlying drive fails, mongod cannot start and standard repair tools cannot read the data directory. We image the failed drive using PC-3000, extract the WiredTiger files, rebuild the catalog metadata, and recover your collections. Standalone instances, replica sets, and sharded clusters. No data, no fee.

How WiredTiger Stores Data on Disk
Since MongoDB 3.2, WiredTiger is the default storage engine. Each collection is stored in a separate file (e.g., collection-0-123456789.wt), and each index gets its own file. The _mdb_catalog.wt file maps collection names to their on-disk filenames. The WiredTiger.wt metadata file and WiredTiger.turtle bootstrap file track the current checkpoint state.
WiredTiger uses multiversion concurrency control (MVCC) and writes through a journal before checkpointing to collection files. A checkpoint occurs approximately every 60 seconds or when 2GB of journal data has been written, whichever comes first. Between checkpoints, data exists only in the journal and in-memory cache.
This architecture creates specific failure modes:
- Metadata loss: If
_mdb_catalog.wtis unreadable, MongoDB does not know which .wt file belongs to which collection. The data files are intact but unnamed. - Checkpoint corruption:
WiredTiger.turtleorWiredTiger.wtreferences a checkpoint that was partially written when the drive failed. MongoDB reports "WT_PANIC" and refuses to open the data directory. - Journal gaps: The journal directory contains incomplete or corrupted journal files. MongoDB cannot replay recent writes, leaving collections in a state up to 60 seconds behind.
- Collection file corruption: Bad sectors on the platter hit a .wt file. Individual documents within the B-tree pages become unreadable, but surrounding pages remain intact.
Why mongod --repair Is Dangerous on a Failing Drive
The mongod --repair command reads every document from every collection, validates it, and rewrites the entire data directory. Documents that fail validation are discarded. This process requires the drive to be fully readable and writable.
On a drive with physical damage, --repair has two problems. First, it forces the drive to read every sector sequentially, including sectors in degraded areas. Each failed read attempt causes the heads to make repeated contact with damaged platter surfaces, expanding the area of damage. Second, --repair writes back to the same drive, overwriting data that a proper imaging process could have recovered using PC-3000's adaptive read parameters.
The MongoDB documentation itself warns that --repair should be used "only in cases where data is corrupted and cannot be fixed otherwise." When the corruption originates from physical media failure, fixing the drive first produces a more complete dataset than --repair can extract from degraded media.
Before running --repair or deleting journal files: Power down the server. Do not restart mongod. Each startup attempt forces a checkpoint recovery that writes to the data directory, overwriting data on damaged sectors. Ship the drives to us for forensic imaging first.
Our MongoDB Recovery Workflow
Drive assessment and forensic imaging
Evaluate the drive using PC-3000 diagnostics. If the drive is mechanically sound, we image it with write-blocking hardware. If heads have failed, we perform a head swap in the 0.02µm ULPA-filtered clean bench before imaging. The result is a complete sector-level clone on healthy media.
File system reconstruction and dbpath extraction
Mount the cloned image and extract the MongoDB data directory. If the file system (ext4, XFS, ZFS) is damaged, we parse the filesystem metadata directly to locate the WiredTiger files on disk. We recover the .wt collection files, the journal directory, and the WiredTiger metadata files.
WiredTiger metadata rebuild
Reconstruct _mdb_catalog.wt, WiredTiger.wt, and WiredTiger.turtle if damaged. Map each .wt file to its original collection name by parsing the B-tree page headers and document structure. Rebuild the checkpoint metadata so WiredTiger recognizes the data directory.
Oplog replay and collection validation
If journal files are intact, replay writes that occurred after the last checkpoint. For replica set members, the oplog (local.oplog.rs) can recover additional transactions. Validate each collection to identify documents with corrupted BSON structures.
BSON export and delivery
Export recovered collections using mongodump to produce BSON/JSON files, or provide a complete mongodump archive that can be restored with mongorestore. For GridFS recoveries, we also deliver reassembled original files alongside the database dump.
Common MongoDB Failure Scenarios
| Scenario | Symptoms | Recovery Approach |
|---|---|---|
| WT_PANIC | mongod crashes with "WT_PANIC: WiredTiger library panic" in the logs. The server refuses to restart. | Image the drive. Rebuild WiredTiger.turtle and WiredTiger.wt checkpoint metadata from the last valid checkpoint in the collection files. |
| Missing journal | mongod logs "Unable to read from journal" or "Recovery requires journal files." Server will not start without --repair flag. | Extract collections from the last checkpoint state. Replay oplog entries from a replica set secondary if available. Uncheckpointed writes are lost. |
| Catalog corruption | Collections appear empty or missing in the mongo shell, but .wt files exist on disk. "No such catalog entry" errors in the log. | Rebuild _mdb_catalog.wt by scanning each .wt file's internal B-tree structure to identify its collection name, namespace, and index configuration. |
| Replica set rollback | A primary with acknowledged writes lost network connectivity and a new primary was elected. When the old primary reconnects, it rolls back writes to match the new primary. | Rolled-back documents are saved in the rollback/ directory as BSON files. We recover and merge these with the current dataset if the rollback directory was on the failed drive. |
| RAID array failure | Multiple drives in a RAID array failed. The MongoDB data directory spans a logical volume that is no longer mountable. | Image each member drive. Reconstruct the RAID stripe geometry and rebuild the virtual disk. Extract the file system and MongoDB data directory from the reconstructed array. |
Sharded Cluster Recovery
A sharded MongoDB deployment distributes data across multiple shard servers, each containing a subset of the documents determined by the shard key. The config servers store the chunk-to-shard mapping and cluster metadata. Mongos routers direct queries to the correct shard based on this mapping.
Recovering a sharded cluster requires three components: the data from each shard, the config server metadata, and knowledge of the shard key for each collection. The config database stores chunk boundaries in the config.chunks collection and shard membership in config.shards.
When config servers are lost along with one or more shards, we reconstruct the chunk boundaries from the data itself. Each shard contains documents whose shard key values fall within specific ranges. By scanning the documents on each recovered shard, we rebuild the chunk map and merge the shards into a complete dataset.
Config servers intact
When config server data is recoverable, the chunk-to-shard mapping is preserved. We image the failed shard drives, recover their collections, and the existing config metadata tells us exactly which documents belong where. Reassembly is straightforward.
Config servers lost
Without config server data, we must infer shard boundaries. We scan each recovered shard for the shard key field, determine the minimum and maximum values on each shard, and validate that no documents overlap between shards. This produces a complete merge with correct deduplication.
GridFS File Recovery
GridFS is MongoDB's specification for storing files larger than the 16MB BSON document limit. It splits each file into 255KB chunks and stores them as individual documents in the fs.chunks collection. File metadata (filename, content type, MD5 hash, upload date) is stored in fs.files.
When a drive fails, both collections may sustain damage. Missing chunks in the middle of a file produce a file with gaps. Missing metadata in fs.files means the chunks exist but we do not know what file they belong to.
Our recovery process handles both scenarios. We extract all documents from both collections, match chunks to files using the files_id field, and reassemble files in chunk order (the n field). For orphaned chunks without matching metadata, we identify the file type from the binary content and reconstruct the file entry.
BSON Document Extraction from Damaged Collection Files
WiredTiger stores BSON documents in B-tree pages within each .wt file. Each B-tree leaf page contains a variable number of documents, prefixed by a 4-byte length field. When a page is corrupted (bad sectors, partial writes), the B-tree traversal fails and MongoDB reports the entire collection as damaged.
We bypass the B-tree index entirely and scan the raw .wt file for BSON document boundaries. The 4-byte little-endian length prefix of each BSON document serves as a structural marker. We validate each candidate document against the BSON specification: the length must be consistent with the field elements, and the document must end with a null terminator byte (0x00).
This approach recovers documents from pages that WiredTiger's own recovery cannot access because the B-tree internal pages pointing to those leaf pages are damaged. The B-tree structure is disposable; the documents themselves are the valuable data.
Pricing
MongoDB recovery pricing is based on the physical condition of the drive. WiredTiger reconstruction, oplog replay, and BSON extraction are included at no additional charge. For RAID arrays or sharded clusters, each member drive is priced separately.
| Service Tier | Price | Description |
|---|---|---|
| Simple CopyLow complexity | $100 | Your drive works, you just need the data moved off it Functional drive; data transfer to new media Rush available: +$100 |
| File System RecoveryLow complexity | From $250 | Your drive isn't recognized by your computer, but it's not making unusual sounds File system corruption. Accessible with professional recovery software but not by the OS Starting price; final depends on complexity |
| Firmware RepairMedium complexity – PC-3000 required | $600–$900 | Your drive is completely inaccessible. It may be detected but shows the wrong size or won't respond Firmware corruption: ROM, modules, or translator tables corrupted; requires PC-3000 terminal access Standard drives at lower end; high-density drives at higher end |
| Head SwapHigh complexity – clean bench surgery50% deposit | $1,200–$1,500 | Your drive is clicking, beeping, or won't spin. The internal read/write heads have failed Head stack assembly failure. Transplanting heads from a matching donor drive on a clean bench 50% deposit required. Donor parts are consumed in the repair |
| Surface / Platter DamageHigh complexity – clean bench surgery50% deposit | $2,000 | Your drive was dropped, has visible damage, or a head crash scraped the platters Platter scoring or contamination. Requires platter cleaning and head swap 50% deposit required. Donor parts are consumed in the repair. Most difficult recovery type. |
Hardware Repair vs. Software Locks
Our "no data, no fee" policy applies to hardware recovery. We do not bill for unsuccessful physical repairs. If we replace a hard drive read/write head assembly or repair a liquid-damaged logic board to a bootable state, the hardware repair is complete and standard rates apply. If data remains inaccessible due to user-configured software locks, a forgotten passcode, or a remote wipe command, the physical repair is still billable. We cannot bypass user encryption or activation locks.
All tiers: Free evaluation and firm quote before any paid work. No data, no fee on simple copy, file system, and firmware tiers. Head swap and surface damage require a 50% deposit because donor parts are consumed in the attempt.
Target drive: The destination drive we copy recovered data onto. You can supply your own or we provide one at cost. For ultra-high-capacity drives (20TB and above), the target drive costs approximately $400+ due to the large media required. All prices are plus applicable tax.
Data Recovery Standards & Verification
Our Austin lab operates on a transparency-first model. We use industry-standard recovery tools, including PC-3000 and DeepSpar, combined with strict environmental controls to make sure your hard drive is handled safely and properly. This approach allows us to serve clients nationwide with consistent technical standards.
Open-drive work is performed in a ULPA-filtered laminar-flow bench, validated to 0.02 µm particle count, verified using TSI P-Trak instrumentation.
Transparent History
Serving clients nationwide via mail-in service since 2008. Our lead engineer holds PC-3000 and HEX Akademia certifications for hard drive firmware repair and mechanical recovery.
Media Coverage
Our repair work has been covered by The Wall Street Journal and Business Insider, with CBC News reporting on our pricing transparency. Louis Rossmann has testified in Right to Repair hearings in multiple states and founded the Repair Preservation Group.
Aligned Incentives
Our "No Data, No Charge" policy means we assume the risk of the recovery attempt, not the client.
Technical Oversight
Louis Rossmann
Louis Rossmann's well trained staff review our lab protocols to ensure technical accuracy and honest service. Since 2008, his focus has been on clear technical communication and accurate diagnostics rather than sales-driven explanations.
We believe in proving standards rather than just stating them. We use TSI P-Trak instrumentation to verify that clean-air benchmarks are met before any drive is opened.
See our clean bench validation data and particle test videoMongoDB Recovery FAQ
Can you recover a MongoDB database from a physically failed drive?
What happens if WiredTiger journal files are missing?
Can you recover GridFS files from a corrupted MongoDB instance?
Should I run mongod --repair on a failing drive?
How is MongoDB recovery priced?
Related Recovery Services
All supported database engines
InnoDB tablespace, ibdata1, redo log reconstruction
WAL reconstruction, pg_control repair, TOAST recovery
RAID 0, 1, 5, 6, 10 arrays
Synology, QNAP, TrueNAS, ASUSTOR
Dell, HP, IBM enterprise servers
Recover Your MongoDB Database
Call Mon-Fri 10am-6pm CT or email for a free drive evaluation.