MongoDB Data Recovery

MongoDB stores data in BSON documents across WiredTiger collection files. When the underlying drive fails, mongod cannot start and standard repair tools cannot read the data directory. We image the failed drive using PC-3000, extract the WiredTiger files, rebuild the catalog metadata, and recover your collections. Standalone instances, replica sets, and sharded clusters. No data, no fee.

Start a Recovery All Database Services

Author01/11

Written by

Louis Rossmann

Founder & Chief Technician

Updated March 2026

No Data = No Charge

WiredTiger + BSON Recovery

PC-3000 + Logical Repair

Nationwide Mail-In

Overview02/11

How We Recover a Corrupted MongoDB Database

We image the failed drive with PC-3000, then extract the WiredTiger .wt collection files, journal, & metadata. We rebuild the _mdb_catalog.wt, WiredTiger.wt, & WiredTiger.turtle, replay journal & oplog writes after the last checkpoint, & raw-scan damaged .wt files for BSON document boundaries when the B-tree is gone. Recovered collections export as BSON. No data, no fee.

WiredTiger Storage03/11

How WiredTiger Stores Data on Disk

Since MongoDB 3.2, WiredTiger is the default storage engine. Each collection is stored in a separate file (e.g., collection-0-123456789.wt), and each index gets its own file. The _mdb_catalog.wt file maps collection names to their on-disk filenames. The WiredTiger.wt metadata file and WiredTiger.turtle bootstrap file track the current checkpoint state.

WiredTiger uses multiversion concurrency control (MVCC) and writes through a journal before checkpointing to collection files. A checkpoint occurs approximately every 60 seconds or when 2GB of journal data has been written, whichever comes first. Between checkpoints, data exists only in the journal and in-memory cache.

This architecture creates specific failure modes:

Metadata loss: If _mdb_catalog.wt is unreadable, MongoDB does not know which .wt file belongs to which collection. The data files are intact but unnamed.
Checkpoint corruption: WiredTiger.turtle or WiredTiger.wt references a checkpoint that was partially written when the drive failed. MongoDB reports "WT_PANIC" and refuses to open the data directory.
Journal gaps: The journal directory contains incomplete or corrupted journal files. MongoDB cannot replay recent writes, leaving collections in a state up to 60 seconds behind.
Collection file corruption: Bad sectors on the platter hit a .wt file. Individual documents within the B-tree pages become unreadable, but surrounding pages remain intact.

Mongod --repair Is Dangerous04/11

Why mongod --repair Is Dangerous on a Failing Drive

The mongod --repair command reads every document from every collection, validates it, and rewrites the entire data directory. Documents that fail validation are discarded. This process requires the drive to be fully readable and writable.

On a drive with physical damage, --repair has two problems. First, it forces the drive to read every sector sequentially, including sectors in degraded areas. Each failed read attempt causes the heads to make repeated contact with damaged platter surfaces, expanding the area of damage. Second, --repair writes back to the same drive, overwriting data that a proper imaging process could have recovered using PC-3000's adaptive read parameters.

The MongoDB documentation itself warns that --repair should be used "only in cases where data is corrupted and cannot be fixed otherwise." When the corruption originates from physical media failure, fixing the drive first produces a more complete dataset than --repair can extract from degraded media.

Before running --repair or deleting journal files: Power down the server. Do not restart mongod. Each startup attempt forces a checkpoint recovery that writes to the data directory, overwriting data on damaged sectors. Ship the drives to us for forensic imaging first.

Our MongoDB Recovery Workflow05/11

Our MongoDB Recovery Workflow

Drive assessment and forensic imaging

Evaluate the drive using PC-3000 diagnostics. If the drive is mechanically sound, we image it with write-blocking hardware. If heads have failed, we perform a head swap in the 0.02µm ULPA-filtered clean bench before imaging. The result is a complete sector-level clone on healthy media.

File system reconstruction and dbpath extraction

Mount the cloned image and extract the MongoDB data directory. If the file system (ext4, XFS, ZFS) is damaged, we parse the filesystem metadata directly to locate the WiredTiger files on disk. We recover the .wt collection files, the journal directory, and the WiredTiger metadata files.

WiredTiger metadata rebuild

Reconstruct _mdb_catalog.wt, WiredTiger.wt, and WiredTiger.turtle if damaged. Map each .wt file to its original collection name by parsing the B-tree page headers and document structure. Rebuild the checkpoint metadata so WiredTiger recognizes the data directory.

Oplog replay and collection validation

If journal files are intact, replay writes that occurred after the last checkpoint. For replica set members, the oplog (local.oplog.rs) can recover additional transactions. Validate each collection to identify documents with corrupted BSON structures.

BSON export and delivery

Export recovered collections using mongodump to produce BSON/JSON files, or provide a complete mongodump archive that can be restored with mongorestore. For GridFS recoveries, we also deliver reassembled original files alongside the database dump.

Common MongoDB Failure Scenarios06/11

Common MongoDB Failure Scenarios

Scenario	Symptoms	Recovery Approach
WT_PANIC	mongod crashes with "WT_PANIC: WiredTiger library panic" in the logs. The server refuses to restart.	Image the drive. Rebuild WiredTiger.turtle and WiredTiger.wt checkpoint metadata from the last valid checkpoint in the collection files.
Missing journal	mongod logs "Unable to read from journal" or "Recovery requires journal files." Server will not start without --repair flag.	Extract collections from the last checkpoint state. Replay oplog entries from a replica set secondary if available. Uncheckpointed writes are lost.
Catalog corruption	Collections appear empty or missing in the mongo shell, but .wt files exist on disk. "No such catalog entry" errors in the log.	Rebuild _mdb_catalog.wt by scanning each .wt file's internal B-tree structure to identify its collection name, namespace, and index configuration.
Replica set rollback	A primary with acknowledged writes lost network connectivity and a new primary was elected. When the old primary reconnects, it rolls back writes to match the new primary.	Rolled-back documents are saved in the `rollback/` directory as BSON files. We recover and merge these with the current dataset if the rollback directory was on the failed drive.
RAID array failure	Multiple drives in a RAID array failed. The MongoDB data directory spans a logical volume that is no longer mountable.	Image each member drive. Reconstruct the RAID stripe geometry and rebuild the virtual disk. Extract the file system and MongoDB data directory from the reconstructed array.

Sharded Cluster Recovery07/11

Sharded Cluster Recovery

A sharded MongoDB deployment distributes data across multiple shard servers, each containing a subset of the documents determined by the shard key. The config servers store the chunk-to-shard mapping and cluster metadata. Mongos routers direct queries to the correct shard based on this mapping.

Recovering a sharded cluster requires three components: the data from each shard, the config server metadata, and knowledge of the shard key for each collection. The config database stores chunk boundaries in the config.chunks collection and shard membership in config.shards.

When config servers are lost along with one or more shards, we reconstruct the chunk boundaries from the data itself. Each shard contains documents whose shard key values fall within specific ranges. By scanning the documents on each recovered shard, we rebuild the chunk map and merge the shards into a complete dataset.

Config servers intact

When config server data is recoverable, the chunk-to-shard mapping is preserved. We image the failed shard drives, recover their collections, and the existing config metadata tells us exactly which documents belong where. Reassembly is straightforward.

Config servers lost

Without config server data, we must infer shard boundaries. We scan each recovered shard for the shard key field, determine the minimum and maximum values on each shard, and validate that no documents overlap between shards. This produces a complete merge with correct deduplication.

GridFS File Recovery08/11

GridFS File Recovery

GridFS is MongoDB's specification for storing files larger than the 16MB BSON document limit. It splits each file into 255KB chunks and stores them as individual documents in the fs.chunks collection. File metadata (filename, content type, MD5 hash, upload date) is stored in fs.files.

When a drive fails, both collections may sustain damage. Missing chunks in the middle of a file produce a file with gaps. Missing metadata in fs.files means the chunks exist but we do not know what file they belong to.

Our recovery process handles both scenarios. We extract all documents from both collections, match chunks to files using the files_id field, and reassemble files in chunk order (the n field). For orphaned chunks without matching metadata, we identify the file type from the binary content and reconstruct the file entry.

BSON Document Extraction09/11

BSON Document Extraction from Damaged Collection Files

WiredTiger stores BSON documents in B-tree pages within each .wt file. Each B-tree leaf page contains a variable number of documents, prefixed by a 4-byte length field. When a page is corrupted (bad sectors, partial writes), the B-tree traversal fails and MongoDB reports the entire collection as damaged.

We bypass the B-tree index entirely and scan the raw .wt file for BSON document boundaries. The 4-byte little-endian length prefix of each BSON document serves as a structural marker. We validate each candidate document against the BSON specification: the length must be consistent with the field elements, and the document must end with a null terminator byte (0x00).

This approach recovers documents from pages that WiredTiger's own recovery cannot access because the B-tree internal pages pointing to those leaf pages are damaged. The B-tree structure is disposable; the documents themselves are the valuable data.

Pricing10/11

Pricing

MongoDB recovery pricing is based on the physical condition of the drive. WiredTiger reconstruction, oplog replay, and BSON extraction are included at no additional charge. For RAID arrays or sharded clusters, each member drive is priced separately.

Low complexity
Simple Copy
Your drive works, you just need the data moved off it
Functional drive; data transfer to new media
Rush available: +$100
$100
3-5 business days
Low complexity
File System Recovery
Your drive isn't recognized by your computer, but it's not making unusual sounds
File system corruption. Accessible with professional recovery software but not by the OS
Starting price; final depends on complexity
From $250
2-4 weeks
Medium complexity
Firmware Repair
Your drive is completely inaccessible. It may be detected but shows the wrong size or won't respond
Firmware corruption: ROM, modules, or translator tables corrupted; requires PC-3000 terminal access
CMR drive: $600. SMR drive: $900.
$600–$900
3-6 weeks
High complexity
Most Common
Head Swap
Your drive is clicking, beeping, or won't spin. The internal read/write heads have failed
Head stack assembly failure. Transplanting heads from a matching donor drive on a clean bench
50% deposit required. CMR: $1,200-$1,500 + donor. SMR: $1,500 + donor.
50% deposit required
$1,200–$1,500
4-8 weeks
High complexity
Surface / Platter Damage
Your drive was dropped, has visible damage, or a head crash scraped the platters
Platter scoring or contamination. Requires platter cleaning and head swap
50% deposit required. Donor parts are consumed in the repair. Most difficult recovery type.
50% deposit required
$2,000
4-8 weeks

Hardware Repair vs. Software Locks

Our "no data, no fee" policy applies to hardware recovery. We do not bill for unsuccessful physical repairs. If we replace a hard drive read/write head assembly or repair a liquid-damaged logic board to a bootable state, the hardware repair is complete and standard rates apply. If data remains inaccessible due to user-configured software locks, a forgotten passcode, or a remote wipe command, the physical repair is still billable. We cannot bypass user encryption or activation locks.

No data, no fee. Free evaluation and firm quote before any paid work. Full guarantee details. Head swap and surface damage require a 50% deposit because donor parts are consumed in the attempt.

Rush fee: +$100 rush fee to move to the front of the queue
Donor drives: Donor drives are matching drives used for parts. Typical donor cost: $50–$150 for common drives, $200–$400 for rare or high-capacity models. We source the cheapest compatible donor available.
Target drive: The destination drive we copy recovered data onto. You can supply your own or we provide one at cost plus a small markup. For larger capacities (8TB, 10TB, 16TB and above), target drives cost $400+ extra. All prices are plus applicable tax.

The prices above are for standard hard drives, which covers most jobs. Helium-sealed drives (for example WD or HGST Ultrastar He and Seagate Exos X) must be resealed and refilled with helium in-house after the chamber is opened, so they price higher, in the $200–$5,000+ range. See helium drive pricing.

Data Recovery Standards & Verification

Our Austin lab operates on a transparency-first model. We use industry-standard recovery tools, including PC-3000 and DeepSpar, combined with strict environmental controls to maintain drive integrity. This approach allows us to serve clients nationwide with consistent technical standards.

Validated Clean Zone

Open-drive work is performed in a ULPA-filtered laminar-flow bench, validated to 0.02 µm particle count, verified using TSI P-Trak instrumentation.

Transparent History

Serving clients nationwide via mail-in service since 2008. Our lead engineer holds PC-3000 and HEX Akademia certifications for hard drive firmware repair and mechanical recovery.

Media Coverage

Our repair work has been covered by The Wall Street Journal and Business Insider, with CBC News reporting on our pricing transparency. Louis Rossmann has testified in Right to Repair hearings in multiple states and founded the Repair Preservation Group.

Aligned Incentives

Our "No Data, No Charge" policy means we assume the risk of the recovery attempt, not the client.

Technical Oversight

Louis Rossmann

Our engineers review all lab protocols to maintain technical accuracy and honest service. Since 2008, his focus has been on clear technical communication and accurate diagnostics rather than sales-driven explanations.

We believe in proving standards rather than just stating them. We use TSI P-Trak instrumentation to verify that clean-air benchmarks are met before any drive is opened.

See our clean bench validation data and particle test video

Overview11/11

MongoDB Recovery FAQ

Can you recover a MongoDB database from a physically failed drive?

Yes. We image the failed drive using PC-3000, reconstruct the file system, and extract the MongoDB data directory (dbpath). Once the WiredTiger collection files and journal are recovered, we rebuild the _mdb_catalog metadata and bring the database online. No data, no fee.

What happens if WiredTiger journal files are missing?

The WiredTiger journal records recent writes that have not yet been checkpointed to collection files. Without it, MongoDB refuses to start because it cannot guarantee data consistency. We extract the last valid checkpoint from WiredTiger.wt and WiredTiger.turtle, then start mongod with recovery flags to salvage collections from the checkpoint state. Uncheckpointed writes since the last checkpoint are lost.

Can you recover data from a sharded MongoDB cluster?

Yes. Each shard stores a subset of the data on its own drives. We image each shard's drives independently, recover the chunk ranges from the config server metadata, and reassemble the complete dataset. If the config servers are also lost, we reconstruct the shard key ranges from the collection data on each shard.

Can you recover GridFS files from a corrupted MongoDB instance?

Yes. GridFS splits files into 255KB chunks stored in the fs.chunks collection, with metadata in fs.files. We recover both collections from the WiredTiger data files, match chunks to their parent file by files_id, and reassemble the original files in sequence order.

Should I run mongod --repair on a failing drive?

No. The --repair flag rewrites every collection and index, which requires reading the entire dataset and writing it back to disk. On a drive with bad sectors or failing heads, this causes additional platter damage and can render remaining data unrecoverable. Power down the server and send the drives for professional imaging first.

How is MongoDB recovery priced?

Pricing is based on the physical condition of the drive. File system corruption: From $250. Firmware repair: $600–$900. Head swap: $1,200–$1,500. MongoDB logical repair (WiredTiger reconstruction, oplog replay, BSON extraction) is included at no additional charge. No data, no fee.

No Data, No Fee

Guarantee

2.49M+

Subscribers

4.9

1,837+ Google Reviews

Since 2008

Established

Repairs on Video

Full Transparency

As Featured In

BBC News