Skip to main contentSkip to navigation
Rossmann Repair Group logo - data recovery and MacBook repair

MongoDB Data Recovery

MongoDB stores data in BSON documents across WiredTiger collection files. When the underlying drive fails, mongod cannot start and standard repair tools cannot read the data directory. We image the failed drive using PC-3000, extract the WiredTiger files, rebuild the catalog metadata, and recover your collections. Standalone instances, replica sets, and sharded clusters. No data, no fee.

No Data = No Charge
WiredTiger + BSON Recovery
PC-3000 + Logical Repair
Nationwide Mail-In
Louis Rossmann
Written by
Louis Rossmann
Founder & Chief Technician
Updated March 2026

How WiredTiger Stores Data on Disk

Since MongoDB 3.2, WiredTiger is the default storage engine. Each collection is stored in a separate file (e.g., collection-0-123456789.wt), and each index gets its own file. The _mdb_catalog.wt file maps collection names to their on-disk filenames. The WiredTiger.wt metadata file and WiredTiger.turtle bootstrap file track the current checkpoint state.

WiredTiger uses multiversion concurrency control (MVCC) and writes through a journal before checkpointing to collection files. A checkpoint occurs approximately every 60 seconds or when 2GB of journal data has been written, whichever comes first. Between checkpoints, data exists only in the journal and in-memory cache.

This architecture creates specific failure modes:

  • Metadata loss: If _mdb_catalog.wt is unreadable, MongoDB does not know which .wt file belongs to which collection. The data files are intact but unnamed.
  • Checkpoint corruption: WiredTiger.turtle or WiredTiger.wt references a checkpoint that was partially written when the drive failed. MongoDB reports "WT_PANIC" and refuses to open the data directory.
  • Journal gaps: The journal directory contains incomplete or corrupted journal files. MongoDB cannot replay recent writes, leaving collections in a state up to 60 seconds behind.
  • Collection file corruption: Bad sectors on the platter hit a .wt file. Individual documents within the B-tree pages become unreadable, but surrounding pages remain intact.

Why mongod --repair Is Dangerous on a Failing Drive

The mongod --repair command reads every document from every collection, validates it, and rewrites the entire data directory. Documents that fail validation are discarded. This process requires the drive to be fully readable and writable.

On a drive with physical damage, --repair has two problems. First, it forces the drive to read every sector sequentially, including sectors in degraded areas. Each failed read attempt causes the heads to make repeated contact with damaged platter surfaces, expanding the area of damage. Second, --repair writes back to the same drive, overwriting data that a proper imaging process could have recovered using PC-3000's adaptive read parameters.

The MongoDB documentation itself warns that --repair should be used "only in cases where data is corrupted and cannot be fixed otherwise." When the corruption originates from physical media failure, fixing the drive first produces a more complete dataset than --repair can extract from degraded media.

Before running --repair or deleting journal files: Power down the server. Do not restart mongod. Each startup attempt forces a checkpoint recovery that writes to the data directory, overwriting data on damaged sectors. Ship the drives to us for forensic imaging first.

Our MongoDB Recovery Workflow

1

Drive assessment and forensic imaging

Evaluate the drive using PC-3000 diagnostics. If the drive is mechanically sound, we image it with write-blocking hardware. If heads have failed, we perform a head swap in the 0.02µm ULPA-filtered clean bench before imaging. The result is a complete sector-level clone on healthy media.

2

File system reconstruction and dbpath extraction

Mount the cloned image and extract the MongoDB data directory. If the file system (ext4, XFS, ZFS) is damaged, we parse the filesystem metadata directly to locate the WiredTiger files on disk. We recover the .wt collection files, the journal directory, and the WiredTiger metadata files.

3

WiredTiger metadata rebuild

Reconstruct _mdb_catalog.wt, WiredTiger.wt, and WiredTiger.turtle if damaged. Map each .wt file to its original collection name by parsing the B-tree page headers and document structure. Rebuild the checkpoint metadata so WiredTiger recognizes the data directory.

4

Oplog replay and collection validation

If journal files are intact, replay writes that occurred after the last checkpoint. For replica set members, the oplog (local.oplog.rs) can recover additional transactions. Validate each collection to identify documents with corrupted BSON structures.

5

BSON export and delivery

Export recovered collections using mongodump to produce BSON/JSON files, or provide a complete mongodump archive that can be restored with mongorestore. For GridFS recoveries, we also deliver reassembled original files alongside the database dump.

Common MongoDB Failure Scenarios

ScenarioSymptomsRecovery Approach
WT_PANICmongod crashes with "WT_PANIC: WiredTiger library panic" in the logs. The server refuses to restart.Image the drive. Rebuild WiredTiger.turtle and WiredTiger.wt checkpoint metadata from the last valid checkpoint in the collection files.
Missing journalmongod logs "Unable to read from journal" or "Recovery requires journal files." Server will not start without --repair flag.Extract collections from the last checkpoint state. Replay oplog entries from a replica set secondary if available. Uncheckpointed writes are lost.
Catalog corruptionCollections appear empty or missing in the mongo shell, but .wt files exist on disk. "No such catalog entry" errors in the log.Rebuild _mdb_catalog.wt by scanning each .wt file's internal B-tree structure to identify its collection name, namespace, and index configuration.
Replica set rollbackA primary with acknowledged writes lost network connectivity and a new primary was elected. When the old primary reconnects, it rolls back writes to match the new primary.Rolled-back documents are saved in the rollback/ directory as BSON files. We recover and merge these with the current dataset if the rollback directory was on the failed drive.
RAID array failureMultiple drives in a RAID array failed. The MongoDB data directory spans a logical volume that is no longer mountable.Image each member drive. Reconstruct the RAID stripe geometry and rebuild the virtual disk. Extract the file system and MongoDB data directory from the reconstructed array.

Sharded Cluster Recovery

A sharded MongoDB deployment distributes data across multiple shard servers, each containing a subset of the documents determined by the shard key. The config servers store the chunk-to-shard mapping and cluster metadata. Mongos routers direct queries to the correct shard based on this mapping.

Recovering a sharded cluster requires three components: the data from each shard, the config server metadata, and knowledge of the shard key for each collection. The config database stores chunk boundaries in the config.chunks collection and shard membership in config.shards.

When config servers are lost along with one or more shards, we reconstruct the chunk boundaries from the data itself. Each shard contains documents whose shard key values fall within specific ranges. By scanning the documents on each recovered shard, we rebuild the chunk map and merge the shards into a complete dataset.

Config servers intact

When config server data is recoverable, the chunk-to-shard mapping is preserved. We image the failed shard drives, recover their collections, and the existing config metadata tells us exactly which documents belong where. Reassembly is straightforward.

Config servers lost

Without config server data, we must infer shard boundaries. We scan each recovered shard for the shard key field, determine the minimum and maximum values on each shard, and validate that no documents overlap between shards. This produces a complete merge with correct deduplication.

GridFS File Recovery

GridFS is MongoDB's specification for storing files larger than the 16MB BSON document limit. It splits each file into 255KB chunks and stores them as individual documents in the fs.chunks collection. File metadata (filename, content type, MD5 hash, upload date) is stored in fs.files.

When a drive fails, both collections may sustain damage. Missing chunks in the middle of a file produce a file with gaps. Missing metadata in fs.files means the chunks exist but we do not know what file they belong to.

Our recovery process handles both scenarios. We extract all documents from both collections, match chunks to files using the files_id field, and reassemble files in chunk order (the n field). For orphaned chunks without matching metadata, we identify the file type from the binary content and reconstruct the file entry.

BSON Document Extraction from Damaged Collection Files

WiredTiger stores BSON documents in B-tree pages within each .wt file. Each B-tree leaf page contains a variable number of documents, prefixed by a 4-byte length field. When a page is corrupted (bad sectors, partial writes), the B-tree traversal fails and MongoDB reports the entire collection as damaged.

We bypass the B-tree index entirely and scan the raw .wt file for BSON document boundaries. The 4-byte little-endian length prefix of each BSON document serves as a structural marker. We validate each candidate document against the BSON specification: the length must be consistent with the field elements, and the document must end with a null terminator byte (0x00).

This approach recovers documents from pages that WiredTiger's own recovery cannot access because the B-tree internal pages pointing to those leaf pages are damaged. The B-tree structure is disposable; the documents themselves are the valuable data.

Pricing

MongoDB recovery pricing is based on the physical condition of the drive. WiredTiger reconstruction, oplog replay, and BSON extraction are included at no additional charge. For RAID arrays or sharded clusters, each member drive is priced separately.

Service TierPriceDescription
Simple CopyLow complexity$100

Your drive works, you just need the data moved off it

Functional drive; data transfer to new media

Rush available: +$100

File System RecoveryLow complexityFrom $250

Your drive isn't recognized by your computer, but it's not making unusual sounds

File system corruption. Accessible with professional recovery software but not by the OS

Starting price; final depends on complexity

Firmware RepairMedium complexity – PC-3000 required$600–$900

Your drive is completely inaccessible. It may be detected but shows the wrong size or won't respond

Firmware corruption: ROM, modules, or translator tables corrupted; requires PC-3000 terminal access

Standard drives at lower end; high-density drives at higher end

Head SwapHigh complexity – clean bench surgery50% deposit$1,200–$1,500

Your drive is clicking, beeping, or won't spin. The internal read/write heads have failed

Head stack assembly failure. Transplanting heads from a matching donor drive on a clean bench

50% deposit required. Donor parts are consumed in the repair

Surface / Platter DamageHigh complexity – clean bench surgery50% deposit$2,000

Your drive was dropped, has visible damage, or a head crash scraped the platters

Platter scoring or contamination. Requires platter cleaning and head swap

50% deposit required. Donor parts are consumed in the repair. Most difficult recovery type.

Hardware Repair vs. Software Locks

Our "no data, no fee" policy applies to hardware recovery. We do not bill for unsuccessful physical repairs. If we replace a hard drive read/write head assembly or repair a liquid-damaged logic board to a bootable state, the hardware repair is complete and standard rates apply. If data remains inaccessible due to user-configured software locks, a forgotten passcode, or a remote wipe command, the physical repair is still billable. We cannot bypass user encryption or activation locks.

All tiers: Free evaluation and firm quote before any paid work. No data, no fee on simple copy, file system, and firmware tiers. Head swap and surface damage require a 50% deposit because donor parts are consumed in the attempt.

Target drive: The destination drive we copy recovered data onto. You can supply your own or we provide one at cost. For ultra-high-capacity drives (20TB and above), the target drive costs approximately $400+ due to the large media required. All prices are plus applicable tax.

Data Recovery Standards & Verification

Our Austin lab operates on a transparency-first model. We use industry-standard recovery tools, including PC-3000 and DeepSpar, combined with strict environmental controls to make sure your hard drive is handled safely and properly. This approach allows us to serve clients nationwide with consistent technical standards.

Open-drive work is performed in a ULPA-filtered laminar-flow bench, validated to 0.02 µm particle count, verified using TSI P-Trak instrumentation.

Transparent History

Serving clients nationwide via mail-in service since 2008. Our lead engineer holds PC-3000 and HEX Akademia certifications for hard drive firmware repair and mechanical recovery.

Media Coverage

Our repair work has been covered by The Wall Street Journal and Business Insider, with CBC News reporting on our pricing transparency. Louis Rossmann has testified in Right to Repair hearings in multiple states and founded the Repair Preservation Group.

Aligned Incentives

Our "No Data, No Charge" policy means we assume the risk of the recovery attempt, not the client.

LR

Louis Rossmann

Louis Rossmann's well trained staff review our lab protocols to ensure technical accuracy and honest service. Since 2008, his focus has been on clear technical communication and accurate diagnostics rather than sales-driven explanations.

We believe in proving standards rather than just stating them. We use TSI P-Trak instrumentation to verify that clean-air benchmarks are met before any drive is opened.

See our clean bench validation data and particle test video

MongoDB Recovery FAQ

Can you recover a MongoDB database from a physically failed drive?
Yes. We image the failed drive using PC-3000, reconstruct the file system, and extract the MongoDB data directory (dbpath). Once the WiredTiger collection files and journal are recovered, we rebuild the _mdb_catalog metadata and bring the database online. No data, no fee.
What happens if WiredTiger journal files are missing?
The WiredTiger journal records recent writes that have not yet been checkpointed to collection files. Without it, MongoDB refuses to start because it cannot guarantee data consistency. We extract the last valid checkpoint from WiredTiger.wt and WiredTiger.turtle, then start mongod with recovery flags to salvage collections from the checkpoint state. Uncheckpointed writes since the last checkpoint are lost.
Can you recover data from a sharded MongoDB cluster?
Yes. Each shard stores a subset of the data on its own drives. We image each shard's drives independently, recover the chunk ranges from the config server metadata, and reassemble the complete dataset. If the config servers are also lost, we reconstruct the shard key ranges from the collection data on each shard.
Can you recover GridFS files from a corrupted MongoDB instance?
Yes. GridFS splits files into 255KB chunks stored in the fs.chunks collection, with metadata in fs.files. We recover both collections from the WiredTiger data files, match chunks to their parent file by files_id, and reassemble the original files in sequence order.
Should I run mongod --repair on a failing drive?
No. The --repair flag rewrites every collection and index, which requires reading the entire dataset and writing it back to disk. On a drive with bad sectors or failing heads, this causes additional platter damage and can render remaining data unrecoverable. Power down the server and send the drives for professional imaging first.
How is MongoDB recovery priced?
Pricing is based on the physical condition of the drive. File system corruption: from $250. Firmware repair: $600-$900. Head swap: $1,200-$1,500. MongoDB logical repair (WiredTiger reconstruction, oplog replay, BSON extraction) is included at no additional charge. No data, no fee.

Recover Your MongoDB Database

Call Mon-Fri 10am-6pm CT or email for a free drive evaluation.