What Is SSD Garbage Collection?
SSD garbage collection (GC) is an automated background process run by the SSD's internal controller. It consolidates valid data from fragmented NAND blocks into clean blocks, then physically erases the old blocks so they can accept new writes. GC is the step that permanently destroys deleted data on an SSD.
When you delete a file, the operating system sends a TRIM command to the SSD. TRIM marks the file's logical addresses as invalid in the controller's Flash Translation Layer (FTL). TRIM does not erase anything. The controller queues those blocks for garbage collection, which runs during idle time and applies the erase voltage that resets the NAND cells to their unprogrammed state.
This two-step process is what makes SSD data recovery different from hard drive recovery. On a hard drive, deleted files stay on the magnetic platters until new data overwrites the same sectors. On an SSD, the controller actively hunts for stale blocks and erases them in the background, regardless of whether new data needs that space.
What Is DZAT (Deterministic Read Zero After TRIM)?
DZAT is a SATA specification feature where the SSD controller guarantees that any read request targeting a TRIMmed logical block address returns all zeroes. This happens immediately after TRIM processing, before garbage collection physically erases the NAND cells. The NVMe equivalent is DLFEAT=001b (Deallocate Logical Block Features).
- DZAT (Deterministic Read Zero After TRIM)
- The strictest SATA TRIM implementation. The controller intercepts all read commands to TRIMmed LBAs and returns a payload of all zeroes. The physical NAND may still hold the original charge states, but no software can reach them through the standard interface. Most modern SATA SSDs implement DZAT.
- DRAT (Deterministic Read After TRIM)
- A less strict implementation where the controller returns a consistent, deterministic value for TRIMmed LBAs. The value is usually zeroes but the specification allows other fixed patterns. Less common in modern drives.
- DLFEAT=001b (NVMe Deallocate Features)
- The NVMe equivalent of DZAT. When a drive sets DLFEAT to 001b in its namespace metadata, reading a deallocated logical block returns all zeroes. Most modern NVMe drives enforce this behavior for predictable latency and RAID parity consistency.
DZAT creates a critical problem for data recovery. The controller lies to the operating system. It reports that TRIMmed blocks contain nothing but zeroes, even while the original data still exists as electrical charges on the NAND cells. This is not a software limitation. It is a protocol-level enforcement built into the drive's firmware. The TRIM and DZAT protocol specifications define the exact ATA and NVMe opcodes, namespace metadata fields, and controller behavior that produce this zero-return result.
Why Does Recovery Software Fail on Modern SSDs?
Recovery software operates at the logical layer above the SSD controller. It sends standard read commands through the OS storage driver. When the controller has DZAT enabled, every read to a TRIMmed block returns zeroes. The software sees empty space and reports the data as unrecoverable.
| Layer | What Happens | Data Accessible? |
|---|---|---|
| File system (OS level) | OS sends TRIM after file deletion; marks LBAs invalid | No; file metadata removed |
| Controller logical interface | DZAT returns zeroes for all TRIMmed LBAs | No; software sees empty space |
| FTL mapping table | LBA-to-PBA mapping dropped; pages marked invalid | Not through standard commands |
| Physical NAND cells | Charge states may still hold original data until GC erases | Yes, via PC-3000 SSD raw NAND access |
Tools like Disk Drill, EaseUS, Recuva, and R-Studio all operate at the top two layers. They cannot bypass the controller's DZAT enforcement, and they cannot access raw NAND pages below the FTL. When these tools report zero recoverable files from a TRIMmed SSD, the data may still exist physically on the NAND. It requires hardware that communicates with the controller at the diagnostic level, not through the standard storage protocol.
SSD Data Recovery Pricing
Five published tiers from From $200 to $1,200–$1,500. NVMe SSD recovery ranges from From $200 to $1,200–$2,500. Free evaluation, firm quote before any paid work. No data, no recovery fee. +$100 rush fee to move to the front of the queue.
Low complexity
Simple Copy
Your drive works, you just need the data moved off it
Functional drive; data transfer to new media
Rush available: +$100
$200
3-5 business days
Low complexity
File System Recovery
Your drive isn't showing up, but it's not physically damaged
File system corruption. Visible to recovery software but not to OS
Starting price; final depends on complexity
From $250
2-4 weeks
Medium complexity
Circuit Board Repair
Your drive won't power on or has shorted components
PCB issues: failed voltage regulators, dead PMICs, shorted capacitors
May require a donor drive (additional cost)
$450–$600
3-6 weeks
Medium complexity
Most Common
Firmware Recovery
Your drive is detected but shows the wrong name, wrong size, or no data
Firmware corruption: ROM, modules, or system files corrupted
Price depends on extent of bad areas in NAND
$600–$900
3-6 weeks
High complexity
PCB / NAND Swap
Your drive's circuit board is severely damaged and requires NAND chip transplant to a donor PCB
NAND swap onto donor PCB. Precision microsoldering and BGA rework required
50% deposit required; donor drive cost additional
50% deposit required
$1,200–$1,500
4-8 weeks
Hardware Repair vs. Software Locks
Our "no data, no fee" policy applies to hardware recovery. We do not bill for unsuccessful physical repairs. If we replace a hard drive read/write head assembly or repair a liquid-damaged logic board to a bootable state, the hardware repair is complete and standard rates apply. If data remains inaccessible due to user-configured software locks, a forgotten passcode, or a remote wipe command, the physical repair is still billable. We cannot bypass user encryption or activation locks.
No data, no fee. Free evaluation and firm quote before any paid work. Full guarantee details. NAND swap requires a 50% deposit because donor parts are consumed in the attempt.
- Rush fee
- +$100 rush fee to move to the front of the queue
- Donor drives
- A donor drive is a matching SSD used for its circuit board. Typical donor cost: $40–$100 for common models, $150–$300 for discontinued or rare controllers.
- Target drive
- The destination drive we copy recovered data onto. You can supply your own or we provide one at cost plus a small markup. All prices are plus applicable tax.
A donor drive is a matching SSD used for its circuit board. Typical donor cost: $40–$100 for common models, $150–$300 for discontinued or rare controllers.
ATA TRIM Protocol and the Recovery Window Inside Over-Provisioned NAND
The time between a file deletion and a successful PC-3000 SSD read depends on two things: what the host sent to the drive (the ATA or NVMe command payload) and what the controller decides to do with the over-provisioned NAND pool. Both layers are specified by JEDEC and the T13 ACS-4 standard, and both behave differently across controller families.
ATA DATA SET MANAGEMENT: the Command That Carries TRIM
TRIM is a feature of the DATA SET MANAGEMENT command (opcode 0x06), introduced in ACS-2 and refined in ACS-4. The host sets the Trim bit in the Count register, then transmits a buffer of LBA Range Entries. Each entry is an 8-byte structure: a 6-byte starting LBA followed by a 2-byte range length. Entries are packed into 512-byte sectors; a single command can carry up to 64 sectors, or 4,096 ranges per transfer. The controller queues those ranges into the FTL, walks the logical-to-physical map, and marks every matching page invalid. No NAND cell is touched by this command. The block-level ERASE that actually destroys the data is a separate, later step, run by garbage collection at the controller's discretion.
The NVMe equivalent is the Dataset Management command (opcode 09h) with the Deallocate attribute, defined in NVM Express 2.0. The namespace's DLFEAT field in Identify Namespace tells the host what the drive will return on a subsequent read to a deallocated LBA. DLFEAT bit values determine whether the drive guarantees zeros, guarantees a fixed deterministic value, or makes no guarantee at all.
Deterministic vs Non-Deterministic TRIM Return Behavior
Not every SSD enforces DZAT. The SATA spec allows three distinct behaviors for reads to TRIMmed LBAs, and an enterprise NVMe drive can advertise a fourth through DLFEAT. A drive that returns data non-deterministically can leak plaintext through the standard read path for a limited window; a drive that enforces DZAT cannot. Both states sit above the NAND itself; neither tells you whether the underlying cells have been physically erased.
| Return Mode | Spec Source | What a Read to a TRIMmed LBA Returns |
|---|---|---|
| Non-deterministic | Pre-ACS-2 (legacy) | Implementation-defined. May return stale NAND contents, zeros, or different values on repeated reads. |
| DRAT (Deterministic Read After TRIM) | ACS-2 | A consistent, fixed pattern on every read. Pattern is vendor-chosen but must not change across reads of the same LBA. |
| DZAT (Deterministic Read Zero After TRIM) | ACS-2 | All zeros. Enforced by the controller on the read path, regardless of NAND state. |
| NVMe DLFEAT bits 2:0 = 001b | NVM Express 2.0 | All zeros for deallocated logical blocks (the NVMe analogue of DZAT). |
| NVMe DLFEAT bits 2:0 = 000b | NVM Express 2.0 | No guarantee. Reads may return any value. Present on some enterprise drives where Deallocate is a hint rather than a contract. |
None of these modes describe what has happened at the NAND layer. They only describe what the controller returns on the host interface. PC-3000 SSD operates below this interface entirely, which is why DZAT enforcement does not block a lab recovery when the physical cells still hold charge.
How Over-Provisioning Preserves Invalidated Pages Across P/E Cycles
The recovery window is not a stopwatch. It is a count of program/erase cycles on other blocks between the moment TRIM ran and the moment wear-leveling scheduled the original block for erase. Over-provisioning is the reserve pool that gives the controller somewhere else to write while leaving invalidated pages in place.
Consumer SSDs typically allocate 7% of raw NAND to OP; mixed-workload enterprise drives allocate 20% to 28% (documented in JEDEC JESD218 and vendor datasheets). When OP is high, the controller has ample clean blocks to absorb host writes and can defer both GC and wear-leveling across thousands of write transactions. An invalidated page can survive until the controller's static wear-leveling sweep picks its block for refresh, which on lightly used drives happens once per thousands of P/E cycles on other blocks. This is why powering the drive off immediately halts the clock: with no host writes arriving, no OP pressure accumulates, no background sweep runs, and the invalidated pages stay programmed.
Conversely, a drive written near capacity operates with an OP that behaves like zero. Every new host write forces the controller to reclaim a block immediately. Under this pressure, foreground GC erases TRIMmed blocks within seconds, and the recovery window closes before the drive can be shipped.
PC-3000 SSD FTL Translator Snapshot Extraction
When a deletion event has already run but the NAND has not yet been erased, raw physical reads return the programmed cell states. They do not return files. A modern SSD scatters a single file across hundreds of physical pages on different dies, indexed only by the FTL translator. To turn raw NAND pages back into files, the lab must rebuild the translator as it existed before the deletion.
Controllers store copies of the FTL translator in the NAND service area as a power-loss recovery safeguard. Every time the translator updates, the controller commits a new generation to service-area logs and retires the oldest. The number of retained generations is controller-specific: Silicon Motion SM2259XT drives carry multi-generation translator backups in the service area; Phison E12 and E18 use a rolling journal; Samsung firmware commits a checkpointed translator on shutdown and between wear-leveling passes.
After PC-3000 SSD enters the controller's diagnostic mode, Data Extractor walks the service area, parses every retained translator generation, and presents them as rollback points. The engineer picks a generation dated before the deletion event. Data Extractor re-applies that translator on top of the raw NAND image; pages the current FTL marks invalid resolve against the older translator and reassemble as live files. The recovery is not a scan for file headers; it is a forensic rollback of the mapping table.
Two conditions determine whether this works. The target blocks must not have been erased by GC since the deletion (so the physical pages still hold data). The service-area translator log must contain at least one generation dated before the deletion event (so there is a rollback point to restore). Drives shipped powered off within hours of a deletion event typically satisfy both conditions.
Retention Physics Bound the Powered-Off Recovery Window
Powering the drive off stops garbage collection, but it does not stop charge loss. An invalidated NAND page is still a programmed page; its electrons sit on a floating gate behind a tunnel oxide that has been stressed by every prior P/E cycle on that block. JEDEC JESD218 specifies the minimum retention the manufacturer must guarantee at End of Life: a client-class SSD must hold data for 1 year powered off at 30C, an enterprise-class SSD for 3 months powered off at 40C. Both numbers are floors at full P/E exhaustion, not steady-state behavior.
The Arrhenius rule governs the temperature dependence; retention roughly halves for every 5C to 10C rise in ambient. A drive shipped in a hot vehicle or stored on a warm shelf loses its window faster than the JEDEC floor suggests. Two physical mechanisms drive the decay on worn drives. Stress-Induced Leakage Current (SILC) is a weak steady leakage through trap sites in the tunnel oxide. Trap-Assisted Tunneling (TAT) is the more aggressive failure mode: trap sites align into a conductive path and electrons hop out of the floating gate orders of magnitude faster than through pristine oxide. A fresh drive can hold an invalidated page for years; a drive late in its rated P/E budget can drop below the LDPC correction threshold in months.
Cell geometry compounds the problem. TLC partitions the cell voltage window into 8 Vt states; QLC partitions the same window into 16. The narrower QLC margins overlap under far less charge loss, so the powered-off recovery window on a QLC drive closes earlier than on TLC at equivalent wear. The controller's background refresh routine does not save these pages either; refresh scans skip pages the FTL has marked invalid, so retention loss on TRIMmed cells is never scrubbed. The practical implication for a customer is direct: ship the drive promptly and keep it cool. A drive sitting in a desk drawer at room temperature for a few days loses far less margin than the same drive baking in a vehicle cabin for the same period.
How the SSD Controller Manages Garbage Collection
NAND flash cannot overwrite data in place. The SSD must erase an entire block (128 to 256 pages, typically 256KB to 4MB) before writing new data to any page within that block. Garbage collection exists to prepare erased blocks for future writes by reclaiming blocks that contain a mix of valid and invalid pages.
Page-Level Writes, Block-Level Erases
Pages (4KB to 16KB) are the smallest unit the controller can read or write. Blocks (groups of 128 to 256 pages) are the smallest unit the controller can erase. When the OS modifies a file, the controller writes updated data to a new, clean page and marks the old page as invalid in the FTL. This out-of-place write model means blocks gradually accumulate invalid pages alongside valid ones.
The GC Cycle
- Identification: The controller selects a block with a high ratio of invalid pages. Controllers track valid/invalid ratios for every block on the drive.
- Valid page migration: The controller reads the remaining valid pages from the target block and writes them to a clean, pre-erased block.
- Block erase: The controller applies erase voltage to the entire target block, resetting all cells (both previously valid and invalid) to the unprogrammed state (0xFF). This step permanently destroys the deleted data.
Background GC vs. Foreground GC
Background GC runs during idle periods without interrupting host I/O. The controller uses dedicated processing cores (Phison controllers use "CoX processors" for this) to clean blocks quietly. Foreground GC triggers when the pool of free blocks drops too low to handle incoming writes. The controller pauses host operations to erase blocks in real time, causing the latency spikes known as the "performance cliff." From a recovery perspective, foreground GC is the worst scenario: it erases TRIMmed blocks immediately to free space for new data.
Over-Provisioning and GC Timing
Over-provisioning (OP) is a reserve of NAND blocks hidden from the user. A drive with 10% OP always has a buffer of free blocks available for writes while background GC processes dirty blocks at its own pace. More OP means the controller can defer physical erasure for longer periods. A nearly full drive with minimal free space triggers aggressive foreground GC, erasing stale blocks within seconds of a TRIM command. Enterprise SSDs with 28% or more OP tend to have the most relaxed GC scheduling; consumer drives with 7% OP erase more aggressively.
Which Controllers Implement DZAT?
Every major SSD controller family implements DZAT or its NVMe equivalent (DLFEAT=001b). The differences that matter for data recovery are in GC timing: how quickly the controller physically erases TRIMmed blocks after the DZAT logical mask is in place.
Samsung Controllers (MKX, Elpis, Pascal)
Samsung designs controllers in-house: the MKX for the SATA 870 EVO, Elpis for the NVMe 980 PRO, and Pascal for the 990 PRO. Samsung firmware prioritizes aggressive GC for NAND endurance optimization. Background GC begins within seconds of idle time after receiving TRIM. Samsung's aggressive GC benefits drive longevity but creates the shortest recovery window of any controller family.
Phison Controllers (PS3111-S11, PS5012-E12, PS5018-E18)
Phison controllers power drives from Corsair, Kingston, Seagate, and Sabrent. Newer Gen4/Gen5 controllers (like the E18 and E26) use dedicated CoX processors for background GC, separating GC from host I/O processing. Older controllers like the PS3111-S11 have a documented firmware failure mode where GC operations cause the FTL to panic and throw the drive into SATAFIRM S11 safe mode. When this happens, GC freezes mid-cycle. The interrupted GC preserves NAND data that would otherwise have been erased, making PC-3000 recovery viable.
Silicon Motion Controllers (SM2258XT, SM2259XT, SM2262EN)
Silicon Motion controllers are used by Crucial, ADATA, and many OEM drives. The SM2259XT (a DRAM-less SATA controller) shows a favorable behavior for recovery: on a quick format, the controller erases the primary FTL translator but does not trigger a mass-GC cycle across the NAND array. The physical data persists on the memory cells. PC-3000 can access historical translator backups stored in the NAND service area and reconstruct the original directory structure. This makes Silicon Motion drives among the more recoverable after format events.
Marvell, Realtek, and Intel/Solidigm
Marvell controllers (enterprise-grade) follow strict DZAT compliance with steady, predictable GC timing. Realtek controllers appear in budget NVMe drives and can delay GC under heavy thermal throttling. Intel/Solidigm controllers enforce RZAT (Read Zero After TRIM) in enterprise environments where RAID parity consistency requires deterministic behavior across all drives in the array.
How Powering Off Preserves Data Before GC Completes
The gap between TRIM marking blocks as invalid and GC physically erasing those blocks is the only recovery window. Powering off the SSD immediately after data loss freezes the controller state and prevents GC from running.
- Power off the drive immediately. Disconnect the SSD from the system. Do not shut down the OS normally if you can avoid it; a normal shutdown gives the controller idle time to run GC before power is removed.
- Do not reconnect to a running OS. Plugging the SSD back into a running system sends queued TRIM commands and gives the controller idle time to resume GC. Even mounting the drive as read-only may trigger controller activity.
- Do not run recovery software. Consumer recovery tools send read commands through the standard interface. On a DZAT drive, these reads return zeroes (giving you no useful data). Worse, the drive is powered on and idle between read commands, allowing background GC to execute.
- Ship the drive powered off to a lab with PC-3000 SSD. The PC-3000 forces the controller into a diagnostic state that disables all background processes, including GC, before reading raw NAND.
If the SSD suffered a power loss or firmware failure before GC ran, the data is likely intact on the NAND. Firmware failures freeze the controller mid-operation, preventing all background tasks including GC. Drives stuck in safe mode (SATAFIRM S11, BSY state, 0-byte capacity) have not run GC since the failure occurred.
Controller-Level Garbage Collection Triggers
Garbage collection does not run on a wall-clock schedule. The controller fires erase cycles when one of three internal conditions is met: free-block pool exhaustion, write-amplification breach, or host idle-window detection. The order in which a specific drive hits these thresholds is what determines the practical recovery window. Powering the SSD off stops every one of these triggers from advancing.
The faster a deleted file enters the controller's erase queue, the shorter the window for a successful SSD data recovery. The trigger model is the same across major controller families. The thresholds and timing constants are not.
The Three Internal Trigger Conditions
- Free-block pool exhaustion
- The controller tracks a running count of erased blocks ready for new writes. When that count falls below a firmware-defined threshold, GC is scheduled to reclaim space. On consumer drives the threshold is typically a small single-digit percentage of total user-visible capacity; enterprise drives keep a larger reserve. Once the count drops far enough to coincide with incoming host writes, background GC escalates to foreground GC and erasure runs in real time.
- Write-amplification factor (WAF) breach
- The controller measures the ratio of physical NAND writes to logical host writes. If WAF climbs past a target, the controller compacts blocks that hold a high proportion of invalid pages, migrating the remaining valid pages and erasing the source block. WAF-driven GC tends to run during sustained write workloads, well before the free-block pool reaches its floor.
- Host idle-window detection
- The controller monitors the host command queue. After a configurable period of no incoming reads or writes, background GC begins erasing blocks the FTL has flagged as candidates. This is the trigger that closes the recovery window on a drive that has been deleted from, left powered on, and otherwise left alone. Samsung firmware reacts in seconds; Phison and Silicon Motion batch the work into longer idle intervals.
Controller Family Aggressiveness Comparison
The table below summarizes the GC behavior of the controller families that appear most often on this bench. It is a survey of documented firmware behavior, not a guarantee for any one drive. Firmware revisions change scheduling constants.
| Controller Family | Dominant Trigger | Background GC Onset | Recovery Posture |
|---|---|---|---|
| Samsung in-house (MKX, Elpis, Pascal) | Host idle-window | Within seconds of idle | Shortest window. Power off the drive at the first sign of data loss. |
| Phison (PS3111-S11, PS5012-E12, PS5018-E18) | Batched WAF and idle window | Minutes, batched on Gen4/Gen5 CoX cores | Moderate window. PS3111-S11 SATAFIRM S11 freezes preserve invalidated pages indefinitely. |
| Silicon Motion (SM2258XT, SM2259XT, SM2262EN) | Free-block pool pressure | Deferred until pool falls below threshold | Longest window on lightly used drives. Multi-generation FTL translator snapshots survive in the service area. |
| Marvell and Intel/Solidigm enterprise | Steady WAF-driven scheduling | Predictable cadence with RAID parity guarantees | RZAT and DLFEAT=001b enforced; recovery still possible through Techno Mode before GC erases the source block. |
FTL Translator Behavior During an Erase Cycle
Once GC selects a victim block, the controller reads its valid pages and writes them to a clean destination block, updates the FTL mapping for those pages to point at the new physical block, and only then issues the block ERASE. The order matters for recovery. The instant the ERASE completes, the source block's physical pages are reset to 0xFF and the original logical-to-physical entries are dropped from the live FTL. The TRIM and DZAT protocol specifications describe the host-facing side of this transition; the controller-internal write, remap, erase sequence is firmware-private but its effect on the NAND is the same on every modern part. A drive shipped powered off before any of these sequences completes preserves both the source pages and the prior FTL generation in the service area, which is what makes the Techno Mode rollback procedure viable.
How PC-3000 SSD Reads Invalid NAND Pages
PC-3000 SSD bypasses the controller's standard SATA/NVMe interface entirely. Instead of asking the controller for logical data (which triggers DZAT zeroes), it communicates with the controller at the diagnostic level to access raw physical NAND pages that the FTL has marked as invalid but GC has not yet erased.
Safe Mode Entry and Loader Upload
The first step is preventing the controller from executing its standard firmware (and with it, background GC). Engineers use hardware techniques to force the controller into Safe Mode. With the primary firmware suspended, PC-3000 uploads a custom microcode loader directly into the controller's RAM. This loader grants access to Techno Mode.
Techno Mode: Freezing the Drive State
In Techno Mode, the SSD operates in a restricted, single-channel state. All background firmware tasks are disabled: no garbage collection, no wear leveling, no TRIM processing. The drive's forensic state is frozen. PC-3000 can now read raw Physical Block Addresses (PBAs) directly, bypassing the FTL mapping table and the DZAT logical mask.
FTL Translator Reconstruction
Raw NAND reads return fragmented data mixed with ECC parity bits and out-of-order blocks from wear leveling. To reconstruct usable files, the FTL translator must be rebuilt. Advanced SSD controllers (Silicon Motion SM2259XT, Phison E12) create backup copies of the FTL translator in the NAND service area as a safeguard against power loss. PC-3000 Data Extractor scans the raw NAND architecture for these historical translator backups, reverts the logical mapping to a version that existed before the deletion event, and rebuilds the original file system and directory structure.
This procedure depends on one condition: the physical NAND cells must still hold the original charge states. If GC has already erased the target blocks, the cells are reset to 0xFF and the data is gone. This is why immediate power-off is critical. For drives where the controller is dead beyond repair, chip-off NAND extraction desolders the NAND chips from the PCB for direct reading, but AES-256 encryption on many modern controllers means chip-off yields only ciphertext without the original controller to decrypt.
Why Chip-Off Fails After Block Erase
A NAND flash cell stores bits as trapped electrons on a floating gate isolated by two oxide layers. The number of electrons on that gate sets the cell's threshold voltage (Vt), which the sense amplifier measures to distinguish 0s from 1s. The physics of NAND cell programming and read is a charge-storage operation; cells retain their state because the oxide barrier traps the injected electrons.
The ERASE operation reverses this at the block level. The controller applies roughly 20V to the P-well substrate beneath every cell in the block and holds the control gates at ground. The resulting electric field pulls electrons off each floating gate through the tunnel oxide via Fowler-Nordheim tunneling. Each ERASE pulse stresses that tunnel oxide; over thousands of program/erase cycles the oxide accumulates trap sites that shorten the retention window for any data still resident on the chip. After ERASE, every cell in the block sits in the unprogrammed Vt distribution (near the erased state), read back as 0xFF. There is no residual charge pattern, no sub-threshold margin, no forensic technique available to a commercial lab that can recover the pre-erase program state. This is the mechanism that ends the recovery window. Before GC runs the ERASE, chip-off can read the raw programmed cells; after GC runs the ERASE, the chip holds no data to read.
LBA-to-PBA Mapping Before and After TRIM Issuance
The host operating system addresses an SSD by Logical Block Address (LBA), a flat numeric index that looks like a sector on a magnetic platter. The NAND chips underneath are organized by Physical Block Address (PBA): die, plane, block, page. The Flash Translation Layer is the lookup table that turns one into the other. The entire question of post-deletion recovery comes down to what that table looks like at three specific moments: before TRIM, immediately after TRIM, and after the erase cycle completes.
| Stage | FTL Entry for the LBA | PBA Page Contents | GC Queue Status |
|---|---|---|---|
| Before deletion | LBA n -> PBA (die, plane, block, page); valid | Programmed cells holding user data | Not queued |
| Immediately after TRIM (DATA SET MANAGEMENT 0x06) | LBA n unmapped from FTL; entry severed | Cells still programmed; charge intact | Containing block added to GC candidate queue with its valid/invalid page ratio recorded |
| After valid-page migration, before erase | Sibling valid pages remapped to a fresh PBA on a clean block | Original cells still programmed (invalid pages have not been touched) | Source block at head of erase queue; depth depends on free-block pressure |
| After block ERASE pulse | No live or historical entry points at the source PBA | All cells reset to unprogrammed state (0xFF); data destroyed at the physics layer | Block returned to free pool |
GC Queue Depth, Over-Provisioning, and Write Amplification
The GC queue is not a single FIFO; it is a multi-priority structure the controller walks every time it needs a free block. Each candidate block carries an internal score driven by its ratio of invalid pages to valid pages, its program/erase count relative to neighbor blocks (the wear-leveling input), and whether it sits in the user-visible region or in the over-provisioning reserve. A high-OP drive maintains queue depth in the thousands of blocks, which lets the controller pick a block with the highest invalid-page ratio every time. That choice minimizes the number of valid pages that must be migrated per reclaimed block, which directly suppresses write amplification.
Over-provisioning, wear leveling, and write-amplification factor (WAF) are connected through this queue. JEDEC JESD218 defines WAF as the ratio of NAND writes to host writes; every valid-page migration adds to the numerator without changing the denominator. Higher OP raises queue depth, which lowers the average migrations per reclaim, which lowers WAF, which lowers the rate at which fresh blocks must be reclaimed. The same loop governs the recovery window: with more free blocks in reserve, the controller can leave the block containing the invalidated pages at the bottom of the priority list across many host writes. Powering the drive off freezes the entire queue, regardless of where the source block sat.
A near-full drive inverts the picture. With OP behaving as zero, queue depth collapses to whatever blocks the controller can recycle on demand. Every host write forces the controller to pull the next candidate, migrate its valid pages, and erase the source. WAF climbs, free-block reserve stays at the floor, and the source block holding the deleted file moves to the head of the queue within seconds. This is the worst-case starting condition for any SSD recovery attempt.
DZAT Physics: ATA RETURN_ZERO, ACS-4 RZAT vs DRAT
DZAT is a controller-level read interceptor, not a NAND-level erase. The drive reports its post-TRIM contract to the host through the IDENTIFY DEVICE response (Word 69 Bit 5 for DZAT, Bit 14 for DRAT on SATA; the DLFEAT field at byte 33 of Identify Namespace on NVMe). When the host later reads a TRIMmed LBA, the controller's firmware short-circuits the read on the SATA or PCIe link and synthesizes a zero buffer. The NAND is never queried. The cells underneath can still be fully programmed with the user's original data.
ACS-4 (the T13 ATA Command Set 4 standard) carries TRIM inside the DATA SET MANAGEMENT command (opcode 0x06). The opcode itself is the same on every TRIM issuance; the post-TRIM read contract is set by the capability bits the drive advertises in its IDENTIFY DEVICE response. Two bits matter for SSD data loss analysis: Word 69 Bit 5 advertises the RZAT contract (Return Zero After Trim), and Word 69 Bit 14 advertises the DRAT contract (Deterministic Read After Trim). The host has no way to negotiate the contract per-command; whichever bit the drive set at boot is what the controller will enforce on every subsequent read to a TRIMmed LBA.
| Field | Location | Value | Meaning for the Host Read Path |
|---|---|---|---|
| TRIM supported | IDENTIFY DEVICE Word 169, Bit 0 | 1 | Drive accepts DATA SET MANAGEMENT 0x06 with the Trim feature set |
| DRAT | IDENTIFY DEVICE Word 69, Bit 14 | 1 | Reads to a TRIMmed LBA return a deterministic value across repeated reads (not necessarily zeros) |
| DZAT / RZAT | IDENTIFY DEVICE Word 69, Bit 5 | 1 | Reads to a TRIMmed LBA are guaranteed to return all zeros, synthesized by the controller |
| NVMe DLFEAT bits 2:0 | IDENTIFY NAMESPACE byte 33, bits 2:0 | 001b | Reads to a deallocated LB return all zeros (NVMe analogue of DZAT) |
Why the Zero Return Happens at the Controller, Not the NAND
On a read to a TRIMmed LBA, the SATA host adapter issues READ FPDMA QUEUED (0x60) or READ DMA EXT (0x25) with the target LBA in the command FIS. The firmware on the SSD controller receives that FIS, walks the FTL, and observes that the LBA carries no live mapping. With DZAT advertised in Word 69 Bit 5, the firmware does not attempt a NAND fetch. It allocates a buffer at the host interface DMA engine, fills it with 0x00 bytes for the requested transfer length, and returns the data over the SATA link as if it had been read from media. The cells the LBA used to point at are not addressed. Their charge state is irrelevant to the response. The same logic applies on NVMe through the Submission Queue / Completion Queue path when DLFEAT bits 2:0 = 001b. The controller is contractually obligated to return zeros to the host on any read to a TRIMmed LBA until the underlying block is reclaimed, regardless of what the NAND cells still hold.
The mathematical reason this contract exists is RAID parity. A parity-redundant array (RAID 5, RAID 6, NVMe over RDMA arrays) computes parity across stripes from every member drive's data. If a TRIMmed sector on one drive returns non-deterministic content across reads, parity computed during a consistency check will not match parity computed during a rebuild, and the array silently corrupts. Enterprise drives must guarantee zeros for TRIMmed sectors so parity calculations stay consistent. The side effect for the home user is the same: the software-level read path returns zeros instantly, irrespective of NAND state.
Why No Software-Only Tool Can Recover Data After TRIM
Consumer recovery software (Disk Drill, EaseUS, Recuva, R-Studio, DMDE, UFS Explorer in its standard scanning mode, PhotoRec) works by issuing standard READ commands through the operating system's block layer. The OS hands those reads to the AHCI or NVMe driver, the driver pushes them onto the drive's command queue, and the controller responds. On any SSD with DZAT or DLFEAT bits 2:0 = 001b, every one of those reads to a TRIMmed LBA returns zeros from the controller firmware. The software sees a flat field of zeros and reports the data as unrecoverable. The result is the same whether the NAND still holds the original charge states or whether GC erased the block ten seconds ago. None of these tools can see past the controller's synthesized zero buffer.
| Tool | Where It Reads From | What It Sees on a DZAT/DLFEAT=001b Drive |
|---|---|---|
| R-Studio (standard mode) | OS block device through AHCI/NVMe driver | Zeros for every TRIMmed LBA; finds nothing to carve |
| DMDE | OS block device | Zeros for every TRIMmed LBA; partition rebuild against an empty image |
| UFS Explorer (standard mode) | OS block device | Zeros; cannot reach controller diagnostic interface |
| Disk Drill, EaseUS, Recuva, PhotoRec | OS block device with signature carving | No file signatures present; carve returns empty |
| PC-3000 SSD (Techno Mode) | Controller diagnostic interface; raw PBA reads below the FTL | Programmed cells if GC has not erased; zero-state cells if it has |
The tools are not broken. They are doing exactly what their architecture allows. They read what the controller hands them. On a healthy drive with TRIM disabled and a logical-only failure (accidental delete on a system that never sent TRIM, a corrupted partition table, a botched file system check), R-Studio and DMDE can still produce results because the FTL never unmapped the LBAs and the controller still serves the original NAND contents. That is a logical recovery scenario, not a post-TRIM scenario. The moment TRIM runs and the drive enforces DZAT, the software-only path is closed. Recovery moves to TRIM-aware lab procedures that bypass the controller's standard interface, either through PC-3000 SSD diagnostic-mode entry or through chip-off NAND extraction when the controller is dead and chip-off is viable for the controller family (which excludes encrypted modern NVMe drives whose AES-256 key lives in controller silicon).
The Narrow Window Between Deletion and Physical Erase
The interval where recovery is still physically possible is the gap between TRIM's FTL unmap and the controller's block ERASE pulse. That gap is measured in seconds, minutes, or hours, never in days for an active drive. Five variables decide where on that range a specific situation falls. Powering the drive off stops the clock on every one of them.
The Five Variables That Set the Window
- Idle time after deletion. Background GC fires on idle-window detection. A drive left powered on with no host I/O after a delete reaches the idle threshold within seconds on aggressive firmware and within a longer batched window on Phison and Silicon Motion parts. A drive disconnected at the SATA or M.2 socket never reaches the threshold at all.
- Host write activity. Sustained writes (a large copy, an OS update, a backup job running across the same volume) push the controller into foreground GC. Foreground erasure can resolve a TRIMmed block within milliseconds of the write that demanded the space. This is the path that closes the window fastest.
- Over-provisioning ratio. A 28% OP enterprise drive holds a deep queue of clean reserve blocks; the source block sits low in the GC priority list and survives across many host writes. A 7% consumer drive operating near full capacity has effectively zero OP and pulls the source block into the erase queue on the next write.
- Controller vendor heuristics. Firmware sets the trigger thresholds. Phison batches GC into longer intervals on Gen4 and Gen5 parts with dedicated background cores; Silicon Motion defers GC until the free-block pool drops below its watermark and retains multi-generation FTL translator snapshots in the service area, which keeps Techno Mode rollback viable on lightly used drives.
- GC interruption by power loss or firmware fault. A sudden power cut, a PMIC failure, or a firmware crash freezes GC mid-cycle. The source block's pages stay programmed. Drives that arrive in SATAFIRM S11 safe mode, in BSY state, or reporting 0-byte capacity after an electrical event have stopped running GC since the fault occurred; their NAND is the most preserved state a lab will see outside of an immediate power-off.
How These Variables Stack in Practice
The window is not additive. Foreground GC under sustained writes will close it in seconds regardless of OP. A drive disconnected at the moment of the deletion event keeps it open indefinitely, regardless of vendor heuristics. The middle of the distribution (a drive left powered on at idle, with moderate OP, no fault) is where vendor behavior matters most: a Phison batched scheduler buys minutes; an aggressive idle-detect firmware closes the window inside that same minute. The correct response in every case is the same. Disconnect power at the wall or the cable, stop the OS from completing a clean shutdown if you can intercept it, and ship the drive powered down. Anything else is a bet against the controller's own scheduler. For drives where the controller has already failed, the recovery path runs through board-level SSD recovery or through NAND-degradation analysis if the failure mode is retention loss rather than active GC.
SSD Garbage Collection Recovery FAQ
Can you recover data from an SSD after garbage collection?
Why does my data recovery software show zeroes on my SSD?
How long does garbage collection take to erase deleted SSD data?
What is the difference between TRIM and garbage collection?
What does DZAT mean for SSD data recovery?
Should I power off my SSD immediately after losing data?
Can PC-3000 SSD bypass DZAT and read the raw NAND?
Does over-provisioning affect how fast garbage collection erases data?
Does disabling TRIM prevent garbage collection?
What is the exact ATA command that sends a TRIM to an SSD?
How does over-provisioning give data a longer life after TRIM?
What specific events trigger SSD garbage collection?
Which SSD controller families erase deleted data fastest after TRIM?
Does enterprise NVMe Deallocate always return zeroes like consumer DZAT?
Where does the zero-return happen on a DZAT drive: in the NAND or in the controller?
Why does chip-off recovery fail after garbage collection runs?
SSD Not Responding? Get a Free Evaluation.
Garbage collection may not have erased your data yet. Power off the drive and ship it to our Austin lab. SATA SSD recovery starts at $200; NVMe starts at $200. No data, no charge.
