ever when it comes to SSDs. The founding assumption of log-structured file systems - that reads are cheap and writes are expensive - is emphatically true for the bare-metal building blocks of SSDs, NAND-based flash. (For the rest of this article, "flash" refers to NAND-based flash and SSD refers to a NAND-based flash device with a wear-leveling, write-gathering flash translation layer.) When it comes to flash, reads may be done at small granularities - a few hundreds of bytes - but writes must be done in large contiguous blocks - on the order of tens of thousands or hundreds of thousands of bytes. A write to flash takes two steps: First the entire block is cleared, setting all the bits to the same value (usually 1, counter-intuitively). Second, individual bits in the block are flipped back to 0 until you get the block you wanted. Garbage collection • Data in a block becomes free as sectors within a block are written. – For example, block A contains sectors X, Y, and Z. When the host writes X, the new location for X is in Block B. The version of X in block A is now “stale” and represents free data (i.e. garbage). • Garbage Collection – When the number of “Free” Flash Blocks reaches a low level, blocks need to be freed. • The block with the most free data is selected and any valid data in the block is rewritten. • Garbage Collection can occur concurrently with data being written by the host. • Synonymous with “Recycling” In an HDD system, the Operating System (OS) can simply request that new data be written to the same location where the older, now invalid data, is stored, and the HDD will directly overwrite the old data. In an SSD, however, the page must first be erased before it can be written to locations previously holding data the SSD cannot directly overwrite existing data as stated earlier. The OS understands the files, their structure, and the logical locations where they are stored, but does not understand the physical storage structure of the storage device. In any storage system, the storage device doesn’t know the file structure it simply knows that there are bytes of data written in specific sectors. The storage system, whether SSD or HDD, returns the data from
the corresponding logical locations. When the OS deletes the file, it simply marks the space used for that data as free in its logical data table. With HDDs, the OS does not need to tell the storage device anything about the deletion because it would simply write something new into that same physical location in the future. In the case of an SSD, it only becomes aware that the data is deleted (or invalid) when the OS tries to write to that location again. At that time the SSD marks the old data as invalid and it writes the new data to a new physical location. It may also perform GC at that same time, but that varies between SSD architectures and other conditions at that moment.