Understanding how SSDs and Flash memory works.
What’s inside SSDs and memory sticks/cards?
Flash NAND based media is a non-volatile storage device. Flash based storage media is found in camera memory cards, usb pen drives, camcorders, mobile phones, dictaphones, tablets like the iPad, and solid state hard disk drives.
As the cost per gigabyte decreases they are becoming increasingly popular as storage devices where hard disk drives would be used. The most significant difference between flash NAND technology and hard disk technology, is that the former does not have any mechanical components.
On the right is an introductory video courtesy of ExplainingComputers.com, which may prove a useful introduction to SSD & memory stick/card technology.
Under the cover
Here we have two images of flash based media with their outer packaging removed. The SSD and memory sticks that Cheadle Data Recovery receives in for recovery usually relate to a failure of the controller chip. This is the component that allows the storage device to interact with the computer and control the way data is distributed through the flash nand chips. The controller converts requests for logical sectors (LBA values) into the physical locations on the actual flash memory chips. It also contains a wear levelling algorithm which extends the working life time of the device.
When a controller chip fails one the most challenging aspects is calculating where data is spread through the flash chip. Data is not recorded in a simple linear fashion. The controller chip is designed to increase the performance which means spreading data through multiple locations to allow, rather than having required data in sequential blocks. As such recovery of data from these devices can be extremely challenging.
Flash memory stores information in an array of memory cells made from floating-gate transistors. In traditional single-level cell (SLC) devices, each cell stores only one bit of information. Some newer flash memory, known as multi-level cell (MLC) devices, including triple-level cell (TLC) devices, can store more than one bit per cell by choosing between multiple levels of electrical charge to apply to the floating gates of its cells.
NAND is similar to a hard-disk drive. It’s sector-based (page-based) and suited for storing sequential data such as pictures, audio, or PC data. Although random access can be accomplished at the system level by shadowing the data to RAM, doing so requires additional RAM storage. Also, like a hard disk, NAND devices have bad blocks, and require error-correcting code (ECC) to maintain data integrity. Due to the decrease in die area resulting from the small cell size, NAND provides the larger capacities required for today’s low-cost consumer market. NAND flash is used in almost all removable memory cards.
NAND basic operation
The 2-Gbit NAND device is organized as 2048 blocks, with 64 pages per block. Each page has 2112 bytes total, comprised of a 2048-byte data area and a 64-byte spare area. The spare area is typically used for ECC, wear-leveling information, and other software overhead functions, although it’s physically no different from the rest of the page. NAND devices are offered with either an 8- or 16-bit interface. Host data is connected to the NAND memory through a bidirectional data bus, 8 or 16 bits wide. In 16-bit mode, commands and addresses use only the lower 8 bits. The upper 8 bits are only used during data-transfer cycles.
Erasing a block requires about 2 ms. Once the data is loaded in the register, programming a page requires about 300 —s. A page read requires approximately 25 —s, in which the page is accessed from the array and loaded into the 16,896-bit register. The register is then available for the user to clock out the data.
Multi-level cell
A Multi-level cell (MLC) stores two bits per cell, versus traditional SLCs that can only store one bit. There are obvious density advantages for MLC technology. However, it doesn’t offer the speed or reliability of its SLC counterpart. Because of this, SLC is used in most media cards and wireless applications, while MLC devices are typically found in consumer and other low-cost products.
Error Checking
As mentioned, NAND requires ECC to ensure data integrity. NAND flash includes extra storage on each page. The extra storage is the spare area of 64 bytes (16 bytes per 512-byte sector). This area can store the ECC code as well as other information like wear-leveling or logical-to-physical block-mapping. ECC can be performed in hardware or software, but hardware implementation provides an obvious performance advantage. During a programming operation, the ECC unit calculates the error-correcting code based on the data stored in the sector. The ECC code for the respective data area is then written to the respective spare area. When the data is read out, the ECC code is also read, and the reverse operation is applied to check that the data is correct.
It’s possible for the ECC algorithm to correct data errors. The number of errors that can be corrected depends on the correction strength of the algorithm used. Including ECC in hardware or software provides a robust system-level solution. Simple Hamming codes provide the easiest hardware implementation, but can only correct single-bit errors. Reed-Solomon codes can provide a more robust error correction and are used on many of today’s controllers. Also, BCH codes are becoming popular due to their improved efficiency over Reed-Solomon.
Software is needed to perform the NAND flash’s block management. This software is responsible for wear-leveling or logical-to-physical mapping. The software may also provide the ECC code if the processor does not include ECC hardware.
It’s important to read the status register after a program or erase operation, as it confirms successful completion of the operation. If the operation wasn’t successful, the block should be marked bad and no longer used. Previously programmed data should be moved out of the bad block into a new (good) block. The spec for a 2-Gbyte NAND device states that it could have up to 40 bad blocks, a number that applies throughout the device’s life (nominally 100,000 program/erase cycles). Due mostly to their large die size, NAND devices can ship from the factory with some bad blocks. The software managing the device is responsible for mapping the bad blocks and replacing them with good blocks.
Failure
While SSDs appear to be more reliable than HDDs, researchers at the Center for Magnetic Recording Research “are adamant that today’s SSDs aren’t an order of magnitude more reliable than hard drives”. SSD failures are often catastrophic, with total data loss. While HDDs can fail in this manner as well, they often give warning that they are failing, allowing much or all of their data to be recovered. Additionally, the robustness of a SSD varies greatly amongst models.
Traditional hard drives store their data in a linear, ordered manner. SSDs, however, constantly rearrange their data while keeping track of their locations for the purpose of wear leveling. As such, the flash memory controller and its firmware play a critical role in maintaining data integrity. One major cause of data loss in SSDs is firmware bugs, which rarely cause problems in HDDs