More and more computing devices such as smartphones, tablet PCs, datacenters are equipped with flash memory because of its many advantage, such as shock resistance, fast access, and low power consumption. However, its distinguishing characteristics, including erase-before-update, asymmetric read/write/erase cost and limited number of erase cycles, make it necessary to reconsider existing database design in order to explore the hardware potential. For example, the buffer replacement scheme for flash-based databases should not only consider the cache hit ratio, but also the relatively heavy write and erase costs that are caused by flushing dirty pages. The frequent changes of B+-trees of database systems can degrade the performance and negatively influence the lifespan of flash memory. Furthermore, Reliable erasing of data from storage devices is a critical component of secure data management and is well understood for magnetic disks. However, flash memory has unusual electronic limitations that make in-place updating impossible.
Most of the recent studies on buffer design focus on a clean-first LRU (Least Recently Used) strategy that evicts clean pages prior to dirty pages, in order to minimize the write access to flash. However, all of them failed to distinguish the cached pages that may have different effects on the flash device under various storage mangers. Meanwhile, most state-of-the-art studies on flash-aware index design focused mainly on buffer and storage mechanisms whereby they can obtain efficient I/Os to flash memory.
In this dissertation, I first propose a three-state log-aware buffer management scheme, called TSLA, which considers not only the imbalance of read/write costs of flash memory but also the log block thrashing, associativity, and space utilization problems of log-based FTLs (flash translation layers). I then introduce the concepts of lazy-split, modify-two-node, which make possible the construction of a novel index solution, the Lazy-Split B+-tree (LSB+-tree). In detail, by their introduction, the first concept of LSB+-tree can efficiently reduce the number of node splits, the second can reduce the number of node modifications. A group round robin based B+-tree index storage scheme (GRR) is also discussed which applies a dynamic grouping and round robin techniques for erase-minimized storage of B+-tree in flash memory under heavy-update workload. Lastly, this dissertation investigates secure deletion and modification module, ESK, to improve both information security and erasing reliability of flash based systems.
Experimental results show that the proposed TSLA buffer solution is effective for reducing the garbage collection overhead under various FTLs, such as BAST, FAST and IPL. GRR is efficient for frequently changed B+-tree structure and improves the I/O performance by 2.14X at best, compared to the related work. ESK module can improve the level of information safety as well as reduce the number of page copies and block erases due to reliable erasing.