AJOU Central Library Repository: An Efficient Fault-Tolerant and Reliable Data Integrity Framework for Object-Based Big Data Transfer Systems

BROWSE

Graduate School of Ajou University Department of Artificial Intelligence 4. Theses(Ph.D)

An Efficient Fault-Tolerant and Reliable Data Integrity Framework for Object-Based Big Data Transfer Systems

Author(s): PREETHIKA KASU

Advisor: TAE-SUN CHUNG

Department: 일반대학원 인공지능학과

Publisher: The Graduate School, Ajou University

Publication Year: 2022-08

Language: eng

Keyword: Big data; bloom filter; data integrity; geo-distributed data centers; high-performance computing; parallel file system

Alternative Abstract: Data has overwhelmed the digital world in terms of volume, variety, and velocity. Individuals, business organizations, computational science simulations, and experiments produce huge volumes of data on a daily basis. Often, this data is shared by data centers distributed geographically for storage and analysis. However, for transferring such huge volumes of data across geo-distributed data centers in a timely manner, data transfer tools are facing unprecedented challenges. Fault is one of the major challenges in distributed environments; hardware, network, and software might fail at any instant. Thus, high-speed and fault tolerant data transfer frameworks are vital for transferring data efficiently between the data centers. In this thesis, we propose a novel bloom filter-based data aware probabilistic fault tolerance (DAFT) mechanism to efficiently recover from such failures. We also propose a data and layout aware mechanism for fault tolerance (DLFT) to effectively handle the false positive matches of DAFT. We evaluate the data transfer and recovery time overheads of the proposed fault tolerance mechanisms on the overall data transfer performance. The experimental results demonstrate that the DAFT and DLFT mechanisms are very efficient in recovering from the faults while minimizing the memory, storage, computation, and recovery time overheads. Furthermore, we observe negligible impact on the overall data transfer performance. Protecting the integrity of data against the failures of various intermediate components involved in the end-to-end path of data transfer is a salient feature of big data transfer tools. Although most of these components provide some degree of data integrity, they are either too expensive or inefficient in recovering corrupted data. This necessitates the need to maintain application-level end-to-end integrity verification during data transfer. However, owing to the sheer size of the data, supporting end-to-end integrity verification with big data transfer tools incurs computational, memory, and storage overheads. In this thesis, we propose a cross-referencing bloom filter based data integrity verification framework for big data transfer systems. This framework has three advantages over state-of-the-art data integrity techniques: lower computation and memory overhead, and zero false-positive errors for a restricted number of elements. We evaluate the computation, memory, recovery time, and false-positive overhead of the proposed framework and compare them with state-of-the-art solutions. The evaluation results show that the proposed framework is very efficient in detecting and recovering from integrity errors while eliminating false-positives of the bloom filter data structure. In addition, we observe negligible computation, memory, and recovery overheads for all workloads.

URI: https://dspace.ajou.ac.kr/handle/2018.oak/21195

Fulltext

Appears in Collections:: Graduate School of Ajou University > Department of Artificial Intelligence > 4. Theses(Ph.D)

Files in This Item:: There are no files associated with this item.

Export: RIS (EndNote); XLS (Excel); XML

Show full item record

qrcode

트윗하기

License

STATISTICS: Total Visit :5,096,827; Total Download :2,112; Today View :17,625

AJOU Central Library Repository는 국립중앙도서관 OAK 보급사업으로 구축되었습니다.

BROWSE

Browse