An on-line replication strategy to increase availability in Data Grids
Authors: Ming Lei, Susan V. Vrbskya, and Xiaoyan Hong
Complete Citation
- Ming Lei, Susan V. Vrbskya, and Xiaoyan Hong. An on-line replication strategy to increase availability in Data Grids. Future Generation Computer Systems, 24(2), 2008.
Abstract
Data is typically replicated in a Data Grid to improve the job response time and data availability. Strategies for data replication in a Data Grid have previously been proposed, but they typically assume unlimited storage for replicas. In this paper, we address the system-wide data availability problem assuming limited replica storage. We describe two new metrics to evaluate the reliability of the system, and propose an on-line optimizer algorithm that can Minimize the Data Missing Rate (
MinDmr? ) in order to maximize the data availability. Based on
MinDmr? , we develop four optimizers associated with four different file access prediction functions. Simulation results utilizing the
OptorSim show our
MinDmr? strategies achieve better performance overall than other strategies in terms of the goal of data availability using the two new metrics.
Annotations
- Define two metrics to measure data availability, the System File Missing Rate and the System Bytes Missing Rate.
- Heuristically choose which files to replicate at any given moment.
- If the system fills up, files must be deleted to perform replication and the cost of that must be considered.
- The new heuristic considers the value of each file based on its usage.
- The OptorSim simulator is employed to study the net effect of several algorithms.
Related Work
--
JustinWozniak - 07 Nov 2007