Dynamic grid scheduling with job migration and rescheduling in the GridLab resource management system

Authors: K. Kurowski, B. Ludwiczak, J. Nabrzyski, A. Oleksiak and J. Pukacki - Poland

Complete Citation

  • K. Kurowski, B. Ludwiczak, J. Nabrzyski, A. Oleksiak and J. Pukacki. Dynamic grid scheduling with job migration and rescheduling in the GridLab? resource management system. Scientific Programming, 12(4), 2004.

Abstract

Grid computing has become one of the most important research topics that appeared in the field of computing in the last years. Simultaneously, we have noticed the growing popularity of new Web-based technologies which allow us to create application-oriented Grid middleware services providing capabilities required for dynamic resource and job management, monitoring, security, etc. Consequently, end users are able to get easier access to geographically distributed resources. In this paper we present the results of our experiments with the Grid(Lab) Resource Management System (GRMS), which acts on behalf of end users and controls their computations efficiently using distributed heterogeneous resources. We show how resource matching techniques used within GRMS can be improved by the use of a job migration based rescheduling policy. The main aim of this policy is to shorten job pending times and reduce machine overloads. The influence of this method on application performance and resource utilization is studied in detail and compared with two other simple policies.

Annotations

The authors describe a new metascheduler called the Grid(Lab) Resource Management System (GRMS). The system functions as a feedback controller for job scheduling. An important central component is the Broker Module, which may modified by inserting new policy plug-ins. The major investigation in this paper is the Reschedule plug-in, which relaxes job requirements to attempt to squeeze more jobs onto limited resources. Application-level checkpointing is used.

The authors run three rescheduling policies against each other:

  • Wait - Queue extra jobs (a control case).
  • Overload - Submit more jobs than resources can normally handle.
  • Reschedule - Apply the adaptive rescheduling policy.

Reschedule narrowly beat Wait in the makespan experiments. Overload gave predictably bad results.

-- JustinWozniak - 14 Nov 2007

Topic attachments
I Attachment Action Size Date Who Comment
pdfpdf GRMS_2004.pdf manage 150.1 K 14 Nov 2007 - 17:04 JustinWozniak  
Topic revision: r4 - 14 Nov 2007 - 17:22:40 - JustinWozniak
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback