Scheduling Optimization based on Resources State Prediction in Large Scale Distributed Systems

Florin Pop, Valentin Cristea

Abstract


This paper presents an approach for the prediction-based optimization of meta-scheduling in Large Scale Distributed Systems. Several methods are analyzed for resource state prediction to be used in meta-scheduling. Because of the different levels and fluctuations of the performance due to contention caused by competitive applications, schedulers must be able to predict the deliverable performance that an application will be able to obtain when it eventually runs. Time series predictions of the resource status of distributed systems resources, such as CPU or free memory, are considered in order to improve the system availability. The prediction model is based on actual parameter values and historical information, both provided by the MonALISA monitoring system and its extensions: repository and ApMon. The prediction system architecture is extensible, in the sense that other monitoring parameters can be easily added and new prediction models can be included. The predictions of resources are used in meta-scheduling for different types of tasks, especially tasks with dependencies that have an associated communication cost. Based on the dynamic resource information, which sometimes need to be predicted, the scheduler can choose the combination of resources from the available resource pool that is expected to maximize performance for the application and use the resources in an effcient way. The significant improvements obtained for scheduling optimization, with an immediate effect on load balancing and resource utilization are presented.

Full Text: PDF