A New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability

Document Type : Research Article

Author

Department of Computer Science, Shahid Bahonar University of Kerman, Kerman, Iran

Abstract

Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data-intensive or computation-intensive. However, only considering one kind of jobs in scheduling does not result in suitable scheduling in the viewpoint of all systems, and sometimes causes wasting of resources on the other side. To address the challenge of simultaneously considering both kinds of jobs, a new Integrated Job Scheduling Strategy (IJSS) is proposed in this paper. At one hand, the IJSS algorithm considers both data and computational resource availability of the network, and on the other hand, considering the corresponding requirements of each job, it determines a value called W to the job. Using the W value, the importance of two aspects (being data or computation intensive) for each job is determined, and then the job is assigned to the available resources. The simulation results with OptorSim show that IJSS outperforms comparing to the existing algorithms mentioned in literature as number of jobs increases.

Keywords


[1] D. Fernandez-Baca, “Allocating modules to processors in a distributed system,” IEEE Transactions on Software Engineering, 15: pp. 427-1436, 1989.
[2] S. Kardani-Moghadam, F. Khodadadi, and R. Entezari-Maleki, A. Movaghar, “A hybrid genetic algorithm and variable neighborhood search for task scheduling problem in grid environment,” Procedia Engineering, 29: pp. 3808-3814, 2012.
[3] R. Entezari-Maleki, and A. Movaghar, “A genetic-based scheduling algorithm to minimize the makespan of the grid applications,” In: Grid and DistributedComputing Conference, Communications in Computer and Information Science (CCIS), pp. 22-31, 2010.
[4] Z. Mousavinasa, R. Entezari-Maleki, and A. Movaghar, “A bee colony task scheduling algorithm in computational grids,” In: Iternational Conference on Digital Information Processing and Communications (ICDIPC), pp. 200-211, 2011.
[5] B. Radha, and V. Sumathy, “Enhancement of grid scheduling using dynamic error detection and fault tolerance,” International Journal of Computer Applications, 31(7), 2011.
[6] R. Shakerian, S.H. Kamali, M. Hedayati, and M. Alipour, “comparative study of ant colony optimization and particle swarm optimization for grid scheduling,” The Journal of Mathematics and Computer Science, 2 (3): pp. 469-474, 2011.
[7] S.H. Kamali, M. Hedayati, R. Shakerian, and S. Ghasempour, “Using identity-based secret public keys cryptography for heuristic security analyses in grid computing,” The Journal of Mathematics and Computer Science, 3 (4): pp. 357-375, 2011.
[8] J. Nabrzyski, J.M. Schopf, and J. Weglarz, Grid Resource Management, Kluwer Publishing, 2003.
[9] L.R Anikode, and B. Tang, “Integrating scheduling and replication in data grids with performance guarantee,” In: Global Telecommunications Conference, pp. 1-6, 2011.
[10] J. Basney, M. Livny, and P. Mazzanti, “Utilizing widely distributed computational resources efficiently with execution domains,” Comput Phys Commun, 140(1): pp. 246-252, 2001.
[11] J. Zhang, B. Lee, X. Tang, and C. Yeo, “Improving job scheduling performance with parall el access to replicas in data grid environment,” J. Supercomput. 56: pp. 245-269, 2011.
[12] G. Falzon, and M. Li, “Enhancing list scheduling heuristics for dependent job scheduling in grid computing environments,” J. Supercomput. 59: pp. 104-130, 2012.
[13] S.Abdi, and S. Mohamadi, “Two level job scheduling and data replication in data grid,” International Journal of Grid Computing & Applications, 1(1), 2010.
[14] K.Yi, F. Ding, and H. Wang, “Integration of task scheduling with replica placement in data grid for limited disk space of resources,” In: Fifth Annual China Grid Conference, pp.37-42, 2010.
[15] A. Jula, N. Khatoon Naseri, and AM. Rahmani, “Gravitational attraction search with virtual mass GASVM to solve static Grid job scheduling problem,” The Journal of Mathematics and Computer Science, 1 (4): pp. 305-312, 2010.
[16] A.S Izadi, A.R. Sahab, and J. Vahidi, “A new mechanism for traffic reduction the service resource discovery protocol in ad-hoc grid network,” The Journal of Mathematics and Computer Science, 6 (2): pp. 129-138, 2013.
[17] H.M. Wong, V. Bharadwaj, Y. Dantong, and T.G. Robertazzi, “Data intensive grid scheduling: multiple sources with capacity constraints,” In: Proceedings of the 15th International Conference on Parallel and Distributed Computing Systems (PDCS), pp. 163-170, 2004.
[18] K. Li, Z. Tong, D. Liu, T. Tesfazghi, and X. Liao, “PTS-PGATS based approach for data-intensive scheduling in data grids” Frontiers of Computer Science, 5(4): pp. 513-525, 2011.
[19] W. Liu, R. Kettimuthu, B. Li, and I. Foster, “An adaptive strategy for scheduling data-intensive applications in grid environments” In: 17th international conference on telecommunication, pp. 642-649, 2010.
[20] F. Xhafa, and A. Abraham, “Computational models and heuristic methods for grid scheduling problems,” Future Gener Comp Sy. 26: pp. 608-621, 2010.
[21] J.M. Schopf, “Ten actions when grid scheduling the user as a grid scheduler,” Chapter 1, 2004.
[22] R.S. Chang, C.Y. Lin, and C.F. Lin, “An adaptive scoring job scheduling algorithm for grid computing” Inform Sciences. 207: pp. 79-89, 2012.
[23] A. Chaudhuri, D. Jana, and B.B. Bhaumik, “Optimal model for scheduling of computational grid entities” In: India Conference (INDICON), pp. 1-6, 2011.
[24] I. Foster, and K. Ranganathan, “Design and evaluation of dynamic replication strategies for high performance data grids,” In: Proceedings of International Conference on Computing in High Energy and Nuclear Physics, 2001.
[25] [25] I. Foster, and k. Ranganathan, “Identifying dynamic replication strategies for high performance data grids,” In: Proceedings of 3rd IEEE/ACM International Workshop on Grid Computing, pp. 75–86, 2002.
[26] R, Chang, J. Chang, and S. Lin, “Job scheduling and data replication on data grids,” Future Gener Comp Sy. 23: pp. 846-860, 2007.
[27] A. Horri, R. Sepahvand, and G.H. Dastghaibyfard, “A hierarchical scheduling and replication strategy,” International Journal of Computer Science and Network Security, 8, 2008.
[28] N. Mansouri, G.H. Dastghaibyfard, and E. Mansouri, “Combination of data replication and scheduling algorithm for improving data availability in data grids” J. Netw. Comput. Appl. 36: pp. 711-722, 2013.
[29] J. Zhang, B. Lee, X. Tang, and C. Yeo, “Impact of parallel download on job scheduling in data grid environment,” In: Seventh International Conference on Grid and Cooperative Computing, pp. 102-109, 2008.
[30] H.H. Mohamed, and D.J. Epema, “An evaluation of the close-to-files processor and data co-allocation policy in multi-clusters,” In: International Conference on Cluster Computing, IEEE Society Press. IEEE Society Press, pp. 287-298, 2004.
[31] M. Tang, B.S. Lee, X. Tang, and C. Yeo, “The impact of data replication on job scheduling performance in the data grid” Future Gener. Comp. Sy. 22: pp. 254-268, 2006.
[32] S. Vazhkudai, “Enabling the co-allocation of grid data transfers,” in: Proceedings of the Fourth International Workshop on Grid Computing, pp. 44-51, 2003.
[33] S. Kumar, and N. Kumar, “Network and data location aware job scheduling in grid: improvement to GridWay Meta scheduler,” International Journal of Grid and Distributed Computing, 5(1), 2012.
[34] C. Wu, and R. Sun, “An integrated security-aware job scheduling strategy for large-scale computational grids,” Future Generation Computer Systems, 26 (2): pp. 198–206, 2010.
[35] M. Hemamalini, and M.V. Srinath, “State of the art: task scheduling algorithms in a heterogeneous grid computing environment,” Engineering research and management journal, 1(1): pp. 15-21, 2014.
[36] D.I. George Amalarethinam, and P. Muthulakshmi, “An overview of the scheduling policies and algorithms in grid computing,” International Journal of Research and Reviews in Computer Science 2 (2): pp. 280-294, 2011.
[37] D.G Cameron, A.P. Millar, C.C Nicholson, R. Carvajal-Schiaffino, F. Zini, and k. Stockinger, “Optorsim: a simulation tool for scheduling and replica optimization in data grids,” In: International conference for computing in high energy and nuclear physics (CHEP’04), 2004.