A Clustering Approach to Scientific Workflow Scheduling on the Cloud with Deadline and Cost Constraints

Document Type : Research Article

Authors

Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, IRAN

Abstract

One of the main features of High Throughput Computing systems is the availability of high power processing resources. Cloud Computing systems can offer these features through concepts like Pay-Per-Use and Quality of Service (QoS) over the Internet. Many applications in Cloud computing are represented by workflows. Quality of Service is one of the most important challenges in the context of scheduling scientific workflows. On the other hand, the remarkable growth of the multicore processor technology has led to the use of these processors by service providers as building blocks of their infrastructure. Therefore, scheduling scientific workflows on the Cloud requires especial attention to multicore processor infrastructure which adds more challenges to the problem. On the other hand, in addition to these challenges users’ QoS constraints like execution time and cost should be regarded. The main objective of this research is scheduling workflows on the Cloud, considering a multicore based infrastructure. A new algorithm is proposed which finds clusters of the workflow that can be executed in parallel while having large data communications. These kinds of clusters could be appropriate candidates to be executed on a multicore processor. In contrast, there are other clusters which should be executed in serial. This algorithm investigates whether serial execution of these clusters is possible or not. The experimental results show that the algorithm has a positive effect on execution time and cost of the workflow execution.

Keywords


[1] S. Abrishami, M. Naghibzadeh, and D. H. J. Epema, “Cost-driven scheduling of grid workflows using partial critical paths,” Parallel IEEE Trans. on Distrib. Syst., vol. 23, no. 8, pp. 1400–1414, 2012.
[2] L. F. Bittencourt and E. R. M. Madeira, “HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds,” J. Internet Serv. Appl., vol. 2, no. 3, pp. 207–227, 2011.
[3] R. G. Michael and S. J. David, “Computers and intractability: a guide to the theory of NP-completeness,” WH Free. Co., San Fr., 1979.
[4] A. Abraham, R. Buyya, B. Nath, and others, “Nature’s heuristics for scheduling jobs on computational grids,” in The 8th IEEE international conference on advanced computing and communications (ADCOM 2000), pp. 45–52, 2000.
[5] A. K. Aggarwal and R. D. Kent, “An adaptive generalized scheduler for grid applications,” in 19th International Symposium on High Performance Computing Systems and Applications, HPCS 2005., pp. 188–194, 2005.
[6] M. Aggarwal, R. D. Kent, and A. Ngom, “Genetic algorithm based scheduler for computationalgrids,” in 19th International Symposium on High Performance Computing Systems and Applications, HPCS 2005., , pp. 209–215, 2005.
[7] A. H. Alhusaini, V. K. Prasanna, and C. S. Raghavendra, “A unified resource scheduling framework for heterogeneous computing environments,” in Proceedings. Eighth Heterogeneous Computing Workshop, 1999.(HCW’99), pp. 156–165, 1999.
[8] R. Bajaj and D. P. Agrawal, “Improving scheduling of tasks in a heterogeneous environment,” IEEE Trans.Parallel Distrib. Syst., vol. 15, no. 2, pp. 107–118, 2004.
[9] S. K. Garg, C. S. Yeo, A. Anandasivam, and R. Buyya, “Environment-conscious scheduling of HPC applications on distributed cloud-oriented data centers,” J. Parallel Distrib. Comput., vol. 71, no. 6, pp. 732–749, 2011.
[10] A. Beloglazov, J. Abawajy, and R. Buyya, “Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing,” Futur. Gener. Comput. Syst., vol. 28, no. 5, pp. 755–768, 2012.
[11] A. J. Younge, G. Von Laszewski, L. Wang, S. Lopez-Alarcon, and W. Carithers, “Efficient resource management for cloud computing environments,” in International Green Computing Conference , pp. 357–364, 2010.
[12] A. Nathani, S. Chaudhary, and G. Somani, “Policy based resource allocation in IaaS cloud,” Futur. Gener. Comput. Syst., vol. 28, no. 1, pp. 94–103, 2012.
[13] W. Wang, G. Zeng, D. Tang, and J. Yao, “Cloud-DLS: Dynamic trusted scheduling for Cloud computing,” Expert Syst. Appl., vol. 39, no. 3, pp. 2321–2329, 2012.
[14] M. E. Frîncu, “Scheduling highly available applications on cloud environments,” Futur. Gener. Comput. Syst., vol. 32, pp. 138–153, 2014.
[15] M. Maheswaran, S. Ali, H. J. Siegel, D. Hensgen, and R. F. Freund, “Dynamic mapping of a class of independent tasks onto heterogeneous computing systems,” J. Parallel Distrib. Comput., vol. 59, no. 2, pp. 107–131, 1999.
[16] K. Etminani and M. Naghibzadeh, “A min-min max-min selective algorihtm for grid task scheduling,” in 3rd IEEE/IFIP International Conference in Central Asia on Internet, ICI 2007., pp. 1–7, 2007.
[17] H. Topcuoglu, S. Hariri, and M. Wu, “Performance-effective and low-complexity task scheduling for heterogeneous computing,” IEEE
Trans. Parallel Distrib. Syst., vol. 13, no. 3, pp. 260–274, 2002.
[18] T. Yang and A. Gerasoulis, “A fast static scheduling algorithm for DAGs on an unbounded number of processors,” in Proceedings of the 1991 ACM/IEEE conference on Supercomputing, pp. 633–642, 1991.
[19] V. Sarkar, Partitioning and scheduling parallel programs for multiprocessors. MIT press, 1989.
[20] L. F. Bittencourt and E. R. M. Madeira, “A performance-oriented adaptive scheduler for dependent tasks on grids,” Concurr. Comput. Pract. Exp., vol. 20, no. 9, pp. 1029–1049, 2008.
[21] L. F. Bittencourt and E. R. M. Madeira, “Towards the scheduling of multiple workflows on computational grids,” J. grid Comput., vol. 8, no. 3, pp. 419–441, 2010.
[22] S. Abrishami, M. Naghibzadeh, and D. H. J. Epema, “Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds,” Futur. Gener. Comput. Syst., vol. 29, no. 1, pp. 158–169, 2013.
[23] D. Poola, S. K. Garg, R. Buyya, Y. Yang, and K. Ramamohanarao, “Robust scheduling of scientific workflows with deadline and budget constraints in clouds,” in The 28th IEEE International Conference on Advanced Information Networking and Applications (AINA-2014), pp. 1–8, 2014.
[24] H. Kanemitsu, M. Hanada, T. Hoshiai, and H. Nakazato, “Effective use of computational resources in multi-core distributed systems,” in 16th International Conference on Advanced Communication Technology (ICACT), 2014, pp. 305–314, 2014.
[25] E. Deelman, G. Singh, M.-H. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, G. B. Berriman, J. Good, and others, “Pegasus: A framework for mapping complex scientific workflows onto distributed systems,” Sci. Program., vol. 13, no. 3, pp. 219–237, 2005.
[26] S. Bharathi, A. Chervenak, E. Deelman, G. Mehta, M.-H. Su, and K. Vahi, “Characterization of scientific workflows,” in Third Workshop on Workflows in Support of Large-Scale Science, 2008. WORKS 2008., pp. 1–10, 2008.