Grid computing is a distributed system with heterogeneous infrastructures. Resource management system (RMS) is one of the most important components which has great influence on the grid computing performance. The main part of RMS is the scheduler algorithm which has the responsibility to map submitted tasks to available resources. The complexity of scheduling problem is considered as a nondeterministic polynomial complete (NP-complete) problem and therefore, an intelligent algorithm is required to achieve better scheduling solution. One of the prominent intelligent algorithms is ant colony system (ACS) which is implemented widely to solve various types of scheduling problems. However, ACS suffers from stagnation problem in medium and large size grid computing system. ACS is based on exploitation and exploration mechanisms where the exploitation is sufficient but the exploration has a deficiency. The exploration in ACS is based on a random approach without any strategy. This study proposed four hybrid algorithms between ACS, Genetic Algorithm (GA), and Tabu Search (TS) algorithms to enhance the ACS performance. The algorithms are ACS(GA), ACS+GA, ACS(TS), and ACS+TS. These proposed hybrid algorithms will enhance ACS in terms of exploration mechanism and solution refinement by implementing low and high levels hybridization of ACS, GA, and TS algorithms. The proposed algorithms were evaluated against twelve metaheuristic algorithms in static (expected time to compute model) and dynamic (distribution pattern) grid computing environments. A simulator called ExSim was developed to mimic the static and dynamic nature of the grid computing. Experimental results show that the proposed algorithms outperform ACS in terms of best makespan values. Performance of ACS(GA), ACS+GA, ACS(TS), and ACS+TS are better than ACS by 0.35%, 2.03%, 4.65% and 6.99% respectively for static environment. For dynamic environment, performance of ACS(GA), ACS+GA, ACS+TS, and ACS(TS) are better than ACS by 0.01%, 0.56%, 1.16%, and 1.26% respectively. The proposed algorithms can be used to schedule tasks in grid computing with better performance in terms of makespan.
📄 Full text (200,000 characters)extracted from the PDF · click to expand
HYBRID ANT COLONY SYSTEM ALGORITHM FOR STATIC
AND DYNAMIC JOB SCHEDULING IN GRID COMPUTING
MUSTAFA MUWAFAK THEAB ALOBAEDY
DOCTOR OF PHILOSOPHY
UNIVERSITI UTARA MALAYSIA
2015
ii
Permission to Use
In presenting this thesis in fulfilment of the requirements for a postgraduate degree
from Universiti Utara Malaysia, I agree that the Universiti Library may make it freely
available for inspection. I further agree that permission for the copying of this thesis
in any manner, in whole or in part, for scholarly purpose may be granted by my
supervisor(s) or, in their absence, by the Dean of Awang Had Salleh Graduate School
of Arts and Sciences. It is understood that any copying or publication or use of this
thesis or parts thereof for financial gain shall not be allowed without my written
permission. It is also understood that due recognition shall be given to me and to
Universiti Utara Malaysia for any scholarly use which may be made of any material
from my thesis.
Requests for permission to copy or to make other use of materials in this thesis, in
whole or in part, should be addressed to:
Dean of Awang Had Salleh Graduate School of Arts and Sciences
UUM College of Arts and Sciences
Universiti Utara Malaysia
06010 UUM Sintok
iii
Abstrak
Pengkomputeran grid adalah sistem teragih dengan infrastruktur heterogen. Sistem
pengurusan sumber (RMS) adalah salah satu komponen terpenting yang mempunyai
pengaruh besar ke atas prestasi pengkomputeran grid. Bahagian utama RMS adalah
algoritma penjadual yang bertanggungjawab untuk memeta tugas kepada sumber
sedia ada. Kerumitan masalah penjadualan dianggap sebagai satu masalah lengkap
polinomial tidak berketentuan (NP-lengkap) dan oleh itu, satu algoritma pintar
diperlukan untuk mencapai penyelesaian penjadualan yang lebih baik. Salah satu
algoritma pintar yang menonjol adalah ant colony system (ACS) yang digunakan
secara meluas untuk menyelesaikan pelbagai jenis masalah penjadualan. Walau
bagaimanapun, ACS mengalami masalah genangan dalam pengkomputeran grid
bersaiz sederhana dan besar. ACS berasaskan mekanisma eksploitasi dan penerokaan
di mana eksploitasi adalah mencukupi tetapi berkurangan pada penerokaan.
Penerokaan dalam ACS adalah berasaskan kepada pendekatan rawak tanpa sebarang
strategi. Kajian ini mencadangkan empat algoritma hibrid di antara ACS, Genetic
Algorithm (GA), dan Tabu Search (TS) untuk meningkatkan prestasi ACS. Algoritma
tersebut adalah ACS(GA), ACS+GA, ACS(TS), dan ACS+TS. Algoritma hibrid yang
dicadangkan ini akan meningkatkan ACS dari segi mekanisma penerokaan dan
penghalusan penyelesaian dengan melaksanakan penghibridan tahap rendah dan
tinggi algoritma ACS, GA, dan TS. Semua algoritma yang dicadangkan telah dinilai
dan dibandingkan dengan dua belas metaheuristic algoritma dalam persekitaran
pengkomputeran grid statik (masa jangkaan kepada model pengiraan) dan dinamik
(corak taburan). Satu simulator yang dinamakan ExSim telah dibangunkan untuk
meniru sifat statik dan dinamik pengkomputeran grid. Keputusan eksperimen
menunjukkan algoritma yang dicadangkan mengatasi ACS dari segi nilai makespan
terbaik. Prestasi ACS(GA), ACS+GA, ACS(TS) dan ACS+TS adalah lebih baik
daripada ACS dengan masing-masing meningkat sebanyak 0.35%, 2.03%, 4.65% dan
6.99% untuk persekitaran statik. Untuk persekitaran dinamik, prestasi ACS(GA),
ACS+GA, ACS+TS dan ACS(TS) adalah lebih baik daripada ACS iaitu masingmasing meningkat sebanyak 0.01%, 0.56%, 1.16%, dan 1.26%. Algoritma yang
dicadangkan boleh digunakan untuk penjadualan tugas dalam pengkomputeran grid
dengan prestasi yang lebih baik dari segi makespan.
Kata Kunci: Algoritma metaheuristik, Ant colony system, Genetic algorithm, Tabu
search, Penjadualan kerja dalam pengkomputeran grid.
iv
Abstract
Grid computing is a distributed system with heterogeneous infrastructures. Resource
management system (RMS) is one of the most important components which has great
influence on the grid computing performance. The main part of RMS is the scheduler
algorithm which has the responsibility to map submitted tasks to available resources.
The complexity of scheduling problem is considered as a nondeterministic polynomial
complete (NP-complete) problem and therefore, an intelligent algorithm is required to
achieve better scheduling solution. One of the prominent intelligent algorithms is ant
colony system (ACS) which is implemented widely to solve various types of
scheduling problems. However, ACS suffers from stagnation problem in medium and
large size grid computing system. ACS is based on exploitation and exploration
mechanisms where the exploitation is sufficient but the exploration has a deficiency.
The exploration in ACS is based on a random approach without any strategy. This
study proposed four hybrid algorithms between ACS, Genetic Algorithm (GA), and
Tabu Search (TS) algorithms to enhance the ACS performance. The algorithms are
ACS(GA), ACS+GA, ACS(TS), and ACS+TS. These proposed hybrid algorithms
will enhance ACS in terms of exploration mechanism and solution refinement by
implementing low and high levels hybridization of ACS, GA, and TS algorithms. The
proposed algorithms were evaluated against twelve metaheuristic algorithms in static
(expected time to compute model) and dynamic (distribution pattern) grid computing
environments. A simulator called ExSim was developed to mimic the static and
dynamic nature of the grid computing. Experimental results show that the proposed
algorithms outperform ACS in terms of best makespan values. Performance of
ACS(GA), ACS+GA, ACS(TS), and ACS+TS are better than ACS by 0.35%, 2.03%,
4.65% and 6.99% respectively for static environment. For dynamic environment,
performance of ACS(GA), ACS+GA, ACS+TS, and ACS(TS) are better than ACS by
0.01%, 0.56%, 1.16%, and 1.26% respectively. The proposed algorithms can be used
to schedule tasks in grid computing with better performance in terms of makespan.
Keywords: Metaheuristic algorithms, Ant colony system, Genetic algorithm, Tabu
search, Job scheduling in grid computing.
v
Acknowledgement
Each part of this study is guided, inspired, and supported by many people. The most
important support and guidance were from my research supervisor Prof. Dr. Hjh. Ku
Ruhana Ku Mahamud. Thank you very much for your great help and support. It is an
honour for me to do a research under your supervision.
I would like to thank all the academic and technical staff in Utara Universiti Malaysia
for their help in the study process and providing all the excellent facilities.
Furthermore, I would like to thank all the members of my family for their
unconditional support. My goal would not be achieved without them.
Finally, I would like to thank all my friends for their support.
vi
Table of Contents
Permission to Use ........................................................................................................ ii
Abstrak ........................................................................................................................ iii
Abstract ....................................................................................................................... iv
Acknowledgement ....................................................................................................... v
Table of Contents ........................................................................................................ vi
List of Tables ............................................................................................................... x
List of Figures ............................................................................................................. xi
List of Appendices .................................................................................................... xiii
List of Abbreviations ................................................................................................ xiv
CHAPTER ONE INTRODUCTION...................................................................... 1
1.1 Problem Statement ................................................................................................. 8
1.2 Objective of the Study ......................................................................................... 10
1.3 Significance of the Study ..................................................................................... 11
1.4 Scope and Assumption of the Study .................................................................... 12
1.5 Thesis Organization ............................................................................................. 12
CHAPTER TWO LITERATURE REVIEW ....................................................... 14
2.1 Job Scheduling Algorithms in Computational Grid System ................................ 14
2.1.1 Heuristic Algorithms .................................................................................. 15
2.1.2 Evolutionary Algorithms ............................................................................ 19
2.1.3 Local Search ............................................................................................... 35
2.1.4 Swarm Intelligence Algorithms ................................................................. 44
2.2 Hybrid Approaches in Job Scheduling ................................................................ 67
2.3 Grid Simulator ..................................................................................................... 77
2.4 Conceptual Framework ........................................................................................ 79
2.5 Summary .............................................................................................................. 81
CHAPTER THREE RESEARCH METHODOLOGY ..................................... 83
3.1 Research Framework ........................................................................................... 83
3.2 Research Methodology ........................................................................................ 85
3.2.1 Problem Formulation .................................................................................. 85
vii
3.2.2 Dynamic Expected Time to Compute ........................................................ 87
3.2.3 Solution Encoding ...................................................................................... 87
3.2.4 Objective Function ..................................................................................... 89
3.2.5 Ant Colony System Algorithm Implementation ........................................ 90
3.4.6 Genetic Algorithm Implementation ........................................................... 91
3.4.7 Tabu Search Algorithm Implementation .................................................... 94
3.2.8 Enhance ACS exploration .......................................................................... 96
3.2.9 Refine the ACS solution ........................................................................... 104
3.2.10 Grid Simulator Development ................................................................. 111
3.2.11 Proposed Algorithm Evaluation ............................................................. 111
3.3 Summary ............................................................................................................ 113
CHAPTER FOUR SIMULATOR DEVELOPMENT......................................... 114
4.1 Identifying the Measurement Criteria ................................................................ 114
4.2 Implementing the Benchmark Problems Model ................................................ 115
4.3 Simulation Verification and Validation ............................................................. 125
4.3.1 Verification Techniques ........................................................................... 126
4.3.2 Validation Techniques .............................................................................. 127
4.3.3 Testing the Proposed Hybrid Algorithms ................................................. 127
4.4 Summary ............................................................................................................ 130
CHAPTER FIVE JOB SCHEDULING IN STATIC GRID COMPUTING ... 131
5.1 Static Environment............................................................................................. 131
5.2 Algorithms Parameters....................................................................................... 132
5.2.1 Genetic Algorithm Parameters ................................................................. 132
5.2.2 Ant System Parameters ............................................................................ 133
5.2.3 Ant Colony System Parameters ................................................................ 134
5.2.4 Tabu Search Parameters ........................................................................... 135
5.2.5 BABC, EBABC1, and EBABC2 Parameters ........................................... 136
5.2.6 PSO-GELS Parameters ............................................................................ 137
5.2.7 AS(TS), AS+TS, ACS(TS), and ACS+TS Parameters ............................ 137
5.2.8 AS(GA), AS+GA, ACS(GA), and ACS+GA Parameters ........................ 138
5.3 Experimental Result and Analysis ..................................................................... 139
5.3.1 Best Makespan Results ............................................................................. 140
viii
5.3.2 Average Makespan Results ...................................................................... 141
5.3.3 Best Flowtime Results .............................................................................. 143
5.3.4 Average Flowtime Results ....................................................................... 145
5.3.5 Best Utilization Results ............................................................................ 147
5.3.6 Average Utilization Results ..................................................................... 149
5.3.7 Discussion ................................................................................................ 150
5.4 Summary ............................................................................................................ 151
CHAPTER SIX JOB SCHEDULING IN DYNAMIC GRID COMPUTING .. 152
6.1 Dynamic Environment ....................................................................................... 152
6.2 Algorithm Parameters ........................................................................................ 154
6.2.1 Genetic Algorithm Parameters ................................................................. 154
6.2.2 Ant System Parameters ............................................................................ 156
6.2.3 Ant Colony System Parameters ................................................................ 157
6.2.4 Tabu Search Parameters ........................................................................... 157
6.2.5 BABC, EBABC1, and EBABC2 Parameters ........................................... 158
6.2.6 PSO-GELS Parameters ............................................................................ 159
6.2.7 AS(TS), AS+TS, ACS(TS), and ACS+TS Parameters ............................ 160
6.2.8 AS(GA), AS+GA, ACS(GA), and ACS+GA Parameters ........................ 160
6.3 Experimental Result and Analysis ..................................................................... 162
6.3.1 Best Makespan Results ............................................................................. 162
6.3.2 Average Makespan Results ...................................................................... 164
6.3.3 Best Flowtime Results .............................................................................. 166
6.3.4 Average Flowtime Results ....................................................................... 167
6.3.5 Best Utilization Results ............................................................................ 169
6.3.6 Average Utilization Results ..................................................................... 171
6.3.7 Discussion ................................................................................................ 172
6.4 Summary ............................................................................................................ 173
CHAPTER SEVEN CONCLUSION AND FUTURE WORK .......................... 174
7.1 Research Contribution ....................................................................................... 174
7.2 Limitation of the Study ...................................................................................... 176
7.3 Recommendation for Future Work .................................................................... 176
ix
REFERENCES ....................................................................................................... 178
x
List of Tables
Table 2.1 Difference between each variant algorithm in ACO .................................. 49
Table 2.2 Various research on different Domains and problems................................ 50
Table 3.1 The implemented algorithms source ......................................................... 112
Table 4.1 Algorithms evaluated with ETC model. ................................................... 115
Table 4.2 Experimental parameters .......................................................................... 122
Table 4.3 ETC matrix for 3 resources and 13 jobs ................................................... 128
Table 5.1 Algorithms resource for parameter values ................................................ 132
Table 5.2 GA parameter values ................................................................................ 132
Table 5.3 AS parameter value ................................................................................... 133
Table 5.4 ACS parameter values .............................................................................. 134
Table 5.5 TS parameter values .................................................................................. 135
Table 5.6 BABC, EBABC1, EBABC2 parameter values ......................................... 136
Table 5.7 PSO-GELS Algorithm Parameters Values ............................................... 137
Table 5.8 AS, ACS, and TS Algorithms Parameter Values ...................................... 138
Table 5.9 AS(GA) and AS+GA Algorithms Parameter Values ............................... 138
Table 5.10 ACS(GA) and ACS+GA Algorithms Parameter Values ........................ 139
Table 6.1 Datasets descriptions ................................................................................. 153
Table 6.2 Parameters for Generating Dynamic Benchmark ..................................... 153
Table 6.3 Algorithms resource for parameter values ................................................ 154
Table 6.4 GA parameter values ................................................................................ 155
Table 6.5 AS parameter value ................................................................................... 156
Table 6.6 ACS parameter values .............................................................................. 157
Table 6.7 TS parameter values .................................................................................. 157
Table 6.8 BABC, EBABC1, EBABC2 parameter values ......................................... 158
Table 6.9 PSO-GELS Algorithm Parameters Values ............................................... 159
Table 6.10 AS, ACS, and TS Algorithms Parameter Values .................................... 160
Table 6.11 AS(GA) and AS+GA Algorithms Parameter Values ............................. 161
Table 6.12 ACS(GA) and ACS+GA Algorithms Parameter Values ........................ 161
xi
List of Figures
Figure 2.1: Basic Genetic Algorithm (Zapfel et al., 2010) .......................................... 26
Figure 2.2: Visualization of GA population (Zapfel et al., 2010) ................................ 26
Figure 2.3: Examples of crossover operators (Zapfel et al., 2010) .............................. 28
Figure 2.4: Process of Genetic Algorithm (Zapfel et al., 2010) ................................. 29
Figure 2.5: Process of Tabu Search algorithm (Zapfel et al., 2010) ........................... 39
Figure 2.6: Tabu Search algorithm pseudocode (Zapfel et al., 2010) ......................... 41
Figure 2.7: Research conceptual framework ............................................................... 80
Figure 3.1: The Research Framework .......................................................................... 84
Figure 3.2: The solution vector used by the ants ......................................................... 88
Figure 3.3: Solution vectors used by genetic algorithm .............................................. 89
Figure 3.4: The new solution vectors produced by crossover operator ....................... 89
Figure 3.5: Chromosomes for five tasks and three machines ...................................... 92
Figure 3.6: ACS(GA) (low level) algorithm pseudocode ............................................ 98
Figure 3.7: ACS(TS) (low level) algorithm pseudocode ........................................... 101
Figure 3.8: ACS+GA (high level) pseudocode .......................................................... 105
Figure 3.9: ACS+TS (high level) algorithm pseudocode .......................................... 108
Figure 4.1: Workload modelling (Feitelson, 2013) ................................................... 119
Figure 4.2: DETC simulator interface ....................................................................... 122
Figure 4.3: Benchmark for dynamic grid computing................................................. 123
Figure 4.4: Grid computing simulator interface ........................................................ 124
Figure 4.5: Simulator charts ....................................................................................... 125
Figure 4.6: ACS(TS) Schedule table ......................................................................... 129
Figure 4.7: ACS+TS Schedule table .......................................................................... 129
Figure 4.8: ACS(GA) Schedule table ........................................................................ 130
Figure 4.9: ACS+GA Schedule table ......................................................................... 130
Figure 5.1: Geometric mean for the best makespan values ....................................... 140
Figure 5.2: The percentage enhancement of each hybrid algorithm in terms of the best
makespan values ........................................................................................................ 141
Figure 5.3: Geometric mean for the average makespan values ................................. 142
Figure 5.4: The percentage enhancement of each algorithm in terms of the average
makespan values ........................................................................................................ 143
Figure 5.5: Geometric mean for best flowtime values ............................................... 144
Figure 5.6: The percentage enhancement of each algorithm in terms of the best
flowtime values .......................................................................................................... 145
Figure 5.7: Geometric mean for average flowtime values ......................................... 146
Figure 5.8: The percentage enhancement of each algorithm in terms of the average
flowtime values .......................................................................................................... 147
Figure 5.9: Geometric mean for best utilization value .............................................. 148
Figure 5.10: The percentage enhancement of each algorithm in terms of the best
utilization values ........................................................................................................ 149
Figure 5.11: Geometric mean for average utilization values ..................................... 149
Figure 5.12: The percentage enhancement of each algorithm in terms of the average
utilization values ........................................................................................................ 150
Figure 6.1: Geometric mean for the best makespan values ....................................... 163
Figure 6.2: The percentage enhancement of each hybrid algorithm in terms of the best
makespan values ........................................................................................................ 164
Figure 6.3: Geometric mean for the average makespan values ................................. 165
xii
Figure 6.4: The percentage enhancement of each hybrid algorithm in terms of the
average makespan values ........................................................................................... 165
Figure 6.5: Geometric mean for the best flowtime values ......................................... 166
Figure 6.6: The percentage enhancement of each algorithm in terms of the best
flowtime values .......................................................................................................... 167
Figure 6.7: Geometric mean for the average flowtime values ................................... 168
Figure 6.8: The percentage enhancement of each hybrid algorithm in terms of the
average flowtime values ............................................................................................ 169
Figure 6.9: Geometric mean for the best utilization value......................................... 170
Figure 6.10: The percentage enhancement of each hybrid algorithm in terms of the
best utilization values ................................................................................................. 170
Figure 6.11: Geometric mean for the average utilization value................................. 171
Figure 6.12: The percentage enhancement of each hybrid algorithm in terms of the
average utilization values ........................................................................................... 172
xiii
List of Appendices
Appendix A Ant Colony System (C# Code)............................................................ 201
Appendix B Genetic Algorithm (C# Code) ............................................................. 204
Appendix C Tabu Search Algorithm (C# Code)...................................................... 213
Appendix D DETC Simulator .................................................................................. 220
xiv
List of Abbreviations
ABC Artificial Bee Colony
ACO Ant Colony Optimization
ACS Ant Colony System
ACS(GA) Low level hybridization between ACS and GA algorithms
ACS(TS) Low level hybridization between ACS and TS algorithms
ACS+GA High level hybridization between ACS and GA algorithms
ACS+TS High level hybridization between ACS and TS algorithms
AS Ant System
AS(GA) Low level hybridization between AS and GA algorithms
AS(TS) Low level hybridization between AS and TS algorithms
AS+GA High level hybridization between AS and GA algorithms
AS+TS High level hybridization between AS and TS algorithms
AS
rank
Rank-Based Ant System
BABC Binary Artificial Bee Colony
BACO Balanced Ant Colony Optimization
BFO Bacterial Foraging Optimization
BWAS Best-Worst Ant System
cMA Cellular Memetic Algorithm
CV Coefficient of Variation
DE Differential Evolution
DETC Dynamic Expected Time to Compute
EA Evolutionary Algorithms
xv
EAS Elitist strategy for Ant System
EBABC1 Efficient Binary Artificial Bee Colony
EBABC2 Efficient Binary Artificial Bee Colony with flexible ranking
ET Execution Time
ETC Expected Time to Compute
FANT Fast Ant System
FCFS First Come First Served
FPLTF Fastest Processor to Largest Task First
GA Genetic Algorithm
GA(TS) Low level hybridization between GA and TS algorithms
GA+TS High level hybridization between GA and TS algorithms
GBF Genetic Bacterial Foraging
GSA Genetic and Simulated Annealing
GTSP Generalized Traveling Salesman Problem
HACO Hybrid Ant Colony Optimization
HC Hill Climbing
HCACO Hybrid Converse Ant Colony Optimization
HGS Hierarchic Genetic Strategy
HPDSs High Performance Distributed Systems
IACO Improved Ant Algorithm
JSP Job Scheduling Problem
KPB K-Parents Best
LJFR-SJFR Longest Job to Fastest Resource-Shortest Job to Fastest Resource
LM Local Move
LMCTS Local Minimum Completion Time Swap
xvi
MA Memetic Algorithms
MA+TS High level hybridization between Memetic and Tabu Search
MACO Multiple Ant Colonies Optimization
MCT Minimum Completion Time
MDS Metacomputing Directory Service
MET Minimum Execution Time
MI Million Instructions
MIPS Million Instructions Per Second
MMAS Max-Min Ant System
MTEDD Minimum Time Earliest Due Date
MTERD Minimum Time Earliest Release Date
OLB Optimization Load Balancing
PGA1 Player’s Genetic Algorithm
PGA2 Parallel Genetic Algorithm
PMCT Player’s Minimum Completion Time
PSO Particle Swarm Optimization
PSO-GELS Particle Swarm Optimization and Gravitational Emulation Local
Search
QoS Quality of Service
RGA Risky Genetic Algorithm
SA Simulated Annealing
SGA Struggle Genetic Algorithm
SLM Steepest Local Move
SS Scatter Search
SSGA Steady-State Genetic Algorithm
xvii
SwA Switching Algorithm
TS Tabu Search
TSP Traveling Salesman Problem
1
CHAPTER ONE
INTRODUCTION
The concept of grid systems goes back to 1969 when Leonard Kleinrock wrote, “We
will probably see the spread of computer utilities, which, like present electric and
telephone utilities, will service individual homes and offices across the country”
(Wankar, 2008). From that time, many researchers presented various works and
contributed in grid systems fields. The popularity of grid systems started by the late
1990s when Foster developed a grid system called Globus Toolkit (Foster &
Kesselman, 1997; 2004).
Grid systems evolves from existing technology such as distributed computing, web
service, and Internet (Magoules, Pan, Tan, & Kumar, 2009). According to Xhafa and
Abraham (2010), grid computing is defined as “Geographically distributed computers,
linked through the internet in a Grid-like manner, which are used to create virtual
supercomputers of vast amount of computing capacity able to solve complex problem
from e-Science in less time than known before”.
Magoules, Nguyen, and Yu (2009) presented an extensive definition for grid systems
as “A hardware and software infrastructure that provides transparent, dependable,
pervasive and consistent access to large-scale distributed resources owned and shared
by multiple administrative organizations in order to deliver support for a wide range
of applications with the desired qualities of service. These applications can perform
either: high throughput computing, on-demand computing, data intensive computing,
or collaborative computing”.
2
From previous definitions, it can be concluded that grid computing is a collection of
geographically distributed and heterogeneous resources. They are connected like a
grid using internet technology to form a virtual supercomputer that has the capacity to
solve very complex problems. Grid can be used by different fields such as science,
commerce, and education.
Grid systems could be distributed geographically through different organizations
using different platforms (Kolodziej, 2012). Two services offered by grid systems are
computing-intensive services and data-intensive services (Xhafa & Abraham, 2008b).
In computing services, the grid process tasks are not possible or very difficult to be
processed in traditional computer resource, while data storage services provide a
storage which is available through many mirrors and servers. Grid computing
provides powerful computation resources for complex tasks such as scientific
research, stock markets, and business requirements for organizations. However,
according to Xhafa and Abraham (2008), grid computing is still in the developmental
stage, and there are many challenges to be addressed in the future.
Grid systems are classified as a modern High Performance Distributed Systems
(HPDSs) along with the clusters and cloud systems (Kolodziej, 2012). However, there
are crucial characteristics which are different between them such as scale, network
type, administrative domain, and resources structure (AL-Fawair, 2009; Montes,
Sanchez, & Perez, 2012). Grid systems are extended to many other types of grids such
as enterprise grid, sensor grid, campus grid, global grid, pc grid, and utility grid
(Babafemi, Sanjay, & Adigun, 2013; Conejero, Caminero, Carrion, & Tomas, 2014;
Fulop, 2008; Kolodziej, 2012).
3
Grid computing requirements, usage, and definitions have changed with time.
Berman, Fox, and Hey (2003) categorized the evolution of grid into three different
generations. The first generation is called meta-computing environment, namely
FAFNER and I-WAY projects. In the second generation, the grid technology is
developed such as grid resource management, resource brokers and schedulers, grid
portal, and complete integrated systems. Some projects developed at this era are
Globus, Legion, and UNICORE. The third generation represents the integration
between grid computing and web services technologies such as OGSI and WSRF.
According to Magoules, Pan, et al. (2009), grid architecture consists of four layers.
The first layer is the user application level (APIs). The second layer is the middleware
layer which includes management software and packages. The third layer deals with
resources available to the grid such as data storage, processing capabilities and other
application-specific hardware. The fourth layer deals with network components such
as routers, switches, and the protocols used for communication between sources in the
grid. In addition, the main characteristics of grid systems are distributed, nondedicated, service-oriented heterogeneous, and non-centralized (Montes et al., 2012).
Magoules, Pan, et al. (2009) classified the usage of grid applications into five major
groups: The first group is distributed computing which is the grid computing
application to solve problems that cannot be solved on a single system, such as
simulation of complex physical process, which needs many resources like CPU and
memory. The second group is called high-throughput computing where the grid
utilizes the unused processor cycles in order to perform independent tasks. Using this
method, a complex task can be divided into multiple tasks and the grid will schedule
and manage the process. Problems such as bio-statistical, molecular simulations of
4
liquid crystal and Monte Carlo simulations are very suited for high-throughput
computing. The third group is on-demand computing. For this type, some resources
cannot be cost-effectively or not locally located. Therefore, grid can provide an access
to such a resource. Yet, there are many issues to be addressed, such as security,
scheduling, code management, configuration, fault tolerance, and payments process.
A typical example of an application requiring on-demand computing is the use of
dynamically acquired supercomputer to perform cloud detection algorithm.
The fourth group is data intensive computing, which is a grid that can be used to
manage data from distributed data repositories, digital library and database. A field
such as High Energy Physics (HEP) is an example of application that requires data
intensive computing support. The fifth group is called collaborative computing,
whereby some applications require strict real time capabilities and different types of
interactions that can take place. A typical example of such application that could use a
collaborative computing is multi-conferencing.
One of the most important components in grid computing is resource management
system (Hussain et al., 2013). Resource management system has the responsibility to
map the submitted tasks to the available resources (Sim, 2009). Resource
management system is implemented with scheduling algorithm to map tasks to the
resources in an efficient way (Espling, 2013; Ma, Yan, Liu, Guan, & Lee, 2011). In
grid computing, job scheduling algorithm is the main issue for grid computing
performance (Kolodziej, 2012; Kousalya & Balasubramanie, 2009; Mathiyalagan,
Suriya, & Sivanandam, 2010; Visalakshi & Sivanandam, 2009). Scheduling can be
done in a simple way by assigning the incoming task to the available resource.
However, by using a more advanced and sophisticated scheduler algorithm, the grid
5
can obtain better computing performance (Kolodziej, 2012). The scheduler should
take into account many aspects, such as dynamic tasks environment, joining and
dropping of resources from grid and evaluating the current load of the resources. The
scheduler can be in hierarchical or distributed organization to deal with large scale of
the grid.
In order to solve the grid computing scheduling problem, it is very important to define
the problem first. The scheduling problem is defined as an NP-complete problem
(Maheshbhai, 2011; Mao, 2011; Wei, Zhang, Li, & Li, 2012). The NP-complete
problem needs heuristic or metaheuristics algorithms which have the ability to solve
combinatorial optimization problems in a reasonable time (Aleti, 2012).
Metaheuristics algorithms, such as Simulated Annealing (SA) algorithm, are
implemented for job scheduling in grid computing (Cai, Ning, Li, & Zhong, 2007;
Guo & Wang, 2010). However, SA needs a long running time to reach a good quality
solution which is very restricted in grid computing environment (Xhafa, Barolli, &
Durresi, 2007a). Genetic Algorithm (GA) is also implemented successfully in grid
computing scheduling problems (Carretero, Xhafa, & Abraham, 2007). GA is able to
find a good solution in consistent and semi-consistent environments. However, in
inconsistent environment, GA suffers from premature convergence (Kolodziej, Xhafa,
& Kolanko, 2009). Another important metaheuristics approach is Tabu Search (TS)
algorithm which is implemented for job scheduling on computational grid system
(Xhafa, Carretero, Dorronsoro, & Alba, 2009). TS algorithm has the benefit of
systematic search which makes the algorithm avoid random solution. However, TS
also suffers from stagnation due to the use of local neighbourhood search technique.
One more important family of metaheuristics algorithms are ACO algorithms which
are implemented widely on scheduling problems (Bandieramonte, Stefano, & Morana,
6
2008; Chang, Chang, & Lin, 2009; Cyril Daisy Christina & Miriam, 2012; Kant,
Sharma, Agarwal, & Chandra, 2010; Zhu, Zhao, & He, 2010a).
ACO algorithms which are inspired by biological ants present many solutions for
different types of NP-complete problems (Dorigo & Stutzle, 2004).
Biological ants have the ability to discover the shortest route from the nest to the
source of food (Dorigo & Stutzle, 2004). Although they do not have an advanced
vision system (Camazine et al., 2003), they have the ability to communicate with the
environment. Ants use a chemical substance called “pheromone” to communicate
with the environment and between each other (Dorigo & Gambardella, 1997a).
Pheromone substance has evaporation property which is a powerful mechanism to
update the route information. While an ant moves looking for food, it deposits a
pheromone along the path. The following ant will more likely select the route with
richer pheromone. This mechanism will make the ant choose the shortest path.
There are several enhanced ACO algorithms implemented in grid computing
scheduling problems such as balanced job scheduling using Ant Colony Optimization
(BACO) for grid environment by Chang, Chang, and Lin (2007). A study proposed by
Chang, Chang, and Lin (2009) implemented ant algorithm for balanced job
scheduling in computational grid. Kousalya and Balasubramanie (2009) presented a
study on improving ant colony optimization algorithm with local search for job
scheduling in computational grid systems. Liusuqin, Shuojun, Menglingfen, and
Lixingsheng (2009) published a study to improve ant colony optimization for Job
Scheduling Problem (JSP). A multiple ant colony model called “Cooperative multiant Colony Pseudo-parallel Optimization Algorithm” was proposed by Liu, Song, and
7
Dai (2010). Another study regarding ACO algorithm for job scheduling on
computation grid was proposed by Kant, Sharma, Agarwal, and Chandra (2010). A
research on task scheduling with load balancing using Multiple Ant Colonies
Optimization (MACO) in grid computing was conducted by Bai et al. (2010).
Enhanced ant colony algorithm for job scheduling in computational grid was
proposed by Maruthanayagam and UmaRani (2010). Mou (2011) presented a new
approach using double Pheromones technique for ant colony system. An improved
ACO algorithm for job scheduling in computational grid systems proposed by
MadadyarAdeh and Bagherzadeh (2011). Kokilavani and Amalarethinam (2013)
published a study on implementing ant colony optimization based load sharing for job
scheduling in computational grid systems.
The first version of ACO known as ant system algorithm was presented by Dorigo,
Maniezzo, and Colorni (1991). Ant system is utilized to solve job scheduling on grid
computing (Kousalya & Balasubramanie, 2008). Another version of ACO is elitist ant
system algorithm which has better performance than ant system (Dorigo, Maniezzo,
& Colorni, 1996). However, the performances of ant system and elitist ant system
algorithms drop dramatically once the size of the problem instance increase (Dorigo
& Stutzle, 2004). Another important version of ACO algorithms is Ant Colony
System (ACS) presented by Dorigo and Gambardella (1997b). ACS algorithm mimics
the behaviour of real ant colony to solve optimization problems such as Traveling
Salesman Problem (TSP) and network routing. ACS algorithm is considered as one of
the best algorithms for solving NP-complete problems (Gendreau & Potvin, 2010).
Therefore, this study selected ACS as the main algorithm to hybridize with GA and
TS algorithms.
8
In ACS algorithm, ants apply exploitation and exploration mechanism when they
select the next node to move. In addition, ACS applies local pheromone update and
global pheromone update to direct the search for the next iteration. The global update
is calculated based on the quality of the best solution so far, while the local update
applies the evaporation concept. In ACS algorithm, the exploration mechanism is
based on a stochastic process. Such a random process will affect the whole solution if
the ant selects a bad choice in any cycle of the solution construction process.
Therefore, a more deterministic and systematic exploration mechanism is required to
enhance ACS algorithm performance.
Hybridizing ACS algorithm with local search approaches such as genetic algorithm or
tabu search algorithms will enhance the solutions produced by ACS. In spite of
several enhanced ACS algorithms used for job scheduling problem, more studies are
required to enhance the algorithm performance.
1.1 Problem Statement
In grid computing systems, many criteria depend on scheduling algorithm efficiency
such as grid performance, utilization, Quality of Service (QoS), and load balancing
(Nithya & Shanmugam, 2011; Zhu & Wei, 2010).
Xhafa & Abraham (2010) stated that “Rather than a problem, scheduling in grid
systems is a family of problems. This is due to the many parameters that intervene
scheduling as well as due to the different needs of grid-enabled applications”.
In scheduling algorithm, there are many factors and parameters that should be taken
into account, such as job size, resource capacity, network speed, current load, and
9
expected time to complete (AL-Fawair, 2009; Bai et al., 2010). In addition, the
dynamic and heterogeneous nature of grid environment makes the scheduling process
more critical such as joining and dropping resources to the gird (Maheshbhai, 2011;
Malarvizhi & Uthariaraj, 2009; Qureshi et al., 2014). If the scheduling algorithm is
not efficient, the grid system user will experience a delay in response time, especially
when the number of tasks is increased (Ku-Mahamud & Nasir, 2010). Therefore,
scheduling algorithm is a very important part in grid computing systems and it needs
to be improved to answer the dynamic requirements. Hence, a sufficient algorithm for
dynamic grid scheduling problem is demanded to enhance the grid computing system
performance (Ku-Mahamud & Nasir, 2010; Maruthanayagam & UmaRani, 2010).
The traditional ACS is not efficient in large-scale computation problems due to the
stagnation nature of pheromone in ACS (Mathiyalagan et al., 2010). ACS suffers
from deficiency in the exploration mechanism. Experiments such as those in Dorigo
and Stutzle (2004) used 0.1 for exploration and 0.9 for exploitation which indicate
that the exploration mechanism is not sufficient, while the exploitation mechanism is
efficient. However, because of the high ratio in exploitation, ACS algorithm has
behaved like greedy algorithm more than metaheuristics algorithm. After observing
the algorithm iteration step by step, it is found that the exploration is a stochastic
process and not guided. As the ACS algorithm uses construction approach, any wrong
selection for one or more node will affect the whole solution quality and that is what
happens when the stochastic exploration has selected a wrong node. According to
Glover and Laguna (1997), “bad strategic choice can yield more information than a
good random choice”. Therefore, ACS algorithm needs to be enhanced in terms of
exploration mechanism. In addition, ACS algorithm needs a mechanism to correct the
construction phase after each cycle. Genetic algorithm and tabu search methods are
10
very good candidates to be hybridized with ACS as both algorithms complement
ACS. In other words, ACS algorithm works based on constructive approach, while
GA and TS algorithms work based on a local search which is suitable to be hybridized
with ACS algorithm. Moreover, GA and TS algorithms could be hybridized as a low
level as well as a high level with ACS.
The research questions answered in this study are as follows:
i. How to improve the job scheduling in static and dynamic grid computing.
ii. How to hybridize ant colony system algorithm with genetic algorithm to
enhance the exploration and exploitation mechanisms?
iii. How to hybridize ant colony system algorithm with tabu search algorithm to
enhance the exploration and exploitation mechanisms?
iv. How the hybridized algorithms will avoid the local optimum trap?
v. Can a simulator produce benchmark environment of the grid computing?
1.2 Objective of the Study
The main objective of the study is to develop a hybrid ant colony system algorithm in
order to improve the job scheduling in static and dynamic grid computing
environment.
Specific objectives of the study are:
i. To propose a hybrid low level algorithm to enhance the exploration
mechanism in ACS algorithm.
ii. To propose a hybrid high level algorithm in refining the final solution found
by ACS algorithm.
11
iii. To design and develop a simulator that can be used to simulate the static and
dynamic grid environment.
iv. To evaluate the performance of the proposed hybrid ant colony system
algorithm.
1.3 Significance of the Study
The developed hybrid algorithm can be considered as a new member in the family of
ant colony optimization algorithms. The hybrid ACS algorithm is a new contribution
to the body of knowledge in the area of swarm intelligence and job scheduling in grid
computing.
The hybridization between ant colony system algorithm, genetic algorithm, and tabu
search algorithm to enhance the exploration part inside ACS guided the ants’
exploration in a better way. This hybridized algorithm concept can be used to solve
other optimization problems with minimal customization. In addition, this study has
implemented, compared and analysed sixteen algorithms for job scheduling in
computational grid. The analyses provide a deep understanding regarding the
behaviour of these algorithms.
The developed gird computing environment simulator has the ability to generate static
and dynamic benchmark problems which are very useful and highly demanded in the
field of job scheduling in grid computing. The simulator can be extended easily for
other scheduling algorithms.
12
1.4 Scope and Assumption of the Study
This study has focused on a single ant colony system algorithm which is a variant of
the ant colony optimization algorithms. The algorithm enhancement focused on low
and high level hybridizations between ACS and local search algorithm to improve the
exploration mechanism and enhance the final solution.
The implemented algorithm is applied on static and dynamic environments using
batch mode scheduling and independent task approach. For static environment, the
experiments were conducted using benchmarks presented by Ali, Siegel,
Maheswaran, Hensgen, and Ali (2000). This benchmark is frequently used because of
its effectiveness in simulating job scheduling problems in grid systems (Xhafa, Alba,
Dorronsoro, Duran, & Abraham, 2008). For dynamic environment, a simulator was
developed to mimic the real grid computing environment using the same pattern of
distribution (Feitelson, 2013).
This study has investigated GA and TS algorithms to enhance the exploration
mechanism and refine the final solution in ACS algorithms. In addition, this study
used three performance metrics to measure the grid computing performance, namely
makespan, flowtime, and utilization.
1.5 Thesis Organization
This thesis is organized as follows: Chapter 2 presents a review on various scheduling
algorithms in computational grid, hybrid approaches in metaheuristics algorithms,
grid simulator, and conceptual framework. The research framework, methods, and
techniques are discussed in Chapter 3. Chapter 4 presents the simulator development
steps with verification and validation. Details regarding the problem formulation,
13
parameters, performance criteria, and static experiments are provided in Chapter 5.
Chapter 6 presents the experiments on dynamic environment. Chapter 7 discusses the
contributions, limitations, and concludes the study with suggestion for future research.
14
CHAPTER TWO
LITERATURE REVIEW
This chapter presents the review of previous studies that have been done in the areas
of job scheduling in computational grid systems, swarm intelligence, scheduling
algorithms based on hybrid approaches, and grid computing simulator.
Job scheduling is the heart of grid computing processes. One of the main issues in job
scheduling for grid computing is the dynamic environment. Many studies try to solve
this problem using different techniques and algorithms. This chapter reviews some of
these studies and points out how they evolved and developed. In addition, their limits
and gaps will be highlighted.
This chapter organized as follows: Section 2.1 discusses several algorithms
implemented in job scheduling on computational grid. Hybrid approaches for job
scheduling in grid computing are presented in Section 2.2. Studies on grid computing
simulator are discussed in Section 2.3. Section 3.4 presents the conceptual framework.
Finally, the chapter summary is presented in Section 2.5.
2.1 Job Scheduling Algorithms in Computational Grid System
Grid computing system can be categorized into two types, namely static and dynamic
environments (Kolodziej, 2012). In static environment, resources are assumed to be
available at all time. In addition, the resource capacity, number of tasks and load are
also fixed. For dynamic environment, resources may join and leave the grid at any
time. In addition, the resource capacity and load changes dynamically. Other
characteristics of the dynamic environment are variation of network speed and
15
availability. The performance of the static and dynamic grid computing depends upon
the efficiency of scheduler algorithm. The scheduler is considered as one of the most
important parts of resource management system in grid computing system. The
scheduler is implemented with heuristic or metaheuristics algorithms. The following
reviews discuss several types of algorithms which are implemented successfully to
solve job scheduling problems in computational grid.
2.1.1 Heuristic Algorithms
In general, a scheduling problem is considered as a complex decision making problem
which needs to be solved optimally (Zapfel, Braune, & Bogl, 2010). This type of
problem is known as optimization problem which is also categorized as NP-complete
problem (Abraham, Grosan, & Ishibuchi, 2007; Talbi, 2013b). Due to the complexity
of NP-complete problems, using exact approaches are not feasible (Framinan, Leisten,
& García, 2014). Therefore, the optimal solution could be sacrificed for the sake of
finding good solution in significantly reduced time using approximate algorithms
(Blum & Roli, 2003). Heuristics and metaheuristics algorithms have been
implemented to solve various optimization problems, such as routing, scheduling, and
planning (Agnetis, Billaut, Gawiejnowicz, Pacciarelli, & Soukhal, 2014; Burke &
Kendall, 2014). A simple approach could be used for job scheduling in grid
computing based on greedy approach (Boussaid, Lepagnot, & Siarry, 2013; Ma, Lu,
Zhang, & Sun, 2012). These types of algorithms provide fast solution which is
suitable for extremely small and time restricted grid computing systems (Xhafa &
Abraham, 2009).
A study on batch mode scheduling for job scheduling in computational grid systems
published by Xhafa, Barolli, and Durresi (2007b). The study aims to optimize four
16
criteria, namely makespan, flowtime, resource utilization, and matching proximity.
The algorithms proposed in their study are: Min-Min, Max-Min, Sufferage, Relative
Cost, and Longest Job to Fastest Resource-Shortest Job to Fastest Resource
algorithms. These algorithms evaluated using static benchmark problems based on
expected time to compute model defined by Braun et al. (2001). The experiment
results show that Min-Min algorithm outperforms other algorithms in term of
makespan and flowtime. In terms of average resource utilization, Max-Min algorithm
achieved the best results among all other algorithms. However, the results show that
Min-Min algorithm achieved the worst performance in terms of average resource
utilization. For matching proximity criterion, Min-Min algorithm achieved the best
performance closely followed by Relative Cost algorithm. The authors recommended
that the proposed algorithms are useful to be utilized to generate an initial solution for
other heuristics algorithms.
An immediate mode scheduling of independent job in grid computing proposed by
Xhafa, Barolli, and Durresi (2007c). Several heuristic algorithms were implemented
and four criteria were used to measure the grid system performance, namely
makespan, flowtime, resource utilization, and matching proximity. The study
implemented and evaluated five algorithms, which are, Optimization Load Balancing
(OLB), Minimum Completion Time (MCT), Minimum Execution Time (MET),
Switching Algorithm (SwA), and K-Parents Best (KPB). These methods were tested
using benchmark problems based on the expected time to compute model proposed
by Braun et al. (2001). The experiments show that MCT algorithm performs the best
in terms of makespan. Regarding flowtime criterion, SwA algorithm performs good,
followed by MCT. In terms of resource utilization, OLB algorithm outperforms other
algorithms. MET algorithm was able to achieve the best matching proximity values
17
with perfect results. It is clear from the results that the objectives of optimizing job
scheduling in computational grid system have some contradictions. Therefore, for
very heterogeneous systems, not all criteria are considered as a good indicator for grid
system performance.
A study on job scheduling algorithm in computational grid environment conducted by
Malarvizhi and Uthariaraj (2009) for grid job scheduling using minimum Time To
Release algorithm (TTR). They proposed architecture for grid scheduling; some of the
main architecture components are dispatcher, grid scheduler and load balancer. The
idea behind their approach is to predict the performance of each resource by
estimating the total time to release, and then map with each resource. Based on TTR,
each combination of job and resource are stored in an increasing order of TTR to
assign to a resource. They also calculate other parameters, such as transfer input time
and transfer output time which makes the module more realistic and accurate for real
life applications. The experiments were conducted using GridSim simulator and the
environment consists of scheduler, five users with different time requirements and
rates of task creating, and 30 nodes with different computer power. They compared
TTR algorithm with first come first serve algorithm and Min-Min algorithm. The
results show that the proposed algorithm performs better than the others in terms of
computation time. However, according to Xhafa and Abraham (2010), metaheuristics
algorithms such as ant colony optimization and genetic algorithm are much better than
the other types of algorithm.
A study presented by Izakian, Abraham, and Snsel (2009) proposed a heuristic
method called Min-Max to generate a solution for job scheduling in grid environment.
Min-Max algorithm could be used to produce a good initial solution for other
18
metaheuristics algorithms. In addition, Min-Max algorithm could be used in realworld situations where applying metaheuristics algorithm is not applicable. The study
focused on static environment which is based on expected time to compute model
presented by Braun et al. (2001). The proposed algorithm was compared with five
heuristic algorithms, namely Work Queue, Max-Min, LJFR-SJFR, Suffrage, and Min-
Min. The experiment results show that the proposed algorithm can minimize the
makespan and flowtime better than other algorithms. The study also conducted
experiment using the proposed algorithm to generate initial solution for simulated
annealing algorithm. The results show that simulated annealing algorithm could not
improve the solution that generated using the proposed algorithm. In addition,
reducing the makespan leads to increasing in the flowtime and vice versa. However,
such contradiction behaviour is common in the problem of job scheduling in grid
computing.
Bardsiri and Hashemi (2012) published a study on comparing seven static mapping
heuristics algorithms for job scheduling on computational grid. The study compared
seven popular heuristics algorithm, namely Opportunistic Load Balancing (OLB),
Minimum Completion Time (MCT), Min-Min, Max-Min, Sufferage, Maxstd, and
Longest Job to Fastest Resource-Shortest Job to Fastest Resource (LJFR-SJFR)
algorithms. The proposed approaches were tested using the benchmark problems
based on expected time to compute model developed by Braun et al. (2001). The
study aims to optimize makespan, resource utilization, and matching proximity
criteria for independent job scheduling in static grid computing environment. The
experiment results show that Min-Min and Sufferage algorithms achieved good
performance compared with other algorithms in terms of makespan. In terms of
resource utilization, all algorithms have similar performance with little favour to Max-
19
Min algorithm. For matching proximity criterion, Min-Min algorithm outperforms the
other algorithms followed by Sufferage algorithm. However, the experiments were
conducted using static environment which is not enough to explore the algorithm
behaviour. Nevertheless, these algorithms are useful to generate an initial solution for
other metaheuristics algorithms such as tabu search and genetic algorithm.
2.1.2 Evolutionary Algorithms
In spite of the fast solutions produced by schedulers based on greedy approach, the
quality of these solutions will drop down dramatically once the grid size increase
(Anousha, Anousha, & Ahmadi, 2014; Braun et al., 2001). In order to overcome the
obstacle of size increasing, metaheuristics algorithms emerged as a practical solution
for job scheduling with reasonable time and resources (Qureshi et al., 2014). One
important category of metaheuristics algorithms is Evolutionary Algorithms (EA) (Yu
& Gen, 2010). The process in EA algorithms is similar to natural process in living
organic such as crossover, mutation, and selection (Stoean & Stoean, 2014). Several
algorithms are developed under the EA category such as genetic algorithm (Holland,
1992), evolution strategies (Schwefel, 1995), evolutionary programming (Fogel,
Owens, & Walsh, 1996), and genetic programming (Koza et al., 2003). Evolutionary
algorithms have been implemented successfully for job scheduling in computational
grid systems with good results. Some of the EA works are discussed such as memetic
algorithms, differential evolution algorithms, and genetic algorithms.
I. Memetic Algorithms
Memetic Algorithms (MA) is an optimization technique which combines different
search methods such as population based search and local search algorithm (Moscato
20
& Cotta, 2010). The idea behind MA is that individuals are not simply doing
crossover and mutation, but also doing some enhancement to their solution elements.
This enhancement is accomplished by incorporating heuristic, approximation
algorithms, or local search technique (Gendreau & Potvin, 2010). Due to the
combination of different techniques, MA is sometimes called hybrid evolutionary
algorithms (Davis, 1991). Memetic algorithms are applied to job scheduling in grid
computing by several studies.
Cellular Memetic Algorithm (cMA) proposed to optimize the makespan and flowtime
for job scheduling in grid computing by Xhafa, Alba, and Dorronsoro (2007). As a
component of cMA algorithm, the study implemented three types of local search
algorithms: Local Move (LM), Steepest Local Move (SLM), and Local Minimum
Completion Time Swap (LMCTS). The experiments conducted using static
environment based on expected time to compute model presented by Braun et al.
(2001). In addition, the proposed algorithm were evaluated with three other versions
of GA as presented in Braun et al. (2001), Carretero and Xhafa (2006), and Xhafa
(2006). The first experiment shows that the local search algorithm LMCTS is the best
among the three considered algorithms. Experiment on makespan criterion shows that
the proposed cMA performs better than other algorithms in some instances. However,
the experiments were not organized properly for example, the flowtime results were
obtained only from comparing cMA with LJFR-SJFR and GA from Xhafa (2006) and
nothing was reported about the other two GA algorithms. Moreover, the study focused
only on static grid computing environment. Therefore, more study and experiments
using dynamic grid computing environment are required to explore the algorithm
behaviour.
21
Xhafa, Alba, Dorronsoro, Duran, and Abraham (2008) published a study using
Cellular Memetic Algorithm (cMA) for job scheduling in grid system. The study
aimed to optimize makespan and flowtime simultaneously as a bi-objective. The
authors implemented two different local moves: Local Minimum Completion Time
Swap (LMCTS), and Local Tabu Hop (LTH) based on Tabu Search (Glover &
Laguna, 1997). Therefore, the study proposed two different cMA algorithms, namely:
cMA+LMCTS and cMA+LTH. The proposed approaches were evaluated with static
and dynamic grid computing environments. The static environment benchmark
problems were generated using expected time to compute model presented in Braun et
al. (2001). On the other hand, the dynamic environment benchmark problems
generated using a simulator developed by Xhafa, Carretero, Barolli, and Durresi
(2007). The proposed algorithms were evaluated first with TS as presented by Xhafa,
Carretero, Alba, and Dorronsoro (2008) in order to select the best move using static
scenario. The first experiment shows that cMA+LTH algorithm was able to achieve
good makespan reduction on five instances while cMA+LMCTS algorithm could not
achieve any good results. On the other hand, TS achieved the best results on seven
instances. Nevertheless, cMA+LMCTS performs better than cMA+LTH with
flowtime reduction. The second experiment compared cMA+LTH performance with
three other versions of GA approaches as presented by Braun et al. (2001), Carretero
and Xhafa (2006), and Xhafa (2007). The empirical results show that cMA+LTH
outperforms other algorithms in static and dynamic environments. It is clear from the
results that some reduction in makespan value will lead to increase in flowtime
values. Although the study was conducted using static and dynamic grid computing
environments, the comparison is not sufficient, for example, the authors did not
22
compare with TS in dynamic scenario. However, the concept of utilizing local search
algorithm inside other metaheuristics algorithm is quite useful.
Zhong, Long, Zhang, and Song (2011) published a study on efficient memetic
algorithm for scheduling job in grid computing. The authors incorporated hillclimbing and tabu search algorithms as solution enhancement mechanisms. The
experiments were conducted using benchmark problems based on expected time to
compute model as introduced by Braun et al. (2001) and the fitness function to
minimize makespan and flowtime values as proposed in Xhafa, Alba, et al. (2008).
The proposed algorithms evaluated against genetic algorithm described in Braun et al.
(2001). The experiment results show that MA with hill-climbing and tabu search
algorithms outperforms genetic algorithm for consistent and semi-consistent grid
scenarios. Comparing between MA-Tabu search and MA-hill-climbing, MA-tabu
search showed faster performance. However, the flowtime results are not reported
which is supposed to be part of the optimization function. In addition, an experiment
conducted only on static environment is not enough to conclude the algorithm
behaviour on dynamic scenario. Moreover, the proposed algorithm did not compare
with other recent metaheuristics algorithms such as artificial bee colony and ant
colony optimization. Nevertheless, the idea of using tabu search algorithm as a
mechanism to enhance the individual solution could be integrated with other
metaheuristics algorithms such as ant colony optimization.
A comparison study on the performance of genetic algorithm, memetic algorithm,
cellular memetic algorithm, and hybrid algorithms with tabu search has been proposed
by Xhafa, Koodziej, Duran, Bogdanski, and Barolli (2011). The study illustrated the
advantages and limitations of different population based methods for job scheduling
23
in computational grid systems. In addition, the study tried to investigate the benefits
of hybridizing population algorithm with local search algorithms such as tabu search.
The authors considered a bi-objective scheduling problem in grid computing to
measure the scheduling effectiveness, namely makespan and flowtime which
optimized simultaneously. The study considered the tasks to be processed in a batch
mode as described in Xhafa and Abraham (2010). In addition, the problem was
formulized based on expected time to computed matrix model as proposed by Ali et
al. (2000b). The experimental analysis was performed using HyperSim-G simulator as
developed by Xhafa et al. (2007). These experiments were conducted on static and
dynamic instances. The experiment results for static instances show that memetic
algorithm achieved the best makespan value for large instances, while cellular
memetic algorithm hybridized with tabu search achieved the best results for small
instances. The dynamic experiments results show that the hybrid cellular memetic
algorithm with tabu search outperforms the other algorithms for small, medium, and
large instances, while memetic algorithm hybridized with tabu search achieved the
best makespan values for very large instances. The study concluded that hybridizing
memetic and cellular memetic algorithms with tabu search will enhance the algorithm
performance. However, the study did not show the result of the flowtime values
which is considered as a part of the bi-objective function. Nevertheless, the proposed
comparison provides good foundation regarding the performance of memetic and
cellular memetic algorithm when they hybridize with tabu search algorithm.
II. Differential Evolution Algorithm
Differential Evolution (DE) is an optimization algorithm developed by Kenneth Price
in 1995 (Price, Storn, & Lampinen, 2005). DE is a population-based algorithm that
24
has the operators: crossover, mutation, and selection to evolve a population of
candidate solutions toward an optimal solution. Differential evolution algorithm for
job scheduling in heterogeneous distributed environment has been proposed by
Kromer, Snasel, Platos, Abraham, and Izakian (2009). The study aimed to optimize
the job scheduling in terms of makespan and flowtime with priority to makespan as
suggested by Carretero et al. (2007). The proposed algorithm was implemented with
classic version adopted from Price et al. (2005). The experiments were conducted
using the expected time to compute model as proposed by Braun et al. (2001). In
addition, the experiments compared the proposed algorithm with max-min, suffrage,
min-min, and min-max algorithms. The experiment results show that the proposed
algorithm did not achieve a good result when it started with random initial solution.
However, with initial solution generated using the heuristic algorithm, the proposed
algorithm outperforms all other algorithms in terms of makespan. Nevertheless, in
terms of flowtime optimization, the proposed algorithm did not perform well.
Selvi and Manimegalai (2010) conducted a study on job scheduling for grid
computing based on differential evolution algorithm. The objective of the study is to
minimize the makespan value as an optimization objective. The experiments were
conducted using static benchmark problems used by Liu, Abraham, and Hassanien
(2010). The proposed algorithm was implemented using MATLAB application. The
performance of the implemented algorithm is compared with fuzzy discrete particle
swarm optimization algorithm as proposed in H. Liu et al. (2010). The experiment
results show that the proposed algorithm achieved good standard deviation and
completion time. However, in terms of makespan values, the proposed approach did
not achieve good results compared to other algorithms. In addition, the experiments
25
were conducted using static benchmark problems. Therefore, the proposed algorithm
needs to be tested on dynamic environment in order to draw the final conclusion.
III. Genetic Algorithm
Genetic Algorithm (GA) is a well-known algorithm to solve various types of
combinatorial optimization problems developed in 1975 by John Holland (Blum &
Roli, 2003). Genetic algorithm is applied in various types of scheduling problems,
such as manufacturing scheduling (Gen, Zhang, Lin, & Jo, 2014), scheduling of
production and transport systems (Hartmann, Makuschewitz, Frazzon, & Scholz-
Reiter, 2014), and scheduling workflow applications in cloud environment (Singh &
Singh, 2014).
GA has three prime operators, namely crossover, mutation, and selection (Yang,
2014). However, In terms of mathematics, there are no explicit mathematical
equations for general genetic algorithm (Yang, 2014). GA procedure’s details, such as
steps on how to generate a new generation from a population and how to process the
operators are provided in many literatures (Reeves & Rowe, 2003; Sivanandam &
Deepa, 2008). Figure 2.1 shows the pseudocode of basic genetic algorithm.
26
Figure 2.1. Basic Genetic Algorithm (Zapfel et al., 2010)
In Figure 2.1, the first step is initializing the population (P) which is generated
randomly or using some heuristic algorithm (Carretero & Xhafa, 2006). Figure 2.2
visualizing the GA population (Zapfel et al., 2010).
Figure 2.2. Visualization of GA population (Zapfel et al., 2010)
Procedure Genetic Algorithm
Step 1- 푃 ←initial population;
Step 2- Evaluate (푃);
Step 3- While termination criterion not satisfied do
Step 4- 푃
′
←Select
(
푃
)
;
Step 5- Crossover
(
푃
′
)
;
Step 6- Mutate(푃
′
);
Step 7- Evaluate(푃
′
);
Step 8- 푃 ←푅푒푝푙푎푐푒(푃
′
∪푃);
Step 9- End
End Procedure
Population
Individual / Chromosome
1, 1, 0, 0, 0, 0, 1, 0, 0, 0
0, 0, 0, 1, 0, 0, 1, 0, 1, 0
1, 1, 0, 0, 0, 0, 1, 0, 0, 0
1, 0, 0, 1, 0, 0, 1, 1, 0, 0
0, 0 0, 0, 0, 0, 1, 1, 1, 1
1, 1, 0, 1, 0, 0, 1, 0, 1, 0
0, 1, 0, 1, 1, 0, 1, 0, 0, 1
0, 1, 1, 0, 0, 1, 1, 0, 1, 0
1, 0, 1, 0, 0, 0, 1, 0, 0, 0
1, 1, 0, 0, 0, 0, 1, 0, 0, 0
1, 1, 0, 0, 0, 0, 1, 0, 0, 0
Gene at locus 1
27
The second step in GA is the evaluation process. Evaluation is an operator to calculate
the solution quality which is called fitness in terms of genetic algorithm. The solution
fitness is required for selection and replacement operators. In other words, solutions
with better fitness value are preferred in the selection process (Zapfel et al., 2010). In
order to calculate the fitness value in job scheduling problem in grid computing, the
following equation is used:
푓푖푡푛푒푠푠= 1 / 푚푎푘푒푠푝푎푛
(2.1)
where 푚푎푘푒푠푝푎푛 is the completion time of the last task by the system (Xhafa &
Abraham, 2010).
The third step in GA algorithm is the loop using while and termination condition. The
execution could be stopped using one or more conditions, such as, specific number of
iteration, specific time of execution, and maximum number of iteration without any
enhancement in the solution quality (Zapfel et al., 2010).
The fourth step in GA algorithm is the selection process which is responsible to select
part of the population for crossover and mutation operators. There several selection
methods, such as Roulette-Wheel-Selection, Linear-Rank-Selection, and Tournament
Selection (Zapfel et al., 2010).
The fifth step in GA is the recombination or crossover operators. Crossover operator
is the process of combining genes from the selected solutions in order to produce a
new solution. There are several types of crossover operator, such as one-point, twopoint, N-point, and uniform crossover (Zapfel et al., 2010). Figure 2.3 shows
examples of one-point, two-point, and uniform crossover.
28
Figure 2.3. Examples of crossover operators (Zapfel et al., 2010)
The sixth step in GA algorithm is the mutation operator. Mutation is the process of
perturbation of the solution with small probability. Mutation operator will help the
algorithm to prevent from premature convergence. In binary solution representation,
the mutation process could be done by changing the value from 1 to 0 or vice versa.
This method is known as bit-flip mutation. In job scheduling problem for grid
computing, mutation could be implemented using re-balance method (Xhafa, Duran,
Abraham, & Dahal, 2008).
The seventh step in GA is the evaluation of the new solutions. This step is similar to
step 2 using the new solutions instead of the whole population.
The last step in GA is the replace operator which replaces the new generated solution
with other solutions from the population. There are several replacement methods,
such as generational replacement and steady state replacement. In generational
method, the new generated solutions supersede the old solutions. In steady state
method, multiple solutions or only one solution is replaced. Figure 2.4 illustrates the
process of genetic algorithm.
29
Figure 2.4. Process of Genetic Algorithm (Zapfel et al., 2010)
An experimental study regarding job scheduler based genetic algorithm for large grid
environment has been proposed by Carretero and Xhafa (2006). The proposed
algorithm aims to minimize the makespan and flowtime values of job scheduling on
30
grid computing. The study considered two versions of optimization. First is
hierarchical structure where makespan is optimized first, then followed by flowtime.
Second is simultaneous structure where both objectives are optimized simultaneously.
The proposed algorithm was implemented using a skeleton as defined in Alba et al.
(2002). The study conducted static and dynamic experiments. The static benchmark
problems were generated using expected time to compute model from Braun et al.
(2001), while dynamic benchmark problems were generated using dynamic simulator
developed by Alba et al. (2002). For static experiments, the proposed algorithm
compared with Min-Min, MCT, and GA algorithms from Braun et al. (2001). The
results show that the proposed genetic algorithm with hierarchic optimization
structure outperforms other algorithms in terms of makespan. However, flowtime
results for static experiments are not reported in the study. For dynamic experiments,
the results show that the proposed algorithm with hierarchical structure achieved the
best makespan values for three instances, while simultaneous structure achieved the
best results for one instance in terms of makespan. Flowtime results were not reported
as well. The study provided the implementation details for genetic algorithm.
However, the proposed algorithm was compared only with heuristic algorithms and
other genetic algorithm implementation. Therefore, it is unknown how is the
performance of the proposed algorithm compared with other metaheuristics
algorithms such as ant colony optimization and artificial bee colony.
Job scheduling for grid computing based on genetic algorithm has been proposed by
Carretero et al. (2007). The study aimed to minimize makespan and flowtime values
either in a hierarchical mode with makespan as primary objective or in a simultaneous
mode. In addition, two types of encoding schemes have been presented in the study
with several operators implementation. The proposed algorithm was implemented
31
based on a generic skeleton as developed by Alba et al. (2006). The benchmark
problems were generated using expected time to compute model presented in Braun et
al. (2001) which consists of twelve static instances. The implemented algorithm was
evaluated against genetic algorithm results as presented in Braun et al. (2001). The
experiment results show that the proposed algorithm outperforms other algorithms in
terms of makespan with simultaneous mode. However, flowtime values are not
provided in Braun et al. (2001). The study compared the proposed algorithm in terms
of flowtime values between hierarchical and simultaneous modes which show favour
results to simultaneous approach. The authors also noticed that makespan and
flowtime are contradictory objectives. Therefore, trying to optimize one objective
may not suit the other objective, especially when the scheduling plan is close to
optimal solution. In spite of the good results achieved by the proposed algorithm,
more investigations are required especially under dynamic job scheduling
environment in order to conclude the algorithm performance in all circumstances.
Another experimental study on genetic algorithm for job scheduling on grid
computing has been proposed by Xhafa, Barolli, and Durresi (2007a). The authors
presented two algorithms for scheduling independent jobs to grid resources based on
two replacement mechanisms, namely Steady-State Genetic Algorithm (SSGA) and
Struggle Genetic Algorithm (SGA). The study experiments were conducted using
benchmark problems generated using expected time to compute model as presented in
Braun et al. (2001). The proposed algorithm was compared with genetic algorithm
developed by Braun et al. (2001). The experiment results show that SGA performs
better than SSGA for moderate grid size problems in terms of makespan. However,
for larger grid size problems, SGA did not achieve good results as SSGA. In terms of
flowtime, SGA also achieved better performance than SSGA. The study
32
recommended implementing SGA for small and medium grid size and SSGA for large
and very large grid size. However, the study only considered static grid computing
environment which is not enough to observe the algorithm behaviour. Therefore,
these two algorithms need to be tested on dynamic grid computing environment in
order to conclude the algorithm robustness.
Xhafa, Duran, Abraham, and Dahal (2008) published a study on Straggle strategy in
Genetic Algorithm (SGA) for job scheduling in computational grid. The study
implemented hash function for computing the similarity of solutions in order to
enhance the standard similarity measures. The aim of the proposed approach is to
minimize the makespan value as an optimization objective. The proposed algorithm
was evaluated with static benchmark problems generated using the expected time to
compute model as presented in Braun et al. (2001). The evaluation is done between
SGA and SGA with hash function. The experiment results show that using the hash
function with SGA, it improves the algorithm performance in terms of makespan
value for several types of instances. The idea of using hash function is useful in order
to avoid evaluating a similar solution for each cycle in the algorithm process which is
time consuming. This idea could be implemented with other metaheuristics
algorithms such as tabu search and ant colony optimization. However, more studies
and experiments are required to observe the algorithm behaviour in terms of flowtime
and resource utilization. In addition, more experiments on job scheduling in dynamic
grid computing environment will provide more information regarding the use of hash
function.
A hierarchic genetic algorithm scheduler of jobs in grid computing environment has
been published by Kolodziej, Xhafa, and Kolanko (2009). The authors proposed an
33
algorithm called Hierarchic Genetic Strategy (HGS) for job scheduling on grid
computing systems. The study aimed to optimize the makespan and flowtime values
simultaneously. In addition, the study objective is to investigate several variations of
HGS operators and parameters to identify the best configuration for job scheduling
problem. The experiments were conducted using static benchmark instances for job
scheduling on grid computing which is based on the expected time to compute model
as developed by Ali et al. (2000b). The proposed algorithm was compared with
classic genetic algorithms as presented by Braun et al. (2001) and Carretero et al.
(2007). The experiment results show that the proposed algorithm performs better than
the other two algorithms in terms of makespan and flowtime as well. The study also
revealed that rebalancing method is the best mutation operator for HGS. However, the
experiments were conducted on static grid computing environment; therefore, more
experiments are required to test the proposed algorithm on dynamic environment to
examine the algorithm robustness.
A study conducted by Kumar, Kumar, and Kumar (2011) for job scheduling used
genetic algorithm. In their study, they considered the network transmission time when
making scheduling decisions. They argue that the scheduler who does not take into
account the network load when making the scheduling decisions might not produce
optimal scheduling. In their work, they implemented multi-objective genetic
algorithm for job scheduling in grid computing using GridSim simulator as developed
by Buyya and Murshed (2002). The algorithm focused on minimizing the jobs’
finalization time and makespan by minimizing the jobs’ data transfer time between
data storage location and computing resource site over the network. The job size was
presented in Million Instructions (MI) and the resource capacity was presented in
Million Instructions Per Second (MIPS). To calculate the network load in the
34
scheduling algorithm, they used four arrays. The proposed algorithm was compared
with non-network-aware scheduling algorithm for grid computing. The results show
that the proposed scheduling algorithm performs better than the non-network-aware
scheduling algorithm. However, the experiments scenario is very limited (using only
50 jobs) which did not give a clear robustness picture about the algorithm.
Nevertheless, the idea of calculating the network transmission time is very important
when scheduling jobs in grid systems and needs more investigation to concrete the
concept.
Enhanced Genetic-based scheduling for grid computing has been developed by
Kolodziej and Xhafa (2011). The authors presented an implementation of hierarchic
genetic strategy for job scheduling in dynamic computational grid environment. HGS
has the ability to search the solution space concurrently using various evolutionary
processes. The study focused on bi-objective optimization, specifically, makespan and
flowtime to be simultaneously optimized. The experiments were conducted under
heterogeneous, large scale, and dynamic environments using grid simulator. The
algorithm was tested with static and dynamic grid computing environments. The
experiment with static environment is based on expected time to compute matrix
model as presented in Ali et al. (2000b). For dynamic environment, the authors used a
simulator developed by Xhafa and Carretero (2009). The proposed algorithm was
compared with two other GA-based schedulers presented in Braun et al.( 2001), and
Carretero et al. (2007). The results show that the proposed algorithm outperforms the
other GA-based scheduler algorithms. However, the proposed algorithm results were
compared only with GA. Therefore, it is unknown how the proposed algorithm will
perform against other metaheuristics algorithms.
35
2.1.3 Local Search
The basic idea of local search is that solution is successively modified by performing
moves that change solution locally (Vob, 2001). The solutions that can be reached by
moves are called neighbourhood solutions. Various techniques have been developed
to search the solution’s neighbourhood, such as random, iterated, greedy, variable
neighbourhood search, and steepest descent algorithms (Aarts & Lenstra, 2003;
Gendreau & Potvin, 2010; Zapfel et al., 2010). However, using metaheuristics
algorithms for local search, such as simulated annealing and tabu search, showed
promising performance in grid computing.
I. Simulated Annealing
Simulated Annealing (SA) is known as an optimization algorithm inspired by nature.
SA mimics the physical process in the annealing of materials when a metal cools and
freezes into a crystalline state (Yang, 2014). SA algorithm was developed in 1980s by
Kirkpatrick, Gelatt, and Vecchi (1983). Simulated annealing algorithm has been
implemented in almost every field of combinatorial optimization problems (Yang,
2014). The main advantages using SA is the ability to skip from local minima by
controlling the threshold value for the maximum allowed decrease in solution quality
(Vidal, 1993; Zapfel et al., 2010). SA algorithm has been proved to converge to the
global optimality if enough time and randomness are given with very slow cooling
(Aarts, Korst, & Michiels, 2014; Yang, 2014). SA algorithm has been implemented
for job scheduling in computational grid environment successfully.
YarKhan and Dongarra (2002) conducted an experimental study on using Simulated
Annealing (SA) algorithm for job scheduling in grid environment. The study used
dynamic machine status and connectivity information from the Globus
36
Metacomputing Directory Service (MDS) defined in YarKhan and Dongarra (2002)
and Network Weather System (NWS) proposed by Wolski, Spring, and Hayes (1999).
The proposed algorithm was compared with Ad-hoc greedy approach presented in
Ullman (1975). The experiments were conducted using real grid computing system
provided from the University of Tennessee and University of Illinois. The experiment
results show that the proposed simulated annealing algorithm outperforms Ad-hoc
greedy algorithm in terms of estimated execution time. In spite of the practical
experiments using real grid environment, the experiments were very limited in terms
of benchmark problems size. In addition, the proposed approach needs to be
compared with other metaheuristics algorithms in order to observe the algorithm
performance. However, using simulated annealing algorithm will help to avoid the
local minima and search near optimal solution.
A study on implementing simulated annealing algorithm for job scheduling in
computational grid has been proposed by Fidanova (2006). The authors claimed that
simulated annealing has two important features, which are: First, SA algorithm is
proved to converge to the optimal solution. Second, AS algorithm is easy to be
implemented to solve job scheduling problems in grid computing. The implemented
scheduler utilizes the expected time to compute model which is developed by
Maheswaran, Ali, Siegel, Hensgen, and Freund (1999). The objective of the study is
to minimize the makespan value for job scheduling as an optimization function in
batch mode. The proposed algorithm starts with generating initial solution using
greedy heuristic approach. The performance of the proposed AS algorithm compared
with online (queue) algorithm and ant colony optimization provided by Fidanova and
Durchova (2006). The experiments were conducted using a scenario developed by the
authors which consists of 5 resources and 20 jobs. In terms of makespan, the results
37
show that the proposed algorithm outperforms other algorithms followed by ant
colony optimization algorithm. However, the experiment scenario is very small in
terms of the number of resources and jobs. In addition, the study only focused on
static grid computing environment. Therefore, designing and conducting a bigger
scenario in static and dynamic environments will reveal the real performance of
simulated annealing algorithm for job scheduling in grid computing.
Cai, Ning, Li, and Zhong (2007) published a study on using simulated annealing
algorithm for independent task assignments in heterogeneous computing system. The
authors proposed two neighbourhood structures to search the best neighbour of the
current solution in the search space. The study aimed to minimize the makespan value
as an optimization objective. The experiments were conducted using benchmark
problems based on expected time to compute model defined in Braun et al. (1999).
The proposed algorithm was compared with Min-Min algorithm and simulated
annealing approach proposed by Braun et al. (1999). The empirical results show that
the proposed simulated annealing algorithm with random and swapping
neighbourhood structures outperforms the other algorithms in terms of makespan.
However, the proposed approach needs to be compared with other metaheuristics
algorithm. In addition, more experiments using dynamic grid computing environment
are required to test the algorithm robustness. Nonetheless, the idea of using various
neighbourhood structures is very important for local search algorithms such as
simulated annealing.
An implementation of simulated annealing algorithm for job scheduling in
computational grid has been proposed by Guo and Wang (2010). The proposed
approach tried to overcome two problems in implementing SA algorithm in grid
38
computing, namely the algorithm overhead and the best tuning parameters. The
authors developed a simulator called Ana-GridSim which is an extension of GridSim
simulator developed by Buyya and Murshed (2002). The experiment results show that
the proposed approach achieved good performance in terms of average error.
However, the experiment design was very poor in terms of benchmark problems (only
7 resources were used). In addition, optimization function was not defined, such as
makespan values have not been optimized. Moreover, the proposed algorithm was not
compared with other metaheuristics algorithms. Therefore, a more comprehensive
investigation on implementing simulated annealing algorithm for job scheduling on
grid computing systems is required.
II. Tabu Search
Tabu search is a metaheuristics algorithm based on a guided local search developed
by Glover (1986). TS algorithm has been successfully implemented to solve many
optimization problems, such as job shop scheduling, travelling salesman problem, and
vehicle routing problem (Gendreau & Potvin, 2014). According to Burke and Kendall
(2014), over the last 25 years, hundreds of papers presenting applications of tabu
search proposed to solve various combinatorial problems. TS is classified as a local
search that has the ability to skip from the local minimum by applying many
mechanisms such as memory and diversification (Rothlauf, 2011). Specifically, TS
applies the concept of adaptive memory and responsive exploration that make the
algorithm more flexible. TS algorithm operates using four types of memory, namely:
recency (short-term memory), frequency (long-term memory), quality, and influence
(Glover & Laguna, 1997). However, many successful applications use only one or
two of these memories. Figure 2.5 shows the tabu search algorithm process.
39
Figure 2.5. Process of Tabu Search algorithm (Zapfel et al., 2010)
The memory in tabu search could be explicit or attributive. Explicit memory records
the complete solution, while attributive memory records the information about the
solution attribute that changes when moving from one solution to another. For
example, in job scheduling scenario, moving a task 푡
푖
from machine 푚
푝
to machine
푚
푞
will create a new solution vector. Hence, TS memory could be implemented based
on recording the complete old solution or recording only the attribute that changed the
solution that is the history of assigning task 푡
푖
to machine 푚
푝
. Recording the
attributes will prevent the algorithm (tabu) to reverse to the old assignment for 푘
number of iterations with the same task and machine. However, this tabu attribute
could be override if the move will produce a solution better than the best-so-far
solution (could be any other criteria); this mechanism is called aspiration level. The
duration parameter for a move to be considered as a tabu is called tabu tenures
(Glover & Laguna, 1997). Effective tabu tenures depend on the size of a problem
instance.
40
Tabu search starts with initial solution which could be created randomly or by using
any ad-hoc algorithm such as Max-min, Min-min algorithm for scheduling (Xhafa,
Carretero, et al., 2008). From the initial solution, TS will start searching the
neighbourhood in order to find the local optima. If the move to the neighbour solution
is not tabu, then the neighbour solution is saved as the current solution. If the
neighbour solution is better than the best-so-far solution, then it saved as the best-sofar solution. In the case that the move is tabu, the inspire level will be checked in
order to override the tabu if the neighbour solution is better than best-so-far solution.
After moving to the neighbourhood solution, TS will update its memory and start a
new iteration if the termination condition is not met. The mechanism of searching the
neighbourhood repeatedly seems to guide the search towards an interesting area in the
search space quickly (Costa, 1994). However, there are many issues which need to be
addressed when implementing tabu search, such as what information to be saved in
the memory, the size of the tabu lists, how to move to the neighbourhoods, and how to
implement diversification (Thesen, 1998). Tabu search algorithm is applied for
scheduling problems in grid computing successfully. Figure 2.6 shows tabu search
algorithm pseudocode.
41
Figure 2.6. Tabu Search algorithm pseudocode (Zapfel et al., 2010)
In Figure 2.6, the first step in TS algorithm is creating the initial solution using
random approach or ad hog algorithm. The second step is initializing the global
solution from the initial solution. Third step is creating the tabu list to store the
movement history. The forth step in initializing the aspiration function to be used to
override the tabu movement.
Fifth step is the starting of the algorithm iteration which is terminated using stop
condition, such as the algorithm reached the maximum number of iterations, reach the
maximum time allowed, or no enhancement achieved for specific number of
iterations.
Procedure Tabu Search Algorithm
Step 1- Create initial solution 푠;
Step 2- Create global solution 푠
∗
← 푠;
Step 3- Create tabu list 푇퐿;
Step 4- Initialize the aspiration function 퐴;
Step 5- While (termination condition not satisfied) Do
Step 6- Search the neighbourhood 푁 of current solution 푠: {푠̂∈푁(푠)};
Step 7- If (move from 푠 to 푠̂ is not in 푇퐿) Then
Step 8- 푠 ← 푠̂;
Step 9- End If;
Step 10- Else If (푓
(
푠̂
)
<퐴(푓
(
푠
)
) Then
Step 11- 푠 ← 푠̂;
Step 12- End If;
Step 13- Update 푇퐿 memories;
Step 14- If (푓(푠) < 푓(푠
∗
)) Then
Step 15- 푠
∗
= 푠;
Step 16- End If;
Step 17- End While;
Step 18- Return Global solution 푠
∗
;
End Procedure;
42
Sixth step is searching the neighbourhood in order to generate different solution
which is based on current solution (Glover & Laguna, 1997). Step seven will check if
the movement to the new neighbourhood solution is not tabu (not visited before) then
the solution will be accepted even if it is worse than the current solution. This
technique makes TS algorithm skips from local optima trap. Step eight will save the
new accepted solution as the current solution and the condition ends at step nine. If
the solution is tabu, step ten will check the aspiration function to determine whether to
override the tabu if the solution quality better than the current solution or discard the
new solution. Step eleven will save the new solution as the current solution if the
aspiration condition satisfied and the step will end in step twelve.
Step thirteen will update the tabu list memory, such as the short and long memories.
Step fourteen will check if the current new solution is better than the global solution,
then the current solution will be saved as the globe best solution in step fifteen and
ends in step sixteen. In job scheduling problem, the solution quality is measured using
makespan metric. Step seventeen will end the while and the best global solution return
in step eighteen.
Tabu search design and evaluation for job scheduling in grid computing has been
proposed by Xhafa, Carretero, et al. (2008). The aim of the study is to minimize the
makespan and flowtime values as a bi-objective optimization problem. The biobjective function is implemented with a hierarchic approach in which makespan is
considered as a primary objective and flowtime as a secondary objective. The
proposed algorithm starts with the initial solution generated using Min-Min algorithm.
To search the neighbourhood of the initial solution, two types of movements were
implemented, namely transfer and swap which are adopted from Thesen (1998). The
43
proposed study used the expected time to compute model as developed by Braun et al.
(2001). The implemented algorithm was compared versus tabu search and ant colony
optimization algorithm hybridized with tabu search as proposed in Ritchie and Levine
(2004). The experiment results show that the proposed design and implementation
outperforms the approach proposed by Ritchie and Levine (2004). Another
experiment was conducted using the benchmark problem as defined by Carretero and
Xhafa (2006) which represent a large instance problem in dynamic environment. The
proposed algorithm was compared against steady-state genetic algorithm developed
by Carretero and Xhafa (2006). Again, the results show that the proposed algorithm
achieved better makespan values than steady-state genetic algorithm. The authors
concluded that the proposed design and implementation for tabu search algorithm are
more efficient than other implementation. In addition, the authors noticed that
makespan and flowtime are contradictory objectives; this observation is very
important to understand the complexity of job scheduling in grid computing.
Tabu search algorithm has been implemented for job scheduling in grid computing by
Xhafa, Carretero, Dorronsoro, and Alba (2009). The authors defined a bi-objective
optimization problem consisting of makespan and flowtime. Their proposed algorithm
adapted two types of neighbourhood movement namely transfer and swap. In
addition, the algorithm implanted intensification and diversification strategies to
achieve better results. Their study dealt with static and dynamic environments for
algorithm evaluation. For static environment, the benchmarks were generated based
on expected time to compute model as presented by Ali, Siegel, Maheswaran,
Hensgen, and Ali (2000a). While for dynamic environment, the benchmarks were
generated using extended HyperSim simulator as presented in Phatanapherom,
Uthayopas, and Kachitvichyanukul (2003). The proposed algorithm was compared
44
with three metaheuristics algorithms, namely TS, ACO+TS, and cMA (Ritchie &
Levine, 2004; Xhafa, Alba, et al., 2008) for static experiment. For dynamic
experiment, the proposed algorithm was compared with GA as presented by Carretero
and Xhafa (2006). The experiment results show that the proposed algorithm
outperforms the other algorithms in static and dynamic environment. The study
provides all the implementation details and the pseudo-code as well which makes the
study repeatable and easy to re-implement. However, there are many neighbourhood
movements other than transfer and swap, which has the ability to find better local
optima, such as insert and load balance moves.
2.1.4 Swarm Intelligence Algorithms
According to Gazi and Passino (2011), the terminology of “swarms” has come to
mean as “a set of agents possessing independent individual dynamics but exhibiting
intimately coupled behaviours and collectively performing some task”. Swarm
Intelligence (SI) algorithms are these algorithms which are nature-inspired such as ant
colony optimization, artificial bee colony, particle swarm optimization, cuckoo
search, and firefly algorithms (Yang, 2014). SI algorithms try to mimic the biological
behaviour of some creatures such as colonies of ants or bees, flocks of birds, and
schools of fish (Gazi & Passino, 2011). Swarm intelligence algorithms are inspired
the field of computing study, specifically the optimization field (Pintea, 2014). In
terms of computational model, swarm intelligence models are considered as
computing algorithms that are useful for solving distributed optimization problems
(Lim & Jain, 2009). The principles of swarm intelligence algorithm are proximity,
quality, diverse response, stability, and adaptability (Lim & Jain, 2009). Swarm
intelligence methods have shown very successful performance in the area of
45
scheduling which is of great importance for the industry and science (Blum & Li,
2008). The following subsections discuss the studies of swarm intelligence algorithms
for job scheduling in grid computing.
I. Ant Colony Optimization
In 1992, Marco Dorigo presented the first ACO algorithm in his PhD thesis to search
for an optimal solution in graph (Dorigo & Stutzle, 2004).The variants of ACO are:
(i) Ant System (AS), ant system is the first algorithm introduced in ant colony
optimization algorithms and the prototype of a number of ant algorithms extension. It
was initially proposed by Colorni, Dorigo, and Maniezzo (1991), and Dorigo et al.
(1991) aimed to search for an optimal path in a graph based on the behaviour of ants
seeking a path between their colony and food source. AS is also the first ACO
algorithm which has been applied to the travelling salesman problem (Dorigo et al.,
1996). Three different versions of ant system were proposed, which are ant-density,
ant-quantity, and ant-cycle. In ant-density and ant-quantity, the ants update the
pheromone directly after a move from a city to another city. But in ant–cycle, the
pheromone update was only done after all the ants had constructed the tours. The two
main phases of the ant system algorithm constitute the ants’ solution construction and
the pheromone update. The performance of ant system when compared to other
algorithms tends to decrease dramatically as the size of the test-instances increases.
For AS tour construction, an ant applies probabilistic action choice rule, called
random proportional rule, to decide which node to visit next. The probability of the
ant to move from node to node depends on pheromone and heuristic values. AS
updates the pheromone trails after all ants have constructed their tours. The first step
in pheromone updating is lowering the pheromone values (evaporation) on all arcs by
46
a constant factor. This step will enable the algorithm to forget the bad decision
previously taken, at the same time if the arc is not chosen by the ants, its associated
pheromone value decreases exponentially in the number of iterations. After
evaporation, all ants deposit pheromone on the arcs they have visited in their tour.
(ii) The first improvement on ant system, called the Elitist strategy for Ant System
(EAS), was introduced by Dorigo et al. (1991, 1996). This algorithm provides strong
additional reinforcement to the arcs belonging to the best tours found since the start of
the algorithms. In EAS, the global best solution deposits pheromone on all iterations
along with all the other ants and the pheromone evaporation is implemented as in the
ant system. The use of the elitist strategy allows the ant system to both find the better
tours and find them in a lower number of iterations. The additional reinforcement of
best tour is achieved by adding an extra quantity of pheromone to its arcs based on the
tour length and a new parameter defined as weight is given to the best-so-far tour.
(iii) Rank-Based Ant System (AS
rank
), another improvement over the ant system is the
rank-based version of ant system introduced by Bullnheimer, Hart, and Straub (1999).
In AS
rank
, each ant deposits an amount of pheromone that decreases with its rank. In
addition, as in EAS, the best-so-far ant always deposits the largest amount of
pheromone with during iteration. In AS
rank,
the first step for updating the pheromone
trails is sorting the ants by increasing the tour length. The quantity of pheromone an
ant will deposit is weighted according to the rank of the ant. During iterations, only
the best ranked ants and the ant that produced the best-so-far tour are allowed to
deposit pheromone. Among the AS-based algorithms, both, AS
rank
and EAS
performed significantly better than AS, with AS
rank
giving a slightly better result than
EAS.
47
(iv) Another variant of ACO algorithm is Max-Min Ant System (MMAS). This
algorithm has direct improvement over AS (Stutzle & Hoos, 1997, 2000). MMAS
differs from the basic approaches of AS in the following aspects. Firstly, it uses a
greedier search mechanism that allows a good exploitation of the accumulated
experiences. Secondly, MMAS uses a range of pheromone trail values to the interval
that help to avoid the premature stagnation (all ants converge early to one suboptimal
solution) of the search process. Thirdly, the initial value of pheromone trails is set to
the upper pheromone trail limit with a small pheromone evaporation rate to increase
the exploration of tours at the start of the search. Finally, in MMAS, pheromone trails
are reinitialized each time when the system does not produce an improved tour for a
certain number of consecutive iterations. To update pheromone trails, in MMAS, after
all ants have constructed a tour, pheromones are updated by applying evaporation as
in AS. After that, the deposit of a new pheromone is applied based on the best-so-far
tour. Only one of the two ants is allowed to add pheromone, either the best-so-far ant
or the iteration-best ant. In MMAS, lower and upper limits [휏
min
and 휏
max
] of
pheromone on any arc are used to avoid the search stagnation.
(v) Ant Colony System (ACS), this improvement has been introduced by Dorigo and
Gambardella (1997a, 1997b) to improve the performance of AS. ACS differs in three
main aspects from ant system. First, ACS uses a more aggressive action choice rule
than AS. Second, the pheromone is added only to moves belonging to the global-best
solution. Third, each time an ant moves on a path, it removes some pheromone from
that path. The three main phases of the ACS algorithm constitute the ants’ solution
construction, global pheromone trail update, and local pheromone trail update. ACS
algorithm starts solution construction when the ant moves from node to node. The ant
will choose the node using one of the two rules. The first rule is called pseudorandom
48
proportional rule which is based on exploitation mechanism. The second rule uses
exploration mechanism which is based on the probability distribution used in AS. The
tuning between exploitation and exploration is controlled by a parameter fixed by the
user. ACS algorithm applies the global pheromone trail update. In this update, only
one ant (the best-so-far ant) is allowed to add pheromone after all ants have finished
constructing their tours. In addition, ACS algorithm applies the local pheromone trail
update. In this update, all the ants apply a local pheromone update rule immediately
after moving on arcs during the tour construction using the evaporation concept.
In ACS algorithm, when ant k wants to move from node i to node j, it will choose the
node using a rule called pseudorandom proportional rule, calculated as:
푃
푖푗
퐴푛푡
푘
={
푎푟푔푚푎푥
푙∈푁
푖
푘
{휏푖푙
[
휂푖푙
]
훽
}, if 푞 ≤푞0;
퐽, otherwise
(2.2)
where q is a random variable uniformly distributed in [0, 1], q0 (0 ≤ q0 ≤ 1) is a
parameter, and J is a random variable selected according to the probability
distribution calculated as:
퐽=
[휏푖푗]
훼
[휂푖푗]
훽
∑
[휏푖푙]
훼
[휂푖푙]
훽
푙∈푁
푖
푘
,푖푓 푗 ∈ 푁
푖
푘
(2.3)
with α = 1. The tuning between exploitation and exploration is controlled by the
parameter q0.
In ACS only one ant (the best-so-far ant) is allowed to add pheromone after each
cycle. The Update is implemented using the following equation:
49
휏푖푗 ←
(
1−푃
)
τ푖푗+푝∆휏
푖푗
푏푠
, ∀
(
푖,푗
)
∈푇
푏푠
,
(2.4)
Where P is the pheromone evaporation rate, and ∆휏
푖푗
푏푠
= 1/ C
bs
.
For local pheromone update, a rule immediately applied after moving on arc (i, j)
during the solution construction using the following equation:
휏푖푗 ←
(
1− ξ
)
τ
ij
+ ξτ
0
,
(2.5)
Where ξ, 0 < ξ < 1, and τ
0
are two parameters. The value of τ
0
is equal to the initial
value for the pheromone trail.
Each variant of ACO algorithms implemented construction and pheromone update
methods. However, there are several differences between them. Table 2.1 shows the
differences between each variant in ACO algorithms.
Table 2.1
Difference between each variant algorithm in ACO
Algorithm
name
Differences and work mechanisms
AS 1. All ants deposit pheromone on the arcs they have crossed in their
tour.
EAS 1. Provides strong additional reinforcement to the arcs belonging to the
best tour by adding a quantity e/C
bs
.
AS
rank
1. Ant deposits an amount of pheromone that decrease with its rank.
2. In each cycle, only the best-so-far ant always deposits the largest
amount of pheromone.
MMAS 1. It strongly exploits the best tour found.
2. Limits the possible range of pheromone trail values to the interval
[휏
min
, 휏
max
].
3. The pheromone trails are initialized to the upper pheromone trail
limit.
4. Pheromone trails are reinitialized each time the system approaches
stagnation.
ACS 1. It exploits the search experience accumulated by the ant more
strongly than AS.
50
2. Pheromone evaporation and pheromone deposit take place only on
the arcs belonging to the best-so-far tour.
3. Each time an ant uses the arc (i, j) to move from node i to node j, it
removes some pheromone from the arc to increase the exploration.
ACO algorithm has been applied on several domains such as optimization,
classification, bioinformatics, and network.
Some domains contain more than one problem that needs to be solved using ant
colony optimization. Table 2.2 presents several types of researches conducted on
different domains.
Table 2.2
Various research on different Domains and problems
Domain Problem Name ACO type References
Routing Travelling Salesman
Problem
Novel
ACO
(Zhu, Zhao, & He, 2010b)
Vehicle Routing
Problem
ACS (Calvete, Gale, & Oliveros,
2012)
Vehicle Scheduling
Problem
Hybrid AS (Zhang, Ning, & Zhang,
2012)
Grid Computing Task Scheduling AS (Wei et al., 2012)
Resource Discovery AS (Devi & Pethalakshmi,
2012)
Grid Resource
Scheduling
Improved
AS
(Liu, Ma, Guo, & Wang,
2012)
Image
Processing
Image Edge Detection AS (Tian, Yu, Chen, & Ma,
2011)
Image Classification AS (Seo, 2012)
Optic Disc Detection AS (Pereira, Goncalves, &
Ferreira, 2013)
Operational
Research
Sequential Ordering
Problem
Enhanced
ACS
(Gambardella,
Montemanni, & Weyland,
2012)
Surgery Scheduling
Problem
AS (Yin & Xiang, 2012)
Process Planning
Optimization
AS (Liu, Yi, & Ni, 2013)
Manufacturing Assembly Sequence
Planning
MMAS (Yu & Wang, 2013)
Assembly Sequence AS (Youhui, Xinhua, & Qi,
51
Planning 2012)
Assembly Line
Balancing Problem
AS (Zheng, Li, Li, & Tang,
2013)
Database Distributed Join Query
Optimization
MMAS (Golshanara, Rankoohi, &
Shah-Hosseini, 2013)
Optimization For RDF
Chain Queries
AS (Hogenboom, Frasincar, &
Kaymak, 2013)
Optimization Of
Distributed Database
Queries
Hybrid AS (Dokeroglu & Cosar, 2012)
Electrical
Engineering
Control Of Ocean
Wave Energy
AS (See, Tai, & Molinas,
2012)
Bus Priority In Power
System
AS (Hamid, Musirin, Rahim,
& Kamari, 2012)
Power Signal Pattern
Classification
Hybrid AS (Biswal, Dash, & Mishra,
2011)
Data Mining Classification Rule
Discovery
AS (Hodnefjell & Junior,
2012)
Data Classification ACS (Michelakos, Mallios,
Papageorgiou, &
Vassilakopoulos, 2011)
Classification And Rule
Generation
AS (Tiwari & Verma, 2012)
Bioinformatics Epistasis Detection AS (Shang, Zhang, Lei, Zhang,
& Chen, 2012)
Finding Optimal
Spaced Seeds
AS Duc, Dinh, Dang, Laukens,
& Hoang, 2012)
Classifying Imbalanced
DNA
AS (Yu, Ni, & Zhao, 2013)
Robotics Robot Path Planning AS (Chen, Kong, Fang, & Wu,
2011)
Robot Path Planning AS (Bai, Chen, Jin, Chen, &
Mao, 2012)
Multi-Tasks
Distribution In
Heterogeneous Robot
AS (Lope, Maravall, &
Quinonez, 2012)
Networks Energy-Saving Routing
For Wireless Sensor
Networks
AS (Chen, Yu, Hong, & Dong,
2012)
Routing For
Hierarchical Wireless
Sensor Networks
AS (Wang, Jing, & Wang,
2012)
Routing And Spectrum
Allocation
AS (Wang, Zhang, Zhao,
Wang, & Gu, 2013)
Assignment Timetabling Problem AS (Nothegger, Mayer,
Chwatal, & Raidl, 2012)
Graph Colouring
Problem
AS (Douiri & Elbernoussi,
2012)
Aircraft Assigning ACS (Zhang, Lin, Qiu, & Fu,
52
Problem 2011)
Various AS algorithms have been derived and extended to exploit the search history
without losing the chance of exploring new areas of the search space. Among them,
ACS algorithm appears to be promising to extend the framework of ACO. It provides
a good opportunity to explore a wide area of the search space in reasonable time. All
variants of ACO algorithm have some similarity in their foundation, such as utilizing
the heuristic information, pheromone value, and the solution is based on constructing.
All ACO algorithms apply pheromone evaporation.
Cordon, Viana, and Herrer (2002) and Cordon, Viana, Herrera, and Moreno (2000)
proposed the Best-Worst Ant System (BWAS) as another extension of the basic idea
of AS by including some concepts from evolutionary computation algorithms. BWAS
uses the same transition rule as in AS algorithm to construct ants’ solutions. Then,
BWAS enhances the ants’ solution by using a local optimizer to bring each solution to
its local optimum. Like AS, AS
rank
and MMAS, all pheromone updates are performed
by daemon actions.
Lorpunmanee, Sap, Abdullah, and Chompoo-inwai (2007) presented a study called
“An Ant Colony Optimization for Dynamic Job Scheduling in grid Environment”. In
their study, they developed a general framework of grid scheduling using dynamic
information and an ant colony optimization algorithm to improve the decision of
scheduling. The experiment was conducted using GridSim simulator toolkit version
4.0 with an extension. They presented a comparison between ant colony optimization
and other various algorithms for job scheduling and dispatching rules for grid
environment, such as First Come First Served (FCFS), Minimum Time Earliest Due
53
Date (MTEDD), and Minimum Time Earliest Release Date (MTERD). Their
experiment results stated that ACO is able to perform the optimal scheduling job.
Besides that, the ACO accounts for less than 17% of the total tardiness time in the
average when it is compared with the other scheduler algorithms. However, in their
approach, ACO algorithm performs much slower than the other scheduler algorithms.
A research was conducted on Balanced job scheduling using Ant Colony
Optimization (BACO) for grid environment by Chang, Chang, and Lin (2007). The
main issue they tried to solve is how to schedule jobs efficiently in a grid
environment. In their approach, they used four main components: portal, information
server, jobs scheduler, and grid resource. BACO algorithm applied inside jobs
scheduler in order to select the most appropriate resource to execute the job. The
experiment was implemented using Taiwan UniGrid platform which consists of 26
campuses. BACO performance was compared with other two algorithms: Improved
Ant Algorithm (IACO) and Fastest Processor to Largest Task First Algorithm
(FPLTF). The results show that BACO has the ability to balance the job scheduling
load in the entire system. However, according to Bai et al. (2010) and Liu, Song, and
Dai (2010), using the single ant colony system leads to local optima because of the
stagnation that occurs due to the positive pheromone feedback mechanism.
A study proposed by Chang, Chang, and Lin (2009) implemented ant algorithm for
balanced job scheduling in computational grid. The study aimed to balance the entire
system load, at the same time tried to minimize the makespan of the set of jobs. In
addition, the study considered the bandwidth speed between the scheduler and
resources as well. The proposed algorithm is based on ant system algorithm. The
study was implemented in the Taiwan UniGrid platform which consists of more than
54
20 campuses in Taiwan. The experiments also simulated two problems, matrix
multiplication and linear programming. The evaluation was done by comparing it with
the improved ACO algorithm as proposed by Yan, Shen, Li, and Wu (2005), the
fastest processor to the largest task first algorithm (Menasce, Saha, Porto, Almeida, &
Tripathi, 1995), suffrage algorithm (Silva, Cirne, & Brasileiro, 2003), and random
method. In terms of makespan and balance, the experiment results show that the
proposed algorithm outperforms the other algorithms. However, the experiments are
very limited and unrepeatable due to the using of real grid computing environment.
Moreover, the comparison did not include other metaheuristics algorithms such as
genetic algorithm and artificial bee colony. Nevertheless, the study provides a
practical implementation guide which is useful to illustrate how job scheduling in
computational grid is working.
Kousalya and Balasubramanie (2009) presented a study on improving ant colony
optimization algorithm with local search for job scheduling in computational grid
systems. The proposed approach aims to minimize the makespan value as the main
objective. The study adopted local search technique from Ritchie and Levine (2003)
which is based on several neighbourhood searches such as Swap, Move, and Move
Top methods. The experiments were conducted using static benchmark problems
generated using Execution Time (ET) model presented by Fidanova (2006). The
proposed algorithm was evaluated versus Min-Min, ACO, Swap, Move, and Move
Top algorithms. The experiment results show that using ant colony optimization
algorithm with Move Top local search method outperforms other algorithms in terms
of makespan. However, the proposed algorithm was not compared with other
metaheuristics and hybrid algorithms in order to prove the algorithms performance. In
55
addition, more experiments using dynamic grid computing environment are required
which could provide more understanding to the behaviour of the proposed algorithm.
Liusuqin, Shuojun, Menglingfen, and Lixingsheng (2009) proposed a study to
improve ant colony optimization for Job Scheduling Problem (JSP). In their approach,
they address the problem of “misusing the great resources for minor purpose” which
makes some resources always idle and some resources are busy processing jobs. They
solved the problem by introducing improved ACO algorithm called “making
concessions in order to gain advantages” based on ACO. The experiment was
conducted using grid pheromone model simulation developed by the authors. The
proposed algorithm was compared with ant colony optimization using static
benchmark problems. The results show that the improved ACO algorithm could
perform better than the conventional ACO. The new algorithm could make better use
of the resources and solve the “misusing the great resources for minor purpose”
problem. However, the experiment size is very limited and did not contain
heterogeneous resources and tasks. In addition, the proposed algorithm was not
compared with other types of metaheuristics algorithms such as genetic algorithm and
tabu search algorithm.
A multiple ant colony model called “Cooperative multi-ant Colony Pseudo-parallel
Optimization Algorithm” was proposed by Liu et al. (2010). In their approach, three
subcolonies were used for optimization. Each subcolony respectively uses ant system
algorithm, ant colony system algorithm, and max-min ant system algorithm
independently. Each subcolony has its own pheromone matrix. By using three
different colonies, the pheromone matrix will have different distributions and
characteristics of change. After going through a certain number of iterations and
56
fulfilling the condition of pheromone interaction, the matrices of pheromone of the
three subcolonies will interact. The interaction will be according to their weight value
to gain new pheromone matrix. After the interaction process, the algorithm reinitializes the three pheromone matrices. They conducted experiments to solve the
travelling salesman problem. The experiment results show that their algorithm
performance is better than the classic algorithms (AS, ACS, and MMAS). The
researchers of the model claim the ability to prevent the system from stagnation
because of different distribution of different ant algorithm used. However, their
algorithm performs slightly better than classical MMAS, while MMAS has less
complexity in implementation and processing time.
Another study regarding ACO algorithm for job scheduling on computation grid has
been proposed by Kant, Sharma, Agarwal, and Chandra (2010). The authors proposed
two types of ant, namely red ants and black ants. The red ants’ responsibility is
system resource estimation, while the black ants’ responsibility is decision of resource
allocation. The study objective is to minimize the maximal total tardiness time of all
jobs within the machines in grid environment. The proposed approach was simulated
in real grid environment using 49 different resources. The comparative study was
done using Min-Min and FCFS algorithms (Fidanova & Durchova, 2006; K. Liu,
Chen, Jin, & Yang, 2009). The experiment results show that the proposed algorithm
outperforms other algorithms. However, the experiments are very limited and the
comparison did not include other metaheuristics algorithms. Therefore, more
investigations are required to test the algorithm robustness.
Load balancing is one of the important criteria in grid computing. A research on task
scheduling with load balancing using Multiple Ant Colonies Optimization (MACO) in
57
grid computing was conducted by Bai et al. (2010). In their work, they used multiple
ant colony optimizations to avoid the local optima from single colony behave. In their
framework, they considered both positive and negative feedbacks in searching for
solutions by sharing the search information and exploring a wider area of search space
with the cooperation between the ant colonies. They defined the degree of imbalance
by calculating the heuristic value using the load computing of each node. The
experiments were conducted to evaluate the proposed algorithm with benchmark
problems generated using GridSim simulator developed by Buyya and Murshed
(2002). They compared MACO with ant colony systems (Dorigo & Gambardella,
1997b) and first-come-first-served (Harchol-Balter, Crovella, & Murta, 1999). The
results showed that their algorithm outperforms other algorithms in terms of
makespan and load balancing. However, the solution for intractability between
performance and load balancing is not illustrated.
Enhanced ant colony algorithm for job scheduling in computational grid has been
proposed by Maruthanayagam and UmaRani (2010). The proposed approach is based
on Fast Ant System (FANT) algorithm which is a version of ACO algorithm. The
study focused on makespan optimization using the independent task model as defined
in Kousalya ad Balasubramanie (2009). The authors compared between two formulas
proposed to calculate the probability of selecting a resource for processing a task. The
experiment results show that using local search algorithm will improve the algorithm
performance significantly. However, the study lacks a proper experiment design such
as using known benchmark problems and comparing the proposed algorithm with
other metaheuristics algorithms. Nonetheless, using fast ant system could be suitable
for job scheduling in grid computing due to the time restriction imposed by
computational grid systems.
58
Mou (2011) proposed a new approach using double Pheromones techniques for ant
colony system. The study model was designed to solve Generalized Travelling
Salesman Problem (GTSP) which is an extension of the classical traveling salesman
problem. In GTSP, the nodes were partitioned into groups called clusters. The
solution to GTSP is to find the shortest closed tour visiting exactly one node from
each cluster. For such a problem, there are two pheromones, namely pheromone
between the groups and pheromone on the edges. The researcher tried to differentiate
between those pheromones by applying the double pheromones concept. In addition, a
mutation idea inspired from genetic algorithm was introduced in this study. According
to the experiment results conducted by the author, applying double pheromones
produced better performance. However, the instances used in the experiment were
small. According to Li, Liao, and Cai (2011), they stated about ant colony system that
“it is difficult to realize the overall optimum and it takes a long time when being
applied to large-scale TSP”. In addition, the implementation and influence of the
mutation idea was not illustrated in the study.
An improved ACO algorithm for job scheduling in computational grid systems has
been proposed by MadadyarAdeh and Bagherzadeh (2011). The main objective of the
study is to minimize the makespan value as an optimization objective in a batch
mode. The authors improved the ACO algorithm for job scheduling provided in
Kousalya and Balasubramanie (2008). The improvement is based on giving higher
probability to tasks that have higher standard deviation. For evaluation purpose, the
study adopted static benchmark problems based on expected time to compute model
using Range-Based method as proposed by Ali et al. (2000b). The proposed algorithm
was compared with ACO and P.ACO presented in Bagherzadeh and MadadyarAdeh
(2009). The experiment results show that the proposed algorithm achieved the best
59
makespan values among other algorithms. However, the experiments were conducted
using a very small scenario (32 tasks and 4 machines) which is not sufficient to test a
metaheuristics algorithm. In addition, only static environment were considered in the
experiment without any dynamic features to reflect the real job scheduling problem in
grid computing. Moreover, the proposed algorithm was only compared with ACO
approaches. Therefore, more comprehensive experiments and comparisons are
required in order to discover the efficiency of the proposed algorithm.
Kokilavani and Amalarethinam (2013) published a study on implementing ant colony
optimization based load sharing for job scheduling in computational grid systems. The
study aims to enhance the quality of service and share the load among the resources in
order to optimize the resource usage in the computational grid environment. The
proposed algorithm was implemented in MATLAB application simulating grid
computing with 2 resources and 5 tasks. The proposed ant colony optimization based
load sharing was compared with Min-Min and Max-Min algorithms. The experiment
results show that the proposed approach outperforms other algorithms in terms of total
wait time criterion. However, the experiments are very limited in terms of the
benchmark problem size. In addition, the study is based on static environment and
used unknown scenario. Moreover, the proposed algorithm was only compared with
simple heuristic approaches. Therefore, the proposed algorithm should be evaluated
on dynamic environment and compared with other metaheuristics algorithms.
II. Artificial Bee Colony
Bee algorithms are inspired by biological honeybee behavioural specifically the
foraging and exploration (Yang, 2014). There are several types of bee algorithms,
such as honeybee algorithm, virtual bee algorithm, artificial bee colony, and
60
honeybee-mating algorithm. Artificial Bee Colony (ABC) is an optimization
algorithm developed by Karaboga and Basturk (2008) and Karaboga (2005). The bees
in ABC algorithm are divided into three groups, namely: employed bees, onlooker
bees, and scout bees. The idea behind ABC is that for each food source, there is only
one employed bee. In other words, the total number of employed bees is equal to the
total number of food sources. When the food source is discarded, the employed bee of
that food source is forced to be a scout bee. The scout bee will search for new food
source randomly. The onlooker bees wait in the hive to obtain the information from
the employed bees. Based on that information, the onlooker bees will choose the best
food source probabilistically and start foraging (Yang, 2014). ABC algorithm is
applied to solve job scheduling problem in computational grid.
A recent study published by Kim, Byeon, Liu, Abraham, and McLoone (2013)
applied Artificial Bee Colony (ABC) for job scheduling in computational grid. The
authors proposed Binary ABC (BABC), Efficient Binary Artificial Bee Colony
(EBABC1), and flexible ranking strategy (EBABC2) algorithms. The study aimed to
minimize the makespan criterion for job scheduling in grid computing. The
experiments were conducted using a series of benchmark problems defined by Liu et
al. (2010). The proposed algorithms were compared with genetic algorithm, simulated
algorithm, and particle swarm optimization algorithm. In terms of makespan criterion,
EBABC1 and EBABC2 algorithms achieved the best results among all other
algorithms with superior performance for EBABC2. However, the experiments were
conducted using static environment which is not enough to conclude the algorithm
robustness in dynamic environment. Therefore, conducting more experiments is
required. In addition, hybridizing artificial bee colony with local search seems a
promising research area as well.
61
III. Bacterial Foraging Optimization Algorithm
Bacterial algorithms mimic the behaviour of bacteria in the nature such as foraging,
reproduction, and movement (Xing & Gao, 2014). There are several types of bacterial
algorithms, such as bacterial foraging algorithm, bacterial colony chemotaxis,
superbug algorithm, bacterial colony optimization, and viral system (Xing & Gao,
2014). Bacterial algorithms have been implemented successfully in various
scheduling problems such as job shop scheduling problems (Ge & Tan, 2012; Wu,
Zhang, Jiang, Yang, & Liang, 2007), flow shop scheduling problems (Botzheim,
Toda, & Kubota, 2012), and assembly line balancing (Atasagun & Kara, 2014).
Bacterial algorithms have been utilized to solve job scheduling problems in grid
computing systems.
Nayak, Padhy, and Panigrahi (2012) proposed an algorithm which combined the
merits of genetic algorithm and bacterial foraging optimization algorithm called
Genetic Bacterial Foraging (GBF). The proposed algorithm implemented a dynamic
mutation as presented in Michalewicz (1999) and crossover operator developed by
Michalewicz (1992). The aim of the study is to reduce the execution time as a cost
function. The experiment was conducted using dynamic environment generated with a
simulator developed by the authors. The proposed algorithm was compared with
Bacterial Foraging Optimization (BFO) algorithm. The experiment results show that
the proposed GBF algorithm outperforms BFO algorithm. However, the experiment
scenario was very small, using only four resources and five tasks. Therefore, more
studies are required to understand the behaviour of bacterial foraging optimization
algorithm.
62
A study has been conducted by Rajni and Chana (2013) on Bacterial Foraging
optimization (BFO) algorithm for resource scheduling on computational grid systems.
The study aimed to optimize makespan and cost values by considering Resource
Provisioning (RP) unit adopted from Aron and Chana (2012). The proposed approach
was implemented using GridSim simulator developed by Buyya and Murshed (2002).
The experiments were conducted by generating a workload using a model defined in
Lublin and Feitelson (2003) and expected time to compute model presented in Ali et
al. (2000a). The authors compared the proposed algorithm with genetic algorithm,
simulated annealing, and GA-TS algorithms. The experiment results show that the
proposed BFO algorithm outperforms other algorithms in terms of makespan and cost
values for both low and high machine heterogeneity benchmark problems. In addition,
the results show that the Coefficient of Variation (CV) of the proposed algorithm is in
the range 0%-2% which confirms the stability of the proposed algorithm. However,
the experiments are very limited and did not include some dynamic grid attributes
such as resource failure which is considered very important in dynamic grid
computing system (Feitelson, 2013).
IV. Particle Swarm Optimization Algorithm
Particle Swarm Optimization (PSO) algorithm was initially developed by Eberhart
and Kennedy (1995) and Kennedy and Eberhart (1995). PSO is considered as a
population-based optimization algorithm based on biological swarm intelligence
(Noghanian, Sabouni, Desell, & Ashtari, 2014). PSO has been implemented to solve
many real time problems such as face recognition (Kothari, Anuradha, Shah, &
Mittal, 2012), assembly scheduling problem (Tian, Liu, Yuan, & Wang, 2012),
Resource-Constrained Project Scheduling Problem (Jia & Seo, 2013), and job shop
63
scheduling problem (Li & Pan, 2012). PSO is applied successfully to solve job
scheduling problems in computational grid system.
In job scheduling problems for grid computing environment, a fuzzy PSO approach
was published by Abraham, Liu, Zhang, and Chang (2006). The proposed algorithm
extends the position and velocity of the particles from real vectors to fuzzy matrices.
The study aimed to optimize the scheduler performance in terms of makespan and
flowtime as a bi-objective. The performance of the proposed algorithm was evaluated
with genetic algorithm and simulated annealing approaches. The experiments were
conducted using three static instances generated by the authors. The evaluation results
show that the proposed fuzzy PSO algorithm was able to achieve better makespan
values than other algorithms. However, the results did not include flowtime values
which are supposed to be the second objective of the study. Thus, no conclusion could
be provided regarding the performance of the proposed algorithm. Moreover, the
experiments were conducted using only static scenario which is not enough to explore
the proposed algorithm behaviour.
Izakian, Abraham, and Snasel (2009a) proposed a particle swarm optimization
algorithm for meta-tasks scheduling in distributed heterogeneous computing systems.
The proposed approach aims to minimize makespan as an objective function. The
implemented PSO algorithm was compared with genetic algorithm as presented in
Braun et al. (2001), and continuous PSO developed by Salman, Ahmad, and Al-
Madani (2002). For the evaluation purpose, the authors generated benchmark
problems using expected time to compute model proposed in Braun et al. (2001). The
experiment results show that the proposed algorithm achieved the best makespan
values in all instances. Moreover, the proposed algorithm has the lowest standard
64
deviation. It is clear from the results that the proposed PSO algorithm was able to
achieve good results. However, the experiments were conducted on a static job
scheduling scenario which is not enough to make a conclusion about the algorithm
efficiency. Hence, more studies and experiments required using dynamic job
scheduling scenario in order to understand the algorithm behaviour.
Another study which implemented PSO approach has been proposed by Izakian,
Ladani, Zamanifar, and Abraham (2009). The study objectives are to minimize the
makespan as well as flowtime simultaneously. The approach’s implementation was
based on static environment using expected time to compute model to estimate the
required time for processing task in a machine. The proposed algorithm was
compared with Fuzzy PSO presented in Abraham et al. (2006). The experiment results
show that the proposed PSO approach performs better than Fuzzy PSO. However, no
details were provided regarding the benchmark problem and the flowtime results were
not reported in the study. In addition, the proposed PSO algorithm has not been
implemented with dynamic environment. Therefore, the proposed approach requires
more experiments with dynamic environment and compared with other metaheuristics
algorithms as well.
Another study using PSO to schedule jobs in heterogeneous computing systems has
been proposed by Izakian, Abraham, and Snasel (2009b). The proposed algorithm
aims to minimize the makespan value as a performance criterion. The study compared
the proposed algorithm with GA presented in Braun et al. (2001) and PSO presented
in Salman et al. (2002). The conducted experiment is based on static environment
using expected time to compute model proposed in Braun et al. (2001). The empirical
results show that the proposed PSO algorithm achieved the best makespan in all
65
instances. In addition, the algorithm convergence time was the lowest in most
instances. In spite of these good results achieved using the proposed PSO algorithm,
more experiments are required using dynamic environment in order to evaluate the
algorithm robustness.
A comparison of four metaheuristics algorithms for task scheduling in computational
grid system was presented by Meihong and Wenhua (2010). The algorithms used in
their study for comparison are genetic algorithm, ant colony optimization algorithm,
particle swarm optimization algorithm, and simulated annealing algorithms. The
evaluation criteria are makespan and the mean response time. The authors conducted
experiments using static environment. The results show that PSO algorithm has the
best performance among the other algorithms. However, the experiments were
conducted in static environment and very small scenario (5 users and 3 resources).
Therefore, the robustness of the compared algorithms is not proven. In addition, only
classical versions of the algorithms are used while enhanced versions are better in
terms of performance. In order to obtain a clear picture about which metaheuristics is
better, more investigations and experiments are required using a known benchmark
such as the one presented in Braun et al. (2001).
Izakian, Ladani, Abraham, and Snasel (2010) proposed a discrete particle swarm
optimization for job scheduling in grid computing. Their approach aims to minimize
the makespan and flowtime simultaneously in grid computing. In their study, they
provide two representations for mapping between problem solution and PSO particle.
The first representation used a direct encoding that is a vector with size equal to the
number of tasks. Each element in the vector represents the machine number. The
second representation used a binary matrix size of (jobs number * machines number).
66
The matrix was represented with values either 0 or 1. The benchmark problem used to
evaluate the proposed algorithm is based on expected time to compute model
presented by Braun et al. (2001). The proposed algorithm was compared with GA,
ACO, PSO, and Fuzzy PSO algorithms. The experiment results show that the
proposed algorithm achieved good results in makespan reduction, while for flowtime,
the algorithm performed the worst. Although the study aims to minimize makespan
and flowtime, the contradiction is clear between them such that the algorithm could
not reduce both of them simultaneously. This contradiction is mentioned by Xhafa
and Abraham (2010) in grid computing as well. In general, the proposed algorithm
performs better than other algorithms. However, the experiments were conducted
using only static environment. Therefore, more experiments on dynamic environment
are required to conclude the performance of the proposed algorithm.
Another study using fuzzy particle swarm optimization for job scheduling in grid
computing has been proposed in H. Liu et al. (2010). In their algorithm, they extended
the velocity and position of particles from the real vectors to fuzzy matrices. The
advantages of using fuzzy matrices in PSO are the speed of convergence and the
increase of the ability to find a faster and feasible solution. The study used the
makespan criterion to measure the algorithm performance. The performance of the
proposed algorithm was compared with genetic algorithm and simulated annealing
algorithm. The experiment results show that the proposed algorithm outperforms the
other algorithm especially in terms of execution time. However, the study did not use
a common benchmark in order to evaluate the proposed algorithm with other
approaches. In addition, only genetic algorithm and simulated annealing algorithms
were used for comparison which is also not enough to give a complete picture.
67
Moreover, the experiments were conducted with static environment only. Therefore, it
is not clear how the proposed algorithm will behave in dynamic environment.
2.2 Hybrid Approaches in Job Scheduling
The term hybrid refers to the concept of combining two or more algorithms in order to
complement each other hoping to achieve a better performance. Hybridization could
be between any types of algorithms such as heuristic and metaheuristics algorithms
(Talbi, 2013a). There are two levels of hybridization, namely low and high levels
(Xhafa, Kolodziej, Barolli, & Fundo, 2011). In low level (also called strong coupled)
hybridization, the algorithms interchange their inner procedures. One of the hybrid
algorithms is considered as the main while the others are subordinate algorithms. The
low level hybridization could be presented as 퐴푙푔표푟푖푡ℎ푚
1
(퐴푙푔표푟푖푡ℎ푚
2
), where
퐴푙푔표푟푖푡ℎ푚
1
is the main algorithm and 퐴푙푔표푟푖푡ℎ푚
2
is the subordinate algorithm. On
the other hand, high level hybridization could be represented as 퐴푙푔표푟푖푡ℎ푚
1
+
퐴푙푔표푟푖푡ℎ푚
2
+⋯+ 퐴푙푔표푟푖푡ℎ푚
푛
where 퐴푙푔표푟푖푡ℎ푚
1
will start first, then it will call
퐴푙푔표푟푖푡ℎ푚
2
after it finishes its process and so on. The sequences could be repeated
any number of times (depending on the algorithm design). Hybrid approach achieved
very good performance in many fields compared with stand-alone approaches
(Kolodziej, 2012). Hybridizing algorithms show promising results in job scheduling
for grid computing.
Ritchie and Levine (2004) published a hybrid approach between ant colony
optimization and tabu search algorithm for job scheduling in heterogeneous
computing environment. The idea behind this hybridization is that the tabu search
algorithm executed to the best-so-far solution found by the ants after some iteration
which is controlled by a parameter. In other words, tabu search algorithm is not
68
applied to every ant solution due to the longer processing time of tabu search
algorithm. The proposed algorithm adopted ACO implementation from Dorigo and
Stutzle (2003) and tabu search implementation from Ritchie and Levine (2003) which
is based on the approach described in Thesen (1998). The experiments were
conducted using static benchmark problems based on expected time to compute
model presented by Braun et al. (2001). The implemented algorithm was compared
with Min-Min, GA, Min-Min+LS, Min-Min+Tabu developed by Braun et al. (2001).
The experiment results show that the proposed approach outperforms other algorithms
for all instances in terms of makespan. However, the study reported that the proposed
algorithm took 3.5 hours to finish the execution of 1000 iterations of ACO which is
considered a very long time compared to 90 seconds according to Xhafa, Duran, et al.
(2008). In addition, the experiments were conducted using only static environment
and makespan criterion which is not enough to conclude the algorithm performance.
Nevertheless, the study provides practical hybridization details which could be reimplemented with enhancement.
A hybrid approach for job scheduling in computational grid systems proposed by
Xhafa (2007). The proposed algorithm is based on memetic algorithms and several
local search algorithms. The idea of the hybridization is that MA can use any of the
16 local search algorithms during the search process. In addition, MA algorithm
hybridized with TS algorithm as a high level hybridization (MA+TS). The proposed
algorithm aims to minimize the makespan and flowtime as a multi-objective
optimization. The study investigated the proposed algorithms on static and dynamic
grid computing environment. For static environment, the evaluation is based on ETC
model proposed in Braun et al. (2001) and the proposed algorithms MA and MA+TS
compared with two versions of GA implemented in Braun et al. (2001) and Carretero
69
and Xhafa (2006). Furthermore, the study conducted experiments on dynamic grid
computing environment using HyperSim simulator developed by Phatanapherom et
al. (2003). The experiment results show that for static environment, MA+TS
outperforms all other algorithms in terms of makespan, while MA outperforms all
other algorithms in terms of flowtime. Similar results were also obtained for dynamic
environment experiments. However, the experiments on dynamic environment did not
evaluate the proposed algorithms with other algorithms (only comparing MA with
MA+TS). Therefore, more evaluation is needed with other algorithms especially in
dynamic environment. Nevertheless, the study provides a good foundation regarding
the performance of local search with metaheuristics.
Another study called “An Improved Ant Colony Algorithm for Grid Scheduling
Problem” was conducted by Bagherzadeh and MadadyarAdeh (2009) to improve AS
algorithm. They argued that using the traditional AS (task, machine) will not achieve
the optimal solution. In their approach, they hybridized between MaxStd and AS
methods. The idea behind that is to give a higher probability to tasks that have higher
standard deviation. They conducted experiments with twelve different types of
problems. The results show an improvement in makespan and utilization vary from
3% to 29% depending on the problem type. However, the proposed algorithm was
compared with ant system algorithm, and not ant colony system, which is considered
the newest version in ant colony optimization algorithms.
A study on low level hybridization between Genetic Algorithm and Tabu Search
GA(TS) for job scheduling in grid computing has been published by Xhafa, Gonzalez,
Dahal, and Abraham (2009). The idea behind this hybridization is to enforce more
exploitation of solution space using the smart process of tabu search algorithm. The
70
proposed approach starts with genetic algorithm as the main algorithm and calls tabu
search algorithm to improve the population individuals. The hybrid algorithm
considers the scheduling problem as a bi-objective optimization problem. The study
aims to minimize makespan as a primary objective and then minimize flowtime as a
secondary objective. The implementation and comparison of genetic algorithm and
tabu search is adopted from Carretero et al. (2007) and Xhafa, Carretero, et al. (2009)
respectively. For evaluation purpose, the experiments were conducted using a grid
computing simulator developed by Xhafa et al. (2007). A three static scenario was
generated using expected time to compute model as a benchmark problem for
experiments. In terms of makespan, the experiment results show that the proposed
hybrid algorithm performs better than GA and TS for small and medium size
instances. However, GA(TS) achieved the worst value for large size problems. In
term of flowtime, GA(TS) achieved the best result for large size instances and the
worse values for other instances. The authors concluded that the proposed hybrid
algorithm outperforms other algorithms in terms of makespan for small and medium
size grid scenarios which is the prime objective of the study. In addition, the
experiment results show that the bi-objective optimization problem in grid computing
is a contradictive problem. However, the experiments are very limited in terms of
static instances and the algorithms compared with. In addition, it is very important to
test any hybrid approach on dynamic grid computing simulator in order to observe
how the algorithm can cope with dynamic change.
A hybrid genetic algorithm based scheduler with secure and task abortion features has
been proposed by Kolodziej, Xhafa, and Bogdanski (2010). The study proposed four
hybrid genetic algorithms, namely Secure Genetic Algorithm (SGA), Risky Genetic
Algorithm (RGA), Player’s Genetic Algorithm (PGA1), and Player’s Minimum
71
Completion Time (PMCT). Besides the proposed algorithm for scheduling jobs in
grid computing, the authors added security and task abortion mechanisms which are
considered as crucial issues in grid systems. The performance of the proposed
algorithm was measured through the makespan and flowtime metrics as bi-level
optimization problem. The study focused on independent job scheduling problem in
batch mode as described in Xhafa and Abraham (2010). In addition, the problem was
formulated using expected time to compute matrix model presented by Ali et al.
(2000b). The implementation of the proposed genetic algorithm for independent batch
scheduling was adopted from Carretero et al. (2007). The experiments were
conducted using discrete event-based grid simulator called HyperSim-G developed by
Xhafa and Carretero (2009). Using HyperSim-G, static and dynamic sets of instances
were generated as benchmark problems in order to evaluate the proposed algorithm.
The experiment results show that PMCT algorithm outperforms other algorithms for
static and dynamic instances in terms of makespan. For the case of flowtime, all
results are similar except for very large grid size where PMCT did not perform well as
other algorithms. Despite the fact that security and task abortion are very important
features in grid computing, security could be separated from the scheduler layer to
improve the algorithm performance. Regarding task abortion, more studies are
required to define a sufficient mechanism for such a complex process.
Song, Sun, and Cao (2010) presented a study on the convergence of converse ant
colony algorithm for job shop scheduling problem. In their study, they addressed two
problems with traditional ACO algorithm, slow and easy to fall into the local optimal
solution. To solve these problems, they proposed an algorithm called “Hybrid
Converse Ant Colony Optimization (HCACO)” with global convergence. HCACO
algorithm uses ACO algorithm with simulated annealing algorithm. The hybrid
72
algorithm can quickly rule out the poor solutions, so that the pheromone of optimal
path will be updated immediately, and the search time will be reduced. Because of
using SA algorithm which has the probability of escaping from the local optimization,
it is sure that ants will not fall into local optimal solution. The experiment was
conducted using a simulator with 13 hard benchmark problems. The results presented
a comparison between HCACO, Parallel Genetic Algorithm (PGA2), and ACO. From
the results, HCACO shows the best result in terms of average relative error percentage
which is smaller than PGA and ACO. The calculation time of HCACO and PGA was
equal.
A paper written by Wang, Duan, Jiang, and Zhu (2010) presented a new algorithm for
grid task scheduling using Genetic and Simulated Annealing algorithms (GSA). The
algorithm combined genetic algorithm with simulated annealing algorithm. They
pointed out that in spite of GA being fast in searching rate at the beginning; it suffers
from trapping in local minimum, while SA takes a long time to get the global
minimum. Based on those reasons, the authors combined GA with SA to inherent the
convergence property of simulated annealing and parallelism capability of genetic
algorithm. The hybrid algorithm GSA will start with GA which is stopped
prematurely after satisfying the termination condition. After that, each node visited at
the last generation with the best node found overall are taken as starting input for the
simulated annealing algorithm. The experiment results show that GSA performs better
than GA and SA. GSA has the ability to converge to a global minimum because of SA
property. However, the experiment was conducted using only 15 tasks and 3 resources
which are not enough to prove the robustness of the algorithm and the feasibility of
calculation time for big problem space such as 100 tasks and more than 10 resources.
Therefore, more investigation is needed to evaluate the hybrid algorithm GSA.
73
A hybrid approach between population and local search algorithms has been
developed by Xhafa, Kolodziej, Barolli, and Fundo (2011). The study represents a
high level hybridization between GA and TS, named GA+TS algorithm for
scheduling in grid computing. The algorithm starts with GA for a specific time, and
then passes the results to TS algorithm as an initial solution. TS algorithm will search
the neighbourhood of the initial solution until the termination condition is met. The
authors expected that GA will explore the solution space widely, while TS will make
in-depth exploration of the best solution found by GA. The objective of the proposed
algorithm is to enhance the makespan as a primary objective and flowtime as a
complement objective. This type of optimization scheme is referred as a hierarchic
optimization with priority to makespan criterion. Both algorithms, GA and TS, have
been implemented and evaluated for job scheduling in computational grid in Carretero
et al. (2007) and Xhafa, Carretero, et al. (2009). The experiments were conducted
using a HyperSim-G simulator developed by Xhafa et al. (2007). The study
considered static and dynamic grid computing environment. The proposed algorithm
evaluated with GA, TS and GA(TS) algorithms proposed in Carretero et al. (2007),
Xhafa, Carretero, et al. (2009), and Xhafa, Gonzalez, et al. (2009). The experiment
results show that the proposed algorithm outperforms other algorithms only in one
instance for static environment in terms of makespan and flowtime. While for
dynamic environment, the proposed algorithm achieved good results only in one small
size grid instance in terms of makespan. However, the GA+TS did not perform good
compared to other algorithms in terms of flowtime. In spite of these results, the
concept of using local search with population algorithm needs more investigation
specifically, the hybridization of different level and different population algorithms.
74
A study conducted by Xhafa, Kolodziej, Barolli, Kolici, et al. (2011) proposed a
hybrid approach between GA and TS for independent batch job scheduling in grid
computing. The hybrid algorithm aims to optimize the makespan and flowtime as a
bio-objectives scheduling problem. In addition, the authors proposed hierarchic and
simultaneous approaches for optimizing makespan and flowtime as well. Two types
of hybridization provided are low and high level hybridizations, which in turn result
in GA(TS) and GA+TS algorithms. The experiments were conducted considering
static and dynamic grid computing environment using HyperSim-G simulator
developed by Xhafa et al. (2007). The proposed algorithms were compared with GA
proposed in Carretero et al. (2007) and TS represented in Xhafa, Carretero, et al.
(2009). The experiment results show that the proposed hybrid algorithms outperform
the other stand-alone algorithms in makespan criterion. However, in terms of
flowtime criterion, GA and TS stand-alone algorithms outperform the proposed
hybrid algorithm; such a contradiction is normal for job scheduling in grid computing.
In spite of the limitation on the experiments and benchmark problem, the study
illustrated the implementation of the hybrid algorithms clearly.
A study has been proposed by Xhafa, Duran, and Kolodziej (2011) on exploitation
and exploration of solution space for job scheduling in computational grid systems.
The study aimed to utilize the population-based algorithm as an exploration
mechanism for search space and hybridize it with local search as an exploitation
mechanism. The authors proposed Memetic Algorithms (MAs) as a population-based
exploration method and Hill Climbing (HC) and Tabu Search (TS) algorithms as
exploitations methods. The proposed algorithm was evaluated using static benchmark
problem adopted from Braun et al. (2001) and dynamic benchmark problem generated
using HyperSim simulator developed by Phatanapherom et al. (2003). The proposed
75
hybrid MA+TS was compared with GA results implemented in Braun et al. (2001).
The experiment results show that the proposed algorithm achieved the best makespan
values for all instances. However, for flowtime results, the non-hybrid approach, that
is MA approach, achieved the best flowtime values. Similar results were obtained for
dynamic experiments where MA+TS achieved the best results for makespan, but
worst for flowtime. It is clear from the results that the proposed algorithm faced
contradictory optimization problem which is not possible to improve both objectives
of makespan and flowtime.
Nithya and Shanmugam (2011) proposed new Hybrid Ant Colony Optimization
(HACO) algorithm for job scheduling in grid computing. In their research, they
focused on high performance computing criteria to decrease the execution time in grid
computing. The proposed algorithm is based on ant colony optimization for dynamic
batch mode heuristic mapping. The new approach considers each job as an ant in the
colony and the pheromone details are provided to help in finding the optimal solution.
The proposed algorithm uses a new rule for updating the pheromone, and probability
matrix calculation formula in order to increase the efficiency of the existing ant
colony algorithm. Different types of experiments were conducted. The results show
that the proposed algorithm reduces the makespan in reasonable time. However, the
load balancing criterion is neglected in their research which is a very important factor
in grid computing performance and throughput.
Another study presented by Wei, Zhang, Li, and Li (2012) aimed to improve the ant
colony algorithm for grid task scheduling. They introduced a new type of pheromone
and a new node redistribution rule, at the same time, the algorithm can track the
performance of resources and tag it. The proposed algorithm considers the load
76
balance, task execution time, and resource fault. The approach replaces the path
pheromone into node pheromone to describe the handling capacity of current
resource. The meaning of pheromone in this algorithm is resource processing
capacity. The proposed system model consists of task receiver, task scheduler, and
resource information service collector. The new pheromone is called resource
suitability. By using this formula, the algorithm can evaluate the resource stability and
increase its pheromone. Another important improvement in this algorithm is the
resource redistribution rule which handles the unsuccessful task processing. The
results are compared with traditional ant algorithm. They claimed that their algorithm
performs better than the basic ant algorithm. The idea of a second type of pheromone
is interesting, but the experiment is very limited. In order to prove this concept, more
experiments are needed and more comparison with other algorithms such as ant
colony systems, and ant systems are required.
A recent study on hybrid approach between ant colony optimization and cuckoo
search algorithm for job scheduling in grid computing has been presented by Raju,
Babukarthik, and Dhavachelvan (2013). The authors tried to combine the advantages
of pheromone in ant colony optimization with local search feature of cuckoo search
algorithm. The study aimed to minimize the makespan value for job scheduling in
computational grid systems. The experiments were simulated using parallel
computing toolbox in MATLAB. The proposed algorithm performance was compared
with ant colony optimization algorithm using static scenario developed by the authors.
The experiment results show that the proposed algorithm executes faster than ACO
algorithm. However, the makespan values which is supposed to be the study objective
is not reported. In addition, the paper did not specify which ant colony optimization
member was used in the hybridization. Moreover, the experiments were very limited
77
in terms of tasks and resources used and the benchmarks were unknown. Therefore,
more studies are required to investigate the algorithm performance, especially in
dynamic environment. Nevertheless, the idea of hybridizing new metaheuristics
algorithms such as cuckoo search algorithm with ant colony optimization is a useful
idea.
2.3 Grid Simulator
One of the main factors that affect the performance of a grid computing is the
workload to which the system is subjected (Feitelson, 2013). Evaluating the job
scheduling algorithm with wrong workloads will lead to erroneous results which
cannot be relied upon (Smith, 2007). The workload could be classified into static and
dynamic workload. The author also stated that the differences between static and
dynamic workloads may have subtle implications for performance evaluation.
Therefore, an experiment that utilizes static workloads is incapable to evaluate the
performance of job scheduling algorithm. Feitelson (2013) has also stated that static
workload cannot be considered as valid samples of real dynamic workloads.
Organizations such as System Performance Evaluation Consortium, Grid Workloads
Archive and Transaction Processing Performance Council provide several
benchmarks on CPU, network file system, web servers, cluster, grid, database and
parallel distributed systems for evaluation of computer systems (Feitelson, 2013).
These benchmarks are useful to be analyzed and modelled.
One of the successful models for heterogeneous static computing system is Expected
Time to Compute (ETC) proposed by Ali et al. (2000). The model arranges the
information in a two dimension matrix called ETC matrix. Each entry in the matrix,
ETC [
i, j
], represents the expected execution time of task i on machine j. In ETC
78
matrix, the elements along a row represent the estimates of the expected execution
times of a given task on different machines, while the elements along a column give
the estimates of the expected times of different tasks on a given machine. Two
methods are used in ETC model, namely Range Based ETC Matrix Generation and
Coefficient-of-Variation Based ETC Matrix Generation. The first method used normal
distribution while the second method used gamma distribution. However, there is a
limitation with the ETC model. Its computing capacity of resources remains
unchanged (static) during tasks execution. Thus, ETC model does not reflect the real
dynamic environment in grid computing (Xhafa & Abraham, 2008a).
GridSim is one of the popular simulators for static job scheduling in grid computing
(Hao, Liu, & Wen, 2012). GridSim is a java-based discrete-event simulation toolkit
which can simulate heterogeneous resources, users, applications, brokers and
schedulers in grid computing. However, GridSim suffers when simulating more than
2,000 grid sites concurrently due to the memory consumption. In addition, GridSim
does not simulate the failure of resources which is one of the dynamic natures in real
grid computing environment. A simulator for mapping jobs to resources in grid
environment was also proposed by Chaturvedi and Sahu (2011). They developed their
simulator using C++ language for ten metaheuristics algorithms. However, their
simulator is based on ETC model which can simulate only static environment.
Caron, Garonne, and Tsaregorodtsev (2007) developed a simulator for many clusters
of heterogeneous nodes belonging to a local network. Their simulator was developed
based on Simgrid toolkit proposed by Grosan and Abraham (2007). They used the
improved Simgrid simulator in their experiments for batch system. Probability
distributions such as Gamma, Gaussian and Poisson were available to simulate the
79
pattern of arrival of jobs. However, the simulator lacks the attribute of resource
failure.
For large distributed grid systems and complex job scheduling, a simulator called
GangSim was developed by Dumitrescu and Foster (2005). The simulator focuses on
the interactions between local and community reservation allocation policies.
GangSim allows parallel execution running on real resources which make scheduling
process faster. GangSim includes components such as external scheduler, local
scheduler, data scheduler, monitoring distribution points, and virtual organization. In
spite of all these components in GangSim, the authors stated that GangSim is still far
from an accurate simulation of the grid environment, primarily due to various
idiosyncratic features.
Grid World Archive (GWA) provides archives of operational data that can be used in
evaluating job scheduling algorithms. However, GWA archives lack details and
systematic description of the grid or cluster resources and from where the data were
collected (Klusacek & Rudova, 2010). In addition, information on the background of
the load, resource failures or specific user’s requests were not provided. It can be seen
that present models cannot fully simulate the dynamic nature of jobs and resources in
the grid environment.
2.4 Conceptual Framework
This study focused on metaheuristics algorithms domain specifically on the
hybridization between ant colony system, genetic algorithm, and tabu search
algorithms. Figure 2.7 illustrates the research conceptual framework.
80
Figure 2.7. Research conceptual framework
In Figure 2.7, the framework starts with metaheuristics algorithms components. There
are several categorise of metaheuristics which could be classified as population or
individual based, swarm intelligence, local search, nature inspired algorithm.
However, there is no standard classification for metaheuristics algorithms (Blum &
Roli, 2003). This study selects ant colony system algorithm from swarm intelligence
branch, genetic algorithm from evolutionary branch, and tabu search from local
search branch. From these three algorithms, four hybridized algorithms are proposed
in this study, namely ACS(GA), ACS+GA, ACS(TS), and ACS+TS. These hybrid
algorithms are based on low and high level of hybridization between ACS, GA, and
TS algorithms. The low level hybridization will enhance the exploration mechanism
of ACS algorithm, while the high level hybridization will refine the final solution
found by ACS algorithm. For evaluation purpose, the proposed hybrid algorithms are
implemented and evaluated on job scheduling problem in static and dynamic grid
computing system.
Metaheuristics Algorithm
Local Search Algorithm
Evolutionary Algorithm Swarm Algorithm
Tabu Search Algorithm
Genetic Algorithm
Ant Colony System
ACS(GA)
ACS+GA
ACS(TS)
ACS+TS
81
2.5 Summary
It is clear from the previous studies that scheduling problem is NP-complete problem.
So far, there is no exact algorithm for solving NP-complete problems. Therefore,
approximate algorithms were used for solving such a problem. All the approximate
algorithms do not guarantee to find the optimum solution but they try to reach a near
optimum solution within reasonable time and resources. ACS is one of these
algorithms which have shown a good performance in solving different types of
optimization problems. However, in huge instances problem, ACS suffers from
stagnation problem which makes it insufficient in terms of computation time and
solution quality. The huge instance means the search space is very big. In order to
search a wide area of that search space, ants in ACS algorithm need to explore more
nodes and arcs. In addition, the number of ants needs to be increased. As the
mechanism of exploration in ACS influenced by random selection, any wrong path
selection in terms of cost will affect the whole solution quality. Therefore, by
increasing the exploration rate, the rate of wrong selection will also be increased and
the final solution definitely will be out of quality (it was clear in most of the literature
review that the rate of exploration is 0.1). On the other hand, increasing the number of
ants will make the search process very slow because each ant will construct its own
solution. Therefore, increasing any parameter value related to those issues will not
give a better result, instead of that; it will give a worse result. Hence, ACS needs
better exploration mechanism which is not based on random selection. Tabu search
algorithm shows very fast convergence with reasonable processing time. Genetic
algorithm and tabu search are considered as good candidates to do the enhancement
process for the solution found by ACS algorithm. Therefore, this study proposes
82
hybrid ACS with GA and TS algorithms. The hybrid approaches are implemented in
low and high level of hybridization.
83
CHAPTER THREE
RESEARCH METHODOLOGY
This chapter presents the framework and methodology of this study to implement the
hybrid ACS with GA and TS algorithms. In addition, the level of hybridization for
each approach is discussed in this chapter as well. Moreover, this chapter provides the
evaluation details.
The rest of this chapter is organised as follows. Section 3.1 discusses the research
framework and Section 3.2 describes the methodology, techniques and the proposed
algorithms. The summary is presented in Section 3.3.
3.1 Research Framework
The research framework started with enhancing the exploration mechanism in ACS
algorithm by implementing low level hybridization with GA and TS algorithms as
shows in Figure 3.1. Then, the GA and TS algorithms are hybridized with ACS
algorithm in a high level order to refine the solution found by ACS algorithm. The
design and development of the grid computing simulator is undertaken in the third
phase. Finally, phase four focuses on the evaluation of the proposed algorithms. The
following sections describe the methods and techniques used in each phase of the
research framework.
84
Figure 3.1. The Research Framework
Enhance ACS exploration
Phase 1:
Low level hybrid
algorithm.
1- ACS(GA)
2- ACS(TS)
(1
st
objective)
Refine the ACS solution
Phase 2:
High level hybrid
algorithm.
1- ACS+GA
2- ACS+TS
(2
nd
objective)
Develop static and
dynamic grid computing
simulator
Phase 3:
Simulator for
grid computing
system
(3
rd
objective)
Evaluation of the
proposed algorithm
Phase 4:
The best hybrid
algorithm for
grid computing
system
(4
th
objective)
Phase Activity Outcomes
85
3.2 Research Methodology
The methods that have been used in conducting the research are explained in the
following sections.
3.2.1 Problem Formulation
Job scheduling problem consists of complex job involving the execution of multiple
tasks. Each job contains one or more tasks (Kolodziej, 2012). This study considered a
static and dynamic grid computing system based on batch mode. In batch mode, the
tasks are grouped into a batches and each batch assigned to the resources via the
scheduler. In addition, this study deals with independent tasks that are tasks with no
relation between each other. The task size is expressed using Million of Instruction
(MI) and the resource capacity expressed using Million of Instruction Per Second
(MIPS) (Kolodziej, 2012). The time required to process a task on a resource is
calculated using Expected Time to Compute (ETC) model proposed by Braun et al.
(2001) as follows:
퐸푇퐶
[
푖,푗
]
=
푡푎푠푘
푖
푚푎푐ℎ푖푛푒
푗
(3.1)
퐸푇퐶
푛×푚
is a matrix with two dimensions 푛×푚 where 푛 is the number of tasks and
푚 is the number of machines. In addition, each machine has a load to process before
processing the new tasks. The previous load expressed using ready time vector
(Kolodziej, 2012). The ready time vector of all machines is defined as:
푟푒푎푑푦_푡푖푚푒=[푟푒푎푑푦
1
,푟푒푎푑푦
2
,...,푟푒푎푑푦
푚
]
86
The completion time of 푚푎푐ℎ푖푛푒
푗
is calculated using:
푐표푚푝푙푒푡푖표푛
[
푗
]
=푟푒푎푑푦
푗
+ ∑ 퐸푇퐶
[
푖,푗
]
,
푖∈푇푎푠푘(푗)
(3.2)
Where 푇푎푠푘(푗) is the set of tasks assigned to the 푚푎푐ℎ푖푛푒 푗 (Kolodziej, 2012).
The 푐표푚푝푙푒푡푖표푛
[
푗
]
parameters are the coordinates of the following completion
vector:
푐표푚푝푙푒푡푖표푛= [푐표푚푝푙푒푡푖표푛
[
1
]
,푐표푚푝푙푒푡푖표푛
[
2
]
,...,푐표푚푝푙푒푡푖표푛
[
푚
]
]
푇
(3.3)
Using completion vector, the makespan calculated using the following equation:
푚푎푘푒푠푝푎푛=max
j ∈ M
(
completion[j]
)
(3.4)
where M is the number of machines (Kolodziej, 2012).
The workflow of the sequence of tasks on a given 푚푎푐ℎ푖푛푒 푖 is calculated using the
following equation:
푊퐹
[
푗
]
= 푟푒푎푑푦
푗
+ ∑퐸푇퐶[푖,푗]
푖∈푆표푟푡푒푑[푗]
(3.5)
Where 푊퐹
[
푗
]
is the workflow of the 푚푎푐ℎ푖푛푒 푗, 푆표푟푡푒푑[푗] is a set of tasks assigned
to the 푚푎푐ℎ푖푛푒 푗 sorted in ascending order by the corresponding ETC values
(Kolodziej, 2012).
The flowtime value is the sum of 푊퐹
[
푗
]
parameters using the following equation:
87
퐹푙표푤푡푖푚푒= ∑푊퐹
[
푗
]
푗∈푀
(3.6)
The utilization metric is calculated using the following equation (Kolodziej, 2012):
푎푣푒푟푎푔푒 푢푡푖푙푖푧푎푡푖표푛=
∑
푐표푚푝푙푒푡푖표푛[푗]
{푗∈푀푎푐ℎ푖푛푒푠}
푚푎푘푒푠푝푎푛∙푛푢푚푛푒푟 표푓 푚푎푐ℎ푖푛푒푠
(3.7)
3.2.2 Dynamic Expected Time to Compute
For ACS implementation, the heuristic information needs to be defined. For static
environment, heuristic value is calculated from the Expected Time to Complete (ETC)
matrix using {1 / (ETC
푖푗
+Load
푗
)} where ETC
푖푗
represents the expected time to
compute task 푖 on machine 푗 Equation (3.1), and Load
푗
is the previous load assigned
to machine 푗 (Ku-Mahamud & Alobaedy, 2012). For dynamic environment, this study
will calculate the heuristic value from the Dynamic Expected Time to Complete
(DETC) matrix using {1 / (DETC
푖푗
+Load
푗
)} where DETC
푖푗
represents the dynamic
expected time to compute task 푖 on machine 푗, and Load
푗
is the previous load assigned
to machine 푗. Longer computing time and more loads will produce a smaller heuristic
value, which will make the probability of selecting this machine smaller and vice
versa.
3.2.3 Solution Encoding
One of the most important parts of any metaheuristics algorithms is how to encode the
solution which is related to the algorithm data structure (Michalewicz, 1999). This
study implemented the solution encoding using a vector (Kołodziej & Khan, 2012).
The size of the vector is equal to the number of tasks. The vector value indicates the
88
machine number; therefore, the vector values are in the range of (1 – number of
machines) (Xhafa et al., 2011). For example:
Solution_Vector [1] = 2, means task 1 is assigned to machine number 2.
This representation implemented for ant colony system, tabu search, and genetic
algorithm. For ant colony system, each ant holds an empty vector which represents
the schedule solution. During construction phase, each ant will assign a value to the
vector which is a machine number. Once the construction phase is finished, the ant
has the complete vector which is a complete schedule solution. Figure 3.2 shows the
solution vector used by the ants.
Figure 3.2. The solution vector used by the ants
For tabu search algorithm, same representation implemented as a solution trajectory.
TS algorithm searched the vector neighbourhood using methods, such as swapping the
adjacent machine. The vector will always represent a valid schedule solution as long
as the vector values in the range of (1- number of machines).
For genetic algorithm, the solution vector considered as a chromosome. The following
scenario is used in demonstrating the practicality of the proposed representation. A
89
scheduler has to assign five tasks (t) to four resources (r). Figure 3.3 shows the
solution vectors contain the resource number.
Figure 3.3. Solution vectors used by genetic algorithm
Note that the tasks sequences are fixed in ascending order ( 푡
1
,푡
2
,...,푡
푛
). This type of
solution vector contains less information (only resource number) and easier to
manipulate with operations such as crossover in genetic algorithm. For example,
applying crossover after the third gene in Figure 3.3 will produce new solutions as
shows in Figure 3.4.
Figure 3.4. The new solution vectors produced by crossover operator
In validating the solution, changing the resource order will always produce valid
solution even if there are resources that are assigned more than once or not utilized at
all. In other words, validation of the solution can be omitted. Thus applying this type
of representation will reduce the calculation process and time.
3.2.4 Objective Function
There are many criteria in job scheduling to measure the solution quality, such as
makespan, flowtime, utilization, matching, and balance. Due to the importance of
90
makespan metric, this is study considered the makespan value as the main objective to
minimize using the following fitness function for GA and ACS (Braun et al., 2001):
퐹푖푡푛푒푠푠=
1
푚푎푘푒푠푝푎푛
(3.8)
Makespan value calculated using equation (3.4). Solution with smaller makespan
value means it has higher fitness value. For TS algorithm, the makespan value is the
fitness value itself, that is: 퐹푖푡푛푒푠푠=푚푎푘푒푝푠푎푛.
3.2.5 Ant Colony System Algorithm Implementation
ACS algorithm starts with initializing the parameters and pheromone trails. The initial
pheromone is calculated as (Dorigo & Stutzle, 2004):
1/
(
(푛푒푎푟푒푠푡−푛푒푖푔ℎ푏표푟−푠표푙푢푡푖푛
)
∙
(
푛푢푚푏푒푟 표푓 푚푎푐ℎ푖푛푒∙
푛푢푚푏푒푟 표푓 푡푎푠푘푠)
)
(3.9)
Once the initializing process is done, each ant in ACS starts solution construction
process using the following equations:
푃
푖푗
퐴푛푡푘
={
푎푟푔푚푎푥 {[푡
푖푗
] .[1 / (퐷퐸푇퐶
푖푗
+퐿표푎푑
푗
)]
훽
}, 푖푓 푞 ≤푞
0
;
퐽 표푡ℎ푒푟푤푖푠푒;
(3.10)
where [푡
푖푗
] is the pheromone value between the task 푖 and machine 푗, 퐷퐸푇퐶
푖푗
is the
dynamic expected time to compute task 푖 on machine 푗 (퐸푇퐶
푖푗
is used for static),
Load
푗
is the previous load on machine 푗, 훽 is the parameter to control the influence of
the heuristic information, 푞 is a random variable uniformly distributed in [0, 1], 푞
0
(0
91
≤ 푞
0
≤ 1) is a parameter, and 퐽 is a random variable selected according to the
probability distribution using the following equation:
푃
푖푗
퐴푛푡푘
=
[푡
푖푗
] . [1 / (퐷퐸푇퐶
푖푗
+퐿표푎푑
푗
)]
훽
∑
[푡
푖푗
] . [1 / (퐷퐸푇퐶
푖푗
+퐿표푎푑
푗
)]
훽
푀
푗=1
(3.11)
During the construction process, each ant will apply a local pheromone update using
the following equation (Dorigo & Stutzle, 2004):
푡
푖푗
=
(
1−휌
)
∙푡
푖푗
+ 휌∙푡
0
(3.12)
Where 휌 (0 < 휌 < 1) is the evaporation rate and 푡
0
is the initial pheromone calculated
using equation (3.9).
Once all the ants finished their construction phase, the global pheromone update starts
using the following equation (Dorigo & Stutzle, 2004):
푡
푖푗
←
(
1−휌
)
∙푡
푖푗
+(휌∙∆푡
푖푗
푏푖푗
), ∀
(
푖,푗
)
∈푇
푏푠
,
(3.13)
Where ∆푡
푖푗
푏푖푗
is the fitness value found be the best-so-far ant using equation (3.8) and
푇
푏푠
are the arcs of the best solution (Dorigo & Stutzle, 2004).
Appendix A provides the C# code for ant colony system algorithm.
3.4.6 Genetic Algorithm Implementation
Genetic algorithm consists of several methods, namely generate population,
evaluation, selection, crossover, mutation, and combination operators. In this study,
92
each operator is implemented according to Xhafa et al. (2007a). The following are the
implementation details.
I. Population
A genetic algorithm is based on a population approach which is using many
chromosomes in order to apply operators, such as crossover and mutation operators.
Each chromosome is representing a complete solution. For job scheduling problem in
grid computing, the chromosome is a vector of integer numbers [1 - machines
numbers]. The size of the vector is equal to the number of tasks and the vector index
represents the task number (Xhafa et al., 2007a). For example Figure 3.5 shows two
chromosomes for 5 tasks and three machines.
Chromosome[1] = 1 2 2 1 2
Chromosome[2] = 2 2 2 1 1
Figure 3.5. Chromosomes for five tasks and three machines
Each chromosome in the population is created randomly. However, for the proposed
hybrid approach ACS(GA) and ACS+GA, one of the population chromosome is the
best solution passed from ACS to GA.
II. Evaluation
Every chromosome’s fitness value in the population is evaluated using the objective
function based on makespan value as defined in equation (3.8).
93
III. Selection
This study implemented the K-tournament method as a selection operator. In Ktournament operator, K chromosomes are randomly selected from the population.
From these K chromosomes, two chromosomes are selected with the highest fitness
values. This type of selection will give chance to all individual to compete fairly
(Xhafa et al., 2007a).
IV. Crossover
There are many types of crossover operators, such as one-point, two-points, multipoints, and uniform crossover. This study implemented a crossover operator knows as
Fitness-Based crossover (Xhafa et al., 2007a). In Fitness-Based crossover, the
crossing is made based on the fitness of the parents chromosomes (solutions). Let 푓
1
be the fitness of the first chromosome and 푓
2
is the fitness of the second chromosome.
Then the probability of interchanging for each gene (machine) is calculated using:
푝= 푓
1
/ (푓
1
+ 푓
2
)
(3.14)
In this method, if there is a large difference in the fitness values between two parent
chromosomes, then it is quite probable that a chromosome of new structure will be
obtained (Xhafa et al., 2007a).
V. Mutation
In this study, the mutation operator implemented based one the Re-balance mutation
method (Xhafa et al., 2007a). Re-balance mutation tries to reduce the lode of the most
overloaded machine by swapping jobs from the overloaded machine. The Re-balance
mutation is done in two steps:
94
a. Choose a machine 푚 from most overloaded machines.
b. Identify job 푡 assigned to 푚 and 푡
′
assigned to another machine 푚
′
such
that 퐸푇퐶
[
푡
′
][
푚
′
]
< 퐸푇퐶
[
푡
][
푚
]
. Jobs 푡 and 푡
′
are swapped.
VI. Replacement
The study implemented the replacement operator based on Steady-State Genetic
Algorithm (SSGA) strategy (Xhafa et al., 2007a). In SSGA, the worse portion of the
population will be replaced with the new generated chromosomes. The size of the
population is maintained constantly. In spite of the risk of stagnation, using SSGA
operator performs very well if a good solution is required to be find very quickly
such as the case of job scheduling problem in grid computing where the time to find
the solution is very restricted (Xhafa et al., 2007a).
Appendix B provides the C# code for genetic algorithm.
3.4.7 Tabu Search Algorithm Implementation
The major parts of the tabu search algorithm implementation in this study are adopted
from the studies proposed by (Xhafa et al., 2009; Xhafa et al., 2011). The
implemented tabu search algorithm consists of six parts as described in the following
points:
I. Initial Solution
In this study, the initial solution is passed from the best solution found by the ants in
ACS algorithm. Therefore, TS algorithm starts with good quality solution in order to
enhance it.
95
II. Movements
The implemented TS algorithm applies two types of movements in order to generate
new solution from the current solution’s neighbourhood, namely transfer and swap.
Transfer is the process of moving a job from one machine to another, while swap is
the process of exchanging two jobs assigned to different machines (Xhafa et al.,
2009).
III. Memory
This study implemented the recency memory that is, a tabu list with the last time each
job was assigned to every machine machines (Xhafa et al., 2009). The recency
memory is based on a matrix 푇퐿
푛×푚
where 푛 is the number of tasks and 푚 is the
number of machines (Xhafa et al., 2011).
IV. Aspiration Criteria
Two criteria are implemented to accept the tabu movements: First, if the movement to
the new solution produce better makespan value, then accept the movement. Second,
if the movement to the new solution produce equal makespan to the best found
solution with better flowtime value, then accept the movement.
V. Soft Diversification
Soft diversification implemented using swap load method in order to move to new
search space area near to the current solution. This method starts by identifying the
highest load machine and tries to swap it with the lowest load machine.
96
VI. Strong Diversification
During the algorithm execution, if the algorithm is not able to find better quality
solution within 50 iterations, then the strong diversification is triggered. The strong
diversification is based on a large perturbation of the current solution by changing the
assignments of 1% of the number of jobs to the random machines (Xhafa et al., 2009).
Appendix C provides the C# code for tabu search algorithm.
3.2.8 Enhance ACS exploration
Hybridization is a term which refers to the approach that combines two or more
algorithms in order to achieve a result which is not achievable using a stand-alone
approach (Xhafa, Gonzalez, et al., 2009). Algorithms could be hybridized fully or
partially to be able to get the best features of the combined algorithms. According to
Xhafa, Kolodziej, Barolli, and Fundo (2011), there are two levels of hybridization
between algorithms, namely, high level and low level which refers to the degree of
coupling between the metaheuristics algorithms.
In low level hybridization, also called strongly coupled, the algorithms inter-change
their inner procedures. The level of hybridization reflects the degree of inner
exchange among the hybridized algorithms. In low level hybridization, one of the
algorithms is the main algorithm, which calls other algorithms at any time of
execution (depending on the hybridization design). The low level hybridization
algorithm could be presented as 퐴푙푔표푟푖푡ℎ푚
1
(퐴푙푔표푟푖푡ℎ푚
2
) (Xhafa, Gonzalez, et al.,
2009). In this representation, 퐴푙푔표푟푖푡ℎ푚
1
is the main algorithm and 퐴푙푔표푟푖푡ℎ푚
2
is
the subordinated algorithm (Jourdan, Basseur, & Talbi, 2009; Xhafa, Kolodziej,
Barolli, & Fundo, 2011).
97
This study has implemented both levels in order to determine the best hybridization.
In low level hybridization, the combined algorithms ACS with GA “ACS(GA)” and
ACS with TS “ACS(TS)” will interchange their inner procedures. ACS is the main
algorithm which during its flow will call the GA and TS for enhancement. The
algorithm notation ACS(GA) and ACS(TS) for low level means ACS is the main
algorithm and GA or TS is the subordinated algorithm. Low level hybridization
between ACS and GA will refine the solution produced by each ant in ACS. On the
other hand, TS will enhance the exploration mechanism in ACS algorithm. Tabu
search algorithm is based on systematic process (Glover & Laguna, 1997). Therefore,
tabu search algorithm is a very suitable approach to be combined with ACS algorithm
to enhance the exploration mechanism. In low level hybridization, the best solution
produced by the ants is sent to the local search algorithm for enhancement. The
enhanced solution is returned to the ant for pheromone update. Therefore, the ant will
update the pheromone using the enhanced solution which makes the ants deposits
more pheromone value. The pheromone value will influence the movements of the
ants in the next iteration. Figures 3.6 and 3.7 represent the pseudocode for ACS(GA),
and ACS(TS) algorithms respectively.
98
Figure 3.6. ACS(GA) (low level) algorithm pseudocode
In ACS(GA) Figure 3.6, the first step is initializing the ants and distribute them
randomly (Dorigo & Stutzle, 2004). Second step is initializing the parameters and
pheromone trails. The initial pheromone is calculated using equation (3.9).
The third step is the while loop which is terminated when the condition is met, in this
study the termination condition is satisfied after 90 seconds of the algorithm
execution. The fourth step is the loop for each ant. The fifth step is the construction
using the equation (3.10).
Procedure ACS(GA)
Step 1- Initialize the number of ants 푛;
Step 2- Initialize parameters and pheromone trails; Equation (3.9)
Step 3- While (Termination condition not met) Do;
Step 4- For i=1 to 푛 Do;
Step 5- Construct new solution; Equation (3.10)
Step 6- Apply local pheromone update; Equation (3.12)
Step 7- End For;
// Genetic algorithm starts here;
Step 8- Initialize population (P);
Step 9- Add (best ant solution from ACS to P);
Step 10- Evaluate (P); Equation (3.8)
Step 11- While (termination condition not met);
Step 12- Ṕ← Select (P);
Step 13 - Crossover (Ṕ);
Step 14- Mutate (Ṕ);
Step 15- Evaluate (Ṕ); Equation (3.8)
Step 16- P ← Replace (Ṕ ∪ P);
Step 17- End While;
// Genetic algorithm ends here;
Step 18- Apply Global pheromone update; Equation (3.13)
Step 19- Update best found solution 푠
∗
;
Step 20- End while;
Step 21- Return the best solution;
End Procedure;
99
The sixth step is the local pheromone update using the equation (3.12). Step seven is
the end of the ants loop. Step eight is the start of genetic algorithm which is start by
initializing the population using random method. Step nine will add the best solution
found by the ants to the population of genetic algorithm. Step ten will evaluate the
population using the makespan fitness objective by equation (3.8). Step eleven is the
main genetic algorithm loop using (while) with termination condition of two seconds.
Step twelve is the selection process using tournament operator. Step thirteen is the
crossover operator using fitness based crossover. Step fourteen is the mutation process
using re-balance operator. Step fifteen will evaluate the new solution using the
makespan objective function by equation (3.8). Step sixteen is the replication operator
which replaces the old solution with the new generated solution using SSGA strategy.
Step seventeen ends the genetic algorithm execution. The best solution found by
genetic algorithm is passed back to the ant colony system algorithm.
Step eighteen is the global pheromone update using the equation (3.13). Step nineteen
will update the best so far solution found by the algorithm. Step twenty ends the ant
colony system algorithm and the best solution is returned by step twenty one.
Appendices A and B provides the C# code for ant colony system and genetic
algorithm respectively.
The proposed ACS(GA) algorithm is different than other proposed algorithm such as
the one presented by Liu, Chen, Dun, Liu, and Dong (2008) which is based on using
genetic algorithm to choose, cross, and mutate the parameters of ant colony algorithm.
In the proposed ACS(GA) algorithm, the genetic algorithm is used to select, cross,
and mutate the best-so-far solution found by ants in every cycle as illustrated in
Figure 3.6 with the step “Add (best ant solution from ACS to P)”. Therefore, GA
100
works as an exploration mechanism to explore the search space based on the solution
found by the ants in ACS algorithm.
Another low level hybridization between ant colony system and tabu search algorithm
is shown in Figure 3.7.
101
Figure 3.7. ACS(TS) (low level) algorithm pseudocode
In Figure 3.7, the first step is initializing the ants and distribute them randomly
(Dorigo & Stutzle, 2004). Second step is initializing the parameters and pheromone
trails. The initial pheromone is calculated using equation (3.9).
Procedure ACS(TS)
Step 1- Initialize the number of ants 푛;
Step 2- Initialize parameters and pheromone trails; Equation (3.9)
Step 3- While (Termination condition not met) Do;
Step 4- For i=1 to 푛 Do;
Step 5- Construct new solution; Equation (3.10)
Step 6- Apply local pheromone update; Equation (3.12)
Step 7- End For;
// Tabu search algorithm starts here;
Step 8- Create solution 풔 from best ant 푨푪푺_풔
∗
;
Step 9- Create global solution 푠
∗
← 푠;
Step 10- Create tabu list 푇퐿;
Step 11- Initialize the aspiration function 퐴;
Step 12- While (termination condition not satisfied) Do;
Step 13- Search the neighbourhood 푁 of current solution 푠: {푠̂∈푁(푠)};
Step 14- If (move from 푠 to 푠̂ is not in 푇퐿) Then;
Step 15- 푠 ← 푠̂;
Step 16- Update 푇퐿 memories;
Step 17- End If;
Step 18- Else If (푓
(
푠̂
)
<퐴(푓
(
푠
)
) Then;
Step 19- 푠 ← 푠̂;
Step 20- Update 푇퐿 memories;
Step 21- End If;
Step 22- If (푓(푠) < 푓(푠
∗
)) Then;
Step 23- 푠
∗
= 푠;
Step 24- End If;
Step 25- End While;
Step 26- 퐴퐶푆_푠
∗
← Global solution 푠
∗
;
// Tabu search algorithm ends here;
Step 27- Apply Global pheromone update; Equation (3.13)
Step 28- End while;
Step 29- Return Global solution 퐴퐶푆_푠
∗
;
End Procedure;
102
The third step is the while loop which is terminated when the condition is met, in this
study the termination condition is satisfied after 90 seconds of the algorithm
execution. The fourth step is the loop for each ant. The fifth step is the construction
using the equation (3.10). The sixth step is the local pheromone update using the
equation (3.12). Step seven is the end of the ants loop.
Step eight is the starting of tabu search algorithm. TS algorithm starts by creating
initial solution using the best-so-far solution found by ACS algorithm. Step nine
makes a global solution which is a copy of the initial solution. Step ten initializes the
tabu list to store the movement attributes. Step eleven initializes the aspiration
function. Step twelve is the (while) loop which is terminated after two seconds. Step
thirteen searches the neighbourhood of the current solution.
Step fourteen checks if the moving from the current solution to the neighbourhood
solution is not tabu. If so, the neighbourhood solution will be saved in the current
solution in step fifteen. Step sixteen will update the tabu list with old position to
prevent visiting the same location. Step seventeen ends the moving condition. In case
the movement to the new neighbourhood solution is tabu, the aspiration function at
step eighteen will check. If the aspiration function returns true, then the
neighbourhood solution will be saved in the current solution in step nineteen. Step
twenty will update the tabu list with old position to prevent visiting the same location.
Step twenty one ends the aspiration function. Step twenty two will check if the current
solution’s makespan is better than the global solution’s makespan. Step twenty three
saves the current solution as the global solution and ends with step twenty four.
103
Step twenty five ends the (while) loop and the best solution passed back to ACS
algorithm in step twenty six. Ant colony system will apply global pheromone update
in step twenty seven using equation (3.13). Step twenty eight ends the ACS (while)
loop and the best global solution found by ACS(TS) is returned in step twenty nine.
TS algorithm works as an enhancing exploration mechanism to explore the search
space based on the solution found by the ants in ACS algorithm.
Appendices A and C provides the C# code for ant colony system and tabu search
algorithms respectively.
Tabu search algorithm has been implemented in several hybrid algorithms as a local
search technique. A study proposed by Nagariya, Mishra, and Shrivastava (2014)
implemented ACO and TS algorithms for job scheduling in computational grid. In
Automatically extracted. Refer to the original PDF for figures, tables, and formatting.