Parallel and distributed architecture of genetic algorithm on Apache Hadoop and Spark

Hau Chun Lu, F. J. Hwang, Yao Huei Huang*

*Corresponding author for this work

Research output: Contribution to journalJournal Article peer-review

34 Scopus citations

Abstract

The genetic algorithm (GA), one of the best-known metaheuristic algorithms, has been extensively utilized in various fields of management science, operational research, and industrial engineering. The efficiency of GAs in solving large-scale optimization problems would be enhanced if the iterative processes required by the genetic operators can be implemented in a parallel and distributed computing architecture. Apache Hadoop has recently been one of the most popular systems for distributed storage and parallel processing of big data. By integrating the GA highly into Apache Hadoop, this study proposes an advanced GA parallel and distributed computing architecture that achieves the effectiveness and efficiency of GA evolution. Characterized by the sophisticated mechanism of dispatching the GA core operators into Apache Hadoop, the developed computing framework fits well with the cloud computing model. The presented GA parallelization architecture outperforms the state-of-the-art reference architectures according to the computational experiments where the testing instances of traveling salesman problems are employed. Our numerical experiments also demonstrate that the proposed architecture can readily be extended to Apache Spark.

Original languageEnglish
Article number106497
JournalApplied Soft Computing Journal
Volume95
DOIs
StatePublished - 10 2020

Bibliographical note

Publisher Copyright:
© 2020 Elsevier B.V.

Keywords

  • Apache Hadoop
  • Apache Spark
  • Genetic algorithm
  • Parallel and distributed computing
  • Traveling salesman problems

Fingerprint

Dive into the research topics of 'Parallel and distributed architecture of genetic algorithm on Apache Hadoop and Spark'. Together they form a unique fingerprint.

Cite this