Abstract
Large volume of multi-structured and low-latency patient data are generated in healthcare services, which is achallenging task to process and analyze within the Service Level Agreement (SLA). In this paper, a Parallel Semi-Naive Bayes (PSNB) based probabilistic method is used to process the healthcare big data in cloud for future health condition prediction. In order to improve the accuracy of PSNB method, a Modified Conjunctive Attribute (MCA) algorithm is proposed for reducing the dimension. Emergency condition of the patient is considered by setting a global priority among the patients and an Optimal Data Distribution (ODD) algorithm is proposed to position both batch and streaming patient data into the Spark nodes. Further, a Dynamic Job Scheduling (DJS) algorithm is designed to schedule the jobs efficiently to the most suitable nodes for processing the data taking SLA into account. Our proposed PSNB algorithm provides better accuracy of 87.8% for both batch and streaming data, which is 12.8% higher than the original Naive–Bayes (NB) algorithm and can conveniently be employed in various patient monitoring applications.
Original language | English |
---|---|
Pages (from-to) | 121-135 |
Number of pages | 15 |
Journal | Journal of Parallel and Distributed Computing |
Volume | 119 |
DOIs | |
State | Published - 09 2018 |
Bibliographical note
Publisher Copyright:© 2018 Elsevier Inc.
Keywords
- Big Data
- Cloud computing
- Healthcare
- Spark