Data Mining Predicts the Need for Immunization Vaccines Using the Naive Bayes Method

R Arri Widyanto, Meidar Hadi Avizenna, Nugroho Agung Prabowo, Kemal Alfata, Agus Ismanto


In December 2019, SARS-CoV-2 caused the coronavirus disease (COVID-19) to spread to all countries, infecting thousands of people and causing death. COVID-19 causes mild illness in most cases, although it can make some people seriously ill. Therefore, vaccines are in various phases of clinical progress, and some of them have been approved for national use. The current state of affairs reveals that there is a critical need for a quick and timely solution to the need for a Covid-19 vaccine. Non-clinical methods such as data mining and machine learning techniques can help to do this. This study will focus on US COVID-19 Vaccination Advances using Machine learning classification algorithms and Using Geospatial analysis to visualize the results. The paper's findings indicate which algorithm is better for a given data set. Naive Bayes algorithm is used to run tests on real world data, and is used to analyze and draw conclusions. Period of Accuracy and performance, and it was found that Naive Bayes is very superior in terms of time and accuracy.

Article Metrics

Abstract: 461 Viewers PDF: 276 Viewers


Covid-19; Geospatial Analysis; Data Mining; Naive Bayes;

Full Text:



Z. Reno, S. Elsi, H. Pratiwi, Y. Efendi, and R. Rusdina, “Utilization of Data Mining Techniques in National Food Security during the Covid-19 Pandemic in Indonesia Utilization of Data Mining Techniques in National Food Security during the Covid-19 Pandemic in Indonesia,” 2020, doi: 10.1088/1742-6596/1594/1/012007.

C. N. Poth, O. Bulut, A. M. Aquilina, and S. J. G. Otto, “Using Data Mining for Rapid Complex Case Study Descriptions : Example of Public Health Briefings During the Onset of the COVID-19 Pandemic,” 2021, doi: 10.1177/15586898211013925.

C. Zhang, S. Xu, Z. Li, and S. Hu, “Understanding Concerns , Sentiments , and Disparities Among Population Groups During the COVID-19 Pandemic Via Twitter Data Mining : Large-scale Cross-sectional Study Corresponding Author :,” vol. 23, no. 3, pp. 1–16, 2021, doi: 10.2196/26482.

C. Zhang, J. Jiang, H. Jin, and T. Chen, “The Impact of COVID-19 on Consumers ’ Psychological Behavior Based on Data Mining for Online User Comments in the Catering Industry in China,” 2021.

F. Stephany and L. Neuhäuser, “The CoRisk -Index: A data-mining approach to identify industry-speci fi c risk assessments related to COVID-19 in real-time,” no. April, pp. 1–18, 2020.

R. E. C. Id, V. Purushothaman, J. Li, M. Cai, and K. M. Id, “Sub-national longitudinal and geospatial analysis of COVID-19 tweets,” pp. 1–11, 2020, doi: 10.1371/journal.pone.0241330.

A. Fadli, A. Wisnu, W. Nugraha, M. S. Aliim, and A. Taryana, “Simple Correlation Between Weather and COVID-19 Pandemic Using Data Mining Algorithms Simple Correlation Between Weather Pandemic Using Data Mining Algorithms and COVID-19,” 2020, doi: 10.1088/1757-899X/982/1/012015.

A. R. Isnain, N. S. Marga, and D. Alita, “Sentiment Analysis Of Government Policy On Corona Case Using Naive Bayes Algorithm,” vol. 15, no. 1, pp. 55–64, 2021.

I. Franch-pardo, B. M. Napoletano, F. Rosete-verges, and L. Billa, “Science of the Total Environment Spatial analysis and GIS in the study of COVID-19 . A review,” Sci. Total Environ., vol. 739, p. 140033, 2020, doi: 10.1016/j.scitotenv.2020.140033.

Y. J. Juhn, P. Wheeler, C. Wi, E. Ryu, E. Ristagno, and C. Patten, “Role of Geographic Risk Factors in COVID-19 Epidemiology: Longitudinal Geospatial Analysis,” Mayo Clin. Proc. Innov. Qual. Outcomes, 2021, doi: 10.1016/j.mayocpiqo.2021.06.011.

A. S. Albahri, R. A. Hamid, J. Alwan, A. A. Zaidan, B. B. Zaidan, and A. O. S. Albahri, “Role of biological Data Mining and Machine Learning Techniques in Detecting and Diagnosing the Novel Coronavirus ( COVID-19 ): A Systematic Review,” 2020.

L. J. M. Milon, I. Sani, S. Usman, and S. Islam, “Predictive Data Mining Models for Novel Coronavirus ( COVID ‑ 19 ) Infected Patients ’ Recovery,” SN Comput. Sci., vol. 1, no. 4, pp. 1–7, 2020, doi: 10.1007/s42979-020-00216-w.

W. The and W. Academy, “Philippine Twitter Sentiments during Covid-19 Pandemic using Multinomial Naïve-Bayes Philippine Twitter Sentiments during Covid-19 Pandemic.”

J. Yuan et al., “Pharmacotherapy Management for COVID ‑ 19 and Cardiac Safety : A Data Mining Approach for Pharmacovigilance Evidence from the FDA Adverse Event Reporting System ( FAERS ),” Drugs - Real World Outcomes, vol. 8, no. 2, pp. 131–140, 2021, doi: 10.1007/s40801-021-00229-8.

S. Kumar, “Monitoring Novel Corona Virus ( COVID ‑ 19 ) Infections in India by Cluster Analysis,” Ann. Data Sci., vol. 7, no. 3, pp. 417–425, 2020, doi: 10.1007/s40745-020-00289-7.

D. Li, H. Chaudhary, and Z. Zhang, “Modeling Spatiotemporal Pattern of Depressive Symptoms Caused by COVID-19 Using Social Media Data Mining,” 2020.

I. O. P. C. Series and M. Science, “Model Expert System for Diagnosis of Covid-19 Using Naïve Bayes Model Expert System for Diagnosis of Covid-19 Using Naïve Bayes Classifier,” 2020, doi: 10.1088/1757-899X/1007/1/012067.

I. O. P. C. Series and M. Science, “Mapping the Spread of Covid-19 in Asia Using Data Mining X-Means Algorithms Mapping the Spread of Covid-19 in Asia Using Data Mining X-Means Algorithms,” 2021, doi: 10.1088/1757-899X/1071/1/012018.

Q. Chen, A. Allot, and Z. Lu, “LitCovid : an open database of COVID-19 literature,” vol. 49, no. November 2020, pp. 1534–1540, 2021, doi: 10.1093/nar/gkaa952.

A. F. Watratan, A. P. B, D. Moeis, S. Informasi, and S. P. Makassar, “JOURNAL OF APPLIED COMPUTER SCIENCE AND TECHNOLOGY ( JACOST ) Implementasi Algoritma Naive Bayes Untuk Memprediksi Tingkat Penyebaran Covid-19 Di Indonesia,” vol. 1, no. 1, pp. 7–14, 2020.

D. Şengür, “Investigation of the relationships of the students ’ academic level and gender with Covid -19 based anxiety and protective behaviors : A data mining approach Öğrencilerin akademik düzeyi ve cinsiyetinin Covid - 19 temelli kaygı ve koruyucu davranışlarla ilişkisinin incelenmesi : Bir veri madenciliği yaklaşımı,” vol. 15, no. 2, pp. 93–99, 2020.

S. Roy, M. Saha, B. Dhar, S. Pandit, and R. Nasrin, “Geospatial analysis ofCOVID-19 lockdown effects on air quality in the South and Southeast Asian region,” no. January, 2020.

A. M. Forati and R. Ghose, “Geospatial analysis of misinformation in COVID-19 related tweets,” no. January, 2020.

K. Takakuwa, “Difficulties of Integrating Human Resources Management Globally by Japanese Corporations,” Int. J. Appl. Inf. Manag., vol. 1, no. 4, pp. 173–185, 2021, doi: 10.47738/ijaim.v1i4.19.

T. T. Kim Phuong, “Proposing a Theoretical Model to Determine Factors Affecting on Job Satisfaction, Job Performance and Employees Loyalty For Technology Information (IT) Workers,” Int. J. Appl. Inf. Manag., vol. 1, no. 4, pp. 201–209, 2021, doi: 10.47738/ijaim.v1i4.21.

N. A. Prabowo, “Social Network Analysis for User Interaction Analysis on Social Media Regarding E-Commerce Business,” vol. 4, no. 2, pp. 95–102, 2021.

M. Fatima, K. J. O. Keefe, W. Wei, and S. Arshad, “Geospatial Analysis of COVID-19 : A Scoping Review,” 2021.

N. M. Abdulkareem, “COVID-19 World Vaccination Progress Using Machine Learning Classification Algorithms,” pp. 100–105, doi: 10.48161/Issn.2709-8206.

J. Samuel, G. G. N. Ali, M. Rahman, E. Esawi, and Y. Samuel, “COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification,” pp. 1–22, doi: 10.3390/info11060314.

M. F. Beckman, F. B. Mougeot, and J. C. Mougeot, “Comorbidities and Susceptibility to COVID-19 : A Generalized Gene Set Data Mining Approach,” 2021.

“Comment Geospatial digital monitoring of COVID-19 cases at high spatiotemporal resolution,” vol. 2, no. 20, pp. 393–394, 2020, doi: 10.1016/S2589-7500(20)30139-4.

M. Azarafza, M. Azarafza, and H. Akgün, “Clustering method for spread pattern analysis of corona-virus ( COVID-19 ) infection in Iran,” vol. 3, no. 1, pp. 1–6, 2021.

K. Chamorro, M. Fors, F. X. Mora, and M. Pulgar-s, “Biomarkers of severe COVID-19 pneumonia on admission using data-mining powered by common laboratory blood tests-datasets,” no. January, 2020.

R. Ahasan, S. Alam, T. Chakraborty, and M. Hossain, “Applications of GIS and geospatial analyses in COVID-19 research : A systematic review [ version 1 ; peer review : awaiting peer review ],” pp. 1–14, 2021.

T. Chen, J. Rong, L. Peng, J. Yang, G. Cong, and J. Fang, “Analysis of Social Effects on Employment Promotion Policies for College Graduates Based on Data Mining for Online Use Review in China during the COVID-19 Pandemic,” 2021.

U. Verawardina, F. Edi, and R. Watrianthos, “Analisis Sentimen Pembelajaran Daring Pada Twitter di Masa Pandemi COVID-19 Menggunakan Metode Naïve Bayes,” vol. 5, pp. 157–163, 2021, doi: 10.30865/mib.v5i1.2604.

N. Made, A. Juli, D. Gede, H. Divayana, and G. Indrawan, “Analisis Sentimen Dokumen Twitter Mengenai Dampak Virus Corona Menggunakan Metode Naive Bayes Classifier,” pp. 22–29, 2020, doi: 10.30864/jsi.v15i1.332.

W. M. Shaban, A. H. Rabie, A. I. Saleh, and M. A. Abo-elsoud, “Accurate detection of COVID-19 patients based on distance biased Naïve Bayes (DBNB) classification strategy,” no. January, 2020.

N. A. Mansour, A. I. Saleh, M. Badawy, and H. A. Ali, Accurate detection of Covid ‑ 19 patients based on Feature Correlated Naïve Bayes ( FCNB ) classification strategy, no. 0123456789. Springer Berlin Heidelberg, 2021.

C. Scarpone, S. T. Brinkmann, T. Große, D. Sonnenwald, M. Fuchs, and B. B. Walker, “A multimethod approach for county ‑ scale geospatial analysis of emerging infectious diseases : a cross ‑ sectional case study of COVID ‑ 19 incidence in Germany,” Int. J. Health Geogr., pp. 1–17, 2020, doi: 10.1186/s12942-020-00225-1.

R. E. Cuomo, V. Purushothaman, J. Li, M. Cai, and T. K. Mackey, “A longitudinal and geospatial analysis of COVID-19 tweets during the early outbreak period in the United States,” pp. 1–11, 2021.


  • There are currently no refbacks.


Journal of Applied Data Sciences

ISSN : 2723-6471 (Online)
Organized by : Departement of Information System, Universitas Amikom Purwokerto, Indonesia; Computer Science and Systems Information Technology, King Abdulaziz University, Kingdom of Saudi Arabia.
Website :
Email : (principal contact) (managing editor) (technical issues)

 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0