e-ISSN 2231-8526
ISSN 0128-7680
Ashok Kumar, Arun Lal Srivastav, Ishwar Dutt and Karan Bajaj
Pertanika Journal of Science & Technology, Volume 29, Issue 4, October 2021
DOI: https://doi.org/10.47836/pjst.29.4.06
Keywords: C4.5 Algorithm, classification algorithms, decision tree, health model, Shannon entropy
Published on: 29 October 2021
The high rate of urbanisation has increased the need for state-of-art health models that can meet the growing needs of society during any pandemic. Information-theoretic algorithms based on decision tree can mine the data to establish standards for the final decision by classifying the related data. Classification is an effective tool to analyse the existing health system in India’s states and union territories. For this purpose, the data is categorised and then treated with the enhanced Shannon Entropy-based C4.5 decision tree algorithm to set some rules. These rules are capable of finding the major gaps in the health care systems after the analysis. Supposedly, these gaps are taken care of properly in the affected regions. In that case, the health care models will accomplish the endeavouring Sustainable Development Goals.
Afulani, P. A., Phillips, B., Aborigo, R. A., & Moyer, C. A. (2019). Person-centred maternity care in low-income and middle-income countries: Analysis of data from Kenya, Ghana, and India. The Lancet Global Health, 7(1), e96-e109. https://doi.org/10.1016/S2214-109X(18)30403-0
Alkema, L., Chou, D., Hogan, D., Zhang, S., Moller, A. B., Gemmill, A., Fat, D. M., Boerma, T., Temmerman, M., Mathers, C., & Say, L. (2016). Global, regional, and national levels and trends in maternal mortality between 1990 and 2015, with scenario-based projections to 2030: A systematic analysis by the UN Maternal Mortality Estimation Inter-Agency Group. The Lancet, 387(10017), 462-474. https:// 10.1016/S0140-6736(15)00838-7
Antonella, P., & Mariangela, S. (2017). Weighted distance-based trees for ranking data. Advances in Data Analysis and Classification, 13(2), 427-444. https://doi.org/10.1007/s11634-017-0306-x.
Assembly, U. G. (2000, September 6-8). United Nations millennium declaration. In Millenium Summit of the United Nations. New York.
Chen, M. S., Han, J., & Yu, P. S. (1996). Data mining: An overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6), 866-883. https:// 10.1109/69.553155
Gondek, D., & Hofmann, T. (2007). Non-redundant data clustering. Knowledge and Information Systems, 12(1), 1-24. https://doi.org/10.1007/s10115-006-0009-7
Jamaludin, M. H., Wah, Y. B., Nawawi, H. M., Yung-An, C., Rosli, M. M., & Annamalai, M. (2020). Classification of familial hypercholesterolaemia using ordinal logistic regression. Pertanika Journal of Science & Technology, 28(4), 1163-1177. https://doi.org/10.47836/pjst.28.4.03
Jonsson, Å., Orwelius, L., Dahlstrom, U., & Kristenson, M. (2020). Evaluation of the usefulness of EQ-5D as a patient-reported outcome measure using the Paretian classification of health change among patients with chronic heart failure. Journal of Patient-Reported Outcomes, 4(1), 1-11. https://doi.org/10.1186/s41687-020-00216-7
Karim, A., & Frank, P. F. (2017). Local generalized quadratic distance metrics:Application to the k-nearest neighbors. Advances in Data Analysis and Classification, 12(2), 341-363. https://10.1007/s11634-017-0286-x.
Kruk, M. E., Nigenda, G., & Knaul, F. M. (2015). Redesigning primary care to tackle the global epidemic of noncommunicable disease. American Journal of Public Health, 105(3), 431-437. https://10.2105/AJPH.2014.302392
Kruk, M. E., Porignon, D., Rockers, P. C., & Van Lerberghe, W. (2010). The contribution of primary care to health and health systems in low-and middle-income countries: A critical review of major primary care initiatives. Social Science & Medicine, 70(6), 904-911. https://10.1016/j.socscimed.2009.11.025
Kumar, A., Taneja, H. C., & Chitkara A.. (2016, January 18-19). Analysis of health conditions using generalized information measure based ID3 algorithm. In 4th Annual International Conference on Operations Research and Statistics (ORS-2016) (pp. 33-37). Singapore. https://10.5176/2251-1938_ORS16.11
Macarayan, E. K., Gage, A. D., Doubova, S. V., Guanais, F., Lemango, E. T., Ndiaye, Y., Waiswa, P., & Kruk, M. E. (2018). Assessment of quality of primary care with facility surveys: A descriptive analysis in ten low-income and middle-income countries. The Lancet Global Health, 6(11), e1176-e1185. https://doi.org/10.1016/S2214-109X(18)30440-6
Mackintosh, M., Channon, A., Karan, A., Selvaraj, S., Cavagnero, E., & Zhao, H. (2016). What is the private sector? Understanding private provision in the health systems of low-income and middle-income countries. The Lancet, 388(10044), 596-605. https://doi.org/10.1016/S0140-6736(16)00342-1
Maria, T. G., & Gunter, R. (2016). Probabilistic clustering via Pareto solutions and significance tests. Advance Data Analysis and Classification, 12(2), 179-202. https://10.1007/s11634-016-0278-2.
OGD. (2015). Open government data (OGD) platform India. Retrieved June 6, 2015, from https://data.gov.in/.
Okada, M., Tanaka, T., Oseto, M., Takeda, N., & Shinozaki, K. (2006). Genetic analysis of noroviruses associated with fatalities in healthcare facilities. Archives of Virology, 151(8), 1635-1641. https://doi.org/10.1007/s00705-006-0739-6
Panagiotis, T., & Christos, T. (2016). T3C: Improving a decision tree classification algorithm’s interval splits on continuous attributes. Advances in Data Analysis and Classification, 11(2), 353-370. https://doi.org/10.1007/s11634-016-0246-x.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1) 81-106. https://doi.org/10.1007/BF00116251
Rokach, L., & Maimon, O. (2014). Data mining with decision trees: Theory and applications. World Scientific. https://doi.org/10.1142/9097
Salzberg, S. L. (1994). C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993. Machine Learning, 16, 235-240. https://doi.org/10.1007/BF00993309
Sarka, B., Maia, Z., Peter, F., Thomas, O., & Christian, B. (2018). Clustering of imbalanced high-dimensional media data. Advances in Data Analysis and Classification, 12(2), 261-284. https://doi.org/10.1007/s11634-017-0292-z.
Shannon, C. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379-423. https://10.1002/j.1538-7305.1948.tb01338.x
Sharma, H., & Kumar, S. (2016). A survey on decision tree algorithms of classification in data mining. International Journal of Science and Research, 4(4) 2094-2097.
Shi, L. (2012). The impact of primary care: A focused review. Scientifica, 2012, Article 432892. https://10.6064/2012/432892
Tzirakis, P., & Tjortjis, C. (2017). T3C: Improving a decision tree classification algorithm’s interval splits on continuous attributes. Advances in Data Analysis and Classification, 11(2), 353-370. https://doi.org/10.1007/s11634-016-0246-x
Varma, R. S. (1966). Generalizations of Renyi’s entropy of order α. Journal of Mathematical Sciences, 1(7), 34-48.
Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G. J., Ng, A., Liu, B., Philip, S. Y., & Zhou, Z. H. (2008). Top 10 algorithms in data mining. Knowledge and information systems, 14(1), 1-37. https://doi.org/10.1007/s10115-007-0114-2
Zeng, J., Shi, L., Zou, X., Chen, W., & Ling, L. (2015). Rural-to-urban migrants’ experiences with primary care under different types of medical institutions in Guangzhou, China. PloS One, 10(10), Article e0140922. https://doi.org/10.1371/journal.pone.0140922
Zhang, J., Kang, D. K., Silvescu, A., & Honavar, V. (2006). Learning accurate and concise naïve Bayes classifiers from attribute value taxonomies and data. Knowledge and Information Systems, 9(2), 157-179. https://doi.org/10.1007/s10115-005-0211-z
Zhu, P., & Wen, Q. (2010). Some improved results on communication between information systems. Information Sciences, 180(18), 3521-3531. https://doi.org/10.1016/j.ins.2010.05.028
ISSN 0128-7680
e-ISSN 2231-8526