Machine Learning in Global Development: Applying k-Means Clustering to Identify Country Groupings by Economic and Health Performance

Section: Article
Published
Jan 1, 2026
Pages
28-37

Abstract

This study applies k-means clustering to group countries based on key economic and health indicators: Gross National Income per capita (GNIP), health expenditure, life expectancy, birth and death rates, and urbanization. The elbow method identified k = 3 as the optimal number of clusters, indicating a significant drop in within-cluster sum of squares (from 5000 to 2000). The results reveal three distinct development groupings. A small cluster of 13 high-performing countries stands out with strong economic (GNIP = 0.658) and health outcomes (Life Expectancy = 0.856), along with low birth (0.116) and death rates (0.113). This group also shows strong internal similarity (silhouette width = 0.58). The remaining countries fall into two broader clusters. The first includes 320 countries with moderate development, higher urbanization (UrbanP = 0.712), and relatively high health spending (healthE = 0.219), but lower GNIP (0.066). The second cluster of 277 countries faces greater challenges, marked by low life expectancy (0.414), high birth rates (0.670), and weak economic indicators (GNIP = 0.067). Both larger clusters show moderate cohesion (silhouette widths = 0.29 and 0.32). These findings highlight the stratified and multidimensional nature of global development, offering a data-driven framework to inform policy decisions and tailor interventions to the unique characteristics of each cluster.

References

  1. Athey, S., & Imbens, G. W. (2019). Machine learning methods economists should know about. Annual Review of Economics, 11, 685-725. https://doi.org/10.1257/jel.20191597
  2. Azur, M. J., Stuart, E. A., Frangakis, C., & Leaf, P. J. (2011). Multiple imputation by chained equations: What is it and how does it work? International Journal of Methods in Psychiatric Research, 20(1), 40-49. https://doi.org/10.1002/mpr.329
  3. Banerjee, A. V., & Duflo, E. (2019). Good economics for hard times. . PublicAffairs. https://www.penguinrandomhouse.com/books/557013
  4. Bank, W. (2021). World Development Indicators 2021. World Bank Publication.https://databank.worldbank.org/source/world-development-indicators
  5. Bloom, D. E., Khoury, A., Kufenko, V., & Prettner, K. (2019). The macroeconomic impact of non-communicable diseases. Health Economics. 28(2), 223-226. https://doi.org/10.1002/hec.3857
  6. Bloom, D. E., Khoury, A., Kufenko, V., & Prettner, K. (2019). Health and economic growth. In *Global Population Health and Well-Being in the 21st Century* (pp. 25–52). Springer. https://doi.org/10.1007/978-3-030-11819-6_2
  7. Donoho, D. (2017). 50 years of data science., . Journal of Computational and Graphical Statistics, 26(4), 745-766. https://doi.org/10.1080/10618600.2017.1384734
  8. Glaeser, E. (2014). A world of cities: The causes and consequences of urbanization in poorer countries. . Journal of the European Economic Association, 12(5), 1154-1199. https://doi.org/10.1111/jeea.12100
  9. Guevara, Z. e. (2021). Sustainable development patterns in country clusters. Ecological Economics, 179, 106818. https://doi.org/10.1016/j.ecolecon.2020.106818
  10. Guevara, M. R., Hartmann, D., & Aristarán, M. (2021). Machine learning for classifying countries’ development indicators. Nature Human Behaviour, 5, 987–994. https://doi.org/10.1038/s41562-021-01122-8
  11. Hair, J. F. , Black, W. C., Babin, B. J., & Anderson, R. E.(2019). Multivariate data analysis (8th ed.). Cengage. https://www.pearson.com/store/p/multivariate-data-analysis/P100001672229
  12. Hartigan, J. A., & Wong, M. A. . (1979). Algorithm AS 136: A k-means clustering algorithm. , . Journal of the Royal Statistical Society, 28(1), 100-108. https://doi.org/10.2307/2346830
  13. Hartmann, D., Guevara, M. R., Jara-Figueroa, C., Aristarán, M., & Hidalgo, C. A. (2017). Linking economic complexity, institutions, and income inequality. World Development, 93, 75–93.
  14. https://doi.org/10.1016/j.worlddev.2017.02.006
  15. Hastie, T., Tibshirani, R., & Friedman, J. . (2009). The elements of statistical learning (2nd ed.). . Springer. https://hastie.su.domains/ElemStatLearn/
  16. Hennig, C. (2007). Cluster-wise assessment of cluster stability. . Computational Statistics & Data Analysis, 52(1), 258-271. https://doi.org/10.1016/j.csda.2006.11.025
  17. Hidalgo, C. A., & Hausmann, R. (2009). The building blocks of economic complexity. , . Proceedings of the National Academy of Sciences, 106(26), 10570-10575. https://doi.org/10.1073/pnas.0900943106
  18. Jones, G. (2020). Ultra-low fertility in East Asia. Population and Development Review, 46(3), 579-606. https://doi.org/10.1111/padr.12364
  19. Ketchen, D. J., & Shook, C. L. (1996). The application of cluster analysis in strategic management research. . Strategic Management Journal, 17(6), 441-458. https://doi.org/10.1177/014920639602200105
  20. Lee, R. (2003). The demographic transition. Journal of Economic Perspectives, 17(4), 167-190. https://doi.org/10.1257/089533003321164967
  21. Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory .https://doi.org/10.1109/TIT.1982.1056489
  22. MacQueen, J. (1967). Some methods for classification. . Proceedings of the 5th Berkeley Symposium.https://projecteuclid.org/proceedings/berkeley-symposium-on-mathematical-statistics-and-probability/Proceedings-of-the-Fifth-Berkeley-Symposium-on-Mathematical-Statistics-and/Chapter/Some-Methods-for-Classification-and-Analysis-of-Multivariate-Observations/bsmsp/1200512992
  23. Milanovic, B. (2016). Global inequality: A new approach for the age of globalization. Harvard University Press. ISBN: 978-0674737136
  24. Moretti, E. (2013). Real wage inequality. , . American Economic Journal: Applied Economics, 5(1), 65-103. https://doi.org/10.1257/app.5.1.65
  25. Nations, U. (2015). Transforming our world: The 2030 Agenda for Sustainable Development. https://sdgs.un.org/2030agenda
  26. Peng, R. D. (2011). Reproducible research in computational science. Science, 334(6060), 1226-1227. https://doi.org/10.1126/science.1213847
  27. Pinkovskiy, M., & Sala-i-Martin, X. (2020). Parametric estimations of the world distribution of income. (NBER Working Paper No. 26933). https://doi.org/10.1257/mac.20150313
  28. Preston, S. H. (2007). The changing relation between mortality and income. International Journal of Epidemiology, 36(3), 484-490. https://doi.org/10.1093/ije/dym075
  29. Ranis, G., Stewart, F., & Ramirez, A. (2000). Economic growth and human development. World Development, 28(2), 197-219. https://doi.org/10.1016/S0305-750X(99)00131-X
  30. Rodrik, D. (2008). The real exchange rate and economic growth. Brookings . Papers on Economic Activity,, 2008(2), 365-412. ISBN: 978-0691141179
  31. Rodrik, D. (2015). Premature deindustrialization. Journal of Economic Growth, 21(1), 1-33.https://doi.org/10.1007/s10887-015-9122-3
  32. Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53-65. https://doi.org/10.1016/0377-0427(87)90125-7
  33. Stiglitz, J. E., Sen, A., & Fitoussi, J. P. (2009). Report by the Commission on the Measurement of Economic Performance and Social Progress. https://ec.europa.eu/eurostat/web/products-statistical-books/-/KS-32-12-142
  34. Thorndike, R. L. (1953). Who belongs in the family? . Psychometrika. 18(4), 267 - 276 https://doi.org/10.1007/BF02289263
  35. UNDP. (2020). Human Development Report 2020. United Nations. http://hdr.undp.org/en/2020-report
  36. United Nations. (2015). Sustainable Development Goals (SDGs). https://sdgs.un.org/goals
  37. United Nations Population Division (2019). World Population Prospects. https://population.un.org/wpp.
  38. World Health Organization (WHO) (2020). Global health expenditure database. https://apps.who.int/nha/database
  39. World Bank. (2021). World Bank country and lending groups. https://datahelpdesk.worldbank.org/knowledgebase/articles/906519.
Download this PDF file

Statistics

Downloads

Download data is not yet available.

How to Cite

[1]
O. P. . Adebayo, I. Ahmed, I. . Garba, and K. . Oyeleke, “Machine Learning in Global Development: Applying k-Means Clustering to Identify Country Groupings by Economic and Health Performance”, JES, vol. 35, no. 1, pp. 28–37, Jan. 2026.
Copyright and Licensing