Date of Publication

8-10-2024

Document Type

Bachelor's Thesis

Degree Name

Bachelor of Science in Statistics Major in Actuarial Science

Subject Categories

Mathematics | Public Health

College

College of Science

Department/Unit

Mathematics and Statistics Department

Thesis Advisor

Rechel G. Arcilla

Defense Panel Chair

Olivia P. Pagulayan

Defense Panel Member

Kevynn P. Delgado

Abstract/Summary

Women face a greater risk of contracting HIV due to their anatomy and the impacts of gender inequality. Despite this, only 8% of Filipino women have ever tested for HIV, according to the results of the 2022 National Demographic Health Survey. This study aimed to identify the determinants of HIV testing to aid in the development of policies and interventions that could improve testing uptake. Relevant factors from stepwise selection were used to predict HIV testing using logistic regression, random forest, and Naïve Bayes classifiers. Since the target class was highly imbalanced, Synthetic Minority Oversampling Technique (SMOTE) preprocessing was implemented before machine learning classification. The classifiers were then evaluated using five-fold cross-validation, and performance metrics precision, recall, and F1 score were computed from resulting confusion matrices. Age, region, residence type, educational attainment, print media use, internet use, wealth, contraceptive use and intention, marital status, age at first sex, and some partner characteristics were found to be significant determinants of HIV testing among Filipino women. Lower rates of HIV testing were associated with older respondents, those from rural households, those with a lower educational attainment, and those with a lower socioeconomic status. Among the three classifiers, random forest and logistic regression performed better with and without SMOTE, respectively. SMOTE preprocessing did not result in any substantial improvements to the logistic regression classifier. As for the two nonparametric machine learning classifiers, random forest, and Näive Bayes, SMOTE yielded higher F1 scores, with higher recall scores coming at the cost of lower precision scores.

Abstract Format

html

Language

English

Format

Electronic

Keywords

HIV (Viruses)—Testing; Women—Health and hygiene--Philippines; Logistic regression analysis

Upload Full Text

wf_yes

Embargo Period

8-9-2025

Available for download on Saturday, August 09, 2025

Share

COinS