Date of Publication
8-2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy in Science Education Major in Biology
Subject Categories
Science and Mathematics Education
College
Br. Andrew Gonzalez FSC College of Education
Department/Unit
Science Education
Thesis Advisor
Maricar S. Prudente
Defense Panel Chair
Voltaire Mallari Mistades
Defense Panel Member
Mary Jane C. Flores
Lydia S. Roleda
Socorro E. Aguja
Denis Dyvee R. Errabo
Abstract/Summary
This study developed and psychometrically validated a Biology Item Bank using the Item Response Theory Four-Parameter Logistic (IRT–4PL) model, aimed at providing a standardized pool of calibrated items for Senior High School STEM students preparing for biology-intensive and health-allied college programs. The development process followed a multi-phase validation protocol integrating expert evaluation, empirical testing, and advanced psychometric modeling. An initial pool of 120 multiple-choice items was constructed and reviewed by five biology educators through online focus group discussions. Items were evaluated for content accuracy, linguistic clarity, and curricular relevance, and were classified based on Bloom’s revised taxonomy across six cognitive levels. A pilot validation confirmed semantic and content appropriateness, after which the test was administered to 1,017 STEM students from a private university in Metro Manila. Both dichotomous and polytomous scoring were employed, enabling robust distractor analysis. Reliability analysis yielded a strong Cronbach’s alpha (α = 0.920), which improved slightly (α = 0.923) after the removal of underperforming items. Additional distractor diagnostics resulted in revisions and refinements, producing an 88-item calibrated pool. Structural validity was established through exploratory factor analysis (KMO = 0.879; Bartlett’s test, p < .001) and confirmatory factor analysis, which demonstrated acceptable fit indices (RMSEA = 0.013, SRMR = 0.025, TLI = 0.916, CFI = 0.932). Item-level calibration under the IRT–4PL model provided parameter estimates for discrimination (a), difficulty (b), guessing (c), and slipping (d). Results indicated a small number of items with misfit or overfitting, while the majority performed within psychometric expectations. The Item Characteristic Curves (ICCs) displayed the psychometric soundness of retained items across cognitive domains. Based on integrated statistical and expert criteria, the final classification consisted of 18 retained items, 20 revised, 16 reassigned to alternative domains, and 66 rejected due to psychometric flaws. This study affirms the utility of the IRT–4PL model in developing item banks for high-stakes assessments. The finalized test, rigorously validated, provides a dependable source of calibrated items for biology assessments and diagnostic purposes. Moreover, the study recommends extending the IRT–4PL framework to the development of item banks in other science domains, ensuring validity, fairness, and pedagogical alignment in assessment design.
Keywords: Biology test, IRT–4PL, item bank, discrimination, difficulty, guessing, slipping, factor analysis, latent trait/construct
Abstract Format
html
Language
English
Format
Electronic
Keywords
Biology—Ability testing; Educational tests and measurements; Item response theory; Psychometrics
Recommended Citation
Mirabete, G. S. (2025). Development and validation of a biology item bank using item response theory – four parameter logistic (IRT–4PL) model. Retrieved from https://animorepository.dlsu.edu.ph/etdd_scied/55
Upload Full Text
wf_yes
Embargo Period
9-2028