Date of Publication
2024
Document Type
Dissertation/Thesis
Degree Name
Bachelor of Science (Honors) in Computer Science and Master of Science in Computer Science
College
College of Computer Studies
Department/Unit
Software Technology
Thesis Advisor
Charibeth K. Cheng
Defense Panel Chair
Ethel Chua Joy Ong
Defense Panel Member
Charibeth K. Cheng
Jennifer O. Contreras
Abstract (English)
Text style transfer involves automatically translating a sentence from one style to another. Exploring techniques for text style transfer is important, as style plays a crucial role in making NLP systems more user-centered. However, there has been limited research on text style transfer in non-English contexts, primarily due to the scarcity of resources, such as parallel corpora, which are crucial for training text style transfer models. The limited work conducted in non-English settings is a barrier to a comprehensive understanding of the current state of text style transfer approaches. In that regard, this work was done within the context of the Filipino language, where text style transfer is unexplored. This work focused on the formality style transfer subtask, which aims to rewrite informal text to have a formal style. To address the lack of parallel corpora in the Filipino language, this work proposed to use pseudo-parallel corpus construction, where informal-formal text pairs are created using only non-parallel corpora. These pseudo-parallel pairs were used to train a sequence-to-sequence model to learn how to formalize Filipino text. Different modifications to the pipeline were explored, and the performances were evaluated using the three standard metrics in text style transfer: style transfer score, meaning preservation, and fluency. Although results show that the best model has below-average performance, the improvements gained with pipeline modifications indicate that further tweaking the methodology could still improve the quality of style transfer. This study recommends exploring better sentence representations, finding adjacent datasets for augmentation, and using aggregation-based scores to refine the dataset. Furthermore, more robust metric implementations should be used for reliable evaluation scores on Filipino text style transfer.
Abstract Format
html
Language
English
Recommended Citation
Loquinte, K. (2024). Exploring Text Style Transfer for Formalizing Filipino Text. Retrieved from https://animorepository.dlsu.edu.ph/etdm_softtech/11
Upload Full Text
wf_yes
Embargo Period
4-17-2024