Software Technology Master's Theses

Exploring text style transfer for formalizing Filipino text

Kenneth Uriel Loquinte, De La Salle University, ManilaFollow

Date of Publication

2024

Document Type

Master's Thesis

Degree Name

Bachelor of Science (Honors) in Computer Science and Master of Science in Computer Science

Subject Categories

Computer Sciences | Software Engineering

College

College of Computer Studies

Department/Unit

Software Technology

Thesis Advisor

Charibeth K. Cheng

Defense Panel Chair

Ethel Chua Joy Ong

Defense Panel Member

Charibeth K. Cheng

Jennifer O. Contreras

Abstract (English)

Text style transfer involves automatically translating a sentence from one style to another. Exploring techniques for text style transfer is important, as style plays a crucial role in making NLP systems more user-centered. However, there has been limited research on text style transfer in non-English contexts, primarily due to the scarcity of resources, such as parallel corpora, which are crucial for training text style transfer models. The limited work conducted in non-English settings is a barrier to a comprehensive understanding of the current state of text style transfer approaches. In that regard, this work was done within the context of the Filipino language, where text style transfer is unexplored. This work focused on the formality style transfer subtask, which aims to rewrite informal text to have a formal style. To address the lack of parallel corpora in the Filipino language, this work proposed to use pseudo-parallel corpus construction, where informal-formal text pairs are created using only non-parallel corpora. These pseudo-parallel pairs were used to train a sequence-to-sequence model to learn how to formalize Filipino text. Different modifications to the pipeline were explored, and the performances were evaluated using the three standard metrics in text style transfer: style transfer score, meaning preservation, and fluency. Although results show that the best model has below-average performance, the improvements gained with pipeline modifications indicate that further tweaking the methodology could still improve the quality of style transfer. This study recommends exploring better sentence representations, finding adjacent datasets for augmentation, and using aggregation-based scores to refine the dataset. Furthermore, more robust metric implementations should be used for reliable evaluation scores on Filipino text style transfer.

Abstract Format

html

Abstract (Filipino)

Abstract Format

html

Language

English

Keywords

Text processing (Computer science); Natural language processing (Computer science); Low-resource languages; Filipino language -- Data processing

Recommended Citation

Loquinte, K. (2024). Exploring text style transfer for formalizing Filipino text. Retrieved from https://animorepository.dlsu.edu.ph/etdm_softtech/11

Upload Full Text

wf_yes

Embargo Period

4-17-2024

This document is currently not available here.

COinS

Software Technology Master's Theses

Exploring text style transfer for formalizing Filipino text

Date of Publication

Document Type

Degree Name

Subject Categories

College

Department/Unit

Thesis Advisor

Defense Panel Chair

Defense Panel Member

Abstract (English)

Abstract Format

Abstract (Filipino)

Abstract Format

Language

Keywords

Recommended Citation

Upload Full Text

Embargo Period

Search

Browse

Submit

Connect

Software Technology Master's Theses

Exploring text style transfer for formalizing Filipino text

Author

Date of Publication

Document Type

Degree Name

Subject Categories

College

Department/Unit

Thesis Advisor

Defense Panel Chair

Defense Panel Member

Abstract (English)

Abstract Format

Abstract (Filipino)

Abstract Format

Language

Keywords

Recommended Citation

Upload Full Text

Embargo Period

Share

Search

Browse

Submit

Connect