Natural language generator for SUMO

Date of Publication

2012

Document Type

Master's Thesis

Degree Name

Master of Science in Computer Science

College

College of Computer Studies

Department/Unit

Computer Science

Thesis Adviser

Ethel Ong

Abstract/Summary

The Suggested Upper Merged Ontology (SUMO) is a formal, open source, upper ontology developed to form a standardized database of knowledge representation. Sigma is a knowledge engineering environment that supports application and development of SUMO. It provides a host of facilities for browsing, development, inspection, merging and debugging of ontologies. It also has a natural language paraphrasing capability which currently generates language that can be difficult to read and understood by humans. This research involves modifying the current natural language paraphrase capability of Sigma to produce an output that is more natural. By definition, a natural sentence is expressed in clear, unforced terms in the target language and close to that of native speakers.

The modified natural language capability is able to generate a sentence for axiom statements that contained CaseRole terms. Axiom statements were selected for the different types of test case scenarios made to validate that the modified NLP capability is able to generate an output for the different types of statements that are contained in SUMO. Output sentences were generated for all the selected axiom statements and were evaluated by the structure of the sentence, based on the SUMO axiom statement as well as by the naturalness of the sentence. Results showed that there were marked improvements between the output sentence of the current NLP capability and the output sentence of the modified NLP capability.

A test was also made to investigate the systems potential to generate a sentence in another language aside from English. The language used for testing is Filipino. Results showed that the modified NLP capability is able to generate a sentence in the Filipino language. However, there are many issues that were encountered. The sentence generated is not natural because the sentence structure used in the modified NLP capability followed the standard structure of English sentences. Sentences in the Filipino language follow a different structure and takes into consideration other factors such as the morphology of Filipino words and the focus point of the sentence.

Abstract Format

html

Language

English

Format

Print

Accession Number

TG05284

Shelf Location

Archives, The Learning Commons, 12F Henry Sy Sr. Hall

Physical Description

vi, 47 leaves ; 28 cm.

Keywords

Computational linguistics; Natural language processing (Computer science)

This document is currently not available here.

Share

COinS