Master's Theses

Keyword extraction for very high dimensional datasets using random projection as key input representation scheme

Jeric Bryle S. Dy, De La Salle University, Manila

Date of Publication

2-1-2011

Document Type

Master's Thesis

Degree Name

Master of Science in Computer Science

Subject Categories

Computer Sciences

College

College of Computer Studies

Department/Unit

Computer Science

Thesis Adviser

Arnulfo Azcarraga

Defense Panel Chair

Nelson Marcos

Defense Panel Member

Arnulfo Azcarraga
Charibeth Cheng

Abstract/Summary

Keywords are increasingly useful as users are faced with the challenge of keeping up with voluminous information that they need to process every day. The most straightforward way for extracting keywords is to compute for the term frequencies for each document. But when dealing with corpora containing hundreds of thousands of unique terms, the huge amount of space needed and the enormous amount of computing time required to eventually extract the most relevant terms as keywords would severely limit the practical implementation of current keyword extraction techniques. As such, the frequency counts of extracted terms need to be subjected to a data compression scheme. In this research, the random projection method is used to compress the extracted data and the method allows for various clustering and keyword extraction algorithms to be done directly on the compressed data. Several experiments are conducted to assess the effect of the random projection method on the quality and time-space efficiency of the k-means clustering and term extraction.

Abstract Format

html

Language

English

Format

Electronic

Electronic File Format

MS WORD

Accession Number

CDTG004899

Shelf Location

Archives, The Learning Commons, 12F, Henry Sy Sr. Hall

Physical Description

1 computer optical disc, 4 3/4 in.

Keywords

Text processing (Computer science); Dimension reduction (Statistics); Document clustering

Upload Full Text

wf_yes

Recommended Citation

Dy, J. S. (2011). Keyword extraction for very high dimensional datasets using random projection as key input representation scheme. Retrieved from https://animorepository.dlsu.edu.ph/etd_masteral/6649

Embargo Period

4-18-2022

Download

COinS

Master's Theses

Keyword extraction for very high dimensional datasets using random projection as key input representation scheme

Date of Publication

Document Type

Degree Name

Subject Categories

College

Department/Unit

Thesis Adviser

Defense Panel Chair

Defense Panel Member

Abstract/Summary

Abstract Format

Language

Format

Electronic File Format

Accession Number

Shelf Location

Physical Description

Keywords

Upload Full Text

Recommended Citation

Embargo Period

Search

Browse

Submit

Connect

Master's Theses

Keyword extraction for very high dimensional datasets using random projection as key input representation scheme

Author

Date of Publication

Document Type

Degree Name

Subject Categories

College

Department/Unit

Thesis Adviser

Defense Panel Chair

Defense Panel Member

Abstract/Summary

Abstract Format

Language

Format

Electronic File Format

Accession Number

Shelf Location

Physical Description

Keywords

Upload Full Text

Recommended Citation

Embargo Period

Share

Search

Browse

Submit

Connect