ABC automatic blog categorizer using K-means algorithm
Date of Publication
2009
Document Type
Bachelor's Thesis
Degree Name
Bachelor of Science in Computer Science
Subject Categories
Computer Sciences
College
College of Computer Studies
Department/Unit
Computer Science
Thesis Adviser
Paul Salvador Inventado
Defense Panel Member
Charibeth K. Inventado
Merlin C. Suarez
Abstract/Summary
Many web logs are being published daily throughout the World Wide Web. One of the reasons why blogs are popular is because it is free. During the survey of 2005, there are around 60 million blogs all over the internet (Riley, 2005). With the increasing number of blogs each day, it is hard to search for a specific blog. Organizing these blogs can help in searching because these blogs will have an identity based on its subject making it easier to distinguish from one concept from the other. An example will be searching for a blog containing the word freestyle, which refers to a stroke in swimming. Other subjects like freestyle as related to dance can be filtered out by specifying the intended category. This research aims to solve the problem by developing a software that will categorize blogs to their respective categories. Most document categorization software categorizes documents into pre-defined categories. This research however, aims to automatically categorize blogs based on content without using pre-defined categories. Throughout the course of the research, the proponents learned that the result of the automated categorization of blogs heavily depends on the input provided for the system. For this dataset, using words alone and without a lexical analyzer or some form of understanding the words, it is difficult to come up with clusters with general topics because these words or terms may have different meanings.
Abstract Format
html
Language
English
Format
Accession Number
TU19848
Shelf Location
Archives, The Learning Commons, 12F, Henry Sy Sr. Hall
Physical Description
1 v. (various foliations) : illustrations (some colored) ; 28 cm.
Keywords
Blogs; Blogs--Social aspects; Online journalism
Recommended Citation
Agustin, O. Y., Cruz, J. S., Flores, A. M., & Luna, C. G. (2009). ABC automatic blog categorizer using K-means algorithm. Retrieved from https://animorepository.dlsu.edu.ph/etd_bachelors/11139