Date of Publication

3-1-2019

Document Type

Master's Thesis

Degree Name

Master of Science in Computer Science

Subject Categories

Computer Sciences

College

College of Computer Studies

Department/Unit

Computer Science

Thesis Adviser

Arnulfo P. Azcarraga

Defense Panel Chair

Conrado D. Ruiz, Jr.

Defense Panel Member

Joel P. Ilao
Arnulfo P. Azcarraga

Abstract/Summary

Despite recent improvements, the arbitrary sizes of objects still impede the predictive ability of object detectors. Recent solutions combine feature maps of different receptive fields to detect multi-scale objects. However, these methods have large computational costs resulting to slower inference time, which is not practical for real-time applications. Contrarily, fusion methods depending on large networks with many skip connections require larger memory footprint, prohibiting usage in devices with limited memory. In this paper, we propose a simpler novel fusion method which integrates multiple feature maps using a single concatenation operation. Our method can flexibly adapt to any base network, allowing for tailored performance for different computational requirements. Our approach achieves 81.7% mAP at 41 FPS on the PASCAL VOC dataset using ResNet-50 as the base network, which is superior in terms of both speed and mAP as compared to several state-of-the-art baselines that uses larger base networks.

Abstract Format

html

Language

English

Format

Electronic

Accession Number

CDTG008037

Keywords

Image converters; Neural networks (Computer science)

Upload Full Text

wf_yes

Embargo Period

2-28-2023

Share

COinS