Person detection and tracking from aerial videos

Date of Publication


Document Type

Bachelor's Thesis

Degree Name

Bachelor of Science in Computer Science

Subject Categories

Computer Sciences


College of Computer Studies


Computer Science

Thesis Adviser

Joel P. Ilao

Defense Panel Member

Roger Luis T. Uy

Macario G. Cordel,II

Ana Franchesca B. Laguna


In this study, a system that can detect and locate humans in video recordings collected by an unmanned aerial vehicle (UAV) was developed. Several machine vision algorithms were implemented. Lucas and Kanade optic flow computation and YUV color space conversion were used for feature point selection, and Mean-Shift clustering in feature space for image segmentation. Histogram of Oriented Gradients (HOG), Haar, and Speeded Up Robust Features (SURF) were used for feature extraction, while Support Vector Machines (SVM) and Adaptive Boosting (AdaBoost) were utilized for training and classification. Kalman filter was employed for tracking humans.

The Person Detection and Tracking from Aerial Videos (PDTrAV system was tested on videos taken under different weather conditions and was able to successfully detect and tract people in them. System tests, however, indicated more false positives and false negatives than true positives. A minimum threshold for the ratio of the blob area and its bounding box area was enforced in order to reduce the number of false positives, which were attributed to diagonal lines or edges that were detected as people. The use of Haar-like features attained the best recall at 32.3925% while the use of HOG features attained the best precision at 24.5945%

While improvements can be done to make the system suitable for any real world application, this study has proven that it is possible to detect humans from an aerial perspective using fewer and more cost effective resources. The system was able to detect humans in aerial videos taken using an ordinary digital camera.

Further improvements on the Mean-Shift clustering, Kalman filter, and the feature extraction methods are recommended for better performance of the system. For the Mean-Shift Algorithm, bandwidth estimators can be used since the system only used a constant bandwidth input. Since the Kalman filter used in the system can only estimate constant velocity and location of the blobs, adaptive Kalman filter can be used to have more accurate estimates when the velocity of the blobs or of the camera becomes time-varying. For HOG and Haar feature extraction, a larger and more unique dataset should be used to have more features to compare with. For SURF extraction, the size of the dataset set, the number of clusters in the visual codebook, the number of octaves and scale levels, and the threshold value should be varied in order to determine the most suitable values for a certain application.

Abstract Format






Accession Number


Shelf Location

Archives, The Learning Commons, 12F, Henry Sy Sr. Hall

Physical Description

1 v. (various foliations) : illustrations ; 28 cm.


Drone aircraft in remote sensing; Aerial videography

This document is currently not available here.