Person detection and tracking from aerial videos
Date of Publication
Bachelor of Science in Computer Science
College of Computer Studies
Joel P. Ilao
Defense Panel Member
Roger Luis T. Uy
Macario G. Cordel,II
Ana Franchesca B. Laguna
In this study, a system that can detect and locate humans in video recordings collected by an unmanned aerial vehicle (UAV) was developed. Several machine vision algorithms were implemented. Lucas and Kanade optic flow computation and YUV color space conversion were used for feature point selection, and Mean-Shift clustering in feature space for image segmentation. Histogram of Oriented Gradients (HOG), Haar, and Speeded Up Robust Features (SURF) were used for feature extraction, while Support Vector Machines (SVM) and Adaptive Boosting (AdaBoost) were utilized for training and classification. Kalman filter was employed for tracking humans.
The Person Detection and Tracking from Aerial Videos (PDTrAV system was tested on videos taken under different weather conditions and was able to successfully detect and tract people in them. System tests, however, indicated more false positives and false negatives than true positives. A minimum threshold for the ratio of the blob area and its bounding box area was enforced in order to reduce the number of false positives, which were attributed to diagonal lines or edges that were detected as people. The use of Haar-like features attained the best recall at 32.3925% while the use of HOG features attained the best precision at 24.5945%
While improvements can be done to make the system suitable for any real world application, this study has proven that it is possible to detect humans from an aerial perspective using fewer and more cost effective resources. The system was able to detect humans in aerial videos taken using an ordinary digital camera.
Further improvements on the Mean-Shift clustering, Kalman filter, and the feature extraction methods are recommended for better performance of the system. For the Mean-Shift Algorithm, bandwidth estimators can be used since the system only used a constant bandwidth input. Since the Kalman filter used in the system can only estimate constant velocity and location of the blobs, adaptive Kalman filter can be used to have more accurate estimates when the velocity of the blobs or of the camera becomes time-varying. For HOG and Haar feature extraction, a larger and more unique dataset should be used to have more features to compare with. For SURF extraction, the size of the dataset set, the number of clusters in the visual codebook, the number of octaves and scale levels, and the threshold value should be varied in order to determine the most suitable values for a certain application.
Archives, The Learning Commons, 12F, Henry Sy Sr. Hall
1 v. (various foliations) : illustrations ; 28 cm.
Drone aircraft in remote sensing; Aerial videography
Garcia, A. B., Rufino, M. V., Sangalang, L. A., & Teodoro, J. R. (2014). Person detection and tracking from aerial videos. Retrieved from https://animorepository.dlsu.edu.ph/etd_bachelors/11042