Date of Publication

6-2022

Document Type

Bachelor's Thesis

Degree Name

Bachelor of Science in Electronics and Communications Engineering

Subject Categories

Electrical and Computer Engineering

College

Gokongwei College of Engineering

Department/Unit

Electronics And Communications Engg

Thesis Advisor

Melvin K. Cabatuan

Defense Panel Chair

John Anthony Jose

Defense Panel Member

Edwin Sybingco
Melchizedek Alipio

Abstract/Summary

Unlike ballgame-type sports, scoring in boxing requires a keen eye and is subject to the judges’ perspective; however, there are many factors that can affect biases in scoring the match. To reduce the instances of declaring the wrong victor, this study focuses on developing a monitoring tool that scores boxing matches using deep learning algorithms. This involves programs that are used to view and classify the gesture of the contending players and generates a scorecard that is statistically evaluated against ground truth data verified by actual boxing judges. The process is broken down into four parts: boxer detection, dataset curation, multi-model scoring, and system testing. Boxer detection uses a YOLOv5-based model which is trained using a 1200-image dataset for an mAP of 0.82. Three approaches are taken for the dataset curation – a video-level and an image-level split for Olympic boxing videos where these are used to train, validate, and test the models, and the live footage dataset where three full matches are recorded for system testing. Initially, the algorithms proposed for the multi-model scoring include Pose Estimation, Action Recognition, and two Convolutional Neural Networks (CNN). However, results from the testing show that the CNN models are the most effective. The CNN Custom Model obtains an 80% scoring accuracy and a winner prediction rate of 86.67% against the Olympic testing dataset, along with 77.78% and 88.89% respectively for live footage testing. The EfficientNet architecture garners a 63.33% scoring accuracy and 70% winner prediction rate for the Olympic footage; live footage comes out to 77.78% and 100%, respectively.

Abstract Format

html

Language

English

Keywords

Deep learning (Machine learning); Gesture recognition (Computer science); Computer vision; Boxing

Upload Full Text

wf_yes

Embargo Period

7-13-2023

Share

COinS