QCKer: An x86-AVX/AVX2 implementation of Q-gram counting filter for DNA sequence alignment

College

College of Computer Studies

Department/Unit

Computer Technology

Document Type

Article

Source Title

ACM International Conference Proceeding Series

First Page

49

Last Page

54

Publication Date

12-19-2019

Abstract

The paper presents the implementation of the q-gram counting filter using x86-AVX/AVX2 SIMD instructions. There are three novel findings during the course of the research work. First, to eliminate inconsistency between the theoretical and experimental result, synthetic reads are generated using DNA character "T" only since generated synthetic reads create a random condition in which the number of seed instances is variable, and thus cannot be predicted. Second, the presence and absence of various SIMD parameters namely, prefetch, multithreading and AVX instruction sets are introduced to determine the speed factor. Result shows that there is a 2% speedup with the presence of prefetching, a 2.7% speedup with the presence of AVX instruction sets, a 100.41% speedup with the presence of multithreading, and a 112.25%) speedup if all parameters are used. This shows that multithreading has the biggest effect among the said parameters. Third, the x86-AVX is compared with Razers3, an existing read mapper using q-gram counting filter. In terms of filter only, the x86-AVX is 12x faster than the Razers3 for small seed size of 4. Though, Razers3 outperforms the x86-AVX implementation for longer seed (i.e., seed size of 12). This is attributed to Razers3 being optimized for q-gram of 12 or higher. From these findings, it is recommended that using real datasets is preferred over synthetic datasets. Also, implementation using multithreading approach is recommended. Though future work can be done to compare multithread with FPGA implementation. © 2019 ACM.

html

Digitial Object Identifier (DOI)

10.1145/3383783.3383806

Disciplines

Computer Sciences

Keywords

Simultaneous multithreading processors; Nucleotide sequence; Gene mapping

Upload File

wf_no

This document is currently not available here.

Share

COinS