Masters Thesis

Blast compatible non-heuristic biological sequence alignment on hetereogenous systems

Biological sequence alignment has been widely used in the field of computational biology and bioinformatics. For this application both non-heuristic and heuristic algorithms have been implemented. With the recent technological development of high- performance systems, researchers have been motivated in proposing diversified accelerated approaches of the Smith-Waterman algorithm, which is a time-consuming non-heuristic algorithm. In this thesis we propose a BLAST compatible non-heuristic local sequence alignment, which combines the statistics and functionalities of the de facto standard heuristic approach, BLAST, and the highest accuracy from Smith-Waterman. The resulting product, BCSW (Blast Compatible Smith-Waterman), supports affine gap, traceback and multiple alignments for a pair of sequences. BCSW is accelerated on multicore, GPU and CPU-GPU heterogeneous systems and tested for performances. To achieve higher performance, an enhancement is done with SIMD/SSE vector instructions and a sophisticated workload balancing scheme is used to exploit the maximum resource utilization in the heterogeneous system. In our practice, the SSE enhanced heterogeneous computation model of BCSW showed the best performance among all the high- performance versions that we developed with a speedup of 112x over the serial version. In addition, integration of existing high-performance approaches of Smith-Waterman into BCSW is conducted and demonstrated that BCSW is suitable for such collaboration.