The MIIS Eprints Archive

A two-base encoded DNA sequence alignment problem in computational biology

Chang, Jen-Mei and Drakes, Chiaka and Huang, Erya and Langat, Deidrey and McGrath, Joseph and Morabito, Mark and Pacheco, Jose and Rodriguez, Nancy and Salazar, Daniel and Vemuri, Rao and Vu, Man and Wadhar, Hem and Wu, Qin (2009) A two-base encoded DNA sequence alignment problem in computational biology. [Study Group Report]

[img]
Preview
PDF
853kB

Abstract

The recent introduction of instruments capable of producing millions of DNA sequence reads in a single run is rapidly changing the landscape of genetics. The primary objective of the "sequence alignment" problem is to search for a new algorithm that facilitates the use of two-base encoded data for large-scale re-sequencing projects. This algorithm should be able to perform local sequence alignment as well as error detection and correction in a reliable and systematic manner, enabling the direct comparison of encoded DNA sequence reads to a candidate reference DNA sequence.

We will first briefly review two well-known sequence alignment approaches and provide a rudimentary improvement for implementation on parallel systems. Then, we carefully examin a unique sequencing technique known as the SOLiDTM System that can be implemented, and follow by the results from the global and local sequence alignment.

In this report, the team presents an explanation of the algorithms for color space sequence data from the high-throughput re-sequencing technology and a theoretical parallel approach to the dynamic programming method for global and local alignment. The combination of the di-base approach and dynamic programming provides a possible viewpoint for large-scale re-sequencing projects. We anticipate the use of distributed computing to be the next-generation engine for large-scale problems like such.

Item Type:Study Group Report
Problem Sectors:Medical and pharmaceutical
Study Groups:Claremont Colleges Math-in-Industry Workshop > Claremont Colleges Math-in-Industry Workshop 2009
Company Name:National Institute of Genomic Medicine, México
ID Code:285
Deposited By: Dr Kamel Bentahar
Deposited On:11 Mar 2010 15:41
Last Modified:29 May 2015 19:54

Repository Staff Only: item control page