eprintid: 285 rev_number: 35 eprint_status: archive userid: 7 dir: disk0/00/00/02/85 datestamp: 2010-03-11 15:41:50 lastmod: 2015-05-29 19:54:28 status_changed: 2010-03-11 15:41:50 type: report metadata_visibility: show item_issues_count: 0 creators_name: Chang, Jen-Mei creators_name: Drakes, Chiaka creators_name: Huang, Erya creators_name: Langat, Deidrey creators_name: McGrath, Joseph creators_name: Morabito, Mark creators_name: Pacheco, Jose creators_name: Rodriguez, Nancy creators_name: Salazar, Daniel creators_name: Vemuri, Rao creators_name: Vu, Man creators_name: Wadhar, Hem creators_name: Wu, Qin corp_creators: Claudia Rangel corp_creators: Pat McLaughlin title: A two-base encoded DNA sequence alignment problem in computational biology ispublished: pub subjects: medicine studygroups: ccmi2009 companyname: National Institute of Genomic Medicine, México full_text_status: public abstract: The recent introduction of instruments capable of producing millions of DNA sequence reads in a single run is rapidly changing the landscape of genetics. The primary objective of the "sequence alignment" problem is to search for a new algorithm that facilitates the use of two-base encoded data for large-scale re-sequencing projects. This algorithm should be able to perform local sequence alignment as well as error detection and correction in a reliable and systematic manner, enabling the direct comparison of encoded DNA sequence reads to a candidate reference DNA sequence. We will first briefly review two well-known sequence alignment approaches and provide a rudimentary improvement for implementation on parallel systems. Then, we carefully examin a unique sequencing technique known as the SOLiDTM System that can be implemented, and follow by the results from the global and local sequence alignment. In this report, the team presents an explanation of the algorithms for color space sequence data from the high-throughput re-sequencing technology and a theoretical parallel approach to the dynamic programming method for global and local alignment. The combination of the di-base approach and dynamic programming provides a possible viewpoint for large-scale re-sequencing projects. We anticipate the use of distributed computing to be the next-generation engine for large-scale problems like such. date: 2009 related_url_url: http://ccms.claremont.edu/mini/problems/national-institute-genomic-medicine citation: Chang, Jen-Mei and Drakes, Chiaka and Huang, Erya and Langat, Deidrey and McGrath, Joseph and Morabito, Mark and Pacheco, Jose and Rodriguez, Nancy and Salazar, Daniel and Vemuri, Rao and Vu, Man and Wadhar, Hem and Wu, Qin (2009) A two-base encoded DNA sequence alignment problem in computational biology. [Study Group Report] document_url: http://miis.maths.ox.ac.uk/miis/285/1/4_National_Institute_of_Genomic_Medicine.pdf