eprintid: 748 rev_number: 7 eprint_status: archive userid: 17 dir: disk0/00/00/07/48 datestamp: 2019-01-21 22:43:33 lastmod: 2019-01-21 22:43:33 status_changed: 2019-01-21 22:43:33 type: report metadata_visibility: show creators_name: Croci, Matteo creators_name: Morawiecki, P. creators_name: Prater, John creators_name: Sulzer, Valentin creators_name: Theil, Florian corp_creators: Alexandra Harvey corp_creators: Emily Matthews title: Classification of Two-Dimensional Gas Chromatography Data ispublished: pub subjects: aerodef studygroups: esgi130 companyname: DSTL full_text_status: public abstract: Gas chromatography (GC) is a popular tool for chemical analysis. Some samples are so complex that a single column does not have enough power to separate all of the analytes. In this instance a higher resolution GC method, known as comprehensive two-dimensional gas chromatography (GCxGC), is used. DSTL want to be able to use data from GCxGC to attribute samples to a particular region or cultivar. However, the nature of the data means that several difficulties must be overcome before being able to do this: noise from sample, peak mis-alignment, and low quantity of samples. In this report, we investigate several methods to overcome such difficulties, and then classify the data. We are very successful in telling apart blanks from seeds, but obtain limited success when trying to classify between seeds. The method that shows the most promise is k-Nearest Neighbours classification by Wasserstein distance. However, this is still quite sensitive to the noise created by the solvent in the sample. Thus, we suggest that more blank runs be obtained, so that the ‘ground truth’ behaviour of the solvent is better understood, allowing us to remove the effect of the solvent from seed data. We also hope that the methods explored here will be more successful on the full raw data than they were on the limited ‘peaks’ data available to us for the purpose of this study. date: 2017 date_type: completed citation: Croci, Matteo and Morawiecki, P. and Prater, John and Sulzer, Valentin and Theil, Florian (2017) Classification of Two-Dimensional Gas Chromatography Data. [Study Group Report] document_url: http://miis.maths.ox.ac.uk/miis/748/1/DSTL_StudyGroupReport_v1.pdf