Názov:Detecting Modified Bases in MinION Data
Vedúci:doc. Mgr. Tomáą Vinař,PhD.
Kµúčové slová:MinION, DNA methylation, anomaly detection, deep learning, autoencoders
Abstrakt:The goal of this master thesis is to computationally identify modified DNA bases from raw MinION data. The MinION is a portable DNA sequencing device, which does not require DNA amplification in the sample preparation step. Consequently, DNA modifications are still present in the DNA strand, which is sequenced by passing through the nanopore. Modified bases cause shifts in the measured signal which can later be identified com- putationally. Current tools for the identification of modified bases from MinION data require a labeled training set which is composed of modified and non-modified (canonical) bases. It is quite difficult and expensive to experimentally create this kind of dataset. In this thesis, we use a semi-supervised approach to this problem instead. We train an autoencoder on a dataset without modifications to learn characteristics of the non- modified bases. Then we analyze the reconstruction error of the autoencoder to identify bases that do not conform to the learnt characterization. In our work, we have focused on DNA methylation but our approach can be used for the detection of any DNA modification. Our results show that from the recon- struction error of the autoencoders, we cannot differentiate between methylated and unmethylated DNA bases only by using a single read. However, when we aggregate reconstruction errors from multiple reads, we get a more promising result: for most of the methylations, ten reads are enough to differentiate between methylated and unmethylated samples.

Súbory diplomovej práce: