Názov:The on-line Viterbi algorithm
Vedúci:Tomáą Vinař, PhD.
Kµúčové slová:hidden Markov models, Viterbi algorithm, information theory, gene finding
Abstrakt:Hidden Markov models (HMMs) are probabilistic models that have been extremely successful in addressing problems in bioinformatics, error-correcting codes, natural language processing, and other important areas. In many of these applications, the Viterbi algorithm is used to find the most probable state path through the model generating sequences that can be extremely long. Known algorithms either use at least $\Omega(n)$ memory in the best case and always find the most probable state path, or use $o(n)$ memory in the worst case, but do not guarantee correct result. We introduce a modification of the Viterbi algorithm which is guaranteed to find the most probable state path and uses anywhere between $O(1)$ and $O(n)$ memory, depending on the model and the input sequence. For a simple class of models, we show that the expected memory needed to process a sequence is $O(\log(n))$. We present results from experiments with more complicated models for the problem of gene finding in bioinformatics that indicate similar expected memory requirements.

Súbory diplomovej práce: