Multiple sequence aligment menggunakan hidden Markov model dengan augemented set dan pengaruhnya terhadap akurasi pohon filogenetik
The basic tasks in molecular biology data analysis are multiple sequence alignment (MSA) and phylogenetic tree inference. The quality of the phylogenetic tree depends on the quality of the MSA. Hidden Markov model (HMM) is one of the good methods to generate the MSA, but having sequences with low si...
محفوظ في:
المؤلف الرئيسي: | |
---|---|
التنسيق: | Theses and Dissertations NonPeerReviewed |
منشور في: |
[Yogyakarta] : Universitas Gadjah Mada
2010
|
الموضوعات: | |
الوصول للمادة أونلاين: | https://repository.ugm.ac.id/84910/ http://etd.ugm.ac.id/index.php?mod=penelitian_detail&sub=PenelitianDetail&act=view&typ=html&buku_id=45774 |
الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
المؤسسة: | Universitas Gadjah Mada |
الملخص: | The basic tasks in molecular biology data analysis are multiple sequence alignment (MSA) and phylogenetic tree inference. The quality of the phylogenetic tree depends on the quality of the MSA. Hidden Markov model (HMM) is one of the good methods to generate the MSA, but having sequences with low similarity, this method will produce less optimal MSA. This research works on performing multiple alignments of protein sequences with low similarity using the HMM, which can be used as input and it produces more accurate phylogenetic tree. The research is carried out by building augmented set. The parameters are the number of child sequences and the percentage of mutation applied in child sequence. There are two kind of m utation process, first based on substitution matrix BLOSUM 80 and second, random mutation. Augmented set used as input into the HMM to obtain the MSA. Baum welch learning algorithm is used to estimate the parameters in HMM. While Viterbi algorithm is used to arrange the alignment from unaligned sequences. The prototype tool is built using Java programming language and utilizing Biojava library. In this research, the accuracy of phylogenetic trees using MSA with augmented set is compared with the MSA without augmented set. There are two phylogenetic tree inference methods used in here. First, neighbour joining is conducted using ClustalX tool. Second, parsimony methods is conducted using Phylip Protpars tool. The data are the amino acid sequences of ribosomes 16S from mitochondria. The accuracy of phylogenetic tree using augmented set based on matrix BLOSUM 80 and the neighbour joining method increases when the datasets with criteria : the number of sequences and HDS (highly diverge sequence) are small enough, and the difference between maximum length and average length of sequences is small enough. While the accuracy of phylogenetic trees using the augmented set and the parsimony method can increase or decrease arbitrarily. |
---|