|Register||Search||Today's Posts||Mark Forums Read|
|Bioinformatics Have questions about bioinformatic tools or databases? Post questions here. Discuss and post interesting bioinformatics information.|
| ||LinkBack||Thread Tools||Display Modes|
Select the most related sequences in a multiple sequence alignment
Hi all !
I'm looking for a method to select / visualise quickly the most related peptide sequences from a list of aligned sequences.
I have a group of 1,500 peptide sequences that I have aligned with a multiple sequence alignment software.
I cannot obtain a alignment score (or pairwise identity %, or Identical sites %) for each pair of sequences, and obviously I cannot align manually all the possible combinations of sequence pairs (i.e sequence 1 with sequence 2, 1 with 3, 1 with 4... 1 with 1500, 2 with 3, 2 with 4...), in order to obtain a score for every pair of sequences.
So I'am looking for a way to visualise subgroup of close sequences among my 1,500 peptides.
I'm thinking about using a tool to build phylogenetic tree, but I'm not a pro. in this field and I don't know which matrix to choose and how interpreting the tree (Do 2 similar sequences be under a same node, or have similar distances ????)
I'm totally lost ! If someone is used to work with this tool, his/her explanations would be very helpful
Re: Select the most related sequences in a multiple sequence alignment
There are several solutions to get sequence similarity info for your alignment
Use of phylogenetic trees is the most advanced one: you will be able to see all groups as the result. Similar sequences will be close to each other in the resulted tree.
Another way to get similarity info is to build and analyze distance matrix that shows distance (for example pairwise identity %) between all sequences in an alignment. It is the same if you manually align all sequence combinations and, usually, is a first phase of phy-tree construction algorithm.
You can get the distance matrix, as will as phy-tree with UGENE tool ( [Only registered users see links. ] ) with the following steps
1) Open file with your alignment (alignment editor is opened automatically for all files with alignment)
2) Use "Tree->Build Tree" or "Statistics->Generate Distance Matrix" context menu options
|alignment , multiple , related , select , sequence , sequences|
|Thread||Thread Starter||Forum||Replies||Last Post|
|Human Cytome Project - Update 24 Jan. 2005||Peter Van Osta||Cell Biology and Cell Culture||1||08-01-2010 02:18 PM|
|Human Cytome Project - an idea - Update 19 April 2005||Peter Van Osta||Cell Biology and Cell Culture||1||06-01-2009 02:17 PM|
|% identity per base in multiple sequence alignment||sedm1000||Bioinformatics||5||03-11-2009 07:15 AM|
|A Human Cytome Project - an idea - Update 14 March 2005||Peter Van Osta||Cell Biology and Cell Culture||0||03-14-2005 01:27 PM|
|Human Cytome Project - Update 6 Jan. 2005||Peter Van Osta||Cell Biology and Cell Culture||0||01-06-2005 10:18 AM|