Hi everyone !
I have to find sequence similarities and to align about 5,000 peptide sequences (from 10 to 60 amino acid length) coming from 2 different organisms.
These sequences have highly conserved Cysteine residues arranged in different but also conserved cystein frameworks.
xCxxxCCxxxCx = framework 1
xCxCxCx = framework 2
xCCxxxCxxCx = framework 3
where "C" = Cysteine, and "x" can be any amino acid excepted Cysteine) So I would like to find a software in order to:
1st - Compare the entire sequences (with all the amino acids)
2nd - If I don't find sequence similarities, then I would like to compare only the cysteine frameworks (i.e position of cysteines to each other + length of inter-cysteine sequences), even if the amino acids between the cysteines are totally different.
I listed all these sequences in an Excel sheet, and I can easily convert them into a Fasta file if needed.
I also modified the sequences to display only the Cysteines as in the following example:
NNKYCCHLWCTKHPRC into ----CC---C-----C
- What about MacVector?
- Would it be possible to align the sequences directly in Excel if I find a formula or a script ?
Thanks in advance !