| |||||||
| Register | Blogs | FAQ | Members List | Calendar | Science Groups New! | Arcade | Search | Today's Posts | Mark Forums Read |
| Bioinformatics Have questions about bioinformatic tools or databases? Post questions here. Discuss and post interesting bioinformatics information. |
|
![]() |
| | LinkBack | Thread Tools | Display Modes |
| |||||
| Here's my problem: I'm trying to find sequences of nucleotides in the 3' UTRs of the Drosophila melanogaster genome that can be used to uniquely identify genes. My approach was to run a portion of the gene in question through BLAST (to find matches elsewhere in the genome). I then ran the BLAST results through a simple Perl script I wrote to develop a histogram of the frequency of matches for a given nucleotide. The first figure shows the results I got for the 5-HT7 gene (short name). Based on these results, I figured that any sequence within the region of zero frequency would be unique in the genome. I then took the first 100 nucleotides of the original sample and ran then through BLAST again. I got the histogram in the second figure. Clearly, there were a few matches (14 to be exact) that did not show up on the first run through BLAST. This was irksome, but not really problematic. Just for grins and giggles, I then took only the first 23 nucleotides and ran them once more through BLAST. The results are shown in the third figure. This is problematic. Any idea why BLAST shows different results depending on the length of the sequence I use? Thanks. |
| | ||||
| ||||
| |
| |||||
| I'm not exactly sure what you're doing -- for B, for example, you took 100 nucleotides. Did you run BLAST once on those 100 nucleotides & collate your histogram from multiple hits you got on that sequence, or did you take subsets of the sequence and run BLAST multiple times? The shorter the string you send to BLAST, the more statistically likely it is that it will match the genome sequence. You can increase the stringency of the BLAST search by playing with which algorithm you use, and the expect threshold, word size, and scoring parameters. |
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Domain/motif and double BLAST hits? | stewdew | Bioinformatics | 0 | 04-18-2008 07:58 PM |