| | |||||||
| Register | Search | Today's Posts | Mark Forums Read |
| Arabidopsis and Plant Biology Discuss Arabidopsis and Plant Biology Research. |
| | LinkBack | Thread Tools | Display Modes |
|
#1
| |||
| |||
| Dear Colleagues Our previous announcement about the gene content of TIGR annotation 5.0, we failed to point out a change in our approach towards annotating transposons. We also, inadvertently lumped transposons and other pseudogenes together with the result that the apparent number of protein-coding genes in this latest annotation is decreased from previous releases. The following paragraph attempts to clarify the situation. "Transposons and pseudogenes were the last categories of genes to be addressed by the re-annotation progress. Many ORFs were originally annotated as being similar to transposons or transposon-related proteins. However, the majority of these regions are degenerate so that it is difficult or impossible to model ORFs across their entire extent although shorter ORFs may be contained within the boundaries of transposon similarity. Thus the legacy annotation for transposon-related sequences consisted of a mixture of genes and pseudogenes, only some of which were annotated as transposon-related. In the latest release (5.0), we have provided a uniform annotation of all transposon-related sequences by searching the entire genome against a curated database of transposons sequences and automatically applying the corresponding Tn family annotation. The majority of such sequences are degenerate and clearly pseudogenes. We have not attempted to discriminate between possible complete ORFs and pseudogenes in this (transposon) category. There are 2,424 loci annotated as transposons in the current release and (in contrast to all previous releases) these are no longer included in the count of "protein coding genes" nor in that dataset. In addition to transposon-related sequences, there are approximately 500 "genuine" pseudogenes that are clearly related to genes of identifiable function. Finally, there are ~ 850 pseudogenes that are similar to proteins from Arabidopsis or other species that have no known function and may represent degenerate ORFs of hypothetical proteins yet to be characterized or to proteins from either of the above categories. Users should note also that the naming of these pseudogenes has not been subjected to the same set of uniform curation standards that we applied to the full set of non-pseudogenes and thus contains a mixture of TIGR and legacy annotation." Please feel free to comment or write with questions. Best wishes Chris Town _________________________________ Chris Town Associate Investigator The Institute for Genomic Research 9712 Medical Center Drive Rockville, MD 20850 **********NOTE NEW PHONE NUMBERS********** Office Phone: 301-795-7523 To page me at TIGR: 301-795-7000 Fax: 301-838-0208 Home Phone: 301-990-0878 Cell Phone: 512-422-8810 |
| Tags |
| annotation , clarification , pseudogenes , tigr , transposons |
| Thread Tools | |
| Display Modes | |
|
|