|Register||Search||Today's Posts||Mark Forums Read|
|Protein Forum Protein Forum|
| ||LinkBack||Thread Tools||Display Modes|
pdb-l: About PDB Files and Secondary Structures
Narges Habibi wrote
None of the above.
Predicting contact maps using known structure is cheating. You should
be predicting the local structure, not extracting it from known
structures. Any way that data from known structures can creep into
your inputs invaliates your testing, and makes it impossible to say
with confidence that your method does anything useful. Given the
rather low-quality of contact prediction at the current state of the
art, even small amounts of information from the real structure can
make a big difference.
The following paper by my student is a pretty good summary of the the
best method as of CASP7---improvements since then have been modest:
George Shackelford and Kevin Karplus.
Contact Prediction using Mutual Information and Neural Nets.
Proteins: Structure, Function, and Bioinformatics,
69(S8):159-164, 2007. (CASP7 sepcial issue).
I see a lot of "prediction" work that is complete garbage, because the
authors fooled themselves by using data that could only come from
knowing the real structures. The even more common problem is
insufficient separation of train and test sets, in which computer
scientists assume that the random partition of a data set is all that
is needed---but the sta sets we have aren't independent samples, so
one has to go to some effort to ensure that the test set does not
contain examples that are very close to training set examples.
Kevin Karplus [Only registered users see links. ] [Only registered users see links. ]
Professor of Biomolecular Engineering, University of California, Santa Cruz
Undergraduate Director, Bioinformatics
(Senior member, IEEE) (Board of Directors & Chair of Education Committee, ISCB)
Affiliations for identification only.
|files , pdb , pdbl , secondary , structures|
|Thread||Thread Starter||Forum||Replies||Last Post|
|pdb-l: About PDB Files and Secondary Structures||Dan Bolser||Protein Forum||0||04-24-2008 02:39 PM|
|[Protein-analysis] About PDB Files and Secondary Structures||Oznur Tastan||Protein Forum||0||04-23-2008 09:00 PM|
|pdb-l: About PDB Files and Secondary Structures||Michael Sauder||Protein Forum||0||04-23-2008 03:16 PM|
|pdb-l: About PDB Files and Secondary Structures||Rolf Huehne||Protein Forum||0||04-23-2008 12:22 PM|
|About PDB Files and Secondary Structures||Narges Habibi||Protein Forum||0||04-23-2008 11:25 AM|