Special Feature

User Panel

My Panel

My Panel

Bookmark Science Articles

Recent News
Bookmark / Share This Science Site

Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure.

Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure. Research Abstract Details 

Research Abstract Table of Contents

Jump to the:

  • Abstract Text of This Paper
  • Journal Published
  • MeSH Keywords of This Abstract
  • Chemicals and Substances Used in this Paper
  • Grants and Granting Agency of this Research
  • Database Accession Numbers Used in this Paper
  • Related Papers
  • Related Research Tags
  • Rate this Research Paper
  • Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure. Abstract Text:

    darrin p lewisDarrin P Lewis,tony jebaraTony Jebara,william stafford nobleWilliam Stafford Noble,

    MOTIVATION: Drawing inferences from large, heterogeneous sets of biological data requires a theoretical framework that is capable of representing, e.g. DNA and protein sequences, protein structures, microarray expression data, various types of interaction networks, etc. Recently, a class of algorithms known as kernel methods has emerged as a powerful framework for combining diverse types of data. The support vector machine (SVM) algorithm is the most popular kernel method, due to its theoretical underpinnings and strong empirical performance on a wide variety of classification tasks. Furthermore, several recently described extensions allow the SVM to assign relative weights to various datasets, depending upon their utilities in performing a given classification task. RESULTS: In this work, we empirically investigate the performance of the SVM on the task of inferring gene functional annotations from a combination of protein sequence and structure data. Our results suggest that the SVM is quite robust to noise in the input datasets. Consequently, in the presence of only two types of data, an SVM trained from an unweighted combination of datasets performs as well or better than a more sophisticated algorithm that assigns weights to individual data types. Indeed, for this simple case, we can demonstrate empirically that no solution is significantly better than the naive, unweighted average of the two datasets. On the other hand, when multiple noisy datasets are included in the experiment, then the naive approach fares worse than the weighted approach. Our results suggest that for many applications, a naive unweighted sum of kernels may be sufficient. AVAILABILITY: http://noble.gs.washington.edu/proj/seqstruct

    Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure. Publishing Authors By Initials

    dp lewisDP Lewis,t jebaraT Jebara,ws nobleWS Noble,

    For similar information science: computing methodologies: software research abstracts see: information science: computing methodologies: software research

    PUBMED ID PMID:

    MEDLINE DATE:

    Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure. Journal Published:

    PUBLICATION TYPE: Research Support, U.S. Gov't,

    Journal: Bioinformatics (Oxford, England)

    VOLUME: 22

    Page Numbers: 2753-60

    Journal Abbreviation: Bioinformatics

    ISSN: 1460-2059

    DAY: 11

    MONTH: 09

    YEAR: 2006

    Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure. Information

    Number of References:

    LANGUAGE: eng

    NlmUniqueID: 9808944

    Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure. Keywords Mesh Terms:

    KEYWORDS: Software

    MESH TERMS: methods

    Chemical & Substance for Abstract: Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure. Information

    Substance Name: Proteins

    Registry Number: 0

    Grant and Affiliation Information for Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure.

    AFFILIATION: Department of Computer Science, Columbia University, New York, NY, 10027.

    Country: England

    England Research PublicationEngland Research Publication

    AGENCY: United States NHGRI

    GRANT: R33 HG003070

    ACRONYM: HG

    MEDLINETA: Bioinformatics

    REFSOURCE:

    DATABASENAME:

    ACCESSION NUMBER:

    Number Hits: 0

    Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure Related Publications

     

    Molecular Station USER Menu

    Welcome to Molecular Station!

    You have to register before you can post on our forums or use our advanced features. Register Now! Its Free and Fast!

    Already registered? Login now below.

    User Name:

    Password:

    Already registered and Forgot your password? Click below to recover it.

    Recover Lost Password

    Join now - it's fast and free!

    Molecular Station is THE largest network of researchers, scientists and science lovers anywhere!

    Research Terms of Usage and Disclaimer
    Home
    Features

    Protocols

    DNA Forum

    Science Forum

    DNA Forum
    Biology Forum

    Science News


    [CaRP] XML error: Invalid document end at line 2

    For more click here:Science News