Predicting Protein Relationships to Human Pathways through a Relational Learning Approach Based on Simple Sequence Features Articles uri icon

publication date

  • August 2014

start page

  • 753

end page

  • 765

issue

  • 4

volume

  • 11

International Standard Serial Number (ISSN)

  • 1545-5963

Electronic International Standard Serial Number (EISSN)

  • 1557-9964

abstract

  • Biological pathways are important elements of systems biology and in the past decade, an increasing number of pathway databases have been set up to document the growing understanding of complex cellular processes. Although more genome-sequence data are becoming available, a large fraction of it remains functionally uncharacterized. Thus, it is important to be able to predict the mapping of poorly annotated proteins to original pathway models. Results: We have developed a Relational Learning-based Extension (RLE) system to investigate pathway membership through a function prediction approach that mainly relies on combinations of simple properties attributed to each protein. RLE searches for proteins with molecular similarities to specific pathway components. Using RLE, we associated 383 uncharacterized proteins to 28 pre-defined human Reactome pathways, demonstrating relative confidence after proper evaluation. Indeed, in specific cases manual inspection of the database annotations and the related literature supported the proposed classifications. Examples of possible additional components of the Electron transport system, Telomere maintenance and Integrin cell surface interactions pathways are discussed in detail. Availability: All the human predicted proteins in the 2009 and 2012 releases 30 and 40 of Reactome are available at http://rle.bioinfo.cnio.es.

keywords

  • pathway relationship prediction; sequence-based prediction; knowledge relational representation; machine learning; function prediction; human reactome pathways; biological pathways; systems biology; ligase activity; bioinformatics; ontology; database