Gpos-PLoc

Gpos-PLoc: an ensemble classifier for predicting subcellular localization of Gram-positive bacterial proteins. A statistical analysis indicated that, of the 35,016 Gram-positive bacterial proteins from the recent Swiss-Prot database, approximately 57% of these entries are without subcellular location annotations. In the gene ontology database, the corresponding percentage is approximately 67%, meaning the percentage of proteins without subcellular component annotations is even higher. With the avalanche of gene products generated in the post-genomic era, the number of such location-unknown entries will continuously increase. It is highly desired to develop an automated method for timely and accurately identifying their subcellular localization because the information thus obtained is very useful for both basic research and drug discovery practice. In view of this, an ensemble classifier called ’Gpos-PLoc’ was developed for predicting Gram-positive protein subcellular localization. The new predictor is featured by fusing many basic classifiers, each of which was engineered according to the optimized evidence-theoretic K-nearest neighbors rule. As a demonstration, tests were performed on Gram-positive proteins among the following five subcellular location sites: (1) cell wall, (2) cytoplasm, (3) extracell, (4) periplasm and (5) plasma membrane. To eliminate redundancy and homology bias, only those proteins which have < 25% sequence identity to any other in a same subcellular location were allowed to be included in the benchmark datasets. The overall success rates thus achieved by Gpos-PLoc were > 80% for both jackknife cross-validation test and independent dataset test, implying that Gpos-PLoc might become a very useful vehicle for expediting the analysis of Gram-positive bacterial proteins. Gpos-PLoc is freely accessible to public as a web-server at http://202.120.37.186/bioinf/Gpos/. To support the need of many investigators in the relevant areas, a downloadable file is provided at the same website to list the results identified by Gpos-PLoc for 31,898 Gram-positive bacterial protein entries in Swiss-Prot database that either have no subcellular location annotation or are annotated with uncertain terms such as ’probable’, ’potential’, ’perhaps’ and ’by similarity’. Such large-scale results will be updated once a year to include the new entries of Gram-positive bacterial proteins and reflect the continuous development of Gpos-PLoc.


References in zbMATH (referenced in 11 articles )

Showing results 1 to 11 of 11.
Sorted by year (citations)

  1. Shatabda, Swakkhar; Saha, Sanjay; Sharma, Alok; Dehzangi, Abdollah: iPHLoc-ES: identification of bacteriophage protein locations using evolutionary and structural features (2017)
  2. Hu, Yinxia; Li, Tonghua; Sun, Jiangming; Tang, Shengnan; Xiong, Wenwei; Li, Dapeng; Chen, Guanyan; Cong, Peisheng: Predicting Gram-positive bacterial protein subcellular localization based on localization motifs (2012)
  3. Chou, Kuo-Chen: Some remarks on protein attribute prediction and pseudo amino acid composition (2011)
  4. Zakeri, Pooya; Moshiri, Behzad; Sadeghi, Mehdi: Prediction of protein submitochondria locations based on data fusion of various features of sequences (2011)
  5. Nanni, Loris; Brahnam, Sheryl; Lumini, Alessandra: High performance set of PseAAC and sequence based descriptors for protein classification (2010)
  6. Zhang, Li; Liao, Bo; Li, Dachao; Zhu, Wen: A novel representation for apoptosis protein subcellular localization prediction using support vector machine (2009)
  7. Lin, Hao: The modified Mahalanobis discriminant for predicting outer membrane proteins by using Chou’s pseudo amino acid composition (2008)
  8. Zhang, Tong-Liang; Ding, Yong-Sheng; Chou, Kuo-Chen: Prediction protein structural classes with pseudo-amino acid composition: approximate entropy and hydrophobicity pattern (2008)
  9. Chen, Ying-Li; Li, Qian-Zhong: Prediction of apoptosis protein subcellular location using improved hybrid approach and pseudo-amino acid composition (2007)
  10. Jahandideh, Samad; Sarvestani, Amir Sabet; Abdolmaleki, Parviz; Jahandideh, Mina; Barfeie, Mahdyar: (\gamma)-Turn types prediction in proteins using the support vector machines (2007)
  11. Kurgan, Lukasz A.; Stach, Wojciech; Ruan, Jishou: Novel scales based on hydrophobicity indices for secondary protein structure (2007)