ESLpred

ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST. Automated prediction of subcellular localization of proteins is an important step in the functional annotation of genomes. The existing subcellular localization prediction methods are based on either amino acid composition or N-terminal characteristics of the proteins. In this paper, support vector machine (SVM) has been used to predict the subcellular location of eukaryotic proteins from their different features such as amino acid composition, dipeptide composition and physico-chemical properties. The SVM module based on dipeptide composition performed better than the SVM modules based on amino acid composition or physico-chemical properties. In addition, PSI-BLAST was also used to search the query sequence against the dataset of proteins (experimentally annotated proteins) to predict its subcellular location. In order to improve the prediction accuracy, we developed a hybrid module using all features of a protein, which consisted of an input vector of 458 dimensions (400 dipeptide compositions, 33 properties, 20 amino acid compositions of the protein and 5 from PSI-BLAST output). Using this hybrid approach, the prediction accuracies of nuclear, cytoplasmic, mitochondrial and extracellular proteins reached 95.3, 85.2, 68.2 and 88.9%, respectively. The overall prediction accuracy of SVM modules based on amino acid composition, physico-chemical properties, dipeptide composition and the hybrid approach was 78.1, 77.8, 82.9 and 88.0%, respectively. The accuracy of all the modules was evaluated using a 5-fold cross-validation technique. Assigning a reliability index (reliability index > or =3), 73.5% of prediction can be made with an accuracy of 96.4%. Based on the above approach, an online web server ESLpred was developed, which is available at http://www.imtech.res.in/raghava/eslpred/.


References in zbMATH (referenced in 20 articles )

Showing results 1 to 20 of 20.
Sorted by year (citations)

  1. Qiu, Wenying; Li, Shan; Cui, Xiaowen; Yu, Zhaomin; Wang, Minghui; Du, Junwei; Peng, Yanjun; Yu, Bin: Predicting protein submitochondrial locations by incorporating the pseudo-position specific scoring matrix into the general Chou’s pseudo-amino acid composition (2018)
  2. Ali, Farman; Hayat, Maqsood: Machine learning approaches for discrimination of extracellular matrix proteins using hybrid feature space (2016)
  3. Arango-Argoty, G. A.; Jaramillo-Garzón, J. A.; Castellanos-Domínguez, G.: Feature extraction by statistical contact potentials and wavelet transform for predicting subcellular localizations in gram negative bacterial proteins (2015)
  4. Wang, Jim Jing-Yan; Gao, Xin: Max-min distance nonnegative matrix factorization (2015)
  5. Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil: Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology (2014)
  6. Hu, Yinxia; Li, Tonghua; Sun, Jiangming; Tang, Shengnan; Xiong, Wenwei; Li, Dapeng; Chen, Guanyan; Cong, Peisheng: Predicting Gram-positive bacterial protein subcellular localization based on localization motifs (2012)
  7. Zakeri, Pooya; Moshiri, Behzad; Sadeghi, Mehdi: Prediction of protein submitochondria locations based on data fusion of various features of sequences (2011)
  8. Lapinsh, Maris; Wikberg, Jarl E. S.: Kinome-wide interaction modelling using alignment-based and alignment-independent approaches for kinase description and linear and non-linear data analysis techniques (2010) ioport
  9. Wang, Tong; Xia, Tian; Hu, Xiao-ming: Geometry preserving projections algorithm for predicting membrane protein types (2010)
  10. Blum, Torsten; Briesemeister, Sebastian; Kohlbacher, Oliver: Multiloc2: integrating phylogeny and gene ontology terms improves subcellular protein localization prediction (2009) ioport
  11. Du, Pufeng; Cao, Shengjiao; Li, Yanda: SubChlo: predicting protein subchloroplast locations with pseudo-amino acid composition and the evidence-theoretic (K)-nearest neighbor (ET-KNN) algorithm (2009)
  12. Kumar, Manish; Raghava, Gajendra P. S.: Prediction of nuclear proteins using SVM and HMM models (2009) ioport
  13. Xu, Qian; Hu, Derek Hao; Xue, Hong; Yu, Weichuan; Yang, Qiang: Semi-supervised protein subcellular localization (2009) ioport
  14. Zhang, Li; Liao, Bo; Li, Dachao; Zhu, Wen: A novel representation for apoptosis protein subcellular localization prediction using support vector machine (2009)
  15. Shah, Anuj R.; Oehmen, Christopher S.; Harper, Jill; Webb-Robertson, Bobbie-Jo M.: Integrating subcellular location for improving machine learning models of remote homology detection in eukaryotic organisms (2007)
  16. Shen, Yao Qing; Burger, Gertraud: ’Unite and conquer’: Enhanced prediction of protein subcellular localization by integrating multiple specialized tools (2007) ioport
  17. Tamura, Takeyuki; Akutsu, Tatsuya: Subcellular location prediction of proteins using support vector machines with alignment of block sequences utilizing amino acid composition (2007) ioport
  18. Kim, Jong Kyoung; Bang, Sung-Yang; Choi, Seungjin: Sequence-driven features for prediction of subcellular localization of proteins (2006)
  19. Gao, Qing-Bin; Wang, Zheng-Zhi: Using nearest feature line and tunable nearest neighbor methods for prediction of protein subcellular locations (2005)
  20. Bhasin, Manoj; Raghava, G. P. S.: Eslpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST. (2004) ioport