Euk-mPLoc

Euk-mPLoc: a fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites. One of the critical challenges in predicting protein subcellular localization is how to deal with the case of multiple location sites. Unfortunately, so far, no efforts have been made in this regard except for the one focused on the proteins in budding yeast only. For most existing predictors, the multiple-site proteins are either excluded from consideration or assumed even not existing. Actually, proteins may simultaneously exist at, or move between, two or more different subcellular locations. For instance, according to the Swiss-Prot database (version 50.7, released 19-Sept-2006), among the 33,925 eukaryotic protein entries that have experimentally observed subcellular location annotations, 2715 have multiple location sites, meaning about 8% bearing the multiplex feature. Proteins with multiple locations or dynamic feature of this kind are particularly interesting because they may have some very special biological functions intriguing to investigators in both basic research and drug discovery. Meanwhile, according to the same Swiss-Prot database, the number of total eukaryotic protein entries (except those annotated with ”fragment” or those with less than 50 amino acids) is 90,909, meaning a gap of (90,909-33,925) = 56,984 entries for which no knowledge is available about their subcellular locations. Although one can use the computational approach to predict the desired information for the blank, so far, all the existing methods for predicting eukaryotic protein subcellular localization are limited in the case of single location site only. To overcome such a barrier, a new ensemble classifier, named Euk-mPLoc, was developed that can be used to deal with the case of multiple location sites as well. Euk-mPLoc is freely accessible to the public as a Web server at http://202.120.37.186/bioinf/euk-multi. Meanwhile, to support the people working in the relevant areas, Euk-mPLoc has been used to identify all eukaryotic protein entries in the Swiss-Prot database that do not have subcellular location annotations or are annotated as being uncertain. The large-scale results thus obtained have been deposited at the same Web site via a downloadable file prepared with Microsoft Excel and named ”Tab_Euk-mPLoc.xls”. Furthermore, to include new entries of eukaryotic proteins and reflect the continuous development of Euk-mPLoc in both the coverage scope and prediction accuracy, we will timely update the downloadable file as well as the predictor, and keep users informed by publishing a short note in the Journal and making an announcement in the Web Page.


References in zbMATH (referenced in 40 articles )

Showing results 1 to 20 of 40.
Sorted by year (citations)

1 2 next

  1. Shen, Yinan; Tang, Jijun; Guo, Fei: Identification of protein subcellular localization via integrating evolutionary and physicochemical information into Chou’s general PseAAC (2019)
  2. Sabooh, M. Fazli; Iqbal, Nadeem; Khan, Mukhtaj; Khan, Muslim; Maqbool, H. F.: Identifying 5-methylcytosine sites in RNA sequence using composite encoding feature into Chou’s PseKNC (2018)
  3. Tarafder, Sumit; Toukir Ahmed, Md.; Iqbal, Sumaiya; Tamjidul Hoque, Md; Sohel Rahman, M.: RBSURFpred: modeling protein accessible surface area in real and binary space using regularized and optimized regression (2018)
  4. Georgiou, D. N.; Karakasidis, T. E.; Megaritis, A. C.; Nieto, Juan J.; Torres, A.: An extension of fuzzy topological approach for comparison of genetic sequences (2015)
  5. Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan: mLASSO-Hum: a LASSO-based interpretable human-protein subcellular localization predictor (2015)
  6. Mei, Suyu: \textitSVMensemble based transfer learning for large-scale membrane proteins discrimination (2014)
  7. Yang, Lei; Lv, Yingli; Li, Tao; Zuo, Yongchun; Jiang, Wei: Human proteins characterization with subcellular localizations (2014)
  8. Fan, Guo-Liang; Li, Qian-Zhong: Discriminating bioluminescent proteins by incorporating average chemical shift and evolutionary information into the general form of Chou’s pseudo amino acid composition (2013)
  9. Huang, Chao; Yuan, Jing-Qi: Predicting protein subchloroplast locations with both single and multiple sites via three different modes of Chou’s pseudo amino acid compositions (2013)
  10. Fan, Guo-Liang; Li, Qian-Zhong: Predicting mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of Chou’s pseudo amino acid composition (2012)
  11. He, Jianjun; Gu, Hong; Wang, Zhelong: Bayesian multi-instance multi-label learning using Gaussian process prior (2012)
  12. Li, Tao; Li, Qian-Zhong: Annotating the protein-RNA interaction sites in proteins using evolutionary information and protein backbone structure (2012)
  13. Mei, Suyu: Multi-kernel transfer learning based on Chou’s PseAAC formulation for protein submitochondria localization (2012)
  14. Mishra, Pooja; Nath Pandey, Paras: Elman RNN based classification of proteins sequences on account of their mutual information (2012)
  15. Qiu, Zhijun; Wang, Xicheng: Prediction of protein-protein interaction sites using patch-based residue characterization (2012)
  16. Chou, Kuo-Chen: Some remarks on protein attribute prediction and pseudo amino acid composition (2011)
  17. Hayat, Maqsood; Khan, Asifullah: Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition (2011)
  18. Kavousi, Kaveh; Moshiri, Behzad; Sadeghi, Mehdi; Araabi, Babak N.; Moosavi-Movahedi, Ali Akbar: A protein fold classifier formed by fusing different modes of pseudo amino acid composition via PSSM (2011)
  19. Khan, Asifullah; Majid, Abdul; Hayat, Maqsood: CE-PLoc: An ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition (2011)
  20. Lin, Hao; Ding, Hui: Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition (2011)

1 2 next