AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties. Some creatures living in extremely low temperatures can produce some special materials called ”antifreeze proteins” (AFPs), which can prevent the cell and body fluids from freezing. AFPs are present in vertebrates, invertebrates, plants, bacteria, fungi, etc. Although AFPs have a common function, they show a high degree of diversity in sequences and structures. Therefore, sequence similarity based search methods often fails to predict AFPs from sequence databases. In this work, we report a random forest approach ”AFP-Pred” for the prediction of antifreeze proteins from protein sequence. AFP-Pred was trained on the dataset containing 300 AFPs and 300 non-AFPs and tested on the dataset containing 181 AFPs and 9193 non-AFPs. AFP-Pred achieved 81.33% accuracy from training and 83.38% from testing. The performance of AFP-Pred was compared with BLAST and HMM. High prediction accuracy and successful of prediction of hypothetical proteins suggests that AFP-Pred can be a useful approach to identify antifreeze proteins from sequence information, irrespective of their sequence similarity.

References in zbMATH (referenced in 24 articles )

Showing results 1 to 20 of 24.
Sorted by year (citations)

1 2 next

  1. Jia, Jianhua; Li, Xiaoyan; Qiu, Wangren; Xiao, Xuan; Chou, Kuo-Chen: iPPI-PseAAC(CGR): identify protein-protein interactions by incorporating chaos game representation into PseAAC (2019)
  2. Arif, Muhammad; Hayat, Maqsood; Jan, Zahoor: IMem-2LSAAC: a two-level model for discrimination of membrane proteins and their types by extending the notion of SAAC into Chou’s pseudo amino acid composition (2018)
  3. Cheng, Xiang; Xiao, Xuan; Chou, Kuo-Chen: pLoc_bal-mGneg: predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC (2018)
  4. Mei, Juan; Fu, Yi; Zhao, Ji: Analysis and prediction of ion channel inhibitors by using feature selection and Chou’s general pseudo amino acid composition (2018)
  5. Sabooh, M. Fazli; Iqbal, Nadeem; Khan, Mukhtaj; Khan, Muslim; Maqbool, H. F.: Identifying 5-methylcytosine sites in RNA sequence using composite encoding feature into Chou’s PseKNC (2018)
  6. Zhao, Xian; Chen, Lei; Lu, Jing: A similarity-based method for prediction of drug side effects with heterogeneous information (2018)
  7. He Zhao and Graham Williams and Joshua Huang: wsrf: An R Package for Classification with Scalable Weighted Subspace Random Forests (2017) not zbMATH
  8. Jiao, Xiong; Ranganathan, Shoba: Prediction of interface residue based on the features of residue interaction network (2017)
  9. Jia, Jianhua; Liu, Zi; Xiao, Xuan; Liu, Bingxiang; Chou, Kuo-Chen: pSuc-Lys: predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach (2016)
  10. Georgiou, D. N.; Karakasidis, T. E.; Megaritis, A. C.; Nieto, Juan J.; Torres, A.: An extension of fuzzy topological approach for comparison of genetic sequences (2015)
  11. Mondal, Sukanta; Pai, Priyadarshini P.: Chou’s pseudo amino acid composition improves sequence-based antifreeze protein prediction (2014)
  12. Sharma, Alok; Lyons, James; Dehzangi, Abdollah; Paliwal, Kuldip K.: A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition (2013)
  13. Fan, Guo-Liang; Li, Qian-Zhong: Predicting mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of Chou’s pseudo amino acid composition (2012)
  14. Hemmateenejad, Bahram; Miri, Ramin; Elyasi, Maryam: A segmented principal component analysis -- regression approach to QSAR study of peptides (2012)
  15. Jahandideh, Samad; Mahdavi, Abbas: RFCRYS: sequence-based protein crystallization propensity prediction by means of random forest (2012)
  16. Jahandideh, Samad; Srinivasasainagendra, Vinodh; Zhi, Degui: Comprehensive comparative analysis and identification of RNA-binding protein domains: multi-class classification and feature selection (2012)
  17. Liu, Guoqing; Liu, Jia; Cui, Xiangjun; Cai, Lu: Sequence-dependent prediction of recombination hotspots in \textitSaccharomycescerevisiae (2012)
  18. Qiu, Zhijun; Wang, Xicheng: Prediction of protein-protein interaction sites using patch-based residue characterization (2012)
  19. Cheng, Feng; Theodorescu, Dan; Schulman, Ira G.; Lee, Jae K.: \textitInvitro transcriptomic prediction of hepatotoxicity for early drug discovery (2011)
  20. de Avila e Silva, Scheila; Echeverrigaray, Sergio; Gerhardt, Günther J. L.: BacPP: bacterial promoter prediction -- a tool for accurate sigma-factor specific assignment in enterobacteria (2011)

1 2 next