iNR-PhysChem: a sequence-based predictor for identifying nuclear receptors and their subfamilies via physical-chemical property matrix. Nuclear receptors (NRs) form a family of ligand-activated transcription factors that regulate a wide variety of biological processes, such as homeostasis, reproduction, development, and metabolism. Human genome contains 48 genes encoding NRs. These receptors have become one of the most important targets for therapeutic drug development. According to their different action mechanisms or functions, NRs have been classified into seven subfamilies. With the avalanche of protein sequences generated in the postgenomic age, we are facing the following challenging problems. Given an uncharacterized protein sequence, how can we identify whether it is a nuclear receptor? If it is, what subfamily it belongs to? To address these problems, we developed a predictor called iNR-PhysChem in which the protein samples were expressed by a novel mode of pseudo amino acid composition (PseAAC) whose components were derived from a physical-chemical matrix via a series of auto-covariance and cross-covariance transformations. It was observed that the overall success rate achieved by iNR-PhysChem was over 98% in identifying NRs or non-NRs, and over 92% in identifying NRs among the following seven subfamilies: NR1--thyroid hormone like, NR2--HNF4-like, NR3--estrogen like, NR4--nerve growth factor IB-like, NR5--fushi tarazu-F1 like, NR6--germ cell nuclear factor like, and NR0--knirps like. These rates were derived by the jackknife tests on a stringent benchmark dataset in which none of protein sequences included has ≥60% pairwise sequence identity to any other in a same subset. As a user-friendly web-server, iNR-PhysChem is freely accessible to the public at either or Also a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics involved in developing the predictor. It is anticipated that iNR-PhysChem may become a useful high throughput tool for both basic research and drug design.

References in zbMATH (referenced in 8 articles )

Showing results 1 to 8 of 8.
Sorted by year (citations)

  1. Akbar, Shahid; Hayat, Maqsood: iMethyl-STTNC: identification of N(^6)-methyladenosine sites by extending the idea of SAAC into Chou’s PseAAC to formulate RNA sequences (2018)
  2. Cheng, Xiang; Xiao, Xuan; Chou, Kuo-Chen: pLoc_bal-mGneg: predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC (2018)
  3. Sabooh, M. Fazli; Iqbal, Nadeem; Khan, Mukhtaj; Khan, Muslim; Maqbool, H. F.: Identifying 5-methylcytosine sites in RNA sequence using composite encoding feature into Chou’s PseKNC (2018)
  4. Lin, Thy-Hou; Tsai, Tsung-Lin: Constructing a linear QSAR for some metabolizable drugs by human or pig flavin-containing monooxygenases using some molecular features selected by a genetic algorithm trained SVM (2014)
  5. Xiao, Xuan; Min, Jian-Liang; Wang, Pu; Chou, Kuo-Chen: iCDI-PseFpt: identify the channel-drug interaction in cellular networking with PseAAC and molecular fingerprints (2013)
  6. Yu, Chenglong; Deng, Mo; Cheng, Shiu-Yuen; Yau, Shek-Chung; He, Rong L.; Yau, Stephen S.-T.: Protein space: a natural method for realizing the nature of protein universe (2013)
  7. Jahandideh, Samad; Srinivasasainagendra, Vinodh; Zhi, Degui: Comprehensive comparative analysis and identification of RNA-binding protein domains: multi-class classification and feature selection (2012)
  8. Mishra, Pooja; Nath Pandey, Paras: Elman RNN based classification of proteins sequences on account of their mutual information (2012)