iMiRNA-PseDPC

iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach. A microRNA (miRNA) is a small non-coding RNA molecule, functioning in transcriptional and post-transcriptional regulation of gene expression. The human genome may encode over 1000 miRNAs. Albeit poorly characterized, miRNAs are widely deemed as important regulators of biological processes. Aberrant expression of miRNAs has been observed in many cancers and other disease states, indicating that they are deeply implicated with these diseases, particularly in carcinogenesis. Therefore, it is important for both basic research and miRNA-based therapy to discriminate the real pre-miRNAs from the false ones (such as hairpin sequences with similar stem-loops). Particularly, with the avalanche of RNA sequences generated in the post-genomic age, it is highly desired to develop computational sequence-based methods for effectively identifying the human pre-miRNAs. Here, we propose a predictor called ”iMiRNA-PseDPC”, in which the RNA sequences are formulated by a novel feature vector called ”pseudo distance-pair composition” (PseDPC) with 10 types of structure statuses. Rigorous cross-validations on a much larger and more stringent newly constructed benchmark data-set showed that our approach has remarkably outperformed the existing ones in either prediction accuracy or efficiency, indicating the new predictor is quite promising or at least may become a complementary tool to the existing predictors in this area. For the convenience of most experimental scientists, a user-friendly web server for the new predictor has been established at http://bioinformatics.hitsz.edu.cn/iMiRNA-PseDPC/, by which users can easily get their desired results without the need to go through the mathematical details. It is anticipated that the new predictor may become a useful high throughput tool for genome analysis particularly in dealing with large-scale data


References in zbMATH (referenced in 10 articles )

Showing results 1 to 10 of 10.
Sorted by year (citations)

  1. Jia, Jianhua; Li, Xiaoyan; Qiu, Wangren; Xiao, Xuan; Chou, Kuo-Chen: iPPI-PseAAC(CGR): identify protein-protein interactions by incorporating chaos game representation into PseAAC (2019)
  2. Akbar, Shahid; Hayat, Maqsood: iMethyl-STTNC: identification of N(^6)-methyladenosine sites by extending the idea of SAAC into Chou’s PseAAC to formulate RNA sequences (2018)
  3. Mei, Juan; Fu, Yi; Zhao, Ji: Analysis and prediction of ion channel inhibitors by using feature selection and Chou’s general pseudo amino acid composition (2018)
  4. Sabooh, M. Fazli; Iqbal, Nadeem; Khan, Mukhtaj; Khan, Muslim; Maqbool, H. F.: Identifying 5-methylcytosine sites in RNA sequence using composite encoding feature into Chou’s PseKNC (2018)
  5. Jia, Jianhua; Liu, Zi; Xiao, Xuan; Liu, Bingxiang; Chou, Kuo-Chen: pSuc-Lys: predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach (2016)
  6. Jiao, Ya-Sen; Du, Pu-Feng: Prediction of Golgi-resident protein types using general form of Chou’s pseudo-amino acid compositions: approaches with minimal redundancy maximal relevance feature selection (2016)
  7. Jiao, Ya-Sen; Du, Pu-Feng: Predicting Golgi-resident protein types using pseudo amino acid compositions: approaches with positional specific physicochemical properties (2016)
  8. Yang, Lei; Wang, Shiyuan; Zhou, Meng; Chen, Xiaowen; Zuo, Yongchun; Lv, Yingli: Characterization of BioPlex network by topological properties (2016)
  9. Kou, Gaoshan; Feng, Yonge: Identify five kinds of simple super-secondary structures with quadratic discriminant algorithm based on the chemical shifts (2015)
  10. Liu, Guoqing; Xing, Yongqiang; Cai, Lu: Using weighted features to predict recombination hotspots in \textitSaccharomycescerevisiae (2015)