GenXHC: a probabilistic generative model for cross-hybridization compensation in high-density genome-wide microarray data. Motivation: Microarray designs containing millions to hundreds of millions of probes that tile entire genomes are currently being released. Within the next 2 months, our group will release a microarray data set containing over 12 000 000 microarray measurements taken from 37 mouse tissues. A problem that will become increasingly significant in the upcoming era of genome-wide exon-tiling microarray experiments is the removal of cross-hybridization noise. We present a probabilistic generative model for cross-hybridization in microarray data and a corresponding variational learning method for cross-hybridization compensation, GenXHC, that reduces cross-hybridization noise by taking into account multiple sources for each mRNA expression level measurement, as well as prior knowledge of hybridization similarities between the nucleotide sequences of microarray probes and their target cDNAs. Results: The algorithm is applied to a subset of an exon-resolution genome-wide Agilent microarray data set for chromosome 16 of Mus musculus and is found to produce statistically significant reductions in cross-hybridization noise. The denoised data is found to produce enrichment in multiple gene ontology–biological process (GO–BP) functional groups. The algorithm is found to outperform robust multi-array analysis, another method for cross-hybridization compensation.

References in zbMATH (referenced in 2 articles )

Showing results 1 to 2 of 2.
Sorted by year (citations)

  1. Khamiakova, Tatsiana; Shkedy, Ziv; Amaratunga, Dhammika; Talloen, Willem; Göhlmann, Hinrich; Bijnens, Luc; Kasim, Adetayo: Quality control of platinum spike dataset by probe-level mixed models (2014)
  2. Robinson, Mark D.; Speed, Terence P.: Differential splicing using whole-transcript microarrays (2009) ioport