DWE

DWE: Discriminating Word Enumerator. Motivation: Tissue-specific transcription factor binding sites give insight into tissue-specific transcription regulation. Results: We describe a word-counting-based tool for de novo tissue-specific transcription factor binding site discovery using expression information in addition to sequence information. We incorporate tissue-specific gene expression through gene classification to positive expression and repressed expression. We present a direct statistical approach to find overrepresented transcription factor binding sites in a foreground promoter sequence set against a background promoter sequence set. Our approach naturally extends to synergistic transcription factor binding site search. We find putative transcription factor binding sites that are overrepresented in the proximal promoters of liver-specific genes relative to proximal promoters of liver-independent genes. Our results indicate that binding sites for hepatocyte nuclear factors (especially HNF-1 and HNF-4) and CCAAT/enhancer-binding protein (C/EBPβ) are the most overrepresented in proximal promoters of liver-specific genes. Our results suggest that HNF-4 has strong synergistic relationships with HNF-1, HNF-4 and HNF-3β and with C/EBPβ. Availability: Programs are available for use over the Web at http://rulai.cshl.edu/tools/dwe