The interactions between transcription factors (TFs) and cis-acting regulatory elements (CREs) provide crucial information on the regulation of gene expression. The determination of TF-binding sites and CREs experimentally is costly and time intensive. An in silico identification and annotation of TFs, and the prediction of CREs from rice are made possible by the availability of whole genome sequence and transcriptome data. In this study, we tested the applicability of two algorithms developed for other model systems for the identification of biologically significant CREs of co-expressed genes from rice. CREs were identified from the DNA sequences located upstream from the transcription start sites, untranslated regions (UTRs), and introns, and downstream from the translational stop codons of co-expressed genes. The biologically significance of each CRE was determined by correlating their absence and presence in each gene with that gene’s expression profile using a meta-database constructed from 50 rice microarray data sets. The reliability of these methods in the predictions of CREs and their corresponding TFs was supported by previous wet lab experimental data and a literature review. New CREs corresponding to abiotic stresses, biotic stresses, specific tissues, and developmental stages were identified from rice, revealing new pieces of information for future experimental testing. The e ectiveness of some—but not all—CREs was found to be a ected by copy number, position, and orientation. The corresponding TFs that were most likely correlated with each CRE were also identified. These findings not only contribute to the prioritization of candidates for further analysis, the information also contributes to the understanding of the gene regulatory network.



Link to publisher version