The context of gene expression regulation
Program in Systems Biology; Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA, USA
The electronic version of this article is the complete one and can be found at: http://f1000.com/reports/b/4/8
Recent advances in sequencing technologies have uncovered a world of RNAs that do not code for proteins, known as non-protein coding RNAs, that play important roles in gene regulation. Along with histone modifications and transcription factors, non-coding RNA is part of a layer of transcriptional control on top of the DNA code. This layer of components and their interactions specifically enables (or disables) the modulation of three-dimensional folding of chromatin to create a context for transcriptional regulation that underlies cell-specific transcription. In this perspective, we propose a structural and functional hierarchy, in which the DNA code, proteins and non-coding RNAs act as context creators to fold chromosomes and regulate genes.
Transcriptional regulation of eukaryotic genes has classically been viewed as the interaction of elements in the immediate vicinity of the transcription start site (promoters) with upstream elements (enhancers) [1,2]. Mutations in conserved upstream promoter DNA sequences affect gene expression directly [3,4]. However, transcriptional regulation is not only determined by DNA sequence but involves additional layers of control that include nucleosome positioning, DNA binding regulatory proteins such as transcription factors, histone modifications and non-coding RNA [5-7]. Moreover, the development of high throughput chromosome conformation capture techniques  has shown that the three-dimensional organization of genes and upstream elements affects transcriptional activity [9-11]. Here, we present how the combination of these factors can provide a “context” for the regulation of gene expression.
Putting the genome into context
The concept of context
Gene expression regulation in eukaryotes includes interactions between promoters and enhancers, but our understanding of the mechanisms that drive these interactions, or that determine their specificity, is far from complete. At the basis of transcription lies the DNA code that directly determines the composition and location of DNA elements and provides specific recognition sites for DNA binding proteins. The binding of transcription factors and recruitment of complexes that modify histones create an environment that allows for element interaction and initiation of gene transcription. However, predicting the location of promoters and enhancers based solely on histone modifications and transcription factor binding relies on complicated models that are still suboptimal [12-14]. Moreover, recent findings on the involvement of non-coding RNA in transcriptional regulation further imply a more complicated reality [15,16]. Is there a combination of histone modifications which is sufficient to predict the position of regulatory elements? Are specific non-coding RNA transcripts involved in recruiting proteins to regulatory elements?
It is now highly debated whether chromatin modifications comprise a “code” similar to DNA (reviewed by Henikoff and Shilatifard ). The crosstalk between histones, transcription factors and non-coding RNA suggest that they interact to form a highly interwoven level of organization . In the context model, transcriptional regulation is subdivided into three levels of interactions: the DNA level, the local chromatin level, and the three-dimensional folding of the genome. The first level, the DNA code, forms an interaction platform by providing protein binding sites for transcription factors that, together with non-coding RNAs and histone modifications, form the next layer of gene regulation. This layer of interactions enables specific long-range interactions that result in a three-dimensional folding of the chromatin. This higher order organization provides transcriptional context that can either facilitate or block the initiation of transcription (Figure 1). Importantly, changes at any level are not necessarily unidirectional. Compaction of higher order structures, for example, will influence the accessibility of DNA and binding of transcription factors, providing a likely feedback mechanism. Below we describe each of the layers of context in more detail.
Level 1: the DNA code
The binding of (core) transcription factors critically depends on the recognition of specific DNA sequences, known as DNA motifs. High throughput techniques, such as ChIP-seq and enhanced yeast one-hybrid, which visualizes the interaction of a transcription factor with a bait DNA sequence, are now employed to uncover transcription factor-DNA and DNA-transcription factor interactions, respectively [18,19]. Such studies show that transcription factor-binding events are abundant and can occur at large genomic distances from genes. Similarly, conserved non-coding sequences can be found throughout the genome. Their conservation implies that they have an important function – they may affect the binding affinity of factors, or encode non-coding RNAs [20,21]. Interestingly, deletion of these sequences with unknown function can influence gene expression of genes located hundreds of Kb away, implying that long-range looping of DNA brings the sequences into contact with the genes they regulate .
Level 2: context creators
This level of interactions involves histone positioning and modifications, repertoires of transcription factors and non-coding RNA and the interplay between them. This level is perhaps the most complex, and it has become the center of attention in recent years.
Although nucleosome positioning, histone modifications, non-coding RNA and transcription factor binding are useful descriptors for genomic elements, they do not seem to define regulatory elements by themselves . Histone modifications have been used to classify upstream regions as promoters (i.e. H3K4me3) and enhancers (i.e. H3K4me1, H3K27ac) , but this may not reflect the complete picture. Recent developments in DNA sequencing have led to the appreciation of the importance of non-coding RNA as a regulatory component in the genome [15,23,24]. Although the extent of non-coding RNA involvement is still debated , more and more examples of the involvement of non-coding RNA in transcriptional regulation appear in the current literature. Both long and short non-coding RNAs have been identified at regulatory elements [26,27]. Their function is still mostly unknown, and their nomenclature is purely descriptive, based on their site of occurrence (e.g. PASR for Promoter-Associated Short RNA or eRNA for enhancer-RNA) [23,28]. It is possible that non-coding RNA can act as a fast and flexible intermediate to recruit histone-modifying complexes to DNA elements [29-32]. Several reports on long non-coding RNA have shown their involvement in chromatin remodeling, affecting differentiation and disease  (reviewed by Huarte and Rinn  and Hung and Chang ). Long non-coding RNAs have also been found to be involved in transcriptional repression via polycomb proteins, which are known to maintain cell identity by repressing developmental regulators in certain cell types [15,16,31]. Although long non-coding RNA are now widely studied, small non-coding RNA and antisense RNA have also been implicated in polycomb-mediated transcriptional gene silencing [36-38]. The combinatorial complexity at this level of chromatin regulation and structure is further modulated by feedback and feedforward signals between histone modifications, non-coding RNA and transcription factors.
Level 3: context
The three-dimensional folding of DNA is the final context that allows for gene transcription to initiate. The folding of DNA into higher order structures is not a random event, and it has long been thought to affect gene transcription [39-42]. At the nuclear level, chromosomes occupy specific nuclear territories (reviewed by Cremer and Cremer ). Chromatin interactions within and between broad zones of chromosomes lead to nuclear compartments where active genes tend to co-locate, near the center of the nucleus, and inactive genes cluster near the nuclear periphery. This indicates a strong correlation, though not necessarily causation, between nuclear positioning and gene activity (reviewed by Geyer et al. ). At the finest scale, precise DNA folding or “looping” interactions between gene promoters and their distal regulatory elements can be found. DNA looping (level 3) is guided by long-range interactions between DNA sequence elements (level 1), which can be mediated by interacting context creators (level 2). An illustration of the interplay between multiple levels of regulation that leads to a context for gene expression can be found in the regulation of the HOXA locus. Here, chromosomal looping brings the non-coding RNA HOTTIP in close proximity to HOXA genes. HOTTIP recruits the histone 3 lysine 4 modifying complex MLL by binding to WDR5, targeting this complex to the HOXA locus. As a result, HOTTIP controls HOXA gene expression by bridging higher-order chromosomal looping and chromatin modifications . This exemplifies how context provides an environment for communication between regulatory elements in three-dimensional space, leading to either activation or repression of gene transcription.
Recent technical advances in DNA sequencing have enabled genome-wide analysis at each of the three levels: (1) genome sequencing to identify conserved regulatory elements; (2) ChIP-seq , DeepCAGE , RNA-seq , NET-seq , ChiRP  and CHART  to identify chromatin modification, transcription factor binding and (non-coding) RNA expression and localization; and (3) 3C combined with deep sequencing (e.g. Hi-C and 3-seq) [8,51,52] to probe three-dimensional folding of the genome. Combined, these tools allow an integrated systems approach towards a more complete understanding of the context in which the genome is regulated.
The authors declare they have no competing interests.
Supported by a Rubicon grant from the Netherlands Organisation for Scientific Research and a Dutch Cancer Society Fellowship to Johan H. Gibcus. Supported by grants from the National Institutes of Health, National Human Genome Research Institute (HG003143 and HG003143-06S1), and a W.M Keck Foundation distinguished young scholar in medical research grant to Job Dekker.
|1||Mitchell PJ, Tjian R: Transcriptional regulation in mammalian cells by sequence-specific DNA binding proteins. Science. 1989, 245:371–8.|
|2||Ren X, Siegel R, Kim U, Roeder RG: Direct interactions of OCA-B and TFII-I regulate immunoglobulin heavy-chain gene transcription by facilitating enhancer-promoter communication. Mol. Cell. 2011, 42:342–55.|
|3||Heintzman ND, Ren B: The gateway to transcription: identifying, characterizing and understanding promoters in the eukaryotic genome. Cell. Mol. Life Sci. 2007, 64:386–400.|
|4||Stefano JE, Ackerson JW, Gralla JD: Alterations in two conserved regions of promoter sequence lead to altered rates of polymerase binding and levels of gene expression. Nucleic Acids Res. 1980, 8:2709–23.|
|5||Venters BJ, Pugh BF: How eukaryotic genes are transcribed. Crit. Rev. Biochem. Mol. Biol. 2009, 44:117–41.|
|6||Mercer TR, Dinger ME, Mattick JS: Long non-coding RNAs: insights into functions. Nat. Rev. Genet. 2009, 10:155–9.|
|7||Goodrich JA, Kugel JF: Non-coding-RNA regulators of RNA polymerase II transcription. Nat. Rev. Mol. Cell Biol. 2006, 7:612–6.|
|8||van Steensel B, Dekker J: Genomics tools for unraveling chromosome architecture. Nat. Biotechnol. 2010, 28:1089–95.|
|9||Looping and interaction between hypersensitive sites in the active beta-globin locus. Mol. Cell. 2002, 10:1453–65.|
|10||Vernimmen D, Marques-Kranc F, Sharpe JA, Sloane-Stanley JA, Wood WG, Wallace H, Smith A, Higgs DR: Chromosome looping at the human alpha-globin locus is mediated via the major upstream regulatory element (HS -40). Blood. 2009, 114:4253–60.|
|11||The three-dimensional folding of the α-globin gene domain reveals formation of chromatin globules. Nat. Struct. Mol. Biol. 2011, 18:107–14.|
|12||Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature. 2009, 459:108–12.|
|13||Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 2007, 39:311–8.|
|14||Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011, 473:43–9.|
|15||Khalil AM, Guttman M, Huarte M, Garber M, Raj A, Rivea Morales D, Thomas K, Presser A, Bernstein BE, van Oudenaarden A, Regev A, Lander ES, Rinn JL: Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc. Natl. Acad. Sci. U.S.A. 2009, 106:11667–72.|
|16||Yang L, Lin C, Liu W, Zhang J, Ohgi KA, Grinstein JD, Dorrestein PC, Rosenfeld MG: ncRNA- and Pc2 methylation-dependent gene relocation between nuclear structures mediates gene activation programs. Cell. 2011, 147:773–88.|
|17||Henikoff S, Shilatifard A: Histone modification: cause or cog?Trends Genet. 2011, 27:389–96.|
|18||Genome-wide mapping of in vivo protein-DNA interactions. Science. 2007, 316:1497–502.|
|19||Reece-Hoyes JS, Barutcu AR, McCord RP, Jeong JS, Jiang L, MacWilliams A, Yang X, Salehi-Ashtiani K, Hill DE, Blackshaw S, Zhu H, Dekker J, Walhout A: Yeast one-hybrid assays for gene-centered human gene regulatory network mapping. Nat. Methods. 2011, 8:1050–2.|
|20||Human-specific gain of function in a developmental enhancer. Science. 2008, 321:1346–50.|
|21||The Evf-2 noncoding RNA is transcribed from the Dlx-5/6 ultraconserved region and functions as a Dlx-2 transcriptional coactivator. Genes Dev. 2006, 20:1470–84.|
|22||D'haene B, Attanasio C, Beysen D, Dostie J, Lemire E, Bouchard P, Field M, Jones K, Lorenz B, Menten B, Buysse K, Pattyn F, Friedli M, Ucla C, Rossier C, Wyss C, Speleman F, de Paepe A, Dekker J, Antonarakis SE, de Baere E: Disease-causing 7.4 kb cis-regulatory deletion disrupting conserved non-coding sequences and their interaction with the FOXL2 promotor: implications for mutation screening. PLoS Genet. 2009, 5:e1000522.|
|23||RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007, 316:1484–8.|
|24||Non-coding RNAs: the architects of eukaryotic complexity. EMBO Rep. 2001, 2:986–91.|
|25||Most “dark matter” transcripts are associated with known genes. PLoS Biol. 2010, 8:e1000371.|
|26||A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature. 2011, 472:120–4.|
|27||Long noncoding RNAs with enhancer-like function in human cells. Cell. 2010, 143:46–58.|
|28||Widespread transcription at neuronal activity-regulated enhancers. Nature. 2010, 465:182–7.|
|29||Nagano T, Fraser P: No-nonsense functions for long noncoding RNAs. Cell. 2011, 145:178–81.|
|30||Wang KC, Chang HY: Molecular mechanisms of long noncoding RNAs. Mol. Cell. 2011, 43:904–14.|
|31||Genome-wide identification of polycomb-associated RNAs by RIP-seq. Mol. Cell. 2010, 40:939–53.|
|32||YY1 tethers Xist RNA to the inactive X nucleation center. Cell. 2011, 146:119–33.|
|33||Gupta RA, Shah N, Wang KC, Kim J, Horlings HM, Wong DJ, Tsai M, Hung T, Argani P, Rinn JL, Wang Y, Brzoska P, Kong B, Li R, West RB, van de Vijver MJ, Sukumar S, Chang HY: Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature. 2010, 464:1071–6.|
|34||Huarte M, Rinn JL: Large non-coding RNAs: missing links in cancer?Hum. Mol. Genet. 2010, 19:R152–61.|
|35||Hung T, Chang HY: Long noncoding RNA in genome regulation: prospects and mechanisms. RNA Biol. 7:582–5.|
|36||Short RNAs are transcribed from repressed polycomb target genes and interact with polycomb repressive complex-2. Mol. Cell. 2010, 38:675–88.|
|37||Janowski BA, Huffman KE, Schwartz JC, Ram R, Nordsell R, Shames DS, Minna JD, Corey DR: Involvement of AGO1 and AGO2 in mammalian transcriptional silencing. Nat. Struct. Mol. Biol. 2006, 13:787–92.|
|38||Argonaute-1 directs siRNA-mediated transcriptional gene silencing in human cells. Nat. Struct. Mol. Biol. 2006, 13:793–7.|
|39||Horn PJ, Peterson CL: Molecular biology. Chromatin higher order folding--wrapping up transcription. Science. 2002, 297:1824–7.|
|40||Sexton T, Schober H, Fraser P, Gasser SM: Gene regulation through nuclear organization. Nat. Struct. Mol. Biol. 2007, 14:1049–55.|
|41||Lanctôt C, Cheutin T, Cremer M, Cavalli G, Cremer T: Dynamic genome architecture in the nuclear space: regulation of gene expression in three dimensions. Nat. Rev. Genet. 2007, 8:104–15.|
|42||Dekker J: Gene regulation in the third dimension. Science. 2008, 319:1793–4.|
|43||Cremer T, Cremer M: Chromosome territories. Cold Spring Harb Perspect Biol. 2010, 2:a003889.|
|44||Geyer PK, Vitalini MW, Wallrath LL: Nuclear organization: taking a position on gene expression. Curr. Opin. Cell Biol. 2011, 23:354–9.|
|45||Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, Thiessen N, Griffith OL, He A, Marra M, Snyder M, Jones S: Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods. 2007, 4:651–7.|
|46||Valen E, Pascarella G, Chalk A, Maeda N, Kojima M, Kawazu C, Murata M, Nishiyori H, Lazarevic D, Motti D, Marstrand TT, Tang ME, Zhao X, Krogh A, Winther O, Arakawa T, Kawai J, Wells C, Daub C, Harbers M, Hayashizaki Y, Gustincich S, Sandelin A, Carninci P: Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE. Genome Res. 2009, 19:255–65.|
|47||Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods. 2008, 5:621–8.|
|48||Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature. 2011, 469:368–73.|
|49||Genomic maps of long noncoding RNA occupancy reveal principles of RNA-chromatin interactions. Mol. Cell. 2011, 44:667–78.|
|50||Simon MD, Wang CI, Kharchenko PV, West JA, Chapman BA, Alekseyenko AA, Borowsky ML, Kuroda MI, Kingston RE: The genomic binding sites of a noncoding RNA. Proc. Natl. Acad. Sci. U.S.A. 2011, 108:20497–502.|
|51||Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009, 326:289–93.|
|52||Soler E, Andrieu-Soler C, de Boer E, Bryne JC, Thongjuea S, Stadhouders R, Palstra R, Stevens M, Kockx C, van Ijcken W, Hou J, Steinhoff C, Rijkers E, Lenhard B, Grosveld F: The genome-wide dynamics of the binding of Ldb1 complexes during erythroid differentiation. Genes Dev. 2010, 24:277–89.|