Counting on co-transcriptional splicing
Max Planck Institute of Molecular and Cellular Biology and Genetics, Pfotenhauerstrasse 108, 01309 Dresden, Germany
†The contribution of the first two authors is equal, and their names are listed alphabetically.
The electronic version of this article is the complete one and can be found at: http://f1000.com/prime/reports/b/5/9
Splicing is the removal of intron sequences from pre-mRNA by the spliceosome. Researchers working in multiple model organisms – notably yeast, insects and mammalian cells – have shown that pre-mRNA can be spliced during the process of transcription (i.e. co-transcriptionally), as well as after transcription termination (i.e. post-transcriptionally). Co-transcriptional splicing does not assume that transcription and splicing machineries are mechanistically coupled, yet it raises this possibility. Early studies were based on a limited number of genes, which were often chosen because of their experimental accessibility. Since 2010, eight studies have used global datasets as counting tools, in order to quantify co-transcriptional intron removal. The consensus view, based on four organisms, is that the majority of splicing events take place co-transcriptionally in most cells and tissues. Here, we discuss the nature of the various global datasets and how bioinformatic analyses were conducted. Considering the broad differences in experimental approach and analysis, the level of agreement on the prevalence of co-transcriptional splicing is remarkable.
Transcription and pre-mRNA splicing are carried out by two distinct macromolecular machines, RNA polymerase II (Pol II) and the spliceosome, which are capable of functioning independently of one another. Polyadenylation of transcripts occurs upon co-transcriptional cleavage of the nascent RNA chain and, thereby, indicates that transcription is complete. Although most mature poly A+ transcripts are spliced, early studies yielded mixed results as to whether splicing is completed before polyadenylation [1-3]. Subsequently, a number of key studies in several model systems supported the notion that splicing at least begins co-transcriptionally, i.e. on nascent RNA tethered to chromatin by elongating Pol II [4-12]. Co-transcriptional splicing is not surprising: nucleoplasm is filled with spliceosomal components that should be able to associate with nascent RNA, much as ribosomes are capable of translating RNA co-transcriptionally in bacteria. Moreover, in vivo rates for the splicing reaction are fast: in the order of 30 seconds to 3 minutes from the time the intron is complete, depending on the species and the method used [5,12-15]; splicing takes significantly less time than gene transcription . Nevertheless, some dramatic examples of post-transcriptional splicing – for example, in anucleate platelets upon activation and in developing fern gametes – remind us of the potential importance of not splicing co-transcriptionally in certain biological contexts [17,18].
The regulatory potential of co-transcriptional splicing is what the field has found so exciting. We started to consider the possibilities that Pol II and the spliceosome could interact physically, that transcription might influence splicing and vice versa, and that co-transcriptional RNA processing could yield a more “fit” mRNP [16,19]. We asked ourselves: how frequent is co-transcriptional splicing among genes and introns? Early studies were based on specific genes, which were handpicked because of special properties such as transcriptional inducibility, gene length, and accessibility to light and/or EM microscopy . Since 2010, eight global studies, investigating multiple tissues and cell types in four organisms, have been published. The consensus view is that co-transcriptional splicing is widespread (see Table 1). These important studies are the subject of this review.
Global studies of co-transcriptional splicing
The first global study on co-transcriptional splicing was undertaken in budding yeast, where it became clear that most introns are spliced co-transcriptionally; 50% of introns are >74% spliced before transcription termination . Table 1 shows that co-transcriptional splicing frequencies are similarly high in fly and human cell lines and tissues [22-26]. Despite general agreement, there are important experimental and analytical differences among the studies. Carrillo Oesterreich et al. , Khodor et al. [23,27] and Tilgner et al.  all based their analyses on RNA isolated from biochemically purified chromatin, whereas Ameur et al.  and Windhager et al.  used total RNA sequencing and 4-thio-uridine (4sU)-labeling, respectively, to quantify co-transcriptional splicing. Even active spliceosomes and their nuclear location can be monitored, using protein biochemistry and immunofluorescence, showing that the majority of active spliceosomes is associated with chromatin . That said, not all intron removal is co-transcriptional. In particular, terminal introns are least well removed co-transcriptionally, and 20% of activated spliceosomes in the cell are not chromatin-associated [24,26].
Table 1 and Figure 1 illustrate the key points of each global study and how co-transcriptional splicing frequencies were quantified. Due to different sample preparations and RNA pools analyzed, one can rarely apply the same method of analysis to a given dataset. For example, Ameur et al. performed total RNA sequencing; because exon reads could come from nascent or mature RNA, analysis was restricted to intron reads . Gene architecture differences (e.g. short and few introns in yeast versus long and many introns in humans) also play a role. Three modes of analysis have been employed, using either junction reads – reads representing spliced (exon-exon) and unspliced (exon-intron and intron-exon) – or intron and/or exon coverage for calculating a splicing score:
- Exon-centered splicing score [22,24]. Pro: Optimal for alternative cassette exon usage. Con: First and terminal exons have to be analyzed differently.
- Intron-centered splicing score [21,23,27]. Pro: First and terminal exons can be included in the analysis. Con: Short exons are a disadvantage.
- Gene-based splicing score [24,28]. Pro: Inclusion of many reads reduces noise. Con: No independent splicing values for individual introns; no information about alternative splicing events per gene; stable RNAs encoded within introns artificially lower splicing, if not filtered out; because RNAseq data are often biased towards 3’ ends, results could depend on transcript length.
Global studies on co-transcriptional splicing in four different organisms, listed in chronological order of publication
Abbreviations: 4sU – 4-thio-uridine, I – intensity, ds – downstream, us – upstream, NA – not applicable; for score calculation schemes refer to Figure 1.
Differences among the methods could well influence the numerical results obtained and/or the interpretations, sometimes making it hard to compare the studies. For example, intron length negatively correlates with co-transcriptional splicing frequency in Drosophila, mouse and human cells [23,24,27]. However, Ameur et al., focusing their analysis in human brain on highly expressed genes with long introns, conclude the opposite . Experimental validation of RNA sequencing and array data by RT-qPCR strengthens and extends results from these approaches [21,22]. Though Ameur et al. could not calculate co-transcriptional splicing frequencies for short introns genome-wide, their RT-PCR results suggest that high co-transcriptional splicing observed for long introns can also be inferred for shorter ones . Remarkably, numerous studies agree that constitutive splicing is more co-transcriptional than alternative splicing [22-24,27].
Last year, one study conducted in induced mouse macrophages reported that full-length, polyA cleaved RNAs accumulate on chromatin in a partially spliced state . The inference that splicing is completed post-transcriptionally in this cell type has been rather hastily interpreted as evidence against co-transcriptional splicing in general [29,30]. However, no overall numerical values for co-transcriptional splicing are provided . Direct comparison to the other global studies is more difficult, because this analysis calculates splicing values on a per gene basis, using coverage over the whole gene (see Table 1 and Figure 1); since splicing values vary from intron to intron (see above), calculations for individual splicing events are more informative. It is possible that the gene-based frequency of splicing yields an underestimate of intron removal; for example, summation of the coverage data from Tilgner et al. (see Table S2 in ) also yields a lower co-transcriptional splicing frequency than usage of splice junction reads (Table 1) . Introns contain a number of stable RNAs, such as snoRNAs, which contribute reads that do not represent unspliced transcripts . Gene-based calculations may be influenced by coverage biases that can reflect differences in nucleotide content (e.g. fraction GC), directionality of sequencing (e.g. from the 3’ end) or RT-priming, which all influence the preparation of RNAseq libraries . The average co-transcriptional splicing frequencies obtained for each gene will likely be influenced by gene length and the total number of introns within the gene; terminal exons are long and generally full of reads, and terminal introns tend to be least well spliced co-transcriptionally [6,11,24]. Nevertheless, it is clear from this and previous studies (referenced within) that processing may be delayed in these cells, such that a higher proportion of splicing is post-transcriptional. It would be fascinating to know which introns are being retained and, indeed, whether all introns within the same transcript are retained. The relatively low co-transcriptional splicing frequencies from both mouse studies contrast sharply with the high co-transcriptional splicing frequencies from yeast, fly, and human (Table 1). Perhaps the easiest means of addressing this would be to analyze directly comparable human and mouse cell types.
It is difficult to resolve differences among studies when validation of co-transcriptional splicing frequencies by an independent method, such as RT-PCR, is omitted. Unfortunately, most current studies do not include validation. Validation acknowledges that something can be unexpected in either the experiment or the analysis, such as differences in biochemical purification, library biases or genome annotation . For example, chromatin preparations can be contaminated with mRNA, which is highly abundant and could lead to an over-estimate in the degree of co-transcriptional splicing. A co-transcriptional process is one that occurs before polyA cleavage, so one would ideally like to incorporate this property into the validation. Due to fluctuations in read densities, the degree of polyA cleavage in the RNA sample can be difficult to ascertain from RNAseq. A prominently used assay employs reverse transcription to specifically copy only uncleaved transcripts, by utilizing a reverse primer placed downstream of the polyA cleavage site; subsequent PCR can query the spliced or unspliced status of the nascent RNA [11,12,21,33]. This method can be difficult in mammals, where polyA cleavage sites are hard to predict. Nevertheless, an independent, small-scale study focusing on 22 human genes was able to validate the high frequencies of co-transcriptional splicing seen in the global data, even among terminal and alternative introns .
Summary and future directions
Taken together, this array of high quality global studies enables us to reach a consensus on co-transcriptional splicing: it is widespread, albeit not 100%. Future challenges encompass the relative importance of co- and post-transcriptional splicing in terms of the fate of the RNA, on the one hand, and/or transcriptional activity of the gene, on the other hand. For example, histone modifications, which would have a bearing on co-transcriptional but not post-transcriptional splicing, can directly or indirectly recruit splicing factors and modify alternative splicing [34,35]. Moreover, transcription elongation rates are influenced by nucleosome positioning and histone modifications, which influence alternative splice site choice [16,35,36]. Co-transcriptional splicing may also have long-lasting effects on the RNA's lifetime, by ensuring proper assembly of export-competent mRNPs . These examples show that co-transcriptional splicing is important for mRNA biogenesis. Co-transcriptional splicing has also emerged as an important regulator of transcriptional activity. It has long been known that the presence of promoter-proximal introns can stimulate gene expression [37-39]. Recent work shows that splicing feeds back to transcription through a distance-dependent enhancer-like activity of the first 5’ splice site .
Thus, genes and gene expression machinery have evolved coordinately to take advantage of crosstalk between transcription and splicing. If specific biological situations – such as the activation of transcriptional programs in macrophages or the repression of splicing in platelets – circumvent co-transcriptional processes, then perhaps there are additional regulatory reasons. In this sense, it is important to recognize that no study claims 100% of introns are 100% co-transcriptionally removed. Advances in high-throughput sequencing that enable sequencing of longer DNA molecules (in the kilobase range) will provide clarity and facilitate analysis, as well as providing insight into the order of intron removal and co-transcriptional dynamics of alternative splicing. Those introns, such as alternative introns, that are spliced post-transcriptionally may be subject to different regulatory mechanisms [19,40]. A challenge for the future will be to more fully explore the significance of post-transcriptional splicing for gene expression.
The authors declare that they have no disclosures.
We thank members of our laboratory, Fernando Carrillo Oesterreich, Karen Adelman, Jean Beggs, and Thoru Pederson for helpful discussions and comments on the manuscript. Also, Mattia Brugiolo was supported by the EU FP7 ITN project RNPnet (contract number 289007).
|1||Nevins J, Darnell JE: Steps in the processing of Ad2 mRNA: Poly(A)+ nuclear sequences are conserved and poly(A) addition precedes splicing. Cell. 1978, 15:1477–93.|
|2||Tilghman SM, Curtis PJ, Tiemeier DC, Leder P, Weissmann C: The intervening sequence of a mouse beta-globin gene is transcribed within the 15S beta-globin mRNA precursor. Proc Natl Acad Sci USA. 1978, 75:1309–13.|
|3||LeMeur M, Glanville N, Mandel JL, Gerlinger P, Palmiter R, Chambon P: The ovalbumin gene family: hormonal control of X and Y gene transcription and mRNA accumulation. Cell. 1981, 23:561–71.|
|4||RNP particles at splice junction sequences on Drosophila chorion transcripts. Cell. 1985, 43:143–51.|
|5||Splice site selection, rate of splicing, and alternative splicing on nascent transcripts. Genes & Development. 1988, 2:754–65.|
|6||Localization of pre-mRNA splicing in mammalian nuclei. Nature. 1994, 372:809–12.|
|7||Baurén G, Wieslander L: Splicing of Balbiani ring 1 gene pre-mRNA occurs simultaneously with transcription. Cell. 1994, 76:183–92.|
|8||The intranuclear site of excision of each intron in Balbiani ring 3 pre-mRNA is influenced by the time remaining to transcription termination and different excision efficiencies for the various introns. Rna. 1996, 2:641–51.|
|9||Rates of in situ transcription and splicing in large human genes. Nat Struct and Mol Biol. 2009, 16:1128–33.|
|10||A wave of nascent transcription on activated human genes. Proc Natl Acad Sci USA. 2009, 106:18357–61.|
|11||Co-transcriptional splicing of constitutive and alternative exons. Rna. 2009, 15:1896–908.|
|12||Real-time imaging of cotranscriptional splicing reveals a kinetic model that reduces noise: implications for alternative splicing regulation. J Cell Biol. 2011, 193:819–29.|
|13||In vivo commitment to yeast cotranscriptional splicing is sensitive to transcription elongation mutants. Genes & Development. 2006, 20:2055–66.|
|14||RiboSys, a high-resolution, quantitative approach to measure the in vivo kinetics of pre-mRNA splicing and 3'-end processing in Saccharomyces cerevisiae. Rna. 2010, 16:2570–80.|
|15||Huranová M, Ivani I, Benda A, Poser I, Brody Y, Hof M, Shav-Tal Y, Neugebauer KM, Stanek D: The differential interaction of snRNPs with pre-mRNA reveals splicing kinetics in living cells. J Cell Biol. 2010, 191:75–86.|
|16||Carrillo Oesterreich F, Bieberstein N, Neugebauer KM: Pause locally, splice globally. Trends in Cell Biology. 2011, :1–8.|
|17||Escaping the nuclear confines: signal-dependent pre-mRNA splicing in anucleate platelets. Cell. 2005, 122:379–91.|
|18||Removal of Retained Introns Regulates Translation in the Rapidly Developing Gametophyte of Marsilea vestita. Dev Cell. 2013, 24 [Epub ahead of print].|
|19||Han J, Xiong J, Wang D, Fu XD: Pre-mRNA splicing: where and when in the nucleus. Trends in Cell Biology. 2011, 21:336–43.|
|20||Pandya-Jones A: Pre-mRNA splicing during transcription in the mammalian system. WIREs RNA. 2011, 2:700–17.|
|21||Global Analysis of Nascent RNA Reveals Transcriptional Pausing in Terminal Exons. Mol Cell. 2010, 40:571–81.|
|22||Total RNA sequencing reveals nascent transcription and widespread co-transcriptional splicing in the human brain. Nat Struct and Mol Biol. 2011, 18:1435–40.|
|23||Nascent-seq indicates widespread cotranscriptional pre-mRNA splicing in Drosophila. Genes & Development. 2011, 25:2502–12.|
|24||Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs. Genome Res. 2012, 22:1616–25.|
|25||Ultrashort and progressive 4sU-tagging reveals key characteristics of RNA processing at nucleotide resolution. Genome Res. 2012, 22:2031–42.|
|26||Post-transcriptional spliceosomes are retained in nuclear speckles until splicing completion. Nat Comms. 2012, 3:994.|
|27||Khodor Y, Menet J, Tolan M, Rosbash M: Cotranscriptional splicing efficiency differs dramatically between Drosophila and mouse. Rna. 2012, 18:2174–86.|
|28||Transcript dynamics of proinflammatory genes revealed by sequence analysis of subcellular RNA fractions. Cell. 2012, 150:279–90.|
|29||Sen R, Fugmann SD: Transcription, splicing, and release: are we there yet?Cell. 2012, 150:241–3.|
|30||Stower H: Splicing: Waiting to be spliced. Nat Rev Genet. 2012, 13:599.|
|31||St Laurent G, Shtokalo D, Tackett MR, Yang Z, Eremina T, Wahlestedt C, Urcuqui-Inchima S, Seilheimer B, McCaffrey TA, Kapranov P: Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells. BMC Genomics. 2012, 13:504.|
|32||Plocik A, Graveley B: New Insights from Existing Sequence Data: Generating Breakthroughs without a Pipette. Mol Cell. 2013, 49:605–17.|
|33||Bieberstein NI, Carrillo Oesterreich F, Straube K, Neugebauer KM: First Exon Length Controls Active Chromatin Signatures and Transcription. Cell Reports. 2012, 2:62–8.|
|34||Epigenetics in Alternative Pre-mRNA Splicing. Cell. 2011, 144:16–26.|
|35||Kornblihtt A, Schor I, Alló M, Dujardin G, Petrillo E, Muñoz MJ: Alternative splicing: a pivotal step between eukaryotic transcription and translation. Nature Reviews Molecular Cell Biology. 2013, 14:153–65.|
|36||Gómez Acuña L, Fiszbein A, Alló M, Schor I, Kornblihtt A: Connections between chromatin signatures and splicing. WIREs RNA. 2013, 4:77–91.|
|37||Introns increase transcriptional efficiency in transgenic mice. Proc Natl Acad Sci USA. 1988, 85:836–40.|
|38||Promoter proximal splice sites enhance transcription. Genes & Development. 2002, 16:2792–9.|
|39||Rose A: The effect of intron location on intron-mediated enhancement of gene expression in Arabidopsis. The Plant Journal. 2004, 40:744–51.|
|40||Single-Molecule Imaging of Transcriptionally Coupled and Uncoupled Splicing. Cell. 2011, 147:1054–65.|