Brought to you by:
Paper

Sources of stochasticity in constitutive and autoregulated gene expression

, and

Published 30 November 2012 © 2012 The Royal Swedish Academy of Sciences
, , Citation Rahul Marathe et al 2012 Phys. Scr. 2012 014068 DOI 10.1088/0031-8949/2012/T151/014068

1402-4896/2012/T151/014068

Abstract

Gene expression is inherently noisy as many steps in the read-out of the genetic information are stochastic. To disentangle the effect of different sources of stochasticity in such systems, we consider various models that describe some processes as stochastic and others as deterministic. We review earlier results for unregulated (constitutive) gene expression and present new results for a gene controlled by negative autoregulation with cell growth modeled by linear volume growth.

Export citation and abstract BibTeX RIS

1. Introduction

Mathematical and physical methods and concepts are increasingly used in the life sciences. For example, the dynamics of gene regulatory circuits is often studied by describing these circuits with simple but nontrivial mathematical models [14]. An important issue is then the choice of an appropriate level of mathematical description. Cells are very dynamic and often adapt to the external conditions by changing their global properties, which may in turn affect the function of genetic circuits hosted by that cell [58]. Thus, one needs to address the question of how one can mathematically model such a dynamic system, at least under constant external conditions. Moreover, every individual cell grows and divides while the circuits it contains perform their programmed functions. Many models use a mean-field-like description averaging over these processes, but it is often not clear how accurate such approximations are. For example, the gene copy number is often described by an average and the actual doubling of the gene during the cell division cycle is not considered.

Yet another issue is whether the description of such a regulatory circuit should be deterministic or stochastic. Many important molecules are present in a cell in low copy numbers. Hence, fluctuations can be expected to be important, so that a stochastic description of gene expression is necessary [911]. These effects, which have been studied theoretically for a long time [9, 1219], have recently become accessible to direct quantitative experiments thanks to the development of single cell approaches [11, 2022].

During the division cycle of a cell, stochasticity arises from different sources and at different points, namely from the inherent stochasticity of the synthesis of proteins (which occurs throughout the division cycle) and from the partitioning of the protein molecules among the daughter cells during cell division (an approximately instantaneous event). An obvious question that arises in this context is whether there is a dominant source of noise? This question is related to the problem of which mathematical description is most appropriate: which sources of noise need to be taken into account for a minimal, but realistic description? Do different descriptions of the noise, with or without explicit volume growth, with explicit or implicit cell division etc lead to approximately the same predictions or are there considerable differences between these descriptions concerning the noise that is generated? In a recent study [23], we have addressed some of these issues by considering various simple models that include or exclude certain sources of noise. The comparison of these results has shown that often there is no dominant source of noise, i.e. that different sources contribute comparably (an exception is so-called bursty protein synthesis: if many proteins are produced from relatively rare transcription events, this bursting is clearly the dominant source of noise). The absence of a dominant noise source means that on the one hand, all sources have to be included for accurate results, but, on the other hand, also that omitting any of those sources will still lead to fluctuations of the same order of magnitude.

In our previous study, these questions have been studied for unregulated genes. Here we extend our approach to regulated genes. We focus on a simple, but important regulatory system, namely negative autoregulation, where the protein product of a gene controls the read-out of that gene, such that large concentrations of the protein suppress further synthesis of that protein [16, 24, 25]. Fluctuations arising from both sources we consider (stochastic synthesis and stochastic partitioning during cell division) are found to be suppressed substantially by the negative feedback. A complication that arises generally for regulated genes is that regulation depends on protein concentrations, which in turn depend on the cell volume. This means that the growth of cellular volume needs to be taken into account explicitly, which was not necessary for unregulated genes that could be described fully by the number of protein molecules per cell.

The paper is organized as follows. In section 2, we review some key results from our previous study [23] comparing different sources of noise for unregulated gene expression. An alternative analytical derivation of one central result is presented in the appendix. In section 3, this type of analysis is extended to a gene controlled by negative autoregulation. We end with some concluding remarks.

2. Sources of stochasticity for constitutive gene expression

Recently we studied different models for the stochastic gene expression of an unregulated (constitutively expressed gene) in order to disentangle different sources of (intrinsic) stochasticity [23]. Stochasticity arises from the process of protein synthesis and also from degradation, if the proteins are unstable. When a cell divides, the partitioning of proteins among daughter cells also generates fluctuations. In our recent work [23], we analyzed these different sources in a systematic way to see which sources contribute to the observed noise and whether there is a dominant source. In this section, we briefly summarize some key results obtained with these models.

Protein synthesis is a two-step process consisting of transcription and translation. In the first step, a gene sequence is transcribed into mRNA and then it is translated by ribosomes to produce proteins. If M and P represent the mRNA and protein copy numbers, respectively, their time evolution is described by:

Equation (1)

where αm, αP and βm, βp are synthesis and degradation rates of mRNAs and proteins respectively. g is the gene copy number. In bacteria, proteins are often stable (with lifetimes long compared to the generation time T) [28]. Then the degradation rate in equation (2) is an effective degradation rate representing dilution by cell growth and division with β = ln 2/T. By contrast mRNA is typically rather short-lived with lifetimes in the range of a few minutes [22, 27] and one can approximate the equation for M by its steady state, M = αmg/βm. In that case the above two-step process is reduced to an effective one-step process:

Equation (2)

with α = αpαm/βm. We would like to note that when mRNA is treated as a fast variable and considered to be in a steady state, one obtains a correct description of the average protein number, but the fluctuations are underestimated, in particular, if a single transcription event (or one mRNA molecule) gives rise to many copies of the protein. This effect, where the output of transcription is strongly amplified by translation, is known as bursty protein synthesis and will not be considered here. The reader is referred to [9, 16, 23] for discussions of this issue.

We now consider different models [23] that are based on equation (2), but describe cell division explicitly. In that case, the degradation rate for stable proteins is β = 0 and the protein copy number per cell is divided by 2 at cell division. We start with a model where protein synthesis is described deterministically, while proteins are distributed stochastically among the two daughter cells during cell division (we note that in all our models a cell divides in exactly two daughter cells and we look at only one lineage of cells; for some more complex cases see, e.g., [26]). So during division each protein molecule has a probability r = 1/2 to go to either of the daughter cells. This means that in every generation a constant number Q = αT of protein molecules are synthesized, but the initial protein number in each cycle fluctuates due to the stochastic division. Figure 1(a) shows a time series of such a process as obtained from simulations. For this case we have obtained a number of analytical results [23] using a method proposed in [29]. An alternative derivation based on generating functions is given in the appendix. The average copy number after division and the variance of that number are found to be given by 〈P0〉 = Q = αT and δP20 = 2Q/3, respectively. Two commonly used characteristics of noise are the noise strength η2 defined as

Equation (3)

and the Fano factor F = η2P〉. η2 typically scales as η2 ∼ 1/〈P〉, so the latter parameter provides a characterization of the pre-factor of that scaling. For the case under consideration, we obtain

Equation (4)

or F0 = 2/3 (the index '0' in these expressions indicates that we have taken averages over a population of cells immediately after division).

Figure 1.

Figure 1. Stochastic models of unregulated protein synthesis. (a)–(c) Trajectories of the protein copy number from stochastic simulations with stochastic synthesis, partitioning during cell division, or both, all with cell division modeled explicitly. (d)–(f) Corresponding concentrations of the protein calculated for a volume that increases linearly during the division cycle and is halved at multiples of the division time T. Here the cell volume does not affect the protein synthesis rate does. The parameter values used for these plots are $\alpha =0.5\min ^{-1}$ , $T= 40\min ^{-1}$ .

Standard image

In the complementary case, synthesis of proteins is stochastic and division deterministic. So when a cell divides each daughter cell gets exactly half of the available proteins as shown in figure 1(b) (for odd number of protein P, we take the number after division to be either (P + 1)/2 or (P − 1)/2, each with probability 1/2, thus leading to a minimal remnant of stochasticity in our otherwise deterministic description of cell division). We also assume the synthesis rate to be constant and do not explicitly describe gene duplication. We then obtain

Equation (5)

The last result implies that the Fano factor is F0 = 1/3, which is just half of what we have seen for stochastic partitioning (equation (4)).

Finally, we combine both sources of stochasticity, thus synthesis as well as degradation of proteins occur stochastically (figure 1(c)). Using again the method of [29], we obtain

Equation (6)

Points to be noted are: (i) additive independent noise strengths (η2). In our case, the noise in equation (6) is the sum of the noise components for stochastic partitioning (2/〈3P0〉) and from stochastic synthesis (1/〈3P0〉); (ii) the contributions from both sources of noise are of the same order of magnitude, implying that there is no dominant source of noise in this simple case (figure 2).

Figure 2.

Figure 2. Stochastic models of protein synthesis: noise strength η2 as a function of the average protein copy number 〈P〉 (varied by varying the synthesis rate α) for the different models (for the models with explicit cell division, averages over cell immediately after division are plotted, i.e. η20 and 〈P0〉). $T= 40\min $ .

Standard image

In figures 1(d)–(f) we show time series for the concentrations of the protein for the three cases discussed above. The concentration fluctuates around its mean, and shows no systematic dependence on the cell division cycle. The latter observation arises from the fact that both the volume and the protein number increase (on average) linearly during the cycle. A systematic variation over the course of the division cycle is obtained if an explicit description of gene duplication is included or if the volume growth is not linear [23].

3. Protein synthesis with negative autoregulation

Gene regulation is incorporated into models of the type of equation (2) via synthesis (or degradation) rates that depend on the concentration of a regulatory protein, for example a transcription factor. Here we consider one specific case, namely negative autoregulation, where the output of a gene (the protein product) modifies the read-out of that gene in such a way that the synthesis of a protein is suppressed by a high concentration of that protein [16, 3032]. The dependence of the synthesis rate on the protein concentration p = P/V is expressed by a so-called Hill function

Equation (7)

Here α0 is the maximal synthesis rate, K is a concentration scale that defines which protein concentration is required to affect the synthesis rate (in the simplest case, it is given by the dissociation constant for binding of a transcription factor to its binding site on the DNA). n is called the Hill coefficient, which describes the cooperativity of regulation and characterizes the steepness of the regulation function. In the following we take n to be equal to 2.

The synthesis rate α(p) in equation (7) is time-dependent through both the protein copy number (which changes in discrete steps of synthesis and degradation) and the cell volume (which changes in a continuous fashion). So in contrast to the unregulated case discussed before, volume growth affects the dynamics of the protein synthesis process in the presence of (concentration-dependent) gene regulation. In our case, volume growth is taken to be linear in time, starting from an initial volume V0 directly after cell division and reaching 2V0 just before the next division. The growth is implemented via a discrete time step Δt in which the volume increases by ΔV . The volume is halved exactly at the division. We are again interested in the different sources of stochasticity and consider the synthesis of protein and the partitioning of molecular content to be either stochastic or deterministic.

We begin with the case where both synthesis and division are stochastic. The variation of the protein number versus time for this case is depicted in figure 3(c), the corresponding concentration is shown in figure 3(f). As before, we consider the dependence of the noise parameters η on the average protein number. In this case η20 follows a 1/〈P0〉-behavior for small 〈P0〉, but crosses over to 2/3〈P0〉 for large 〈P0〉 (see figure 4 blue line with filled squares). The crossover occurs for values of 〈P0〉 of the order of KV0, i.e. it occurs when the autoregulation mechanism becomes important. For smaller 〈P0〉 (or α0), the system behaves like an unregulated system with considerable fluctuations due to protein synthesis as well as division. For large α0 (or 〈P0〉) autoregulation becomes active and suppresses protein number fluctuations, so that protein synthesis becomes approximately deterministic, but partitioning during division remains stochastic, hence leading to the observed 2/3〈P0〉-behavior.

Figure 3.

Figure 3. Stochastic models of protein synthesis with negative auto-regulation. (a)–(c) Trajectories of the protein copy number from stochastic simulations with stochastic synthesis, cell division, or both, all with cell division modeled explicitly and linear volume growth. (d)–(f) Corresponding concentration of the protein. The parameter values used for these plots are α0 = 2.0 min−1, V0 = 2 μm3, T = 50 min, K = 10 molecules μm−3.

Standard image
Figure 4.

Figure 4. Noise strength η20 as a function of the average protein copy number 〈P0〉 (varied by varying the synthesis rate α0) for the different models with negative autoregulation. Averages are taken over cells immediately after cell division. The parameters values are V0 = 2 μm3, T = 50 min, K = 100 molecules μm−3.

Standard image

Now we separate the two sources as we did before for the unregulated gene. In the first case, proteins are added deterministically and partitioned stochastically. Deterministic addition means integrating equation (7) over a cycle. This number depends on the initial protein number. The integration usually leads to a noninteger value of the protein number, in such cases the remaining noninteger part is interpreted probabilistically and it is added with a probability equal to the fractional part. A trajectory of the number of molecules per cell for this case is depicted in figure 3(a) and the corresponding concentration is shown in figure 3(d). Our simulations show that in this case η20 behaves as 2/3〈P0〉 for the entire range of 〈P0〉 as in the case without autoregulation (see figure 4, black line with filled triangles).

Finally we consider the case where the synthesis of the protein is a stochastic process, but partitioning during cell division is deterministic (see figure 3(b) for the protein number and figure 3(e) for corresponding concentration). In this case for small 〈P0〉, autoregulation does not kick in and the system behaves effectively as an unregulated gene with η20 = 1/3〈P0〉. For large 〈P0〉, both η20 and the Fano factor η20 × 〈P0〉 are strongly reduced, as the synthesis becomes almost deterministic for large 〈P0〉, where protein number fluctuations are suppressed by the negative autoregulation. (see figure 4 green line with filled circles).

In an experiment, one typically looks at a population of cells, which are at different points in the division cycle. To address this situation we take averages of the protein number or concentration over many realization and over time through the full division cycle instead of over different realizations, all taken directly after the division. In doing so, the average protein concentration is a more relevant parameter than the average protein number, because the protein number increases twofold over the division cycle. We therefore determine a noise parameter η2p for the concentration. We plot simulation results for the different cases in figure 5. When both synthesis and division are stochastic (blue line with filled squares in figure 5), η2p retains the 1/〈P〉-behavior for small 〈P〉 (where the gene is effectively unregulated). For large 〈P〉, that is in the presence of negative autoregulation, the noise is suppressed and η2p shows a 1/3〈P〉 behavior. The other two cases, namely stochastic synthesis and deterministic division and vice versa exhibit the same dependence. For both the noise is increased compared to the earlier case where averages were taken immediately after division. Here η2p goes as ≃1/2〈P〉 for small 〈P〉 and ≃1/6〈P〉 for large 〈P〉 (see figure 5, black line with filled triangles and green line with filled circles). The latter results indicate that negative autoregulation suppresses noise from both sources (stochasticity of protein synthesis and stochastic partitioning during cell division). The observation that our results above (figure 4) only showed suppression of the synthesis noise and not of the partitioning noise, is due to taking averages directly after division, which leaves no time to compensate for variations of the protein concentration introduced during partitioning.

Figure 5.

Figure 5. Noise strength of concentration η2p as a function of the average protein number 〈P〉 (varied by varying the synthesis rate α0) for the different models with negative autoregulation. Averages are taken over whole division cycle. The parameters values are V0 = 2 μm3, T = 50 min, K =  100 molecules μm−3.

Standard image

4. Conclusions

In this paper, we have reviewed models for constitutive (unregulated) protein synthesis that we have studied previously [23] and presented some new results for protein synthesis with negative autoregulation. In both cases, different model variants were considered to disentangle different sources of stochasticity, specifically stochastic synthesis of the protein and stochastic partitioning during cell division.

The different models for unregulated gene expression show that there is no dominant source of stochasticity, as switching off one or the other source of noise leads to similar results (we note however that 'bursting' in protein synthesis, which we did not discuss here, can be dominant [9, 16, 23]). We found similar behavior for the models with negative autoregulation and explicit linear volume growth, but for these models the relation between the noise parameter η2 and the average protein number shows a crossover for protein concentrations at which autoregulation becomes important. Fluctuations are suppressed by negative autoregulation, as expected for such control systems and known from previous studies [16, 25]. Specifically, our results show that fluctuations arising from both sources (stochastic synthesis and stochastic partitioning) are suppressed by negative autoregulation, as shown by the approximately threefold reduction in the Fano factor for large values of 〈P〉, where the autoregulatory mechanism becomes active.

Acknowledgments

The authors would like to thank Veronika Bierbaum for useful discussions during the course of this study, and Angelo Valleriani especially for the calculation presented in the appendix.

Appendix.: Calculation of moments for stochastic partitioning using generating functions

The case of deterministic addition of Q molecules during the cell division cycle and stochastic partitioning during cell division, depicted in figure A.1, can be solved using the method of generating functions [33]. In this case the number of protein molecules will follow a binomial distribution at the time of cell division with parameters Q and r. Thus for the generating function we have

Equation (A.1)

Let X and Y be two binomially distributed random variables. Let gX|y(v1) be the generating function of the variable X conditioned to the fixed value of the variable Y such that

Equation (A.2)

and let gY (v0) be the generating function for the variable Y , such that

Equation (A.3)

Then we can write

Equation (A.4)
Figure A.1.

Figure A.1. Depiction of the model with deterministic synthesis and stochastic partitioning in cell division. We start with N0 = Q particles, At every step (generation) the cell divides into two daughter cells and each proteins goes to one of the daughter cells with probability r = 1/2. Between two divisions, a constant number Q of proteins is added, corresponding to a synthesis rate α = Q/T. Only the rightmost branch of the tree diagram, i.e. one lineage of cells, is considered.

Standard image

Now let N1 be the random number of molecules in box 1 (see figure A.1). Its probability distribution is given by

Equation (A.5)

Now in the next generation we know that Q particles are added and then distributed randomly between daughter cells. Thus in the next generation we will need to divide N1 + Q number of molecules with the distribution

Equation (A.6)

Then the unconditional probability $\mathcal {P}(N_2=k)$ is given by

Equation (A.7)

Switching back to generating function and taking into account that y = N1 + Q, we find that

Equation (A.8)

which can be generalized to m subdivisions leading to

Equation (A.9)

This expression is enough to evaluate the moments. Let us call the product in the bracket Fm then gm = FQm and let v = vm−1. Also, we have

Equation (A.10)

for 1 ⩽ j ⩽ m . This derivative is equivalent to

Equation (A.11)

for 0 ⩽ l ⩽ m − 1. Then we have

Equation (A.12)

and for the second derivative with $ \frac {\mathrm {d}v_{j}^2}{\mathrm {d}v^2} =0$ , we have

Equation (A.13)

both of which need to be evaluated at vj = 1. We get

Equation (A.14)

which in the limit m →  and r = 1/2 leads to the average value E[n|Q] = Q. On the other hand the second derivative gives:

Equation (A.15)

which for m →  and r = 1/2 gives E[n(n − 1)|Q] = Q2 − Q/3. Finally we obtain the variance as

Equation (A.16)

which gives η2 = 2/3Q.

Please wait… references are loading.
10.1088/0031-8949/2012/T151/014068