Bulletin of Mathematical Biology
Printedin GreatBritain
Vol.45, No.4, pp. 521-554, 1983.
0092-8240/8353.00 + 0,00 Pergamon Press Ltd. Society for Mathematical Biology
STOCHASTIC THEORY OF POPULATION GENETICS 9 TAKEO MARUYAMA National Institute of Genetics, Mishima, Shizuokaken 411, Japan Stochastic models of population genetics are studied with special reference to the biological interest. Mathematical methods are described for treating some simple models and their modifications aimed at the problems of the molecular evolution. Unified theory for treating different quantities is extensively developed and applied to some typical problems of current interest in genetics. Mathematical methods for treating geographically structured populations are given. Approximation formulae and their accuracy are discussed. Some criteria are given for a structured population to behave almost like a panmictic population of the same total size. Some quantities are shown to be independent of the geographical structure and their dynamics are described.
1. Introduction. Stochastic theory has a rather long history in population genetics. The model often referred to as the Fisher-Wright type was first studied by Fisher (1922) and then extensively by Wright. The issue which made Fisher formulate the model was the time (generations) required for genetic variation existing in a population to be lost by random drift due to sampling of gametes from one generation to the next. He devised a diffusion equation to describe the gene frequency change of a mutant in question and obtained the answer, which was then rather surprising: namely, the rate at which the genetic variability decreases in each generation has been shown to be approximately equal to the reciprocal of the population size. In 1931 Wright published his model of evolution based on the stochastic theory. In his model of evolution, organisms are viewed as a highly interacting complex system of many genes. Therefore in order to move from one adaptive system to another, possibly to a better system, it is necessary to pass through systems which are inferior in view of the natural selection. On the other hand, the natural selection acts only to increase immediate fitness, but not necessarily the fitness for the future, unless the interacting complex of genes is an additive system of some kind. Hence if the natural selection acting on the interacting system of genes is to determine the course of evolution, it may end up at a deadlock and may never move out to a better system. Therefore it is necessary to have some force which can bring a species from one well-adapted system to a system from which the species can pass to a better adapted one by a force counteracting to the natural selection. 521
522
TAKEO MARUYAMA
Wright (1931) thought that the force which can move a species out of one adapted system to another system leading to still other systems was random genetic drift of various kinds. There are many factors which can generate random drift, for example the random sampling of gametes, the fluctuation of environment, random immigrations from outside populations, and so forth. Certainly some random drift, such as drift due to sampling of gametes, works efficiently in small populations. Therefore, if a species consists of many small local demes which are connected by migration, the random drift can act in each of the demes almost independently and produce fitness in some local populations to various less-adapted systems to which natural selection would not lead. However, being in such less-adapted systems of many kinds opens up a possibility which eventually leads some of the demes to higher systems. Once a deme reaches a better adaptation it will spread out to its neighboring demes by migration and natural selection, and eventually to the entire population by spreading into all the demes. This is Wright's model of evolution. The random drift and the geographical structure are probably the two most important factors in the model. Wright (1931) developed much of the mathematical machineries and obtained marly important results. One of them is the correct determination of the rate at which the genetic variation is lost. In the 1931 paper Wright had many important concepts in mind and had worked out problems relevant to his model of evolution. His models are becoming increasingly important in population genetics and the study of evolution, which are becoming very topical subjects, particularly after the spectacular development of modern molecular biology. It is, however, important to note that though many individual problems have been worked out by Wright, Kimura, Mal6cot and many others, Wright's model of evolution still remains very much in a conceptual stage. It is therefore necessary in the future to develop adequate mathematics and investigate the model of evolution in which the stochastic processes play vital roles. This is also a challenging issue which may have a large impact in modern biology and in understanding of evolution. For Wright's model of evolution see Wright (1931, 1949, 1969, 1970). While Wright has been working on his model of evolution he has developed many important theories of population genetics, and has given foundation to much of the current population genetics theory. Steady distributions of gene frequency, the fate of mutant genes in a population, isolation by geographical distance, Wright's F-statistics describing the population structure and many other theories have been advanced by him, though not necessarily solved completely.
2. Theory of Panmictic Populations.
One of the Fisher-Wright models considers a locus at which two alleles, say A and a, are segregating in a popu-
STOCHASTIC THEORY OF POPULATION GENETICS
523
lation. We will denote the relative frequency of A to a in the population by x or y. Hence the frequency of a is 1 - - x or 1 - - y . We assume that the fate of the alleles is determined by various factors, some stochastic and other deterministic, and that the alleles obey the laws of Mendelian inheritance. Then the temporal development of x and y forms a Markov process which can be approximated as a classical diffusion (Ethier and Nagylaki, 1980). Therefore, if we let Max and Vax be the infinitesimal mean change and variance of the gene frequency we have a pair of Kolmogorov equations, namely 02 = 1 02 { Vay ~} _ at 2ay2
0
-~y { Maydp}
(2.1)
and
a~_ vax a2O + Max O~ at
ax
(2.2)
where ~b = r x, y) is the probability density that the frequency of A is y at time t, given that it is x at t = 0. Equation (2.1) has been introduced by Kolmogorov (1935) and Wright (1945) and has been used extensively (Wright, 1945, 1948, 1949, 1969; Kimura, 1954, 1955a;Crow and Kimura, 1970). Among many studies using equation (2.1), Kimura (1955b) obtained, for the first time, an eigenfunction expansion of the fundamental solution r x, y) for the special case where May = 0 and Vay = y(1 - - y ) , which represents the process of a pure random drift due to the sampling of gamates. Some processes have steady states to which ~(t, x, y) converges and becomes independent of the initial frequency x. For such cases we have equilibrium distributions of y given by 2
,
(2.3)
which is called Wright's formula. Although they are the adjoints of each other, equation (2.2) has been shown to be more useful than equation (2.1) and has provided much information of biological interest. One of the first uses of equation (2.2) was to find out the probability of a mutant gene increasing its frequency and becoming eventually fixed in a population (Kimura, 1962). Ewens (1963) also used the same equation to obtain the average sojourn time of x at given values, and this is related to the distribution of the gene frequency. Feller (1951 ) demonstrated how to derive equations (2.1) and (2.2) from a discrete Markov chain which can be directly formulated from a discrete
524
TAKEO MARUYAMA
generation model of finite size. Feller (1954) gives all t>e necessary mathematics to deal with equation (2.2) and in this sense the problems which can be formulated as one-dimensional classical diffusion processes are completely solved, at least mathematically. In addition to the fixation probability of mutant genes and the sojourn time, many other problems of population genetics have been answered by the solution of equation (2.2). These are the time required for a mutaiat to become fixed in a population, the number of individuals who are to become carriers of a particular gene, the number of heterozygotes to be formed, and so forth (Kimura and Ohta, 1969; Maruyama, 1972; Li, 1973). A well-studied special case of biological importance is the Wright-Fisher model of finite population. We assume that a population consists of N diploid individuals (2N genes) and that two alleles A and a are segregating at a locus. Denoting the frequency of A by x, we have the following cases of particular importance: Vsx = x ( 1 - - x ) / 2 N ,
(2.4)
representing the variance component due to the binomial sampling of gametes from one generation to the next; V6x = ax2(1 - - x ) 2,
(2.5)
the variance due to the fluctuation o f the relative fitnesses o f A and a; Msx = 0,
(2.6)
representing no systematic pressure; M~,, = v - - (u + v ) x ,
(2.7)
a case where allele A mutates to a at a rate of u and the reverse rate v;
Mnx =
x(1 -- x) dW 2 dx'
(2.8)
a general selection model, where 9 = W l l X z + 2w12x(1 - - x ) + w22(1 - - x ) 2, with w11, wx2 and w22 being the relative fitnesses of A A , A a and aa respectively; Msx = sx(1 -- x),
(2.9)
STOCHASTIC THEORY OF POPULATION GENETICS
525
a special case of (2.8), where wit = 1, wt2 = 1 + s and w22 = 1 + 2s; and
Msx = sx(1 -- x) (1 -- 2x)/2,
(2.10)
another case o f (2.9), where w~x = w22 = I and w12 = 1 + s. If s > 0 in (2.10), the model is called overdominance, and if s < 0 underdominance. Various combinations of these M~x and V~x describe important models o f population genetics. For instance, the neutral hypothesis o f molecular evolution is the process governed by M6x = 0 and Vsx of (2.4) (Kimura, 1968). The neutral hypothesis of molecular polymorphisms is given by M6~ = - u x o f (2.7) with v = 0 and Vsx o f (2.4) (Kimura and Ohta, 1971; Nei, 1975). Another example is the Hb S gene of the sickle cell anemia in African black populations. Individuals carrying two Hb S genes are likely to develop sickle cell anemia, but those carrying one Hb S and one normal gene are more resistant to malaria than normal homozygotes. Thus wt2 > wtl > w22 ~ 0, and the Hb S genes are maintained in the populations by overdominant selection. Interestingly, the frequency of Hb S in American black populations has been greatly reduced because being more resistant to malaria is no longer advantageous in America, i.e. wx2 ~ wn > w22. Presumably the selection on the sickle cell anemia homozygotes (Hb S/Hb S) has caused the decrease in the Hb S frequency. Although originally each problem was solved independently of the others, it is nice to have a unified theory which includes those individual problems as special cases. For that, consider an arbitrary function f(x), and let
F(n)(x) = E { [ ~ o t~ f ( x ( w , t))] n } ,
(2.11)
where r indicates a particular sample path of the process, rio the time when the path ~o exits from the interval (0, 1) and E{} the expectation with respect to the sample paths. Then F('O(x) represents the expectation of the nth m o m e n t of the sum o f the quantity given by f ( x ) from t = 0 until the exit time ro~, given that the initial gene frequency is x. For example, if we want to measure the total heterozygosity we may put f ( x ) = 2x(1 -- x). Now let the second-order differential operator
V~x d 2 d L - = - 2 dx 2 + M ~ x d- -x"
(2.12)
Then it can be shown that F (n) satisfies
LFO)(x) + f ( x ) = 0
(2.13)
526
TAKEOMARUYAMA
and
LF(nl(x)+nF(n-1)(x)=O f o r n
= 2, 3 , . . . .
(2.14)
For equation (2.13) and (2.14) we need to impose appropriate boundary conditions. In many cases they are F(n)(0) = F(n)(1) = 0. The fixation probability of mutant genes is given by the solution of
Lu(x)
= O,
(2.15)
satisfying u(0) = 0 and u(1) = 1 (Kimura, 1962). In general, we are concerned with an ordinary second-order differential equation of the form
LY(x)+K(x) = O,
(2.16)
where K(x) is a given function o f x and Y(x) satisfies a boundary condition, say Y(0) = Y(1) = 0 or Y'(0) < ~ and Y'(1) < ~ (see Nagylaki, 1974). Equation (2.16) can be integrated and we have
Y(x) =
{ 1 -- u(x)}
j.x Ok(~)u(~)d~ + u(x) o
~k { 1 -- u(~)}d~, (2.17)
where ~bk(~) = 2k(~) f01
G(O)dO/{V~tG(~)}
u(~) = fo I G(O)dO/fol G(O)dO
(2.18)
and G(~) = e x p { - - 2
f: (M~o/V~o)dO}.
(2.19)
If we put f(x) = 6(x --y), where ~(.) is the Dirac delta function, then the unique solution, denoted by q~(x, y) of equation (2.13) with boundary conditions 4~(0, y) = 4~(1, y) = 0, gives the average sojourn time at y for the paths starting at x. For general theory see Dynkin (1965) and for application to genetics see Maruyama and Kimura (1971, 1975), Nagylaki (1974) and Maruyama (1977).
STOCHASTIC THEORY OF POPULATION GENETICS
527
In population genetics we sometimes want to distinguish a certain kind of sample path. For instance, we want to know a special quantity for those paths destined for the fixation o f a particular mutant gene. Then it is quite simple. Let us assume that we want to know about those paths in which allele A, whose initial frequency is x, is ultimately fixed in the whole population. Then let u(x) be the fixation probability, given by the solution of (2.15), and define a new operator 1
LI - u(x )LU(x ),
(2.20)
where L is that given by (2.12). Let us denote by F}n)(x) the function defined in (2.11), except that the sample paths concerned here are only those in which allele A becomes eventually fixed, excluding sample paths in which A is lost. Like F(n)(x), Fl~(x) satisfies (2.21)
LIFp)(x) + f i x ) = o and
L1F~n)(x) + nF(n-1)(x) = O, with F~n)(0) = F~n)(1) = 0. Substituting 8(x - - y ) for f(x) in (2.21) and solving for F~l)(x), we obtain the average sojourn time ~l(x, y) for the paths starting from x and going to fixation: q51(x, y) = 2u(y){ 1 - - u ( y ) } / [
Vsyu'(y)} for y ~> x (2.22)
01(x, y) = 2{ 1 --
u(x)}u2(y)/[ u(x)Vsyu'(y)} for y ~
where u'(y) = du(y)/dy. In an important case in which a single mutant gene is destined to eventual fixation the average number of generations which the mutant allele spends in the interval between x and x + dx is given by ~ l ( x ) = 2u(x){ 1 --
u(x)}/[ V~xu'(x)},
x >1
1 2N
(2.23)
For a neutral m u t a n t this reduces to q51(x) = 4N, showing that the mutant spends, on average, two generations at each frequency class, i.e. x = 1/2N, 2/2N .....
Another important case of choosing sample paths is related to the problem o f estimating the age of mutant genes based on their present frequency.
528
TAKEO MARUYAMA
Here we assume that every mutant is new to the population and the mutation rate per gene per generation is u. Then the function needed for modifying the operator is the solution of
Lob(x, y) + 6(x --y) = O,
(2.24)
where L is given by (2.12) and 6(') is the Dirac delta function. With this choice of the modifying function let
Ly
z
r-
-
1
y) Lob(x, y).
(2.25)
With the aid of this operator we can investigate the history o f a sample path of an equilibrium population, given that the gene frequency is in (y, y + dy) at present. If we let A(x, y) be the age of an allele whose frequency is y, then it satisfies
LyA(x, y) + 1 = 0
(2.26)
or, equivalently,
LG(x, y) + d#(x, y) = O, where G(x, y) = r y)A(x, y). For instance, if we assume that every mutant is selectively neutral, the average age of a mutant whose present frequency is y is given by A (2.27) where F = 1 -- 4Nu. Some numerical examples are given in Table I (Maruyama, 1974). We can now examine some rather realistic cases. Most existing data on protein polymorphisms suggest that the value of 4Nu is between 0 and 0.5. Here we assume the neutral hypothesis of polymorphisms. If we take, for instance, the column in Table I headed by 0.2, an allele of frequency 0.01 would have persisted about 4N • 0.1 = 0.4N generations in a population. Therefore if N = 10 s the allele would be on the average 40,000 generations old. Likewise, if the frequency is 0.5 the allele would have an average age of about 6 X 4N = 24N generations, and if N = 105 the gene would be more than two million generations old. Therefore a gene of high frequency can be very old if the population size is large.
STOCHASTIC THEORY OF POPULATION GENETICS
~
0
~
~
Z 0
<
0
~
~
0
~
,,-i 0
t"4
d
<
d d d d o d ~
t--i ;~
o
o"
9
('-1
dddddddddd ;~ o t"-I
0 0 0 0 0 0 0 0 0 0
0 0 r~
>
dodddododd 0 t"-I
dodddddddd 0
Z
dddddddddc~
529
530
TAKEO MARUYAMA
While the mean age of a gene is given by formula (2.27), the second m o m e n t of the age is given by 8N y {1--(1--~)F}m(~)d~+{l_(1 [fo --F~( 1 - - ~)F
_y)F}
f,
m(~)
~( 1 - - ~)
Fd Jl (2.28)
where
' Efo
m(~) = ~
0(1 -- 0) P
+ { 1 -- (1 -- ~)F}
;i 0(1 -'- 0 ) Pd0 ] "
Analaysis of the age problem can be treated by the time reverse method of Nagasawa (1964) and can be greatly simplified (Nagasawa and Maruyama, 1979; Maruyama and Fuerst, 1983). The first arrival time can be treated by a method similar to that used in the analysis of the age. Suppose that the frequency of a particular allele in question is x at t = 0 and, further, that the frequency of that allele reaches y at least once before it becomes extinct. Then the probability that the above event happens is equal to 1 --(1
--x) F
uy(x) = 1 - - ( 1 = 1
__y)F f o r x
(2.29)
forx >y,
where F = 1 -- 4Nu. This formula is valid for the neutral allele model where every mutant is new. Under this hypothesis a trivial case would be x = 1 and uy(1) = 1 for ally. Define a new operator 1
Lr - uy (x ) Luy(x ),
(2.30)
where L is given by (2.12). Let fy(x) be the mean first arrival time that an allele starting from frequency x arrives at y for the first time. Then the mean first arrival time fy(x) satisfies an equation similar to (2.26) in which Ly is replaced by Lf of (2.30). Two particular solutions of our interest are
fy(0)
4N{ 1 -- (1 -- y)P} ( , up(G){ 1 -- uy(~)} a~ F Jo ~( 1 -- ~/)F -'-
(2.31)
STOCHASTIC THEORY OF POPULATION GENETICS
fy(1)=-~ -
(l--y) F
/j(l_/j) F
+logy
.
531
(2.32)
Formula (2.31) gives the mean of the first arrival time for a mutant to reach frequency y, given that it starts from a very low frequency and that it reaches y at least once. Besides the mean and variance of the first arrival time, the distribution of the time is also important and can be studied using the operator defined by (2.30). Let fy(t, x)dt be the probability that the first arrival time falls in an interval (t, t + dt). Define
Fy(t, x) - f : g(~,
x)d/j.
Then the cumulative distribution o f the first arrival time satisfies
aF/t, x) at
= LflFy(t,
x)],
(2.33)
where L: is given by (2.30). The Kolmogorov backward equation (2.33) can be easily integrated numerically, though an explicit analytical solution may be difficult. Figure 1 shows graphs offy(t, x) with y = 0.5 and x = 1.
15
10 Z0
C~ .1 .2 .3 ./, .5 .6 .7 .8 .9 1.0 1.1 1.2 TIM E (N-GENERATIONS)
Figure 1. Distribution (density) of the first arrival times from x = 1-0.5. The abscissas are the time measured in units of N generations.
532
TAKEO MARUYAMA
3. Geographically Structured Populations. In this section we deal with problems whose exact solutions are not known. The methods used are intuitive and cannot be regarded as proven. However, they give a qualitative indication of where the solution lies and in many cases have been checked by computer simulation. They should have heuristic value and I hope that they may point the way to rigorous exact solutions at some later time. A population under consideration consists of a finite number of colonies (or lines). Each colony is subject to extinction with a certain probability, and when a colony becomes extinct it is immediately replaced by a new colony formed by taking some individuals from one of the colonies, including the one to become extinct. We assume that a new colony consists of an arbitrary number of randomly chosen individuals. Every mutant is new, as we assumed in the preceding section. The population structure is assumed to be arbitrary but in a stable equilibrium. Using Kolmogorov backward equations, we will study this general model of structured poulations. We will show that the method is useful in studying the model for various quantities of biological interest, though the analyses are approximations. One of the emphases in this section is to see how closely a structured population behaves like a panmictic population, and to reveal the parameters which determine the fate of genes. Let us define the following quantities: n = number of lines, fixed; N/ = number o f individuals in colony i; .~ = the arithmetic mean ofNi; N = the harmonic mean o f Nt; ?~ = extinction rate of a colony; v = mutation rate; s = selection coefficient of a mutant allele; h = degree of dominance; xi = frequency of a mutant allele in colony i; P!
x = n?7 ~ NiX/= the mean frequency; i--1
and r x, y ) = transition probability density for the mean frequency change. In order to derive a Kolmogorov equation for 4>(t, x, y) we need the mean and variance of ZLvt = (xt+ 1 -- xt), where t denotes time in units o f generation. Simple algebraic calculation gives
v+x=
x --E(x~)
+
2X{ E ( x ~ ) - - x 2} I -
n
{x -- E(x/2)} + 2)QX{E(x~) -- x 2 } nN
(3.1)
STOCHASTIC THEORY OF POPULATION GENETICS
533
and
Max = - v x for neutral mutation,
(3.2)
msx={x(1--xt) I 9
2wi
dxi
Nt for the selection model,
(3.3)
in which E{ } indicates the expectation over all the colonies and wi = 1 +
sx 2 + 2shxi(1 -- xi). Therefore the Kolmogorov equation for the neutral mutation is
a._~ = Vax a2r at
2
vxa._.~
ax 2
(3.4)
ax
and for the selection model a~b Vax a2~ + 8 9 a"t- = 2 ax 2
dwi}a(9 wi
dx~
ax"
(3.5)
These equations do not have an explicit migration term, but the coefficients involve the local gene frequencies xi. Equations (3.4) and (3.5) are very difficult for exact analyses, but they can be useful in studying some special cases. First, we assume that the migration is extremely restricted and almost always each local colony is either fixed with a particular allele or lost, i.e. xi ~ 1 or xi "~ O. With this assumption
E(x 2) ~ x
(3.6)
x - - E ( x ] ) .~ O.
(3.7)
and therefore
Noting this relation in (3.1), equation (3.4) reduces to aq~ = Xx( 1 -- x) a2~b at
n
ax 2
a~b vx
ax
(3.8)
This asserts that if the migration is strongly restricted and the local heterozygosity is low, the process can be approximated by a panmictic population of the effective size n/(2X).
534
TAKEO MARUYAMA
If the whole population is nearly panmictic, we have E(X~) ~ X 2
and therefore E ( x ~ ) - x 2 ~ O.
Substituting this relationship in (3.1) and (3.4), we have ar at
x(1-x) =
a2~
2n~
Ox 2
a~ PX
bx"
F i x a t i o n probability o f m u t a n t genes. A special case of no dominance and no extinction can be exactly solved. If ~ = 0, h = 0.5 and wt ~ 1, then the equation for the fixation probability u(x) of a mutant with initial frequency x is given by the solution of 1
2 n N { x -- E(x])}u(X)xx + s{ x -- E ( x ] ) } u ( x ) x = O.
Eliminating a common factor in {}, we have u(X)xx + 2nslTu(x)x = O,
(3.9)
which is exactly the same as that of a panmictic population of the effective size n-N. Still assuming X = 0, let us investigate limiting cases of extremely restricted migration for a general selection scheme. If the migration is strongly limited we expect every xi to be either close to 1 or close to 0. Furthermore, the fraction of xls close to 1 must be x and that of xis close to 0 must be 1 -- x because x is the mean o f x i. Let el be a small quantity such that either x i = 1 --ei or xi = ei holds. We can assume, without loss o f generality, thatxt= 1--e tfori= 1, 2 . . . . . k a n d x i = e t f o r t = k + 1 . . . . ,n, i n w h i c h kin ~ x. We have the following approximations:
E(x~)=
~(1--ei)3+ i=1
~ l=k+ 1
et3
~x--3x.e(x)+3xe(x2),
(3.11)
STOCHASTIC THEORY OF POPULATION GENETICS
535
where e(x) = E{ el} and e(x 2) = E{ e~} for given x. Note e(x) -* 0 as x ~ 0 or x -~ 1. Assuming X = 0 and thus IV = N, we have x -- E { x ~ } d2u(x) du(x) 2nN ------Sdx + E[xl(1 --xt)(s(1 -- 2h)xi + sh}] ~ = 0.
(3.12)
Applying (3.10) and (3.11) to the coefficients of (3.12), E{xi(1 -- x/)[s(1 -- 2h)xt + sh] } = E{ (x 2 -- x/a)s(1 -- 2h) + (xi - - x ~ ) s h } ~, xe(x)s(1 -- 2h) + 2 x e ( x ) s h = s x e ( x )
(3.13) and x -- E ( x ] ) = E{ xi( 1 -- xi)} "~ 2xe(x).
(3.14)
The the variance V~x ~, 2 x e ( x ) / n N and Msx ~ s x e ( x ) , and for u ( x ) d2u(x) dx
du(x) + nNs~ = O. dx
(3.15)
The solution is given by 1 u(x) -
-
-
1 --
e -nNsx e -nNs
"
(3.16)
Therefore in the limit of low migration the fixation probability converges to that of an additive gene in a panmictic population. Using a different method, Slatkin (1981) has obtained the same result. The selection scheme given in (3.3) is certainly general and covers most situations. However, an exactly symmetric overdominance model must be formulated in a slightly different way. Now assume that the relative fitnesses o f A A , A a and aa are 1 - - s , 1 and 1 - - s respectively. Thus wi = s { x ~ + (1--xt)2}.
Assuming wt -~ 1, we have M6xi = E{s( 1 -- 2x~-)xt(1 -- xt)}.
536
TAKEO MARUYAMA
Applying the approximations (3.10) and (3.11) to the above formula, we have Max "~ -- 3se(x)2( l -- 2x).
Now assume X = 0; then x -- E(x~)
2xe(x)
2xe(x)
Nn
nN
nN
Therefore 2Mex
Vex
.~
6se(x)2( 1 -- 2 x ) n N
2x e(x )
= -- 3 s n N
1 -- 2x
x
e(x).
Note that the above expression is proportional to e(x). This implies that if the migration is strongly restricted, the process of the symmetric overdominance model becomes very much like that of the neutral model. Still assuming strongly restricted migration and that (3.13) and (3.14) hold, let ~ > 0 and thus N 4: N. Then the variance given in (3.1) becomes
Vex ~-
2xe(x) + 2 ~ ( x - x 2) n~l
(3.17)
and the mean Mex ~ xe(x){ s(1 -- 2h) + 2sh} = xe(x)s.
(3.18)
Therefore we have for u ( x ) d2u(x) dx 2
{ +
n~Isxe(x) } d u ( x ) = o. x e ( x ) + 2](/'a(x--x 2)
(3.19)
If L~r is large and e(x) is small for all x the selection term in the above equation becomes small and the fixation probability approaches that of a neutral gene given by u ( x ) = x. Therefore if the extinction rate is high and the local heterozygosity is low, the fate of a mutant gene becomes very much like that of neutral genes. Next, assume that the whole population is nearly panmictic and therefore E(x 2) "~ x 2 and E(x~) ~ x 3.
STOCHASTIC THEORY OF POPULATION GENETICS
Under this assumption the term {x -- E(x~)}/nN X x(1 -dominates the other term 2~{ E(x~)}/n. Thus
x)/n~l
537
in (3.1)
x(1 - - x )
V~x ~
nl(l
and
M~x
~ x(1 -- x){ ( 1 --
2h)sx + sh}.
Thus
~o G(x)dx u(x)
=
,
(3.20)
Jo~ G(x)dx in which 2
The formula given in (3.20) is identical to that o f a panmictic population o f the effective size n ~ / ( K i m u r a , 1962). Some examples showing the convergence o f the fixation probability to that o f formula (3.16) are given in Table II. In case 1 o f the simulations the mutant allele is completely recessive and in case 2 it is completely dominant, while the homozygotes for the mutant have identical fitness (s = 0.1). It is worth noting that in case 1 the fixation probability increases as the migration rate decreases, while a reverse trend is apparent in case 2. A recessive, advantageous mutant has less opportunity to become fixed in a population than a dominant, advantageous mutant of the same effect in homozygotes. This is consistent with the values o f U(Xo) given in the two examples of cases 1 and 2. Cases 3 and 4 are deleterious genes completely recessive and dominant respectively. Since the mutant is recessive in case 3 the fixation probability is higher in a panmictic population than that in a strongly structured population, while the situation is reversed in case 4 for a dominant allele. Case 5 is an example of negative heterosis in which the fixation o f a mutant allele is much restricted in panmictic populations. As the migration becomes less frequent and the local heterozygosity declines, the process converges to a neutral model o f the effective size nN. This situation is clearly shown in case 5.
538
TAKEO MARUYAMA
TABLE II Some Numerical Examples to Demonstrate the Convergence of the Fixation Probabilities to Those Given by the Formula (3.16) as the Migration Rate (m) approaches Zero m
U(Xo) m u(xo) m U(Xo) m
U(Xo) m U(Xo)
Case 1 : s = 0.2, h = 0 (recessive, a d v a n t a g e o u s m u t a n t ) 0.5 0.25 0.1 0.05 0.01 0.001 formula 0.0532 0.0588 0.0674 0.0726 0.0802 0.0810 0.0952 Case 2: s = 0.1, h = 1 ( d o m i n a n t , a d v a n t a g e o u s m u t a n t ) 0.5 0.15 0.01 0.05 0.01 0.001 formula 0.1378 0.1314 0.1214 0.1056 0.0970 0.1012 0.0952 Case 3: s = 0.05, h = 0 (recessive, h a r m f u l m u t a n t ) 0.05 0.25 0.1 0.05 0.01 formula 0.0096 0.0066 0.0050 0.0060 0.0056 0.00458 Case 4: s = --0.05, h = 1 ( d o m i n a n t , h a r m f u l m u t a n t ) 0.05 0.25 0.01 0.05 0.01 formula 0.0032 0.0034 0.0032 0.0042 0.0050 0.00458 Case 5: s = 0.001, h = - - 5 0 (negative, o v e r d o m i n a n t m u t a n t ) 0.5 0.25 0.1 0.05 0.01 0.001 formula 0.0082 0.0122 0.0132 0.0180 0.0188 0.0204 0.0200
(3.16)
(3.16)
(3.16)
(3.16)
(3.16)
The simulations were based on a circular stepping stone structure (n = 5 and N = 10) X = 0(N = N) and each fixation probability u(xo) was calculated from the results of 5000 duplicates (x 0 = 1/nN = 0.02). The values given by formula (3.16) are the fixation probability of the limiting cases.
TABLE III Numerical Examples showing the Effect of Colony Extinction on the Fixation Probability m
U(Xo) m
u(xo) m
U(Xo) m
U(Xo) m
U(Xo)
Case 1 : s = 0. I, h = 0 (recessive, a d v a n t a g e o u s m u t a n t ) 0.5 0.25 0.1 0.05 0.01 0.001 n e u t r a l limit 0.0182 0.0116 0.0110 0.0128 0.0134 0.0118 0.0100 Case 2: s = 0.1, h = 1 ( d o m i n a n t , a d v a n t a g e o u s m u t a n t ) 0.5 0.25 0.1 0.05 0.01 0.001 n e u t r a l limit 0.0330 0.0224 0.0164 0.0138 0.0122 0.0114 0.0100 Case 3: s = - - 0 . 0 5 , h = 0 (recessive, deleterious m u t a n t ) 0.5 0.25 0,1 0.05 0.01 0.001 n e t u r a l limit 0.0082 0.0080 0.0076 0.0100 0.0114 0.0128 0.0100 Case 4: s = - - 0 . 0 5 , h = 1 ( d o m i n a n t , deleterious m u t a n t ) 0.5 0.25 0.1 0.05 0.01 0.001 n e u t r a l limit 0.0070 0.0060 0.0060 0.0106 0.0084 0.0072 0.0100 Case 5: s = 0.001, h = - - 5 0 (negative, o v e r d o m i n a n t m u t a n t ) 0.5 0.25 0.I 0.05 0.01 0.001 n e u t r a l limit 0.0068 0.0096 0.0096 0.0070 0.0098 0.0084 0.0100
In the simulations the circular stepping stone models of n = 5 where used and h = 0.2. Each colony consists of 10 individuals, and when the extinction occurs to a colony it is replaced by two randomly chosen individuals from one of the neighboring colonies and they expand to 10 individuals in one generation. Each fixation probability u(xo) was calculated from the results of 5000 duplicates (x o = 1/nN = 0.01).
STOCHASTICTHEORYOF POPULATIONGENETICS
539
In Table III the effect of colony extinction is demonstrated assuming X = 0.2. The variance and mean, given in (3.17) and (3.18), and the equation (3.19) indicates that if ~ is large and the migration is restricted, the fate of a mutant gene will be very much like that of a neutral gene whose fixation probability is equal to its initial frequency. The same is true if the local homozygosity is nearly one and the line extinction occurs quite frequently. This is irrespective of the selection scheme. We should recall that when X = 0 the fixation probability converges to the corresponding case of no dominance (formula 3.16) as the migration becomes less and the local homozygosity increases. Therefore the two situations of X = 0 and X > 0 can have quite different biological implications. Gene frequency distribution. Let us assume here that all the segregating alleles are selectively neutral and every mutant is new. Substituting Vsx of (3.1) into (3.4), we have
ar 8 9 = Ot
nN
+2X[(E(x~)--x2l n
] a2d~ 0(~ ~ -- vx-~x, Ox2
(3.21)
Where xt is the gene frequency in colony L x is the average of xts, and ~ = ~(t, x, y). Let
H(x) = x --E(x~) = E{ xi(1 --xi)},
(3.22)
which is equal to a half of the average local heterozygosity, and let ~(x, y)dy be the mean sojourn time that a particular allele starting from the initial frequency x stays in a dy neighborhood of y before becoming extinct from the population. Then ~(x, y) satisfies the differential equation: 89{ H ( x ) ( 1 - 2~')+n~ 2LK/(x--x2) } d2~dx 2
vx d~ + 8 ( x - - y ) = 0, dx
(3.23)
where 8 ( ) is the delta function. The solution of (3.23) satisfying the boundary conditions
9 (0, y) = 0 and d~(x, y) / = 0 (Ix Ix =1 is given by ~(Xo, y) - - -
540
TAKEOMARUYAMA
where F(y) = H(y)(1 -- 2L~) + 2L~y(1 -- y) and H(y) = E{y;(1 --y;)}. Note that H(y) ~> 0 and therefore if we apply the mean value theorem of integral calculus,
y 2nlVv~
fo
F--~ d ~ =
~(1 - } ) d} "y ( 1 -- })F(/~)
2Nnv Jo
2Nnvp( 1 -- p) F(p)
log(1 -- y),
where 0 ~<)~ ~
1 9 (Xo = n N ' y) by ~(y), we have 2Nnv~(1--~)
~b(y)
F(y)N (1 -- y)
,
(3.25)
where F(y) = H(y)(1 -- 2Lr(r) + 2L~y( 1 -- y) and 0 ~
F(y)=y(1--y)
Hfy) (1--2L~)+2L~]
y(1 - - y )
Substituting the above expression of F(y) into (3.25) and multiplying by Nnv, which is the average number of new mutants introduced each generation into a whole population, we have the density of the number of alleles at frequency y in an equilibrium population: 2~rnv
9 (y) =
21Vnv
i
y(1 - y ) (1 - 2x~O + 2 x ~
/
y-X(1 - - y )
--1
{kO--k) H(~)(l--2h/~r)+2~}
(3.26) where 0 ~<~ ~
STOCHASTIC THEORY OF POPULATION GENETICS
H(y)/y(l
541
-- y ) is the ratio of the average local heterozygosity to the
'global heterozygosity' (the heterozygosity of the whole population). Obviously, as the gene frequency approaches 0 the ratio tends to 1; namely, H(y) y(1--y)
~- 1 a s y ~ 0.
Therefore formula (3.26) asserts that for small values of y the spectral density can be approximated by ~b(y) = 2~lnv y - l ( 1 __ y)2ffnv-1
(3.27)
which is the same as that for a panmictic population of effective size 2qn. Hence the frequency spectrum of low-frequency alleles should follow closely that of a panmictic population (given by (3.27). This is quite important because in some hypotheses the distribution of low-frequency alleles is critical. The ratio H ( y ) / y ( 1 -- y ) is not constant but a function of y and it decreases monotonically as y tends to 0.5 and increases to 1 as y approaches 1, assuming a m i n i m u m at y = 0.5. However, the change is not very large unless the population structure is rigid and migration is extremely limited. In fact, most experimental data indicate values of the ratio to be usually larger than 0.7 or 0.75, and often about 0.9 or larger. Therefore, assuming the ratio to be approximately constant and denoting it by r, we have 2s 2Nnv
_ y){2xgO-O+, } -l
,I~(y) ~,{ 2k~t(1 - - r ) + r} Y-I(1
(3.28)
with r = H ( y ) / y ( 1 -- y ) , where the bars indicate the means. Note that (3.28) can be written as 9 (y) = txy-l( 1 -- y),~-x and that h -= ~o y(1 - - y ) ~ ( y ) d y = ot ~ot (1 - - y ) " dy = 1 ot + or"
(3.29)
This is the probability that when two genes are chosen randomly from the whole population they are different alleles. This is an observable quantity.
542
TAKEO MARUYAMA
Therefore, from formulae (3.27) and (3.29), we can say that the frequency spectrum density of alleles is very much like that of a panmictic population which maintains the same level of genetic variability. To confirm the above analysis on the frequency spectrum I have carried out several simulations, each with sufficient repeats. Agreement between the simulation result and the theoretical expectation in all cases was about the same as that presented in Figure 2.
2
0
0
I
I
I
I
I
0.2
0.4
0.6
0.8
tO
x
Figure 2. Comparison of the gene frequency distribution from a computer simulation to that of the theoretical expectation based on formula (3.28). In the simulation the parameters were set as follows: n = 10, N = 40, v = 0.000125. = 0.01, r = 0.639 and ~ = 28.6. The curve indicates the theoretical expectation and the dots the simulation results.
Local gene frequency. Treating the global gene frequency has not required steady distributions of the local gene frequencies. However, if the gene frequencies in local populations rapidly reach steady distributions, the following analysis may be important. Let A1, A2, ..., Ak be the existing alleles with reference to the whole population and let xj(0 be the frequency of allele Ai in colony j. Let
j
Although the value of k varies with time, it is always finite. We assume that the change in xl(O is much faster than that in x (0. Therefore if we let
STOCHASTICTHEORYOF POPULATIONGENETICS
543
q~(Sl, X2, ..., Xk) be a kind of pseudo-equilibrium distribution o f A1, A2, ..., Ak in colony j, we have, from Wright (1949), k
(~i(Xl, x2, . . ., x k ) = c ~
x4N jPj ~(i)-'1
(3.30)
i=1
where O/is the migration rate and xx + x2 + ... + xk = 1. Geographically invariant p r o p e r t y . Here we will show that some quantities associated with geographically structured populations are independent of the structure when we assume that the colony's extinction does not occur. Let X = 0 (no extinction o f colonies), h = 0.5 (additive fitness) and v = 0 (no mutation). Then equation (3.5) can be written a_~_~= H ( x ) a2~t) + s H ( x ) at 2n~ r a x 2 2
a•
(3.31)
ax'
where H ( x ) = x - - E ( x ~ ) = E { x i ( 1 -- xi)}, which is half of the average local heterozygosity. Note that the two coefficients given in (3.31) have a comm o n factor H(x); this is a biologically observable quantity. N o w define t = ~o '~ H ( x ( w , ~)d~,
(3.32)
where w indicates the sample path. With this definition we can regard r as a new time parameter. Biologically, r is the sum o f the heterozygotes, because H ( x ) = ~_ xi( 1 - - xi). i
Therefore the new time r,o can be different depending on the sample path. Mathematically, r,o is a non-negative additive functional of the process and thus can be used as a time parameter. Using rw, let q(r, x, y) be the probability density that the gene frequency o f allele A goes from x at rw = 0 to y at r,~ = r. Then equation (3.31) becomes aq at
1 a2q S aq 2nN ax 2 + 2 a x '
(3.33)
where q = q(r, x, y ) . Equation (3.33) has a two-fold consequence: namely,
544
TAKEO MARUYAMA
(i) the process governed by q(r, x, y) is a Browian motion; and (ii) it is independent of the population structure. The appropriate solution of (3.33) is q(r, x, y) = 2e -s(x-y) ~ e -(n''~+s')rl2n~ sin nrrx sin nrry, n=l
where S = s n N . Returning to equation (3.33), we will investigate various path integrals of the process governed by q(r, x, y). The ultimate fixation probability is not altered by going from the process governed by equation (3.31) to the process governed by equation (3.32) because in this transformation only the time scale is changed and the 'road map' remains exactly the same. Therefore the fixation probability u ( x ) can be obtained by searching for a solution of (3.33) satisfying u(0) = 0 and u(1) = 1. The appropriate solution is 1 --
e -ngsx
u ( x ) = 1 -- e - ' g s
'
(3.34)
which is the same as u ( x ) given in (3.16). Next, consider the time T ( x ) to fixation or extinction; then according to (2.13) T ( x ) satisfies 1
d2 T(x)
2nN
dx 2
s dT(x) +-~+1=0, 2 dx
with the boundary condition T(0) = T(1) = 0. The appropriate solution is
T(x) -
2{ u ( x ) - - x }
where u ( x ) is the fixation probability given by (3.34). If s = 0 then T ( x ) = nlVx( 1 -- x ) ,
and in particular if x = 1 / n N ,
STOCHASTIC THEORY OF POPULATION GENETICS
545
For s 4: 0, if nlVs >> 1 and s "~ 1, T
(1)
~2,
and if nNs ,~ -- 1 and is t ~ 1,
T
(1)
~ n--~s"
Mathematically, the quantity given by these formulae is the average time, measured in units of r, a sample path stays in (0, 1) before it goes to fixation or extinction. Biologically, they are the average of the sum of the heterozygotes which appear in the population. Therefore we have shown that the ultimate fixation probability of a mutant gene is independent of the geographical structure of population and, likewise, the sum of heterozygote individuals. These assertions have quite important biological implications. The fixation probability of a mutant gene is one of the most important factors in evolution because this makes species different from each other by incorporating different genes into each population. This probability is now shown to be independent of the population structure. It may be worth noting that the time of fixation measured in units of generations may change depending upon the population structure. Another quantity shown to be independent is the sum of the local heterozygosity formed among mutant genes. This is also interesting because the total amount of genetic variability maintained in a population or species may differ depending upon the population. Nevertheless, the local variation, or rather, the average local variation, is independent. Actually the ratio between the two variables is a good measurement in assessing the rigidness of the population structure or the degree of local differentiation of the population. Namely, in our present notation the ratio is H(x)/x(1 --x). The importance of this ratio was first emphasized by Robertson (1964) in studying the effect of population structure on the fate of mutant genes.
4. Numerical Analysis based on Stochastic Integrals. As we have seen in the preceding sections, most of the early stochastic models deal with one locus at which two alleles are present and where one-dimensional diffusion theory can be successfully applied. In these early studies most of the effort was spent in searching for analytic solutions of problems. Particularly when the subject can be formulated in one-dimensional theory, it can almost always be solved analytically (Kimura, 1964; Wright, 1969; Crow and Kimura, 1970; Maruyama, 1977; Ewens, 1979). On the other hand, except for some special
546
TAKEO MARUYAMA
cases which turn out to be important biologically, analytic solutions to multiple allele problems and linked locus problems have been very difficult to obtain. Experimental population genetics now strongly demands solutions to such stochastic models so that further biological insight can be obtained. In fact, many problems which now appear to be biologically realistic seem to be hopelessly difficult for analytic solution. Therefore it is natural that a number o f numerical methods have been developed. There are two basically different methods. One is to describe the process in a classical diffusion equation and solve it numerically. The other is to carry out simulations on the models. Since a diffusion equation or an equation derived from it gives the expectation of a quantity and the numerical solutions are more stable, it is usually more desirable to take this approach rather than simulation. However, perhaps with a few exceptions, the numerical integration of a diffusion equation is limited to processes with no more than two or three space variables. Recently we developed a new numerical method which enables us to handle various stochastic models in a very general form (Maruyama, 1980, 1981 ; Maruyama and Nei, 1981 ; Maruyama and Takahata, 1981). Because of the availability o f modern high-speed computers, the method proved to be useful, permitting us to parameterize the problem and seek a concrete answer. Here I outline the method and present some of the findings on various problems which are biologically important. Mathematical method. Let xi be random variables representing gene frequencies or gamete frequencies. Then the set of random variables xi satisfies a system of stochastic differential equations:
dxi = ~" aO(xl, x2,...)dB/+ bi(xl, x2, ...)dt,
(4.1)
/
where a 0 and bi are functions of x i and Bj independent Brownian motions. The coefficients a0 are elements o f the non-negative definite square root of the covariance matrix whose ith row, ]th column element is given by 1
a~i = lim - - E{ Ax~Axi} , , , ~ o At
(4.2)
where E{ } indicates the expectation. In practice it is usually easy to determine the covariance matrix [o//]. The [a0] is the square root o f the matrix [a0]. The other coefficient bs represent the infinitesimal mean change in xi; that is, 1
bt = lim - - E{ Axi}. At-*0 A t
(4.3)
STOCHASTIC THEORY OF POPULATION GENETICS
547
The differential of the Brownian motion represents the random change which occurs in an infinitely small time interval. In a special case of onedimensional processes oll of (4.2) and bl of (4.3) correspond to V6x and M~x of (2.2) respectively. Since (4.1) requires operations involving infinitesimally small time intervals it is not possible to simulate it by computer. As with differential equations of deterministic processes, (4.1) can be approximated by difference equations:
Axi "~ Z a//(X1, x 2 , - " .)AWl q- bi(xx, x2, ...)At,
(4.4)
i where AWj indicates a white noise with variance equal to the time increment At. Let tk be discrete time incremented by At for k = 1, 2, ... and xi(t k) be the random variable xi at time tk. Then we can rewrite (4.4) as
x,(tk+l ) = xt(t,) + ~. a~j(xl(tk), x2(tk), ...)AW i i + bi(xl(tk), x2(tk) .... )At.
(4.5)
This equation gives the values of xi at these discrete times tk but not between these specified times. Now let
xi(t) = xi(t~,)
(4.6)
for tk+l ~ t >1 tk. With this definition the path constructed by (4.4) or, equivalently, (4.5) is defined for all values of time, and each path forms a step function having discontinuity at tk. If we are dealing with differential equations of a deterministic process, it is almost obvious that solutions given by a difference equation converge to those of the original equations, as At becomes smaller. However, with a differential equation involving differentials of random processes this is not simple and in fact requires highly technical mathematics to show the convergence ofxi(t) given by (4.5) and (4.6) to the solution of the original equation (4.1) (Skorokhod, 1965; McShane, 1974). Equation (4.1) is a symbolic representation of the behavior of sample paths, and actually depending on the way of integrating (4.1), many different processes can be constructed which no longer satisfy the original equation (4.1). Therefore it is important to note that among the schemes of integrating (4.1) the one given by (4.5) and (4.6) simulates sample paths which converge to those specified by (4.1) (Wong and Zakai, 1965; Arnold, 1973; McShane, 1974).
548
TAKEO MARUYAMA
In the case of differential equations for deterministic processes there are many different improved methods for numerical analysis, such as the RungeKutta approximation, which enables us to have a faster convergence than a method equivalent to (4.5) and (4.6). However, with differential equations describing sample paths of random processes a method yielding an improved convergence rate can often lead us to a different process. The reason is that the second-order moment ~[ AIVj}2 does not vanish as the time increment decreases to zero, while a corresponding quantity in differential equations for deterministic processes can be made arbitrarily small by letting At be small (see Rumelin, 1980). In some models of population genetics we assume that the possible number of variables, which are usually either the gene frequencies or the gamete frequencies, is fixed. In many recent models dealing with molecular evolution the number of variables is not constant but changes with time. This is because the number of possible allelic states of a gene or gamete has been shown to be extremely large but those represented in a population at any one time is rather small. Furthermore, in reality, once a particular variable appears and then disappears from the population it will never come back again to the same population (Kimura and Crow, 1964). Therefore it is necessary to incorporate newly arising variables into the system (4.1), while the variables which disappear must be removed from the system. It is easy to eliminate a vanishing variable. We re-name the variables, omitting the one which we want to remove. To add a new variable we do as follows: during the time interval of At = t k + l - - tk the new variable
xn+l(&+l) = e
(4.7)
is introduced with a probability that can be specified by the model. For instance, in the case of a gene mutation at one locus in a population of N diploid organisms with mutation rate v the probability is 2NvAt/e, provided that this is much less than unity. However, 2NvAt/e, the mean number of new mutants to be introduced into a population during At, is not necessarily small. Therefore in general we introduce m new mutants, all having equal frequency e : X n + l ( t k + 1) = e . . . . . Xn+m(t~+l) = e, where m is a Poisson random number with mean 2NvAt/e. If rn = 0, no new mutants will be introduced, if m = 1, one new mutant will be intro&tced, and so forth. It is very important to make e > At, particularly when the value of 2Nv is smaller than unity. If e < At with 2Nv < 1, the simulation can lead to a process of a quite different nature at the boundary x = 0. That property requiring e > At for small 2Nv appears to be unique for this numerical method. The same requirement applies to cases with a fixed number of
STOCHASTIC THEORY OF POPULATION GENETICS
549
variables. The mutation from the jth allele to the ith allele cannot be performed by adding 2Nv~jxj(t)At to x~(t) and subtracting the same quantity from xj(t) when 2Nvq is small. Instead, it must be carried out by adding me to x(t) and subtracting the same from xj(t), where m is a Poisson random number with mean equal to 2Nvqxi(t)At/e and e > At. There is more than one form o f the stochastic differential equation representing the same process (Watanabe, 1971). The equation given in (4.1) is a standard one that is often used in the mathematical literature (It6 and McKean, 1965). However, any matrix A = [ a i i ] having property AA* = [oq] can represent a diffusion coefficient matrix. Here A* represents the transpose o f A and oq the coefficients given in (4.2). Stochastic models studied in population genetics usually involve the diffusion coefficients resulting from multinomial sampling of gametes. In that case aq = xi(6ij -- xj), where ~q = 1 if i = j and 0 otherwise. With this system of diffusion coefficients several different representations are possible, besides the one given in (4.1), in which the square root of [oij] is used. The formulations o f Pederson (1973), Itoh (1979) and Kimura (1980) can be used in cases o f multinomial sampling. Itoh's presentation is simple and is given below. Itoh (1979) has shown that when oq = xi(Sq -- xj), the diffusion can be expressed as follows: tl
dxt(t) = ~" a(i, j)x/(xi(t)xj(t)dBq)
(4.8)
j=0
for i = 1, 2, ..., n, where o(i,j) = 1 i f i < j , o ( i , j ) = - - I ifi>jandBij for i < j is independent Brownian motion and Bji = Bq. While Itoh's formula (4.8) is simple in its form, it tends to involve a larger number of independent Brownian motions Bq, for the number grows proportional to the square of the number n of the random variables. There are processes in population genetics which cannot be represented by (4.8) and therefore require the square root of the [oil] matrix. The deterministic change in xi(t) is governed by bl(xx, x2 . . . . )At of (4.4). For instance, if an extant allele A i mutates to new alleles at a rate v,
bi(xl , x2 .... )At = -- vx~( t )At.
(4.9)
Under natural selection 1
bW
bi(Xl, x2 .... )At = 2W ~xi At,
(4.10)
550
TAKEOMARUYAMA
in which W=
~ ~" (1 + sij)x~(t)xj(t), J J
with sij being relative fitness of the genotype AiAi. Accuracy. For the following two reasons the accuracy of the approximations represented by equations (4.4)-(4.7) need to be checked. One reason is that the difference equation (4.4) together with (4.5) converges rather slowly to a continuous limit of a diffusion process. The other reason is that the nature of the process can be greatly altered by the input equation (4.7) for new mutants, particularly when sample paths have a high probability of being near boundaries. In the test simulation given below I used At = 0.001 and e = 5At throughout. Since the allele frequency distribution and heterozygosity are probably the most important quantities in population genetics, I have tested the accuracy of the method in predicting these quantities. For the numerical simulations I used the infinite allele model of Kimura and Crow (1964), in which every mutant is assumed to be new. Two schemes of selection have been used. One model assumes that all mutants are selectively neutral and the other is an overdominance model in which every heterozygote has relative fitness 1 + s, while fitness is equal to unity for every homozygote. The average level (h) of heterozygosity is defined as the mean of 1 -- Zx 2, and the allele frequency distribution q~(x) is the average number of alleles having frequency x. For the neutral model both of these quantities can be obtained exactly (Kimura and Crow, 1964; see also Wright, 1949). The average heterozygosity for a small value of 2Ns for the case of overdominant selection can be studied by the formulae of Wright (1949), Watterson (1977) or Li (1978). The theoretical average h obtained by these methods are presented in Table IV and are compared with the simulation results. It is clear that the agreement between theory and numerical simulation is satisfactory in all cases. Figures 3a and 3b show the theoretical allele frequency distribution of the neutral model together with the results of simulations. The theoretical allele frequency distribution was obtained by the formula ~ ( x ) = 4Nvx-l(1 --x) 4 N v ' - I (Kimura and Crow 1964). Figure 3a shows the comparison between the theoretical distribution and the simulation results for the whole range o f x partitioned into 50 intervals of equal length (0.02). Since in population genetics the distribution of rare alleles is also important, I have tested the distribution near x = 0 and x = 1. Figure 3b shows the comparison near x = 0 and x = 1 by partitioning [0, 0.02] and [0.98, 1.0]. The agreement between theory and simulation is again satisfactory. Thus this method of simulation appears to give quite
STOCHASTIC THEORY OF POPULATION GENETICS
551
TABLE IV Mean Heterozygosities for testing the Accuracy the Computer 4Nv
2Ns
0.02 0.1 0.2 0.5 0.004 0.012 0.026 0.060
0 0 0 0 2 6 13 30
of
Simulation used Mean heterozygosity Theoretical Simulation 0.0196 0.0909 0.1667 0.3333 0.0080 0.0970 0.4548 0.6265
0.0195 0.0897 0.1658 0.3316 0.0079 0.0964 0.4501 0.6360
The theoretical values for neutral alleles were obtained by k7 = 4Nv/l(1 + 4Nv) and those for overdominant alleles were obtained
by numerical integration of formulae given by Wright (1949).
140
120
5
100
;~(x) 4
8O
~cx)
40
(a)
Lj I
0.2
0.4
0.6 X
0,8
1.0
0.01
I--
--
0.02
--I
0.98
[
0.99
x
Figure 3. Theoretical allele frequency distributions [qb(x) = 4Nvx -1 X (1 X)4iVy-l] and the distributions obtained by computer simulations. (a) Distribution over entire region of x; (b) distribution near x = 0 and x = 1. -
-
d
1.0
552
TAKEOMARUYAMA
an accurate result when studying the population parameters we are interested in. It is important to note that even though the input parameter e = 0.005 was only five times larger than the smallest range of frequency interval of [0, 0.001] the accuracy for the range turned out to be good (Figure 3b). This numerical method of stochastic integrals has been proved to be useful and has been applied in a number of difficult problems; a study of gene silencing at duplicated loci (Maruyama and Takahata, 1981), overdominance models (Maruyama and Nei, 1981) and linkage disequilibrium (Maruyama, 1981). These problems are very difficult and the method presented here is probably one of the best ways now available. A future task along the numerical method is to improve the convergence rate. As pointed out earlier, formula (4.5) has a rather slow convergence rate. This research was supported by a grant (57120001) from the Ministry of Education, Science and Culture.
LITERATURE Arnold, L. 1973. Stochastic Differential Equations: Theory and Applications. New York: John Wiley & Sons. Crow, J. F. and M. Kimura. 1970. An Introduction to Population Genetics Theory. New York: Harper and Row. Dynkin, E. B. 1965. Markov Processes, Vols 1 and 2. Berlin: Springer-Verlag. Ethier, S. N. and T. Nagylaki. 1980. "Diffusion Approximations of Markov Chains with Two Time Scales and Applications to Population Genetics." Adv. appl. Prob. 12, 14-49. Ewens, W. J. 1963. "The Diffusion Equation and a Pseudo-distribution in Genetics. JIR. statist9 Soc. (B) 25,405-412. . 1979. Mathematical Population Genetics. Berlin: Springer-Verlag. Feller, W. 1951. "Diffusion Processes in Genetics." Proc. 2nd Berkeley Symp. on Math. Stat. and Prob., pp. 227-246. 9 1954. "Diffusion Processes in One Dimension 9 Trans. Am. math. Soc. 77, 1-31. Fisher, R. A. 1922. "On the Dominance Ratio." Proc. R. Soc., Edinburgh 42, 321-341. It6, K, and H. P. McKean. 1965. Diffusion Processes and Their Sample Paths9 Berlin: Springer-Verlag. Itoh, Y. 1979. "Random Collision Process of Oriented Graph." Institute of Statistical Mathematics (Japan), Research Memorandum No. 154, pp. 1-20. Kimura, M9 1954. "Process Leading to Quasi-fixation of Genes in Natural Populations due to Random Fluctuation of Selection Intensities." Genetics 39, 280-295. . 1955a. "Solution of a Process of Random Genetic Drift with a Continuous Model." Proc. natn Acad. Sci. U.S.A. 41, 144-1 50. 9 1955b. "Stochastic Processes and Distribution of Gene Frequencies under Natural Selections." Cold Spring Harb. Symp. quant. Biol. 20, 33-53. 1962. "On the Probability of Fixation of Mutant Genes in a Population." Genetics 47, 713-719. 9 1964. "Diffusion Models in Population Genetics." J. appl. Prob. 1, 177-232. --.. 1968. "Evolutionary Rate at the Molecular Level." Nature, Lond. 217, 624-626.
STOCHASTIC THEORY OF POPULATIONGENETICS
-
553
9 1980. "Average Time until Fixation of a Mutant Allele in a Finite Population under Mutation Pressure: Studies by Analytical, Numerical and Pseudo-sampling Methods." Proc. natn Acad. Sci. U.S.A. 7 7 , 5 2 2 - 5 2 6 . and J. F. Crow. 1964. "The Number of Alleles that can be Maintained in a Finite Population." Genetics 49, 725-728. and T. Ohta. 1969. "The Average Number of Generations until Fixation of an Individual Mutant Gene in a Finite Population." Genetics 63, 701-709 9 and - - - . 1971. "Protein Polymorphism as a Phase of Molecular Evolution." Nature, Lond. 2 2 9 , 4 6 7 - 4 6 9 . Kolmogorov, A. 1935. "Deviations from Hardy's Formula in Partial Isolation." C. r. Acad. Sci. U.R.S.S. 3, 129-132. Li, W.-H. 1973. "Total Number of Individuals Affected by a Single Deleterious Mutant in a Finite Population." Am. J. Human. Genet. 2 4 1 , 6 6 7 - 6 7 9 . 9 1978. "Maintenance of Genetic Variability under the Joint Effect of Mutation, Selection and Random Drift9 Genetics 90, 349-382. Maruyama, T. 1972. "The Average Number and the Variance of Generations at a Particular Gene Frequency in the Course of Fixation of a Mutant Gene in a Finite Population." Genet. Res. 19, 109-113. 9 1974. "The Age of an Allele in a Finite Population 9 Genet. Res. 23, 137-143. --. 1977. Lecture Notes in Biomathematics 17. Stochastic Problems in Population Genetics. Berlin: Springer-Verlag. 9 1980. "On an Overdominant Model of Population Genetics." Adv. appl.. Prob. 12,274-275. . 1981. "Stochastic Problems in Population Genetics: Applications of It6's Stochastic Integrals." In Stochastic Nonlinear Systems, Eds. L. Arnold and R. Lefever, pp. 154-161. Berlin: Springer-Verlag. and P. A. Fuerst. "Analyses of the Age of Genes and the First Arrival Times in a Finite Population." Genetics. In press9 and Kimura, M. 1971. "Some Methods for treating Continuous Stochastic Processes in Population Genetics." Jap. J. Genet. 46,407--410. and - - . 1975. "Moments for Sum of an Arbitrary Function of Gene Frequency along a Stochastic Path of Gene .Frequency Change." Proc. natn Acad. sci. U.S.A. 72, 1602-1604. and M. Nei. 1981. "Genetic Variability maintained by Mutation and Overdominant Selection in Finite Populations." Genetics 98, 491-459 9 and N. Takahata. 1981. "Numericl Studies of the Frequency Trajectories in the Process of Fixation of Null Genes at Duplicated Loci. Heredity 46, 49-57. McShane, E. J. 1974. Stochastic Calculus and Stochastic Models. New York: Academic Press. Nagasawa, M. 1964. "Time Reversions of Markov Processes." Nagoya math. J. 24, 177-204. and Maruyama, T. 1979. "An Application of Time Reversal of Markov Processes to a Problem of Population Genetics." Adv. appL Prob. 1 1 , 4 5 7 - 4 7 8 . Nagylaki, T. 1974. "The Moments of Stochastic Integrals and the Distribution of Sojourn Times." Proc. natn Acad. ScL U.S.A. 7 1 , 7 4 6 - 7 4 9 9 Nei, M. 1975. Molecular Population Genetics and Evolution. New York: North-Holland/ American Elsevier. Pederson, D. G. 1973. "Note: An Approximation Method of Sampling a Multinomial Population." Bio metrics 29, 814-821. Robertson, A. 1964. "The Effect of Non-random Mating within Inbred Lines on the Rate of Inbreeding." Genet. Res. 5, 164-167. Rumelin, W. 1980. "Numerical Treatment of Stochastic Differential Equations. Report No. 12, Forschungsschwerpunkt Dynamische Systeme, Universit/it Bremen. Skorokhod, A. V. 1965. Studies in the Theory o f Random Processes. Reading, MA: Addison Wesley9 -
-
-
-
-
-
-
-
-
-
-
-
-
-
-
554
TAKEO MARUYAMA
Slatkin, M. 1981. "Fixation Probability and Fixation Times in a Subdivided Population," Evolution 35,477-488. Watanabe, S. 1971. "On the Stochastic Differential Equations for Multidimensional Diffusion Process with Boundary Conditions." J. Math. Kyoto Univ. 11, 169-180. Watterson, G. A. 1977. "Heterosis or Neutrality?" Genetics 85, 789-814. Wong, E. and M. Zakai. 1965. "On the Convergence of Ordinary Integrals to Stochastic Integrals." Ann. math. Statist. 36, 1560-1564. Wright, S. 1931. "Evolution in Mendelian Populations." Genetics 16, 97-159. 9 1945. "Differential Equations of the Distribution of Gene Frequencies." Proc. natn Acad. Sci. U.S.A. 3 1 , 3 8 2 - 3 8 9 . --. 1948. "Genetics of Populations." In Encyclopedia Britannica, Vol. 10, 111, 111A-D, 112. 9 1949. "Adaptation and Selection." In Genetics, Palaeontology, and Evolution, Ed. G. G. Simpson, G. L. Jepsen and E. Mayr, pp. 365-389. Princeton, NJ: Princeton University Press. --. 1969. Evolution and Genetics of Populations, Vol. 2. The Theory of Gene Frequencies. Chicago, IL: University of Chicago Press. --. 1970. "Tandom Drift and the Shifting Balance Theory or Evolution." In Mathematical Topics in Population Genetics, Ed. K. Kojima, pp. 1-31. Berlin: SpringerVerlag.