Hum Genet (1984) 66 : 1-16
© Springer-Verlag1984
Review articles
DNA restriction fragment length polymorphisms and heterozygosity in the human genome David N. Cooper and J8rg Schmidtke Institut ftir Humangenetik der Universit/it G6ttingen, Nikolausberger Weg 5a, D-3400 G6ttingen, Federal Republic of Germany
Summary. A list is presented of published reports o f D N A polymorphisms found in the human genome by restriction enzyme analysis. While the list indicates the large number of restriction fragment length polymorphisms (RFLPs) detected to date, the information collated is insufficient to permit an estimate of heterozygosity for the genome as a whole. Data from our laboratory are therefore also presented on RFLPs detected using a random sample of cloned DNA segments. Such an analysis has permitted a first unbiassed estimate of heterozygosity for the human genome. Since this figure is an order of magnitude higher than previous estimates derived from protein data, the majority of polymorphic variation present in the human genome must, by implication, occur in noncoding sequences. In addition it was confirmed that enzymes containing the dinucleotide CpG in their recognition sequences detect more polymorphic variation than those that do not contain a CpG. Also presented are the clinical applications o f D N A polymorphisms in the diagnosis of human genetic disease.
Introduction How polymorphic is the human genome? Previous attempts to answer this question have relied upon serologic and electrophoretic methods of detection (Vogel and Motulsky 1982). Using these now classical techniques, variability at the DNA level could only be inferred and estimated from protein data. Therefore, only variability of coding sequences could be measured. With the advent of recombinant DNA technology, it is now possible not only to examine DNA polymorphisms directly but also to detect them both in coding and non-coding regions of the genome. Restriction enzymes can detect polymorphisms arising from single base-pair changes introducing or removing a restriction site, or from sequence additions, deletions, and rearrangements affecting the length of D N A between sites. The vast majority of DNA variants seem to be due to single base-pair exchanges. However, techniques usually employed would not detect the insertion or deletion of only a few base pairs. Restriction fragment length polymorphisms (RFLPs) promise to be useful in a number of different ways. The utilisation of widely spaced RFLPs as "genetic signposts" promises to be the basis for the future successful mapping of the human Supported by the Deutsche Forschungsgemeinschaft Offprint requests to: J6rg Schmidtke
genome (Botstein et al. 1980; Southern 1982). When linked to human disease loci, polymorphic sequences could serve as markers in the antenatal diagnosis of hitherto undetectable genetic defects, and in the identification of heterozygous carriers. Other possible uses include paternity testing and population studies (e.g. Roberts 1982). The purpose of this review is to present a list of published reports of DNA polymorphisms found to date in the human genome by restriction enzyme analysis, together usually with Southern blotting. To this list have been added data from our own laboratory on polymorphisms associated with a random sample of human DNA segments. These data are useful in several respects. They permit the first unbiassed, although still preliminary estimate of the overall heterozygosity present in the human genome. In addition it is now possible to address the question of whether polymorphisms occur uniformly in the genome or whether there are regional and/or chromosomal differences in their frequency of occurrence. Finally, it was intended to determine whether or not some enzymes, notably those recognizing a CpG-containing sequence, detect polymorphisms at a higher frequency than those which do not possess this dinucleotide in their recognition sequence. A polymorphism has generally but somewhat arbitrarily been defined as a Mendelian trait that exists in the population in at least two phenotypes neither of which occurs with a frequency of less than 1% (Harris and Hopkinson 1972; Vogel and Motulsky 1982). An allele present at less than a 1% frequency is simply termed a variant. Although this distinction has been the subject of recent discussion (Meisler 1983), we adhere to this definition here. Table 1 therefore includes DNA polymorphisms sensu strictu, together with rare variants detected by restriction enzyme analysis. D N A polymorphisms in the human genome
A list of published reports of DNA polymorphisms and rare variants in the human genome is presented in Table 1. For each gene region or DNA segment the nature of the polymorphism (i.e. single base-pair exchange, deletion, insertion, rearrangement etc.), the sample size tested, and the enzyme(s) used to detect it, are given. Also listed are the characteristic sizes of the restriction fragments corresponding to the variant alleles and their frequencies of occurrence. Cases where no polymorphism was detected using a specific restriction enzyme have also been included. Whether or not studies were conducted to confirm Mendelian inheritance of the polymorphism is mentioned. The chromosomal assignment is after McKusick (1982) unless otherwise stated. The list does, how-
~ZZ •
.
•~ ' ~ ~
o~ o
o o 0
~-~
•~
go
.
.
g2Z .
Z
~
~=
~
~
~
~
~ ~
~
~
~
~
C
~
C
~.
~.
~
~.
~
~
0
0
g o
o
~-.
¢N eq
~
g
~;~
~
~
~
~
o C
e
e
~
m.. I
I
I
I
I
I
I
Z
Z
I
Z
Z
Z
ZZ
~.
o¢,q
~
~
oo
~.
+
]
+
+
+
++
o.
~or~
< Z
>~.
~
~
~
~
++
+ I I Z Z
Z
Z
I
i
C
I
I I Z
Z
Z
r~ i
C
C
©
+
+ cq
¢xl
2
~ a : ~ _ t_
I
II
+
+
++
+ + + + +
+
+
+
0 C >
C
cq
0
<
.~-
~
~~~~~
~
~
.~
4~
~=
02
o~
o 0 ,-a C
gO
Z
o 0 '-o
o.oq~g
C~
g,
~
.
--y...
.
.
.
. G
,~
, ,~ --~ G
~' ~ = o
•
o - - , .~
g, g, g,
.
~
e~e
o
o
~
,-.
=
=
~
-~
aa~
eq
g E
o
II II
~ ~- ~ -
~ ~ ~ ~
I
[
I
~
¢5
I
I I I P P P I
¢5
o
o< e-. ¢-. e-.
m. ~
[
.-.,
< Z + +
< < < + I Z Z Z
I I I I I l l
I
<< Z Z
< 2:
+
<< Z Z
< < < < < < Z Z Z Z Z Z
< Z
+
,--*
I
I
I
+
+
+
+
o
I
÷
¢5 ¢-. ~--.
vq.
oq~-, ,--a I
< Z
I
< Z
~
I
I
+
I
1
¢N
Cq
¢'-1
Z
t~
+
[
+
+
+
[
+
+
+
[
+
+
+
+
+
+
+
I
+
+
+
+
+
+
Z
Z
Z
Z
+
+
+
+
+
+
++
+
+
+
Z
Z
Z
Z
Z
Z
Z
Z
~" ~
c'-.
c--.
¢,-.
¢N
¢N
eN
~
~
x: ~
£ <
• _~o
~o
o
c~
c~
o
£ <
2 <
2 <
2
~>~
c~
c~
c~
~
~
~
~.
<
<
<
<
Z
Z
Z
Z
~'o
eoo
rs~
p..,
<~
o
oo¢5
I o
c~
o
I
¢5
+
+
< Z
+
p..
,q.
o~
~=~
~
,~
imp.
< I Z
<
Z
+
+
+
+
+
I
I >~ ~D r~
',7-,
r.t) ' ~
.'o7" +
+
+
+
I
I
< z
< z
< z
+
I
+
+
+
+
+
+
+
+
+
+
+
o
O
O
o
o
o
o
o
o
,..o 0
..o o
,", 0
O
< z <.<-< z z z
Q
%o0
,.~--,
~6
=
*d
*6
"6
o
o°
oo
o
o
~
o
,.o
~ . ~E.~ >,.=~
oo
o ~0
I
o
©
-~
~
~
~
o
o
~
o
~
~
~"~
,r_.,_, g .~
.x
,,
==~~:~
:-_.
:z
©
©
I
I
I
I
I
.
o
-&
mo.~ &~..
•
P
,"0
~ o ~z,
I
¢'q ¢"q ¢'xl ¢',1 t'q ¢N
oo
©
<5
i
o
~
o
o. ¢N eq
o
o
t-~
o oN
>
o
i
< Z
~
o
< Z
i
.a
< Z
<~ Z;
< Z
< Z
..1
--1
-a
~.a
,o
,g
+
+
I
=-
=-
F:
~
~
~
< Z
~
oz ~
¢)
V-
+
u6
',O
J
+
+
<
ua~
-g
-g
-g
0
9
~,
¢> O ¢o,
I
.
<
,.o
o
+
.
.
<
¢~
o
+
.
.
~
e-~
o
..~
o
,.o
o
o
o
..o
o i
o
I
,o°
,g
,~°
g,
I
I
I
+
I
+
+
+
+
+
+
©
©
©
o
=_ -& =a
~
en
o
i
¢'~
o
i
oo ~
e-~
o
i
,.o
o
,l~
o i
,.o
o
i
NNNNNNN •
g
~
~
~
,
~
~
~
~
M
~
~
..
~
~
~
d
. . . . ~ ~
~
~ M ~ ~ ~ ~
~ ~
~
g~
© © ©
go ~-.
~-.
~.
~-.
~-.
¢'-1 t",l ~,ICN t'N t",l ~"I t'~l
~-.
r--,~
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
~'N t'N (',1 t'~l
&&&&&&& o
~.
o
~g I I I I I I I I
I
I l l l l
g g g g g g g O O O O O O O
1 4
-I-
q-
-F
<
.<
<
~
~;
~
I
I
I ~
~ : ° °~ , - 4 >
< < Z < ~ < < < < < -F
I ZZ
< < < < < < < Z Z Z Z Z Z Z
Z Z Z Z ~ Z +
Z Z Z Z Z Z Z
o
o r.m'~
+
+
+
+
+
+
+
+
+
+
I
+
+
+
+
+
+
+
+
I
+
+
+
+
+
+
+
~
N
N
N
N
N
N
N N
N
N
N
~
0 o
o
< Z
~
o
o
o
o
~
~
~
~
o
o
~
~
~
~
~
~
~
N
~
~
N
~
,
,
~
~
~
~
N
oo
~o
o0 o
oo
©
©
~
=
oo
o',
0
go
0
"F,
._= ,.a o
©
© ¢Q. ~"¢I
I
I
I l l l t
I
~
o
I
I
I
I
¢xl
Pq
¢N
o
o
o
I
I
I
I
I
¢-q
,.o
~n~
~
~o ~
,.o t¢~ 0
,4'~ on
¢',1
oO
< Z
< Z
+
~E ~ +
~ +
oo
e~
¢'-,I
< +Z
I
+
+
+
< Z
< Z
+
< Z
< Z
I
I
+
n,
+ + + [ +
+
+
~
~
+
+
+
+
+
I
I
:=
~
~ o~ = . ~~ g g~o ~ _~
.=.
#
<
~n
<
<
Z
+ Z
Z
¢-q
+
g
,o
<
Z
~
F:
.=_
<
•~ . ~
o - = ~ °° ~ o ~
~ ~o~ ~ g ~ ' ~ o ~
~o
+
+ + +
+
,..-4
-.4
8
~ .o
0
I
~
I
~
I
I
I
I
I
Z
oo o
o
¢5
o
o
I
g
o
¢5
o
-g o oa
=
~
~.~. ~
.~
:~
< Z
< Z;
< Z
+
+
+
+
< Z
< Z
< 2;
< Z;
o
o
o
o
0
0
0
0
oa ~'~
i ,
=
,.~ ..-..
< Z
< Z
< Z
+
< Z
+
< Z
< Z
< Z
< Z
< Z
< Z
< Z
i
Ua ¢:a ~
m
>,'-8 ~
O~
09
r~
+
+
++
r~
¢a
+l
I
{
I+
+
+
+
+
+ l +
+
.o
o 0a
<
<
© elO
< Z
©
o
=
~
E
~D
z Z
~ ~
z z
~.~
+..
I
_.a
z
I
+.~
Z 0"5 ~t"q
¢.e3
8~
©
>
~5
©
>
+
< Z
+
[
ct)
z
X~
=
c,.) ~
EE
< z o m ©
._= +.a
© ra,)
©
O
ever, make some intentional omissions• Sequence rearrangements associated with the haemoglobinopathies are not included since these have been treated adequately elsewhere (Little 1981; Spritz and Forget 1983). Also omitted are polymorphisms detected solely by comparative analysis of cloned sequences, since artefactual sequence alterations resulting from the cloning process have often not been excluded. Polymorphisms exhibited by uncharacterized repetitive sequence families (e.g. Lerman et al. 1983) are not listed since their repetitiveness severely limits their usefulness in genome mapping studies and the diagnosis of genetic disease. Polymorphisms detected by methods other than restriction enzyme analysis, such as chromosome banding, are not relevant to the intended purpose of this review• Finally, although Skolnick and Francke (1982) include other polymorphisms in their list on the basis of personal communication of results, such unpublished data are not presented here. As can be inferred from the absence of much of the relevant information on the polymorphisms detected, most authors were not primarily concerned with the polymorphisms per se. In most cases, the sample size, the frequency of the variant alleles, or even sometimes the restriction enzymes used, are not quoted. Authors tended to mention only those enzymes which detected polymorphisms and to ignore those that failed to detect variation. This omitted information, however, is vital for any accurate assessment of the heterozygosity of the human genome and for any comparison of the frequencies with which restriction enzyme recognition sites contain or encompass polymorphic variants• Very often, the nature of the polymorphism (single base-pair exchange, deletion/insertion, or sequence rearrangement) has not been properly determined. A few studies have gone some way towards answering the above questions. Seven different enzymes were used to examine the immune interferon gene region for polymorphisms but without success (Gray and Goeddel 1982). Using a "battery of restriction enzymes", Barker and White (1982) found five polymorphisms in 126 restriction sites analysed using 11cloned probes. Pearson et al. (1982) reported a similar frequency (five polymorphisms using 12probes and three different restriction enzymes). Bakker et al. (1983) using X chromosome-derived DNA segments found "one high frequency polymorphism (above 200/0)for every three unique sequence probes analysed with up to six restriction enzymes". From a comparison of these results a hitherto unnoticed lower frequency of DNA polymorphisms exhibited by X chromosome-derived sequences becomes apparent. This finding will be further discussed below in the light of results from our laboratory. Analysis of another X chromosomederived segment, 2RC8, has yielded a Taq I polymorphism when examined with five restriction enzymes (Murray et al. 1982). The most systematic study to date of variation present within a defined region of the genome is that of Jeffreys (1979) for the fl-globin locus. This analysis yielded three polymorphisms when 60 individuals were screened with eight different restriction enzymes. A Pst I polymorphism which occurred at a very low frequency could easily have gone undetected had the sample size not been so large. The author concluded from his results that "at least one in one hundred base pairs varies polymorphically" within the fl-globin gene region, while pointing out that this analysis may have detected fewer than 0.7% of all variants present in this gene region. The data are, however, insufficient to provide us with an estimate of heterozygosity for the genome as a whole since the analysis was confined to a region containing several coding sequences.
10 The level of heterozygosity might well be expected to differ in and around coding sequences on the one hand, and non-coding sequences on the other. The latter, comprising the majority of the genome, is likely to be under less stringent selective pressure than the former. Heterozygosity will also differ between different coding sequences depending on the extent of conservation of those sequences. Murray et al. (1983) have recently presented a study of the variation at the human serum albumin locus. They reported finding six RFLPs with Hae III (3), Msp I, Taq I, and EcoR V after screening with 18 restriction enzymes. They calculate that some 1/95 nucleotides were affected by an RFLP, similar to that estimated by Jeffreys (1979). This analysis, however, does not provide a direct estimate of heterozygosity since the method of calculation ofpolymorphism frequency presented by the authors ignores the number of individuals screened with each enzyme; this differs some 30-fold between enzymes. Recalculation of their results yields an estimate ofheterozygosity at the albumin locus ofh = 0.0025, a value only marginally lower than our own estimate and that derived from the data of Jeffreys (1979). In our laboratory we have attempted to circumvent the aforementioned problems by utilising D N A segments, cloned at random with respect to their coding potential, as probes to examine the extent of variation in different parts of the genome. The clones were obtained from DNA-libraries made from flow-sorted metaphase chromosomes. Their chromosomal allocation was confirmed with somatic cell hybrid D N A hybridization experiments. The selection of suitable clones for this analysis was based upon their ability to produce a good hybridization signal and one which was compatible with their being unique sequences. Hence, there is no reason to suppose that the selection of clones was not random with respect to the polymorphism detection frequency. The data will shortly be presented in more detail elsewhere (Cooper, Smith, Cooke, Niemann and Schmidtke, submitted for publication). Table 2 Table 2. Estimation of heterozygosity in the human genome using cloned DNA segments as hybridization probes DNA segment
Chromosome
Heterozygosity (h)
Base pairs screened
Variants (and No. of individuals found for each)
p H 72 p Ju 78 p B 97 p B 20 p B 22 2 22.1 2 22.3
X X X~ X X 22 22
0.00000 0.00000 0.00265 0.00312 0.00000 0.00351 0.00706
1764 1098 1488 1260 1168 1680 1092
2 22.4 3. 22.6 p J 3.11
22 22 7
0.00000 0.00000 0.01200
480 1536 1660
p J 5.11
7
0.00000
744
0 0 2 (Msp I) 2 (Msp I) 0 3 (Barn HI) 2 (Barn HI) 2 (Eco RI) 0 0 9 (Msp I) 1 (Taq I) 0
13,970 Assignment still tentative HAutosomal = 0.00376
a
HX chromosome = 0.00115
lists the heterozygosities calculated for each probe and the number of polymorphic variants detected with each enzyme. In all eleven different clones were used to examine a total of nearly 14,000 base pairs in homologous genomic regions with up to six restriction enzymes (Barn HI, Eco RI, Hind III, Kpn I, Msp I, and Taq I) in 10-15 different individuals. Seven different polymorphisms were found. Heterozygosities (h) were calculated using the formula of Nei (1975)
Et; (TI +
b-a) 2
where a equals the number of polymorphic base pairs found with all enzymes and b equals the total number of base pairs examined with all enzymes. This latter figure, b, was calculated in the following way: •number of ~ •number of ~ •number of b = |autoradiographic] x lbase pairs in ] × |chromosomes] \bands + 1 / \ restriction site/ \screened / This approach would overestimate heterozygosity when a probe detects several noncontiguous sequences in the genome because a greater number of recognition sites would have been screened. Conversely, heterozygosity would be underestimated either when the fragment length variation lies below the limits of resolution or when small restriction fragments are not detected due to inefficient binding to the nitrocellulose membrane. It is, however, thought unlikely that the results are affected greatly by these possible sources of error. The 11 clones used in this study probably detect 11 distinct loci in the genome. This assertion is based upon the clones themselves being chromosome-specific as well as their being single or lowcopy number in the genome. In addition, although multiple (more than two) autoradiographic bands were seen with specific probes with some enzymes, a solitary band was always seen with at least one enzyme. These points are consistent with the view that each clone does indeed detect a single distinct locus in the genome. The data permit not only an estimate of heterozygosity of the human genome as a whole but also a comparison ofheterozygosity between different chromosomes. Average heterozygosity (//) is calculated as the arithmetic mean of individual heterozygosities (h) over all the loci examined. For the 11 loci in this study, the average heterozygosity was calculated to be 0.00257. Average heterozygosities exhibited by the autosomes and by the X chromosome were calculated to be 0.00376 and 0.00115 respectively. Thus polymorphisms appear to occur with an approximately three-fold higher frequency on the autosomes than on the X chromosome. However, using the nonparametric test of homogeneity of Kolmogoroff and Smirnoff, the difference between the polymorphism frequencies exhibited by the X chromosome and the autosomes was not significant. While Bakker et al. (1983) have assumed that there is no difference between D N A sequence variation on the X chromosome as compared with that present on the autosomes, such a difference would be consistent with the disparity between previously reported polymorphism detection rates for autosomal sequences (Barker and White 1982) and X chromosome-derived sequences (Bakker et al. 1983). Limitations of our data include small sample size, both of the number of D N A regions examined and the number of base pairs within each region. In addition we have not yet confirmed the Mendelian inheritance of all the polymorphisms detected. Further studies and therefore required to substantiate our finding.
11 Table 3. Frequency of detection of restriction fragment length polymorphisms in the human genome with various restriction enzymes Enzyme
Msp I Taq I Hind III Bgl II EcoR I Sst I Bam HI Hinc II Pst I Pvu II Sac I Xba I Asu I Ava I Ava II Dde I Hpa I Mbo I Mst I Hinf I Kpn I
Polymorphisms reported in the Literature
Data from randomly selected DNA segments Base pairs Polymorphic Frequency screened variants (P/BP) (BP) found (P)
Heterozygosity (h)
10 10 9 9 9 7 5 5 4 2 2 2 1 1 1 1 1 1 1 1 0
1392 1284 2568
13 1 0
0.0093 0.0008 0.0000
0.0147 0.0014 0.0000
3410
2
0.0006
0.0012
3366
5
0.0015
0.0029
240
0
0.0000
0.0000
1680
0
0.0000
0.0000
If the X chromosome is indeed under greater constraint with respect to sequence variability, this would extend "Ohno's law" (Ohno 1967) of conservation of the X chromosome from identical localization ofgene sequences among different species to the conservation of D N A sequences present on the X chromosome among individuals of the same species. One explanation for this conservation may be that the X chromosome contains a higher proportion of coding and/or regulatory sequences which are not at liberty to accumulate polymorphic variants to the same degree as the autosomes. It should be pointed out that the calculated figure for heterozygosity on the autosomes (0.00376) is very close to that calculated from Jeffrey's (1979) data (0.00405). Thus either heterozygosity of the fl-globin gene region is close to that exhibited by the rest of the genome or our figure is close to that exhibited around coding sequences. It can be assumed that most of the variation seen is actually contained in noncoding stretches of D N A since previous estimates of heterozygosity in coding sequences, as inferred from protein data, are about one order of magnitude lower (Nei 1975). Stretches of coding D N A that are comparatively neutral with respect to point mutations and subsequent amino acid replacement might also be expected to exhibit a higher degree of variation than highly conserved D N A coding, for example, for "active" sites in a protein (Vogel and Kopun 1977). Which enzymes detect most polymorphisms ?
The selection of restriction enzymes with which to examine specific D N A segments for polymorphisms is important in aiding rapid detection of such variation. It has been claimed that
more RFLPs occur at restriction enzyme recognition sites that contain the dinucleotide CpG (Barker and White 1982). The theoretical basis for this observation lies in the fact that the human genome is highly methylated at this dinucleotide (van der Ploeg and Flavel11980; reviewed by Cooper 1983). 5-Methyl cytosine is subject to frequent replacement by thymidine due to the deamination of the methylated base (Coulondre et al. 1978). This explains both the high frequency of the C ~ T transition (Vogel 1972) and the resulting CpG deficiency observed in vertebrate genomes (Salser 1978; Bird 1980). Such deamination events would in a certain proportion of cases result in the addition or removal of a restriction site. Msp I (CCGG) and Taq I (TCGA) for example, would therefore be expected to detect such variation with a higher efficiency than enzymes that did not contain CpG in their recognition sequences. However, until now no systematic study has been done on the relative efficiency of detection of RFLPs with different restriction enzymes. Table 3 illustrates the numbers of RFLPs from published work detected with different restriction enzymes. Enzymes that have been used to detect sequence deletions, additions, and rearrangements have been excluded here as have mitochondrial D N A polymorphisms, the latter since the rate of base replacement for extrachromosomal D N A is probably different (Brown et al. 1979). Published data are, however, probably misleading in a number of different respects. Many authors only mention enzyme usage when they have detected an RFLP and omit to mention enzymes which failed to detect RFLPs. Since enzyme usage is also often determined by economic considerations and enzyme availability, it is probable that some enzymes are used far more frequently than others.
12 Table 4. Analysis of human genetic disease by means of recombinant DNA technology Disease
Gene probe
Reference
A. Direct analysis of a genetic disease using gene probes to detect intragenic defects Antithrombin III deficiency a-Antitrypsin deficiency Atherosclerosis Diabetes Ehlers-Danlos syndrome Growth hormone deficiency Haemophilia B
Antithrombin III Synthetic oligonucleotide Apolipoprotein A-1 Insulin a (1) Collagen Growth hormone Factor IX
Hereditary persistance of foetal haemoglobin
fl-Globin
Hypoxanthine-guanine phosphoribosyltransferase (HPRT) deficiency Lesch-Nyhan syndrome Osteogenesis imperfecta Retinoblastoma Sickle cell anaemia
HPRT
Prochownick et al. (1983b) Kidd et al. (1983) Karathanasis et al. (1983c) Haneda et al. (1983) Pope et al. (1983) Phillips et al. (1981) Giannelli et al. (1983) Farquhar et al. (1983) Tuan et al. (1979) Wilson et al. (1983)
HPRT Pro ~1 (1) collagen Chromosome 13 DNA segments fl-Globin Synthetic oligonucleotide
Thalassaemia
a- and fl-globin
Yang et al. (1983), Nussbaum et al. (1983) Chu et al. (1983), Pope et al. (1983) Cavenee et al. (1983) Geever et al. (1981) Conner et al. (1983) Orkin et al. (1978) Little et al. (1980)
B. Indirect analysis of genetic disease using gene probes to detect closely linked polymorphisms Atherosclerosis Diabetes (Type II) Growth hormone deficiency Type I Hyp er triglyceridaemia
Apolipoprotein A-1 Insulin Growth hormone Apolipoprotein A-1
Sickle cell anaemia
fl-Globin
Lesch-Nyhan syndrome
HPRT
Osteogenesis imperfecta Phenylketonuria
pro a2 (1) Collagen Phenylalanine hydroxylase
Tsipouras et al. (1983) Woo et al. (1983)
Thalassaemia
p-Globin
Boehm et al. (1983)
Karathanasis et al. (1983a) Rotwein et al. (1981, 1983)a Phillips et al. (1982) Rees et al. (1983) Kan and Dozy (1978) Phillips et al. (1980) Boehm et al. (1983) Nussbaum et al. (1983) Yang et al. (1983)
C. Indirect analysis of genetic disease using cloned DNA segments to detect linked DNA polymorphisms Disease
Probe
Fragile X-mental retardation syndrome
Factor IX
< 12
Camerino et al. (1983)
Huntington's chorea
G8
< 10
Gusella et al. (1983)
Menkes kinky hair
LI.28
16
Wieacker et al. (1983b)
LI.28 2 RC8 LI.28
16 17 17
Kingston et al. (1983) Davies et al. (1983a) Davies et al. (1983a)
7
Davies et al. (1983b)
Muscular dystrophy Becker Duchenne
Distance between probe and disease locus (cM)
Reference
Myotonic dystrophy
Complement C3 gene
Retinoschisis
2 RC8
15
Wieacker et al. (1983e)
Steroid-sulphatase-X-linked ichthyosis
2 RC8
25
Wieacker et al. (1983a)
a A firm association between diabetes and polymorphisms 5' to the insulin gene has been challenged by e.g. Yokoyama (1983)
Eco RI for example is comparatively cheap and easily available. These considerations make it difficult on the basis of the inadequate published data to conclude w h e t h e r some restriction sites are m o r e polymorphic than others. It may be for instance
that since Msp I and Taq I are believed to detect polym o r p h i s m s at a higher frequency than others, these enzymes are actually employed m o r e often and are consequently m o r e likely to detect variation.
13 Our results with 11 cloned DNA segments and six enzymes do, however, permit a preliminary conclusion. These results are summarized in Table 3. It can be seen that Msp I detects variation much more frequently than the other enzymes and that Taq 1 is comparable to Eco RI and Barn HI in its detection efficiency. These results are therefore consistent with the hypothesis (Barker and White 1982) that more RFLPs occur in restriction enzyme recognition sites that contain CpG. We also tentatively conclude from our data that the average DNA fragment size after restriction does not appear to correlate with polymorphism detection rates as suggested by others (Basti6Sigeac and Lucotte 1983).
Clinical applications A n u m b e r of polymorphic DNA sequences have now been utilized as marker loci for h u m a n disease genes. While direct analysis of the genetic defect (Table 4A) is the most reliable and hence the most desirable means of detection, it is often not possible to locate the site of mutation within the gene if it is not a gross deletion. Reasons for this include the lack of a suitable restriction enzyme or of the appropriate gene probe. One alternative is to utilize DNA polymorphisms flanking the region of interest (Table 4B). Another is to establish linkage between a known polymorphic DNA segment and the gene under study (Table 4C). This latter approach is applicable in cases where the exact location of the mutant gene (e.g. muscular dystrophies) is unknown. Obviously, the closer the linkage between the polymorphic DNA segment and the gene, the less likely will be the occurrence of a recombination event between them and the more accurate will be the subsequent diagnosis. Examples of the clinical use of linked DNA polymorphisms detected either by the appropriate gene probe or by a linked DNA segment are given in Tables 4B and C. In addition, attempts have been made to link specific polymorphisms to thyroxine-binding globulin deficiency (Hill et al. 1982), Xg blood group (Murray et al. 1982), colour blindness (Murray et al. 1982), and citrullinaemia ( S u e t al. 1982) without success. Together the strategies outlined briefly above hold out the promise of antenatal diagnosis and carrier detection of many h u m a n genetic defects without needing to identify either the primary gene product or the basis biochemical mechanism of the disease. Both for gene mapping and gene linkage studies, it is necessary to have a large n u m b e r of polymorphic loci to identify each linkage group or chromosome. Estimates of the n u m b e r of RFLPs thought to be required such that a DNA segment is certain to be linked to an RFLP locus at a distance no greater than 0.1 morgans, vary from 150 (Botstein et al. 1980) to 1500 (Lange and Boehnke 1982). It is also necessary to possess a sufficient n u m b e r of markers segregating in families to determine linkage for the trait in question. For a highly accurate diagnosis, linkage between the marker sequence and the disease locus must be very tight. Moreover, these polymorphisms must be frequent in the population under study in order to render the procedure applicable to a majority of patients. The data compiled in this review demonstrate that these criteria are seldom met and therefore also indicate that much work still remains to be done until analysis at the DNA level becomes a routine procedure in clinical genetics. After completion of the manuscript a provisional report of the "Committee on H u m a n Gene Mapping By Recombinant DNA Techniques, 1983" was circulated. Whereas this list includes many unpublished reports and work in press, it is far from complete with regard to already published data. We therefore
hope that the information included here in Table 1 can be used to augment the Committee's Report.
Acknowledgements. We would like to thank Juiiet Honeycombe and Bryan Young for the provision of clones, and Chris Bostock and Ed Southern for advice and help during the course of our own work included in this review. We also thank Mrs. Doris Immke for secretarial help. J. S. was supported by the Deutsche Forschungsgemeinschaft. References Antonarakis SE, Boehm CD, Giardina PJV, Kazazian HH (1982) Nonrandom association of polymorphic restriction sites in the /Lglobin gene cluster. Proc Natl Acad Sci USA 79 : 137-141 Auffray C, Ben-Nun A, Roux-Dosseto M, Germain RN, Seidman JG, Strominger JL (1983) Polymorphism and complexity of the human DC and murine I-Aa chain genes. EMBO J 2 : 121-124 Bakker E, Wieacker P, Beverstock GC, Pearson PL (1983) Recombinant DNA techniques for mapping the human X chromosome. Clin Genet 23 : 225 Barker D, White R (1982) More base pair change polymorphisms at sites containing CpG. Cytogenet Cell Genet 32 : 253 Basti6-Sigeac F, Lucotte G (1983) Optimal use of restriction enzymes in the analysis of human DNA polymorphism. Hum Genet 63 : 162-165 Bell GI, Pictet RL, Rutter WJ, Cordell B, Tischer E, Goodman HM (1980a) Sequence of the human insulin gene. Nature 284:26-32 Bell GI, Pictet R, Rutter WJ (1980b) Analysis of the regions flanking the human insulin gene and sequence of an Alu family member. Nucleic Acids Res 8 : 4091-4109 Bell GI, Karan JH, Rutter WJ (1981) Polymorphic DNA region adjacent to the 5' end of the human insulin gene. Proc Natl Acad Sci USA 78 : 5759-5762 Bell GI, Selby MJ, Rutter WJ (1982) The highly polymorphic region near the human insulin gene is composed of simple tandemly repeating sequences. Nature 295 : 31-35 Beutler E, Kuhl W, Johnson C (1981) A common mutant EcoRI restriction endonuclease site in the 5' flanking portion of the human a-globin gene. Proc Natl Acad Sci USA 78 : 7056-7058 Bird AP (1980) DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res 8 : 1499-1504 Blanc H, Chen K-H, D'Amore MA, Wallace DC (1983) Amino acid change associated with the major polymorphic Hinc II site of Oriental and Caucasian mitochondrial DNAs. Am J Hum Genet 35 : 167-176 Boehm CD, Antonarakis SE, Phillips JA, Stetten G, Kazazian HH (1983) Prenatal diagnosis using DNA polymorphisms. Report on 95 pregnancies at risk for sickle-cell disease or fl-thalassemia. N Engl J Med 308 : 1054-1058 B6hme J, Owerbach D, Denaro M, Lernmark A, Peterson PA, Rask L (1983) Human class I1 major histocompatibility antigen /?-chains are derived from at least three loci. Nature 301 : 82-84 Boothby M, Ruddon RW, Anderson C, McWitliams D, Boime I (1981) A single gonadotropin a-subunit gene in normal tissue and tumor-derived cell lines. J Biol Chem 256 : 5121-5127 Botstein D, White RL, Skolnick M, Davis RW (1980) Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet 32:314-331 Brown WM, George M, Wilson AC (1979) Rapid evolution of animal mitochondrial DNA. Proc Natl Acad Sci USA 76 : 1967-1971 Calabretta B, Robberson DL, Barrera-Saldafia HA, Lambrou IP, Saunders GF (1982) Genome instability in a region of human DNA enriched in Alu repeat sequences. Nature 296 : 219-225 Camerino G, Mattei MG, Mattei JF, Jaye M, Mandel JL (1983) Close linkage of fragile X-mental retardation syndrome to Haemophilia B and transmission through a normal male. Nature 306 : 701-704 Cann HM, Ascanio L, Paul P, Marcadet A, Dausset J, Cohen D (1983) Polymorphic restriction endonuclease fragment segregates and correlates with the gene for HLA-B8. Proc Natl Acad Sci USA 80 : 1665-1668 Cavenee WK, Dryja TP, Phillips RA, Benedict WF, Godbout R, Gallie BL, Murphree AL, Strong LC, White RL (1983)Expression of recessive alleles by chromosomal mechanisms in retinoblastoma. Nature 305 : 779-784
14 Chandra T, Stackhouse R, Leicht M, Long G, Kurachi K, Davie EW, Woo SLC (1982) DNA polymorphism in the human chromosomal alpha-l-antitrypsin gene. DNA I : 210 Chu M-L, Williams CJ, Pepe G, Hirsch JL, Prockop DJ, Ramirez F (1983) Internal deletion in a collagen gene in a perinatal lethal form of osteogenesis imperfecta. Nature 304 : 78-80 Conner BJ, Reyes AA, Morin C, Itakura K, Teplitz RL, Wallace RB (1983) Detection of sickle cell flS-globin allele by hybridization with synthetic oligonucleotides. Proc Natl Acad Sci USA 80: 278-282 Cooke H J, Noel B (1979) Confirmation ofY/autosome translocations using recombinant DNA. Hum Genet 50 : 39-44 Cooper DN (1983) Eukaryotic DNA methylation. Hum Genet 64 : 315-333 Coulondre C, Miller JH, Farabaugh PJ, Gilbert W (1978) Molecular basis of base substitution hotspots in Escherichia coil Nature 274 : 775-780 Daiger SP, Wildin RS, Su T-S (1982) Sequences on the human Y chromosome homologous to the autosomal gene for argininosuccinate synthetase. Nature 298 : 682-684 Dalla-Favera R, Martinotti S, Gallo RR, Eriksson J, Croce CM (1983) Translocation and rearrangements of the c-myc oncogene locus in human undifferentiated B-cell lymphomas. Science 219: 963-967 Darlington GJ, Astrin KH, Muirhead SP, Desnick RJ, Smith M (1982) Assignment of human alpha-I-antitrypsin gene to chromosome 14. Cytogenet Cell Genet 32 : 262-263 Davies KE, Pearson PL, Harper PS, Murray JM, O'Brien TO, Sarfarazi M, Williamson R (1983a) Linkage analysis of two cloned DNA sequences flanking the Duchenne muscular dystrophy locus on the short arm of the human X chromosome. Nucleic Acids Res 11:2303-2312 Davies KE, Jackson J, Williamson R, Harper PS, Ball S, Sarfarazi M, Meredith L, Fey G (1983b) Linkage analysis of myotonic dystrophy and sequences on chromosome 19 using a cloned complement 3 gene probe. J Med Genet 20 : 259-263 De Martinville B, Wyman HR, White R, Francke U (1982a) Assignment of the first random restriction fragment length polymorphism (RFLP) locus (DI4SI) to a region of human chromosome 14. Am J Hum Genet 34 : 216-226 De Martinville B, Wyman A, White R, Francke U (1982b) Assignment of the first highly polymorphic marker locus to a human chromosome region. Cytogenet Cell Genet 32 : 265 Denaro M, Blane H, Johnson MJ, Chert KH, Wilmsen E, CavalliStorza LL, Wallace DC (1981). Ethnic variation in Hpa II endonuclease cleavage patterns of human mitochondrial DNA. Proc Natl Acad Sci USA 78 : 5768-5772 Driesel A J, Schumacher AM, Flavell RA (1982) A Hind III restriction site polymorphism in the human collagen a(I)-like gene on chromosome No. 7. Hum Genet 62 : 175-176 Driscoll MC, Baird M, Bank A (1981) A new polymorphism in the human/~-globin gene useful in antenatal diagnosis. J Clin Invest 68 : 915-919 Engel JN, Gunning PW, Kedes L (1981) Isolation and characterization of human actin genes. Proc Natl Acad Sci USA 78:46744678 Erickson JM, Rushford CL, Dorney DJ, Wilson GN, Schmickel RD (1981) Structure and variation of human ribosomal DNA: molecular analysis of cloned fragments. Gene 16 : 1-9 Erlich HA, Stetler D, Saiki R, Gladstone P, Pious D (1983) Mapping of the genes encoding the HLA-DRa chain and the HLA-related antigens to a chromosome 6 deletion by using genomic blotting. Proc Natl Acad Sci USA 80 : 2300-2304 Farquhar M, Gelinas R, Tatsis B, Murray J, Yagi M, Mueller R, Stamatoyannopoulos G (1983) Restriction endonuclease mapping of 7-~-fl-globin region in GT(fl) + HPFH and a Chinese Ay HPFG variant. Am J Hum Genet 35 : 611-620 Geever RF, Wilson LB, Nallaseth FS, Milner PF, Bittner M, Wilson JT (1981) Direct identification of sickle cell anemia by blot hybridization. Proc Natl Acad Sci USA 78 : 5081-5085 Giannelli F, Choo KH, Rees DJG, Boyd Y, Rizza CR, Brownlee GG (1983) Gene deletions in patients with haemophilia B and antifactor IX antibodies. Nature 303 : 181-182
Goldfarb M, Shimizu K, Perucho M, Wigler M (1982) Isolation and preliminary characterization of a human transforming gene from T24 bladder carcinoma cells. Nature 296 : 404-409 Gray PW, Goeddel DV (1982) Structure of the human immune interferon gene. Nature 298 : 859-863 Gusella JF, Wexler NS, Conneally PM, Naylor SL, Anderson MA, Tanzi RE, Watkins PC, Ottina K, Wallace MR, Sakaguchi AY, Young AB, Shoulson I, Bonilla E, Martin JB (1983) A polymorphic DNA marker genetically linked to Huntington's disease. Nature 306 : 234-238 Haneda M, Chan SJ, Kwok SCM, Rubenstein AH, Steiner DF (1983) Studies on mutant human insulin genes: identification and sequence analysis of a gene encoding [SerB24] insulin. Proc Natl Acad Sci USA 80 : 6366-6370 Harris H, Hopkinson DA (1972) Average heterozygosity per locus in man: an estimate based on the incidence of enzyme polymorphisms. Ann Hum Genet 36 : 9-20 Hieter PA, Hollis GF, Korsmeyer SJ, Waldmann TA, Leder P (1981) Clustered arrangement of immunoglobulin 2 constant region genes in man. Nature 294 : 536-540 Higgs DR, Goodbourn SEY, Wainscoat JS, Clegg JB, Weatherall DJ (1981) Highly variable regions of DNA flank the human a globin genes. Nucleic Acids Res 9 : 4213-4224 Hill MEE, Davies KE, Harper P, Williamson R (1982) The Mendelian inheritance of a human X chromosome-specific DNA sequence polymorphism and its use in linkage studies of genetic disease. Hum Genet 60:222-226 Hollis GF, Hiefer PA, McBride OW, Swan D, Leder P (1982) Processed genes: a dispersed human immunoglobulin gene bearing evidence of RNA type processing. Nature 296 : 321-325 Jeffreys AJ (1979) DNA sequence variants in the oy., Ay_ d. and p-globin genes of man. Cell 18 : 1-10 Kan YW, Dozy AM (1978) Polymorphism of DNA sequence adjacent to human/~-globin structural gene: relationship to sickle mutation. Proc Natl Acad Sci USA 75 : 5631-5635 Kan YWK, Lee KY, Furbetta M, Angius A, Cao A (1980) Polymorphism of DNA sequence in the fl-globin gene region. N Engl J Med 302 : 185-188 Kao F-T, Hawkins JW, Law ML, Dugaiczyk A (1982) Assignment of the structural gene coding for albumin to human chromosome 4. Hum Genet 62 : 337-341 Karathanasis SK, Norum RA, Zannis VI, Breslow JL (1983a) An inherited polymorphism in the human apolipoprotein A-1 gene locus related to the development of atherosclerosis. Nature 301 : 718-720 Karathanasis SK, McPherson J, Zannis VI, Breslow J (1983b) Linkage of human apolipoproteins A-I and c-III genes. Nature 304 : 371-373 Karathanasis SK, Zannis VI, Breslow JL (1983c) A DNA insertion in the apolipoprotein A-I gene of patients with premature atherosclerosis. Nature 305 : 823-825 Karin M, Richards RI (1982) Human metallothionein genes, primary structure of the metallothionein-II gene and a related processed gene. Nature 299 : 797-802 Kidd VJ, Wallace RB, Itakura K, Woo SLC (1983) al-Antitrypsin deficiency detection by direct analysis of the mutation in the gene. Nature 304 : 230-234 Kingston HM, Thomas NST, Pearson PL, Sarfarazi M, Harper PS (1983) Genetic linkage between Becket muscular dystrophy and a polymorphic DNA sequence on the short arm of the X chromosome. J Med Genet 20 : 255-258 Kohen G, Philippe N, Godet J (1982) Polymorphism of the Hinf I restriction site located I kb 5' to the human /~-globin gene. Hum Genet 62 : 121-123 Krystal M, D'Eustachio P, Ruddle FH, Arnheim N (1981) Human nucleolar organizers on nonhomologous chromosomes can share the same ribosomal gene variants. Proc Natl Acad Sci USA 78 : 5744-5748 Lange K, Boehnke M (1982) How many polymorphic marker genes will it take to span the human genome? Am J Hum Genet 34 : 842-845 Lawn RM, Fritsch EF, Parker RC, Blake G, Maniatis T (1978) The isolation and characterization of linked d- and p-globin genes from a cloned library of human DNA. Cell 15 : 1151-1174
15 Lebo RV, Kan YW, Cheung MC, Carrano AV, Yu L-C, Change JC, Cordell B, Goodman HM (1982) Assigning the polymorphic human insulin gene to the short arm of chromosome 11 by chromosome sorting. Hum Genet 60 : 10-15 Lebo RV, Chakravarti A, Buetow KH, Cheung M-C, Cann H, Cordell B, Goodman H (1983) Recombination within and between the human insulin and/3-globin gene loci. Proc Natl Acad Sci USA 80 : 4808-4812 Lee JS, Trowsdale J, Bodmer WF (1982) cDNA clones coding for the heavy chain of human HLA-DR antigen. Pro c Natl Acad S ci USA 79 : 545-549 Lefranc M-P, Lefranc G, Rabbitts TH (1982) Inherited deletion of immunoglobulin heavy chain constant region genes in normal human individuals. Nature 300 : 760-762 Leinwand LA, Saez L, McNally E, Nadal-Ginard B (1983a) Isolation and characterization of human myosin heavy chain genes. Proc Natl Acad Sci USA 80 : 3716-3720 Leinwand LA, Fournier REK, Nadal-Ginard B, Shows TB (1983b) Multigene family for sarcomeric myosin heavy chain in mouse and human DNA: Localization on a single chromosome. Science 221 : 766-769 Lerman MI, Thayer RE, Singer MF (1983) Kpn I family of long interspersed repeated DNA sequences in primates: polymorphism of family members and evidence for transcription. Proc Natl Acad Sci USA 80 : 3966-3970 Lie-Injo LE, Herrera AR, Kan YW (1981) Two types of triplicated c~-globin loci in humans. Nucleic Acids Res 9 : 3707-3717 Little PFR (1981) DNA analysis and the antenatal diagnosis of hemoglobinopathies. In: Wiltiamson R (ed) Genetic engineering, vol I. Academic Press, London Little PFR, Annison G, Darling S, Williamson R, Camba L, Modell B (1980) Model for antenatal diagnosis of/3-thalassaemia and other monogenic disorders by molecular analysis of linked DNA polymorphisms. Nature 285 : 144-147 McKusick VA (1982) The human gene map. 20. October 1982. Clin Genet 22 : 359-391 Matthyssens G, Rabbitts TH (1980) Structure and multiplicity of genes for the human immunoglobulin heavy chain variable region. Proc Natl Acad Sci USA 77 : 6561-6565 Meisler MH (t983) DNA polymorphisms. Nature 303 : 108 Michelson AM, Markham AF, Orkin SH (1983) Isolation and DNA sequence of a full-length cDNA clone for human X-chromosomeencoded phosphoglycerate kinase. Proc Natl Acad Sci USA 80 : 472-476 Mignone N, Feder J, Cann H, van West B, Hwang J, Takahashi N, Honjo T, Piazza A, Cavalli-Sforza LL (1983) Multiple DNA fragment polymorphisms associated with immunoglobulin p chain switch-like regions in man. Proc Natl Acad Sci USA 80 : 467-471 Murray JM, Davies KE, Harper PS, Meredith L, Mueller CR, Williamson R (1982) Linkage relationship of a cloned DNA sequence on the short arm of the X chromosome to Duchenne muscular dystrophy. Nature 300 : 69-71 Murray JC, Demopulos CM, Lawn RM, Motulsky AG (1983) Molecular genetics of human serum albumin: Restriction enzyme fragment length polymorphisms and analbuminemia. Proc Natl Acad Sci USA 80 : 5951-5955 Myers JC, Dickson LA, de Wet WJ, Bernard MP, Chu M-L, Di Liberto M, Pepe G, Sangiorgi FO, Ramirez F (1983) Analysis of the 3' end of the human pro-a2(1) collagen gene. J Biol Cbem 258 : 10128-10135 Naylor SL, Sakaguchi AY, Gusella JF, Housman D, Shows TB (1982a) Mapping of an arbitrary restriction polymorphism (ARP2) to human chromosome 3. Cytogenet Cell Genet 32 : 302 Naylor SL, Sakaguchi AY, Gutai MW, Schmickel RD, Shows TB (1982b) Chromosomal organization of rDNA spacer length variants on human chromosomes 13, 14, 15, 21, and 22 in somatic cell hybrids. Cytogenet Cell Genet 32 : 302 Naylor SL, Sakaguchi AY, Shen L-P, Bell GI, Rutter WJ, Shows TB (1983a) Polymorpbic human somatostatin gene is located on chromosome 3. Proc Natl Acad Sci USA 80 : 2686-2689 Naylor SL, Sakaguchi AY, Shows TB, Grzeschik K-H, Holmes M, Zasloff M (1983b) Two nonallelic tRNA Met i genes are located in the p23-~q12 region of human chromosome 6. Proc Natl Acad Sci USA 80 : 5027-5031
Nei M (1975) Molecular population genetics and evolution. North Holland Publishing Co, Amsterdam Nishida Y, Miki T, Hisajima H, Honjo T (1982) Cloning of human immunoglobulin e chain genes: evidence for multiple C~ genes. Proc Natl Acad Sci USA 79 : 3833-3837 Nussbaum RL, Crowder WE, Nyhan WL, Caskey CT (1983) A threeallele restriction-fragment-length polymorphism at the hypoxanthine phosphoribosyltransferase locus in man. Proc Natl Acad Sci USA 80 : 4035-4039 Ohno S (1967) Sex chromosomes and sex-linked genes. Springer, Berlin Old JM, Wainscoat JS (1983) A new DNA polymorphism in the /3-globin gene cluster can be used for antenatal diagnosis of/3-thalassaemia. Br J Haematol 53 : 337-341 Orkin SH, Alter BP, Altay C, Mahoney MJ, Lazarus H, Hobbins JC, Nathan DG (1978) Application of endonuclease mapping to the analysis and prenatal diagnosis ofthalassemias caused by globingene deletion. N Engl J Med 299 : 166-172 Orkin SH, Kazazian HH, Antonarakis SE, Goff SC, Boehm CD, Sexton JP, Waber PG, Giardina PTV (1982) Linkage of/3-thalassaemia mutations and /3-globin gene polymorphisms with DNA polymorphisms in human/3-globin gene cluster. Nature 296 : 627-631 Owerbach D, Rutter WJ, Cooke NE, Martial JA, Shows TB (1981) The prolactin gene is located on chromosome 6 in humans. Science 212 : 815-816 Owerbach D, Lernmark A, Rask L, Peterson PA, Platz P, Svejgaard A (1983a) Detection ofHLA-D/DR-related DNA polymorphism in HLA-D homozygous typing cells. Proc Natl Acad Sci USA 80 : 3758-3761 Owerbach D, Lernmark A, Platz P, Ryder LP, Rask L, Peterson PA, Ludvigsson J (1983b) HLA-D region/3-chain DNA endonuclease fragments differ between HLA-DR identical healthy and insulindependent diabetic individuals. Nature 303 : 815 Page D, De Martinville B, Barker D, Wyman A, White R, Francke U, Botstein D (1982) Single copy sequence hybridizes to polymorphic and homologous loci on human X and Y chromosomes. Proc Natl Acad Sci USA 79 : 5352-5356 Panny SR, Scott AF, Smith KD, Phillips JA, Kazazian HH, Talbot CC, Boehm CD (1981) Population heterogeneity of the Hpa I restriction site associated with the/3-globin gene: implications for prenatal diagnosis. Am J Hum Genet 33 : 25-35 Parks JS, Herd JE, Wurzel JM (1981) Human growth hormone (HGH) deficiency and polymorphism within the HGH and human placental lactogen (HPL) gene cluster. Pediatr Res 15 : 513 Pearson PL, Bakker E, Flavell RA (1982) Considerations in designing an efficient strategy for localizing unique sequence DNA fragments to human chromosomes. Cytogenet Cell Genet 32 : 308 Phillips JA, Panny SR, Kazazian HH, Boehm CD, Scott AF, Smith KD (1980) Prenatal diagnosis of sickle cell anemia by restriction endonuclease analysis: Hind III polymorphisms in y-globin genes extend test applicability. Proc Natl Acad Sci USA 77 : 2853-2856 Phillips JA, Hjelle BL, Seeburg PH, Zachmann M (1981) Molecular basis for familial isolated growth hormone deficiency. Proc Natl Acad Sci USA 78 : 6372-6375 Phillips JA, Parks JS, Hjelle BL, Herd JE, Plotnick LP, Migeon CJ, Seeburg PH (1982) Genetic analysis of familial isolated growth hormone deficiency type 1. J Clin Invest 70 : 489-495 Pope FM, Nicholls AC, Grosveld FG (1983) Similar a I (I)-like gene deletions cause some types of Ehlers Danlos syndrome type II and lethal osteogenesis imperfecta. Clin Genet 24 : 303 Prochownik EV, Markham AF, Orkin SH (1983 a) Isolation ofa cDNA clone for human antithrombin III. J Biol Chem 258 : 8389-8394 Procbownik EV, Antonarakis S, Bauer KA, Rosenberg RD, Fearon ER, Orkin SH (1983b) Molecular heterogeneity of inherited antithrombin III deficiency. N Engl J Med 308 : 1549-1552 Raugei G, Bensi G, Colantuoni V, Romano V, Santoro C, Costanzo F, Cortese R (1983) Sequence of human haptoglobin cDNA: evidence that the a and/3 subunits are coded by the same mRNA. Nucleic Acids Res 11 : 5811-5819 Rees A, Shoulders CC, Stocks J, Galton DJ, Baralle FE (1983) DNA polymorphism adjacent to human apoprotein A-1 gene: relation to hypertriglyceridemia. Lancet 1:444 Riggin CH, Pitha PM (1982) Methylation and a polymorphic restriction site adjacent to human/?-interferon gene. DNA 1 : 267-271
16 Roberts D (1982) Applications ofpolymorphisms in anthropogenetic studies. Hum Biol 54 : 175 Rotwein P, Chyn R, Chirgwin J, Cordell B, Goodman HM, Permutt MA (1981) Polymorphism in the Y-flanking region of the human insulin gene and its possible relation to type 2 diabetes. Science 213 : 111%1120 Rotwein PS, Chirgwin J, Province M, Knowler WC, Pektitt DJ, Cordell B, Goodman HM, Pettitt MA (1983) Polymorphism in the 5' flanking region of the human insulin gene: a genetic marker for non-insulin-dependent diabetes. N Engl J Med 308 : 65-71 Sakaguchi AY, Naylor SL, Schmickel RD, Shows TB (1982) Assignment of an arbitrary restriction fragment, ARF-1 to human chromosome 6. Cytogenet Cell Genet 32 : 313-314 Sakaguchi AY, Naylor SL, Shows TB, Toole JJ, McCoy M, Weinberg RA (1983) Human c-Ki-ras 2 proto-oncogene on chromosome 12. Science 219 : 1081-1085 Salser W (1978) Globin mRNA sequences: analysis of base pairing and evolutionary implications. Cold Spring Harbor Syrup Quant Biol 42 : 985-1002 Santos E, Tronick SR, Aaronson SA, Pulciani S, Barbacid M (1982) T 24 human bladder carcinoma oncogene is an activated form of, the normal human homologue of BALB- and Harvey-MSV transforming genes. Nature 298 : 343-347 Sch~ifer M, White R (1982) Three random loci in the genome with base pair change polymorphisms. Cytogenet Cell Genet 32: 314-315 Schmickel RD, Waterson JR, Knoller M, Szura LL, Wilson GN (1980) HeLa cell identification by analysis of ribosomal DNA segment patterns generated by endonuclease restriction. Am J Hum Genet 32 : 890-897 Schmid M, Gall H, Schempp W, Weber L, Schmidtke J (1981) Characterization of a new aberration of the human Y chromosome by banding methods and DNA restriction endonuclease analysis. Hum Genet 59 : 26-35 Shows TB, Sakaguchi AY, Naylor SL (1982) Mapping the human genome, cloned genes, DNA polymorphisms and inherited disease. Adv Hum Genet 12 : 341-452 Shmookler Reis RJ, Lumpkin CK, McGill JR, Riabowol KT, Goldstein S (1983) Extra-chromosomal circular copies of an "InterAlu" unstable sequence in human DNA are amplified during in vitro and in vivo ageing. Nature 301 : 394-398 Skolnick MH, Francke U (1982) Report of the committee on human gene mapping by recombinant DNA techniques. Cytogenet Cell Genet 32 : 194-204 Soriano P, Szabo P, Bernardi G (1982) The scattered distribution of actin genes in the mouse and human genomes. EMBO J 1 : 579-583 Southern EM (1982) Application of DNA analysis to mapping the human genome. Cytogenet Cell Genet 32 : 52-57 Spritz RA, Forget BG (1983) The thalassemias: molecular mechanisms of human genetic disease. Am J Hum Genet 35 : 333-361 Stetler D, Das H, Nunberg JH, Saiki R, Sheng-Dong R, Mullis KB Weissman SM, Erlich HA (1982) Isolation of a cDNA clone for the human HLA-DR antigen c~chain by using a synthetic oligonucleotide as a hybridization probe. Proc Natl Acad Sci USA 79 : 5966-5970 Su T-S, Bock H-GO, Beaudet AL, O'Brien WE (1982) Molecular analysis of argininosuccinate synthetase deficiency in human fibroblasts. J Clin Invest 70 : 1334-1335 Sukumaran PK, Nakatsuji T, Gardiner MB, Reese AL, Gilman JG, Huisman THJ (1983) Gamma thalassemia resulting from the deletion of a ~'-globin gene. Nucleic Acids Res 11 : 4635-4643 Tabin C J, Bradley SM, Bargmann CI, Weinberg RA, Papageorge AG, Scolnick EM, Dhar R, Lowy DR, Chang EH (1982) Mechanism of activation of a human oncogene. Nature 300:143-149 Taub RA, Hollis GF, Hieter PA, Korsmeyer S, Waldmann TA, Leder P (1983) Variable amplification of immunoglobulin 2 light-chain genes in human populations. Nature 304 : 172-174 Trowsdale J, Lee J, Carey J, Grosveld F, Bodmer J, Bodmer W (1983) Sequences related to HLA-DR c~chain on human chromosome 6: restriction enzyme polymorphism detected with DC c~ chain probes. Proc Natl Acad Sci USA 80 : 1972-1976
Tsipouras P, Myers J, Prockop D, Ramirez F (1983) Genetic analysis of the mild autosomal dominant osteogenesis imperfecta with restriction fragment length polymorphism associated with the pro a 2 (I) collagen gene. Abstracts, 34th annual meeting, American Society of Human Genetics : 182A Tuan D, Biro PA, deRiel JK, Lazarus H, Forget BG (1979) Restriction endonuclease mapping of the ¥-globin gene loci. Nucleic Acids Res 6 : 2519-2544 Ullrich A, Dull TJ, Gray A, Philips JA, Peter S (1982) Variation in the sequence and modification state of the human insulin gene flanking regions. Nucleic Acids Res 10 : 2225-2240 Van tier Ploeg LHR, Flavell RA (1980) DNA methylation in the human ydfl-globin locus in erythroid and non-erythroid tissues. Cell 19 : 947-958 Vogel F (1972) Non-randomness of base replacement in point mutation. J Mol Evol 1:334-367 Vogel F, Kopun M (1977) Higher frequencies of transitions among point mutations. J Mol Evol 9 : 159-180 Vogel F, Motulsky AG (1982) Human genetics, 2nd edn. Springer, Berlin Wake CT, Long EO, Strubin M, Gross N, Accolla R, Carrel S, Mach B (1982a) Isolation of cDNA clones encoding HLA-DR c~chains. Proc Natl Acad Sci USA 79 : 6979-6983 Wake CT, Long EO, Mach B (1982b) Allelic polymorphism and complexity of the genes for HLA-DR fl-chains-direct analysis by DNA-DNA hybridization. Nature 300 : 372-374 Whitehead AS, Bruns GAP, Markham AF, Colten HR, Woods DE (1983) Isolation of human C-reactive protein complementary DNA and localization of the gene to chromosome 1. Science 221 : 69-71 Wieacker P, Davies KE, Mevorah B, Ropers HH (1983a) Linkage studies in a family with X-linked recessive ichthyosis employing a cloned DNA sequence from the distal short arm of the X chromosome. Hum Genet 63:113-116 Wieacker P, Horn N, Pearson P, Wienker TF, McKay E, Ropers HH (1983b) Menkes kinky hair disease: a search for closely linked restriction fragment length polymorphism. Hum Genet 64: 139-142 Wieacker P, Wienker TF, Dallapiccola B, Bender K, Davies KE, Ropers HH (1983c) Linkage relationships between retinoschisis, Xg, and a cloned DNA sequence from the distal short arm of the X chromosome. Hum Genet 64 : 143-145 Wilson JT, Milner PF, Summer MF, Nallaseth FS, Fadel HE, Reindollar RH, McDonough PG, Wilson LB (1982a) Use of restriction endonucleases for mapping the allele for fl~-globin. Proc Natl Acad Sci USA 79 : 3628-3631 Wilson GN, Szura LL, Rushford C, Jackson D, Erickson J (1982b) Structure and variation of human ribosomal DNA: the external transcribed spacer and adjacent regions. Am J Hum Genet 34 : 32-49 Wilson JM, Young AB, Kelley WN (1983) Hypoxanthine-guanine phosphorybosyltransferase deficiency: the molecular basis of the clinical syndromes. N Engl J Med 309 : 900-910 Winichagoon P, Higgs DR, Goodbourn SEY, Lamb J, Clegg JB, Weatherall DJ (1982) Multiple arrangements of the human embryonic zeta globin genes. Nucleic Acids Res 10 : 5853-5868 Woo SLC, Lidsky AS, Grittier F, Chandra T, Robson KJH (1983) Cloned human phenylalanine hydroxylase gene allows prenatal diagnosis and carrier detection of classical phenylketonuria. Nature 306 : 151-155 Wyman AR, White R (1980) A highly polymorphic locus in human DNA. Proc Natl Acad Sci USA 77 : 6754-6758 Yang TP, Patel PI, Brennand J, Chinault AC, Laskey CT (1983) Molecular analysis of the human HPRT locus. Abstracts, 34th annual meeting, American Society of Human Genetics : 185A Yokoyama S (1983) Polymorphism in the 5'-flanking region of the human insulin gene and the incidence of diabetes. Am J Hum Genet 35 : 193-200
Received October 10, 1983 / Revised October 31, 1983