Diabetologia
Diabetologia 18, 431--436 (1980)
9 by Springer-Verlag 1980
Editorial Review R e c o m b i n a n t D N A - A N e w Source of Insulin W. L. Miller 1 and J. D. Baxter 2 Departments of Pediatrics, Medicine, and Biochemistry and Biophysics, Howard Hughes Medical Institute Laboratories and Metabolic Research Unit, University of California, San Francisco, California, USA
Recombinant D N A technology may soon result in the large-scale synthesis of insulin useful for medical therapy. This technology, which developed from decades of basic research on the nature of DNA, now offers the means to synthesize virtually any useful protein, such as insulin, and provides a vital tool for studying gene activity. In this article, the production of human insulin is reviewed as an example of the power and usefulness of this technology. However other developments in this field, such as the ability to insert genes back into human cells, may ultimately provide more complete approaches to the therapy of diabetes. "Recombinant D N A " specifically refers to the joining of two separate pieces of DNA; however, in general usage, the term "recombinant D N A " refers to a number of biochemical techniques for handling DNA. These include cutting D N A at specific sites, synthesizing specific D N A molecules, inserting D N A into bacteria or mammalian cells so that the cell will replicate the D N A (cloning), and manipulating cloned D N A so that the host cell makes the protein for which the D N A codes. The term "cloning" is used because a bacterium receiving a properly constructed recombinant D N A molecule will propagate that molecule as the bacterium divides, so that all daughter bacteria contain identical "cloned" DNAs derived from the single, original recombinant molecule. Recombinant systems may be devised wherein the bacteria not only propagate the cloned DNA, they may also express it, i. e., transcribe it into messenger RNA, and translate that R N A into the protein coded by the gene (for review see [1]). Cloning of the gene for human insulin may thus permit the produc1 Clinical Investigator of the NIAMDD 2 Investigator of the Howard Hughes Medical Institute
tion of human insulin by bacteria. Such bacteria could provide a sure and consistent source of unlimited amounts of insulin. Whereas at present there is generally enough insulin for medical needs, the supply from animal sources may not keep up with the demand as the world's population increases and receives better health care. Furthermore, patients in whom animal insulins are significantly antigenic may benefit from treatment with human insulin. When these techniques first became possible in the early 1970's some biochemists expressed concern that inserting foreign D N A into bacteria might produce harmful organisms [2]. Some feared that genes responsible for cancer or for the production of toxins could be cloned accidentally, and that bacteria containing these genes might survive and become infectious. As a result, a number of agencies including the National Institutes of Health promulgated a series of guidelines which regulated the source and purity of D N A to be cloned, the type of bacterial D N A into which this could be inserted for cloning, the organisms that could receive such recombinant DNA, and the physical containment precautions to be observed in the laboratories [3]. Numerous experiments now indicate that these initial fears were probably unwarranted. Thus, experiments with cloned D N A of polyoma virus (a tumor-causing virus which is highly infective in mice) failed to demonstrate that infection could be induced in susceptible animals by E. coli harbouring recombinant polyoma D N A [4, 5]. Therefore, although certain restrictions remain in force, it is now possible to proceed with useful experiments. Mammalian and bacterial cells synthesize proteins in similar fashions. D N A is transcribed by a complex polymerase into RNA. The sequence of nucleotides in the R N A is dictated by base pairing wherein adenine (A) pairs with thymidine (T) in 0012-186X/80/0018/0431/$01.20
432
W.L. Miller and J. D. Baxter: Recombinant DNA
GENERATION OF GENE FRAGMENT ANALOGUES BY REVERSE TRANSCRIPTION
~
RNA
~ Oligo(dT)-Cellulose Chromatography
AAAA TTTT
Reverse ~ Transcriptase
AAAA
mRNA
NaOH Reverse
TTTT~ cDNA
rranscripta~
Pol
~
llrll
DScDNA
I
Blunt.Ended DS cDN A
I
/Linkers
~ T4 Llgase Eco R I I
I"l
I Clone in Plasmid
Fig. 1. Preparation of DNA complementary to insulin mRNA. RNA was prepared from rat pancreatic islet cells which contain mRNA coding for insulin. Insulin mRNA, like most mRNAs from higher organisms, is polyadenylated, so that it was among the mRNAs prepared by affinity chromatography on oligo (dT)-cellulose [12]. This mRNA was copied into single-stranded DNA complementary to it (cDNA) by reverse transcriptase from avian myelobtastosis virus [10, 11]. After alkali digestion of the mRNA template, the single-stranded cDNA was again reverse-transcribed into a double-stranded cDNA. This results in a hairpin loop because the first strand folds back on itself to initiate the synthesis of the second strand. The hairpin was opened and the singlestranded ends were partially digested with $I nuclease [13], an enzyme which digests single-stranded but not double-stranded DNA. Any remaining single-stranded ends not destroyed by $1 were filled in by DNA potymerase I to yield "blunt-ended" cDNA. Synthetic DNA "linkers" containing the recognition sequence of the restriction enzyme Eco RI [14] were attached to the blunt-ended cDNA by DNA ligase from bacteriophage T4. The cDNA was then digested with Eco RI to yield cohesive termini suitable for ligation into a plasmid as shown in Figure 2
DNA or uridine (U) in RNA, and guanine (G) pairs with cytosine (C). Thus, the RNA contains bases that are "complementary" to those in the strand of DNA that was transcribed. The RNA may undergo further processing before it is used as messenger RNA (mRNA). The mRNA is translated into protein on ribosomes, where it directs the assembly of amino
acids (carried to the ribosome by transfer RNA) into polypeptides. The mRNA for insulin is translated into a polypeptide chain of 104 amino acids [6] termed preproinsulin. The first 23 amino acids, which constitute the "pre" sequence of the protein (sometimes also referred to as a "signal peptide") appear to facilitate the entry of the protein into the endoplasmic reticulum of the cell by hydrophobic interactions [7, 8]. The pre-sequence is cleaved from the protein during this process, leaving proinsulin within the endoplasmic reticulum. Further processing occurs when the C-peptide, a 30 amino acid sequence in the middie of the proinsulin chain, is excised by a trypsin-like proteolytic enzyme to yield the insulin secreted by the fi-cell. The A-chain (21 amino acids) and B-chain (30 amino acids) remain joined to one another by disulfide bonds which formed between cysteine residues during the initial synthesis of the protein. Thus, to obtain human insulin by recombinant DNA technology, it is first necessary to clone DNA coding for insulin and manipulate the bacteria to transcribe and translate that DNA into protein. That protein must then be processed so that it contains the A and B chains connected properly without a C-peptide or a pre-sequence. The first step in the production of insulin by bacteria was achieved in 1977. Ullrich et al. [9] prepared mRNA from the islets of Langerhans of rats, then synthesized double-stranded DNA that was complementary to this mRNA using reverse transcriptase, a viral enzyme which makes DNA from a template of RNA (cDNA) [10, 11]. The details of these procedures are shown in Figure 1. This DNA was then transferred to bacteria with the use of a plasmid. Plasmids (resistance-transfer factors) are circular pieces of DNA which will replicate in a bacterium independently of the bacterial chromosome. The insulin cDNA was inserted into the plasmid with the aid of restriction endoculeases. These enzymes recognize and cut specific sequences of DNA four to six nucleotides long (Fig. 2)'. The plasmid used contained a single site for the restriction enzyme Eco RI. This enzyme recognizes the sequence GAATI?C CITAAG and cleaves it to yield: .... G AAq-TC .... CTTAA G To insert the insulin cDNA into this site, chemicallysynthesized DNA containing an identical site was added (by an enzyme, DNA ligase) to both ends of the cDNA. Digestion with Eco RI yielded insulin cDNA with the same bases shown above on each end. In this way the . . T r A A on one end of the
W. L. Miller and J. D. Baxter: Recombinant DNA
@
433
Psi@ 5' 3'
I EcoRI
I Pst I 3'
AATTC G G CTTAA DScDNAwithEcoR I linkers LinearPlasmid ~ A
3'GGGG
5'
DScD~A
3' 5'
Terminaltransferase ExcessdC
3'
5' I Terminal ExcessdG
transferase
5'
GGGG3'
3'CCCC
S'
CCCC3' 5'
Ligase Hybridization DScDNASegment C22GCG
~
RecombinaPlasmid nt
Fig. 2. Cloning of cDNA in a plasmid. The cDNA, prepared as shown in Figure 1, now has identical single-stranded sequences at each end. These sequences, the remains of the "linkers" following digestion with Eco RI, are the recognition sequence of Eco RI. A plasmid (a circular DNA molecule which confers antibiotic resistances to bacteria) was chosen which contained a single Eco RI site. The plasmid was opened with Eco RI and mixed with the cDNA. The complementary, cohesive Eco RI termini of the plasmid and of the cDNA aligned by hydrogen-bond base pairing, and the strands were sealed with DNA ligase. This chimeric plasmid was then put into suitable strains of E. coli which then replicated the plasmid
Fig. 3. Cloning of eDNA by "tailing". A bacterial plasmid such as pBR322, which has been specially designed for use in recombinant DNA experiments [17], is cleaved with a restriction endonuclease such as Pst I. As there is only one Pst I site in pBR322, the cleavage results in a linear plasmid having short single-stranded (3)' ends as a result of the cleavage. A polymeric chain ("tail") of deoxyguanosine residues is added to these (3)' ends by terminal transferase, an enzyme which extends single-stranded (3)' ends [18]. Deoxycytosine residues are added to linear double-stranded eDNA in the same manner. The tailed eDNA and tailed plasmid are mixed, and the (3)' ends of poly(G) and poly(C) hybridize by base pairing, forming a stable, circular, recombinant plasmid. In practice, the tails of dC and dG are 15-30 nucleotides in length, forming very stable hybrids which do not need to be sealed by DNA ligase as shown in Figure 2
eDNA could bind by base pairing to the A A T T . . on one end of the cleaved plasmid, and the A A T F . . on the other end of the eDNA could bind to the . .T-FAA on the other end of the cleaved plasmid. The resulting recombinant D N A molecule was then inserted into a strain of E. coli which can only live in special laboratory environments [15]. A colony of E. coli containing the recombinant plasmid was selected and grown in order to prepare a large amount of plasmid DNA, all derived from the single original recombinant plasmid. The cloned D N A complementary to the m R N A for rat pre-proinsulin was excised with the Eco RI enzyme and its nucleotide sequence was determined. The cloned D N A contained the complete coding region of rat proinsulin, part of the
pre-peptide sequence and the entire D N A sequence coding for an untranslated portion of the m R N A on its 3' terminal end. A year later, Villa-Komaroff et al., achieved Ithe next step, the creation of bacterial clones which actually synthesized proinsulin [16]. Beginning with m R N A from a rat insulinoma, these investigators synthesized eDNA as did Ullrich et al. Rather than completing the synthesis of the double-stranded eDNA with D N A polymerase I and adding restriction sites to the ends of the eDNA, as shown in Figure 1, they added single-stranded "tails" of polymeric deoxycytidine to the (3') ends of a plasmid (pBR322) which had been opened at its single site recognized by the restriction endonuclease Pst I
~ G A A T TC CTTAAG
GAAT TC r
CTTAAG
434
W.L. Miller and J. D. Baxter: Recombinant DNA
E. E. gene~'x~ /~ain gene~ Vs-ga~t ~ coli
~ / el
\
coli
Achain /
Cl eave Parti awilpuri yth fA-chai y n Lyse cells
CNBr
Insulin
A-chain ~ ' ~
/
'::: ': ~
Lyse cells Cleave with CNBr Partially purify B-chain
Insulin B-choin
Oxidation
i
i
Active I n s u l i n
Fig. 4. Production of human insulin from cloned synthetic genes. Chemically synthesized pieces of double-stranded D N A were designed to contain the information coding for all the amino acids in the A or B chain of human insulin. Each synthetic chain began with the nucleotide triplet coding for methionine and ended with two "stop" signals to terminate transcription of RNA. The synthetic genes were inserted into plasmids by inserting them into the gene for fi-galaetosidase distal to the site where synthesis of the mRNA for/3-galactosidase begins. These plasmids were put into bacteria, which were then grown in medium containing galactose, but no glucose. In order to metabolize galactose, the bacteria had to make fi-galactosidase, which was connected by a methionine residue to an insulin A or B chain. Bacterial proteins were prepared and cleaved with cyanogen bromide, which cleaves proteins only at methionine residues. The liberated A or B chains were partially purified by differential solubility, DEAE-cellulose chromatography, Sephadex G-75 gel filtration, and RP-8 reversed-phase column chromatography. The preparations of A and B chains were reduced, mixed, and oxidized to yield immunologically active insulin
(Fig. 3). When the plasmid and the cDNA were mixed, the single-stranded tails of dC on the cDNA hybridized by base pairing to the single-stranded tails of dG on the plasmid, thus inserting the cDNA into the plasmid. E. coli transformed with this recombinant plasmid were grown, and colonies harbouring insulin gene sequences were chosen as before. The Pst I site in pBR322 lies within the plasmid gene coding for fi-lactamase, which permits the plasmid to confer penicillin resistance to host bacteria. Inserting insulin cDNA into this gene results in the
creation of a fused gene. This results in the synthesis of a fused protein, part of which is comprised of the first 182 amino acids of fi-lactamase (penicillinase), while the remainder contains the amino acids of proinsulin. By analysing the bacterial colonies with a solid-phase radioimmunoassay [19] a clone was identified which synthesized immunoassayable rat proinsulin. The insulin secreted by these bacteria, however, is imperfect. It is fused to a large portion of the penicillinase molecule and it retains the C-peptide of insulin. Furthermore, it is rat, and not human insulin. However, the experiment was important because it demonstrated that recombinant D N A technology could be used to produce mammalian proteins such as insulin in bacteria. While similar work with human insulinomas proceeded in several laboratories, Crea et al. [20] applied elaborate organic chemistry to the synthesis of insulin genes by purely chemical means. These investigators had previously created a totally synthetic gene for somatostatin and had achieved its expression in bacteria [21]. This proved that a synthetic gene could be designed from a known amino acid sequence and be propagated and expressed in bacteria even if the synthetic gene did not contain exactly the same nucleotide sequence found in the native gene. From the known amino acid sequence of insulin and the known genetic code, these workers designed a D N A sequence that would code for human insulin. Beginning with chemically-modified nucleotides, a library of 32 different nucleotide trimers was created. These trimers were then used to assemble 29 different oligo-deoxyribonucleotides, varying form 10 to 15 bases in length. These larger building blocks were then assembled into two different double-stranded D N A fragments, coding for the amino acids in the A and B chains, respectively. To facilitate their insertion into and excision from plasmid DNA, restriction enzyme sites were added to each end of the synthetic gene. Large quantities of each synthetic gene fragment were generated by cloning them in bacteria. Goeddel et al. [22] then inserted the synthetic gene for the B-chain into a bacterial gene for fi-galactosidase, which had previously been cloned in a plasmid. Transcription of this fused gene resulted in the production of a fused protein consisting of fl-galactosidase and the insulin B-chain. The gene for the A chain was inserted into the fi-galactosidase gene in the same manner. When the synthetic A and B genes were designed, D N A coding for an extra methionine residue was placed at each 5' (amino terminal) end so that this amino acid would be inserted between the/3galactosidase and insulin sequences. Cyanogen bromide, a proteolytic agent which cleaves proteins
W. L. Miller and J. D. Baxter: Recombinant DNA
only at methionine residues, was then used to liberate the insulin chains and fragment the fi-galactosidase. The A and B chains, which contain no methionine, remained intact. The A and B chains were then partially purified, mixed, and joined by forming disulfide bonds, thus generating radioimmunoassayable insulin. These steps are outlined in Figure 4. The A Chain preparation was purified further by high-performance liquid chromatography and found to have the amino acid composition of human insulin A chain, as predicted by the designed DNA sequence. The work of Crea and Goeddel and their collaborators clearly demonstrated that human insulin can be made with the use of recombinant DNA technology. Since each amino acid may be coded by more than one array of three nucleotides, they could design and synthesize DNA sequences which code for human insulin chains and which could be manipulated and assembled conveniently. Even though the synthetic genes were not identical to the native gene sequences, the resulting insulin chains are chemically absolutely identical to native human insulin chains. By inserting the insulin genes into the gene for/3galactosidase, Goedell et al. brought the production of insulin under the control of bacterial elements regulating the synthesis of/3-galactosidase. By growing the bacteria in glucose-free medium containing galactose, synthesis of both fi-galactosidase and insulin chains were induced, resulting in the production of 1.4 mg insulin/1 of bacteria. However, the A and B chains must still be separated cleanly from the fragmented fi-galactosidase, and must then be united efficiently into functional insulin. Although Goeddel et al. detected immunoassayable insulin, the procedure used for the union of the A and B chains was only 10-15% efficient. Thus, the production of separate A and B chains may not be the best approach to producing insulin in bacteria. Other groups are attempting the bacterial cloning of DNA sequences coding for the entire human proinsulin molecule, including the C-peptide. It is thought that retaining the C-peptide may help the proper folding of the polypeptide chain so that the A and B chains are properly oriented and form the appropriate disulfide bonds. The C-peptide could then be removed by tryptic cleavage. Another approach is to clone the entire gene for pre-insulin in a mammalian virus such as SV-40, rather than in a bacterial plasmid. Cultured cells infected with such a virus might then produce and secrete intact, functional, fully processed insulin. A further advantage of such an approach would be the greater ease of harvesting and purifying insulin from culture medium rather than from bacterial cells. Regardless of which recombinant techniques are employed, human insulin produced by this technol-
435
ogy will still have to undergo extensive testing for biologic activity and antigenicity before it is approved for general use. However, this work is commanding the attention of a large number of skilled investigators and is fuelled by ample resources from the drug industry. Hence, there appears little doubt that human insulin produced from recombinant DNA will be ready for animal and human testing in the near future.
References 1. Miller WL (1979) Use of recombinant DNA for the production of polypeptides. In: Petriccinai C, Hopps HE, Chapple PS (eds) Cell substrates; Their use in the production of biologicals. Plenum Press, New York, p 153-174 2. Committee on Recombinant DNA Molecules, Assembly of Life Sciences, National Research Council, National Academy of Sciences, Berg P, Baltimore D, Boyer HW, Cohen SN, Davis RW, Hogness DS~ Nathans D, Roblin R, Watson JD, Weissman S, Zinder ND (1974) Potential biohazards of recombinant DNA molecules. Proc Natl Acad Sci USA 71: 2593-2594 3. Recombinant DNA research - Guidelines (1976) Federal Register 41: 27902-27943; Recombinant DNA research Revised Guidelines (1978) Federal Register 43: 60080-60131 4. Israel MA, Chan HW, Rowe WP, Martin MA (1979) Molecular cloning of polyoma virus DNA in Escherichia coli: Plasmid vector system. Science 203:883-887 5. Chan HW, Israel MA, Garon CF, Rowe WP, Martin MA (1979) Molecular cloning of polyoma virus DNA in Escherichia coli: Lambda phage vector system. Science 203: 887-892 6. Chan SJ, Keim P, Steiner, DF (1976) Cell-free synthesis of rat preproinsulins: Characterization and partial amino acid sequence determination. Proc Natl Acad Sci USA 73: 19641968 7. Blobel G, Dobberstein B (1975) Transfer of proteins across membranes. I. Presence of proteolytically processed and unprocessed nascent immunoglobulin light chains on membrane-bound ribosomes of murine myeloma. J Cell Biol 67: 835-851 8. Blobel G, Dobberstein B (1975) Transfer of proteins across membranes. II. Reconstitution of functional rough microsomes from heterologous components. J Cell Biol 67: 852-862 9. UUrich A, Shine J, Chirgwin J, Pictet R, Tischer E, Rutter W J, Goodman HM (1977) Rat insulin genes; Construction of plasmids containing the coding sequences. Science 196: 1313-1319 10. Baltimore D (1970) RNA-dependent DNA polymerase in virions of RNA tumor viruses. Nature 226:1209-1211 11. Temin HM, Mizutani S (1970) RNA-dependent DNA polymerase in virions of Rous sarcoma virus. Nature 226: 1211-1213 12. Aviv H, Leder P (1972) Purification of biologically active globin messenger RNA by chromatography on oliogothymidylic acid-cellulose. Proc Natl Acad Sci USA 69:1408-1412 13. Leong JA, Garapin A-C, Jackson N, Fanshier L, Levinson W, Bishop JM (1972) Virus-specific ribonucleic acid in cells producing Rous sarcoma virus: Detection and characterization. J Virol 9:891-902 14. Scheller RH, Dickerson, RE, Boyer HW, Riggs AD, Itakura K
436 (1977) Chemical synthesis of restriction enzyme recognition sites useful for cloning. Science 196:177-180 15. Curtiss III R, Pereira DA, Hsu DC, Hull SC, Clarke JE, Maturin LS, Goldschmidt R, Moody R, Inoue M, Alexander L (1977) Biologic containment; The subordination of Escherichia coli K-12. In: Beers Jr RF, EG Bassett (eds) Recombinant molecules: Impact on science and society. Raven Press, New York, p 45-46 16. Villa-Komaroff L, Efstratiadis A, Broome S, Lomedico P, Tizard R, Naber SP, Chick WL, Gilbert W (1978) A bacterial clone synthesizing proinsulin. Proc Natl Acad Sci USA 75: 3727-3731 17. Betlach M, Herschfield V, Chow L, Brown W, Goodman HM, Boyer HW (1976) A restriction endonuclease analysis of the bacterial plasmid controlling the Eco RI restriction and modification of DNA. Fed Proc 35:2037-2043 18. Roychoudhury R, Jay E, Wu R (1976) Terminal labeling and addition of homopolymeric tracts to duplex DNA fragments by terminal deoxynucleotidyl transferase. Nucleic Acids Res 3: 863-877 19. Broome S, Gilbert W (1978) Immunological screening method to detect specific translation products. Proc Natl Acad Sci USA 75:5764-5769
W.L. Miller and J. D. Baxter: Recombinant DNA 20. Crea R, Kraszewski A, Hirose T, Itakura K (1978) Chemical synthesis of genes for human insulin. Proc Natl Acad Sci USA 75:5764-5769 22. Itakura K, Hirose T, Crea R, Riggs AD, Heyneker HL, Bolivar F, Boyer, HW (1977) Expression in Escherichia coli of a chemically synthesized gene for human somatostatin. Science 198:1056-1063 23. Goeddel DV, Kleid DG, Bolivar F, Heyneker HL, Yansura DG, Crea R, Hirose T, Kraszewski A, Itakura K, Riggs AD (1979) Expression in Escherichia coli of chemically synthesized genes for htunan insulin. Proc Natl Acad Sci USA 76: 106-110
Received: March 3, 1980
Dr. W. L. Miller Department of Pediatrics University of California San Francisco, CA 94143 USA