Mol Gen Genet (1981) 181 : 169-175 © Sprlnger-Verlag i981
The Sequence of IS4 R. Klaer, S. Kfihn*, E. Tillmann, H.-J. Fritz, and P. Starlinger Institut fiir Genetik der Unlversitat K61n, Weyertal 121, D-5000 K61n 40, Federal Republic of Germany
Summary. IS-elements are devoid of easily recognizable transacting functions and exert their visible effects in the position cis only (recent reviews Calos and Miller 1980; Starlinger 1980). It has been a matter of debate, whether these elements encode functions for their own transposition. In the case of the E. coli IS-elements this could not easily be determined by genetic methods, because most of these elements are present in several copies (Saedler and Heiss 1973; Deonier et al. 1979). In the case of the IS-elements flanking transposons, evidence has recently been brought forward that these carry the transposition specificity (Rothstein et al. 1980; Kleckner 1980; Grindley 1981). IS4 is present in one copy only in several E. coli K12 strains and should, therefore, be suitable for genetic and physiological studies (Chadwell et al. 1979). It has been cloned from several sites on the E. coli chromosome in p B R 3 2 2 (Klaer and Starlinger 1980). Here we report the D N A sequence of IS4 which contains an open reading frame for 442 amino acids, and of the junctions of this element with surrounding D N A at three different sites in the E. eoli chromosome.
in Fig. 1. It can be seen from this figure that the whole sequence is covered by overlaps, and that every sequence was determined from both strands, in most cases several times, until no violation of the D N A base pairing rules remained. The only exceptions are a few positions, in which a G in one strand corresponds to a gap in the other strand (example shown in Fig. 2). In all of these cases, this sequence is part of the sequence CC(A)GG. It is assumed that, in these cases, a C residue escaped detection because of methylation by a modification sytem with the specificity of the EcoRII-system. The sequence of IS4 is shown in Fig. 3. The integration sites, as inferred from the junctions of IS4 with adjacent bacterial D N A are shown in Fig. 4. Table 1 is a list of all restriction cleavage sites for the enzymes listed in the catalogue by Roberts (1980).
Discussion
The following features are recognized on the sequence of IS4: 1. An inverted terminal repeat of 18 nucleotides with two mismatches in position 8 and 9.
Materials and Methods
Plasmids pKS57 and pKS58 have been described (Klaer and Starhnger 1980) and were carried in E. coli K12, F165 Agal trp (Fiethen and Starlinger 1970). pKS57 was used for the determination of the whole sequence of IS4 in its common chromosomal site. pKS58 carries the second copy of IS4 of strain F165 and served as a DNA source for the determination of the junction sequences at this integration site. DNA preparation, isolation of restriction fragments on a preparative scale, and treatment with calf intestinal alkaline phosphatase (Boehringer, Mannheim) were as described previously (Habermann et al. 1979). Fragments were labelled at the Y-termini with polynucleotldyl kmase, and 7-a2P-ATP (Amersham/Buchler, Braunschweig), and at the 3'-termmi with the Klenow fragment of DNA polymerase I (Boehringer, Mannheim), and the appropriate c~-32P-dXTPs (Amersham/Buchler, Braunschweig). The fragments were either recleaved with an appropriate restriction enzyme, or single strands were separated on polyacrylamide gels, as described by Maxam and Gilbert (1977). The chemical degradation of Maxam and Gdbert (1977) was used throughout, using the A > C alkaline degradation for the determination of A residues. Computer programs designed by R. Staden (pers. comm.) were used with minor modifications after adaption to a Cyber 72 computer. Results
The fragments used for the determination of the sequence of IS4 in plasmid pKS57, and the direction of reading are shown * Present Address: Max-Planck-InstltUt ffir Zellbiologle, D-6802 Ladenburg, Federal Republic of Germany Offprint requests to: P Starlinger
2. A potential stem and loop structure, followed by a stretch of T residues, located near both of the termini and pointing into the interior of IS4 (Fig. 5). These structures strongly resemble rho-independent transcription stop signals (Rosenberg and Court 1979). It is possible that these structures are responsible for the strong polarity of IS4 (Besemer and Herpers 1977), though this will have to be supported by functional studies. If these structures are transcription stop signals, they prevent transcription from the outside into IS4. 3. In orientation II, a long open reading frame extends from an A T G - c o d o n beginning at position 85 to a T A A termination codon at position 1411. A sequence of this length could encode 442 amino acids, and the polypeptide would have a molecular weight of 54000 dalton, as deduced from the amino acid sequence encoded by the open frame. The amino acid composition of this protein and also the structural features predicted by the rules of Chou and Fasman (t974) are inconspicuous, and would not predict an unusual or particularly unstable 3D structure (K. Trinks, pers. comm.). 4. Preceding the open reading frame, a sequence G G A occupies position - 8 to - 1 0 , and would qualify for a ribosome binding site, similar to that found for the coat protein gene of bacteriophage f l (Steitz 1979). A possible promoter sequence could be formed by a Pribnow box A A G A A T C starting at position 47 and a - 3 5 site around the sequence G T T G A C at positions 24 29. These sequences match rather well the consensus sequence for a promoter described by Rosenberg and Court (1979), (Fig. 6). If this sequence were a promoter, transcription would
0026-8925/81/0181/0169/$01.40
170
I ll~l
Hhal A[u I
~
~
I I1->
Taql
H~eSII HinfI
~ I
<--"
HpaII
I ,1~-',
~i
II
:1
Ilo II 1 I~ > ~
~
k-'--> II
;I
>
J [
HoeIII
~'
<
Ih
Taql
",-
~I
'4
I ",-
~
I
<
,,
I ~
A[uI
~
I-"
I
I
I
"1 I
II
II
HindlI
>
II II
II
~
HhnI
'
><
J~ '--' ,
-f
I ..
AvaI
>~
I Ih
II
>11
PsfI
HindlI
< H
I--*
I
I1~:
"1
Sau3A
Sau3A Hpa II HinfI
I1
~.
>1 II
II I
< I~ I
<
~ I~1 <
I< I
I
~ I
----II
I II
II
I
I
Fig. 1. A restriction m a p of IS4 in the c o m m o n site shows the cleavage sites used for sequencing. Arrows above the bar indicate sequences determined from the upper strand, arrows below the bar denote sequences read from the lower strand. The length of the arrow indicates the length of the fragment used, the heavy black part of the arrow denotes the extent of the sequence determined. The heads of the arrows indicate whether the fragments were read in 3 ' ~ 5' or 5'--+3' direction Most of the degradation products were separated both on 40 cm 20% polyacrylamide gels, and on 40 or 80 cm 12% polyacrylamide gels
G
A [ >C +T T
T
C T A \ C \ x,X', A \ 5 ,\ --\ \\ T ,,\
/
C*T
A>C
G
15 ~ ~ A ~ T ~___ 8~_ T ~ C A
O
G
[
G
[ G ~ T A
Fig. 2. Sequencing gels of complementary strands from position 306 to 318 of the IS4 sequence. Arrows indicate gaps opposite to a G residue m the sequence CC(~)GG
171
l
20
30
40
50
60
70
80
90
I00
TAAT•CCCATCAGTTAAGGATCAGTTGACCGATCCAGTGGCTGTGTAAGAATCCGGAAACGCTCACTTGTTTCCGGATTTTTTTATGCACATTGGACAGG ATTACG•CTAGTCAATTCCTAGTCAACTGGCTAGGTCACCGACACATTCTTAGGCCTTTGCGAGTGAACAAAGGCCTAAAAAAATACGTGTAACCTGTCC II
120
130
140
150
160
170
180
190
200
CTCTTGATCTGGTATCCCGTTACGATTCTCTGCGTAACCCACTGACTTCTCTGGGGGATTACCTCGACCCCGAACTCATCTCTCGTTGCCTTGCCGAATC •AGAACTA•ACCATAGGGCAATGCTAAGAGACGCATTGGGTGACTGAAGAGACCCCCTAATGGAGCTGGGGCTTGAGTAGAGAGCAACGGAACGGCTTAG 21
220
230
240
250
260
270
280
290
30
AGGTACTGTAACGCTACGCAAGCGCCGTCTTCCCCTCGAAATGATGGTCTGGTGTATTGTTGGCATGGCGCTTGAGCGTAAAGAACCTCTTCACCAGATT •CCATGACATTGCGATGCGTTCGCGGCAGAAGGGGAGCTTTACTACCAGACCACATAACAACCGTACCGCGAACTCGCATTTCTTGGAGAAGTGGTCTAA 31
320
330
340
35
360
370
380
390
400
GTGAATCGCCTGGACATCATGCTGCCGGGCAATCGCCCCTTCGTTGCCCCCAGTGCCGTTATTCAGGCCCGCCAGCGCCTGGGAAGTGAGGCTGTCCGCC CACTTAGCGGACCTGTAGTACGACGGCCCGTTAGCGGGGAAGCAACG•GGGTCACGGCAATAAGTCCGGGCGGTCGCGGACCCTTCACTCCGACAGGCGG 41
42
430
440
450
460
470
480
490
500
GCGTGTTCACGAAAACAGCGCAGCTCTGGCATAACGCCACGCCGCATCCGCACTGGTGCGGCCTGACCCTGCTGGCCATCGATGGTGTGTTCTGGCGCAC CGCACAAGTGCTTTTGTCGCGTCGAGACCGTATTGCGGTGCGGCGTAGGCGTGACCACGCCGGACTGGGACGACCGGTAGCTACCACACAAGACCGCGTG 510
520
530
540
550
56
570
580
59
600
ACCGGATACACCAGAGAACGATGCAGCCTTCCCCCGCCAGACACATGCCGGGAACCCGGCGCTCTACCCGCAGGTCAAAATGGTCTGCCAGATGGAACTG TGGCCTATGTGGTCTCTTGCTACGTCGGAAGGGGGCGGTCTGTGTACGGCCCTTGGGCCGCGAGATGGGCGTCCAGTTTTACCAGACGGTCTACCTTGAC 610
620
630
640
650
660
670
680
690
700
ACCAGCCATCTGCTGACGGCTGCAGCCTTCGGCACGATGAAGAACAGCGAAAATGAGCTTGCTGAGCAACTTATAGAACAAACCGGCGATAACACTCTGA TGGTCGGTAGACGACTGCCGACGTCGGAAGCCGTGCTACTTCTTGTCGCTTTTACTCGAACGACTCGTTGAATATCTTGTTTGGCCGCTATTGTGAGACT 710
720
730
740
750
760
770
780
790
800
CGTTAATGGATAAAG•TTATTACTCACTGGGACTGTTAAATGCCTGGAGCCTGGCGGGAGAACACCGCCACTGGATGATACCTCTCAGAAAGGGAGCGCA GCAATTACCTATTTCCAATAATGAGTGACCCTGACAATTTACGGACCTCGGACCGCCCTCTTGT•GCGGTGACCTACTATGGAGAGTCTTTCCCTCGCGT 810
820
830
840
850
860
87
880
890
900
ATATGAAGAGATCAGAAAACTGGGTAAAGGCGATCATCTGGTGAAGCTGAAAACCAGCCCGCAGGCACGAAAAAAGTGGCCGGGACTGGGAAATGAAGTG TATACTTCTCTAGTCTTTTGACCCATTTCCGCTAGTAGACCACTTCGACTTTTGGTCGGGCGTCCGTGCTTTTTTCACCGGCCCTGACCCTTTACTTCAC 910
920
930
940
950
960
970
98
990
I000
ACT•CCCGCCTGCTGACCGTGACGCGCAAAGGAAAAGTCTGCCATCTGCTGACGTCGATGACGGACGCCATGCGCTTCCCCGGAGGAGAAATGGGGGATC TGACGGGCGGACGACTGGCACTGCGCGTTTCCTTTTCAGACGGTAGACGACTGCAGCTACTGCCTGCGGTACGCGAAGGGGCCTCCTCTTTACCCCCTAG I010
1020
1030
1040
1050
1060
1070
1080
1090
II00
TGTACAGTCATCGCTGGGAAATCGAACTGGGATACAGGGAGATAAAACAGACGATGCAACGGAGCAGGCTGACGCTGAGAAGTAAAAAGCCGGAGCTTGT ACATGTCAGTAGCGACCCTTTAGCTTGACCCTATGTCCCTCTATTTTGTCTGCTACGTTGCCTCGTCCGACTGCGACTCTTCATTTTTCGGCCTCGAACA iii0
1120
1130
1140
1150
i160
1170
1180
1190
1200
GGAGCAAGAGCTGTGGGGTGTCTTACTGGCTTATAATCTGGTGAGATATCAGATGATTAAAATGGCGGAACATCTGAAAGGTTACTGGCCGAATCAACTG CCTCGTTCTCGACACCCCACAGAATGACCGAATATTAGACCACTCTATAGTCTACTAATTTTACCGCCTTGTAGACTTTCCAATGACCGGCTTAGTTGAC 1210
1220
1230
124
1250
1260
1270
1280
1290
1300
AGTTTCTCAGAATCATGCGGAATGGTGATGAGAATGCTGATGACATTGCAGGGCGCTTCACCGGGACGTATACCGGAGCTGATGCGCGATCTTGCAAGTA TCAAAGAGTCTTAGTACGCCTTACCACTACTCTTACGACTACTGTAACGTCCCGCGAAGTGGCCCTGCATATGGCCTCGACTACGCGCTAGAACGTTCAT 1310
1320
1330
1340
1350
1360
1370
1380
1390
1400
TGGGACAACTTGTGAAATTACCGACAAGAAGGGAAAGGGCCTTCCCGAGAGTGGTAAAGGAGAGGCCCTGGAAATACCCCACAGCCCCGAAAAAGAGCCA ACCCT•TTGAACACTTTAATCGCTGTTCTTCCCTTTCCCG•AA•G•CTCTCACCATTTCCTCTCCGGGACCTTTATG•GGTGTCGG•CCTTTTTCTCGGT 1410
[420
GTCAGTTGCTTAACTGACTGCCATTA CACTCAACGAATTGACTGACCGTAAT
Fig. 3. Sequence of IS4 in the c o m m o n site. The n u m b e r i n g of nucleotides is ~ o m 1 to 1426 corresponding to orientation II in
galT
172 Table 1. Computer printout of cleavage sites within IS4 of the restriction enzymes listed in Roberts (1980). No sites are found for AvalI, AvaIII, AvrII, BamHI, BclI, BglI, BglII, BstEII, EcoB, EcoK, EcoRI, HgiAI, HindlII, HpaL KpnI, MstI, PvuI, PvuII, SacI, SaclI, Sail, SphI, XbaI, XhoI, XmaI, XmalII (isoschizomers not given) AccI AcyI AluI AsuI AvaI Bali BbvI ClaI DdeI EcoP1 EcoP15 EcoRII Fnu4HI FnuDII HaeI HaeII HaelII HgaI HhaI HindII Hinfl HpalI HphI MbolI MnlI PstI RsaI Sau3A SfaNI TaqI XhoII
1268 951, 964 422, 656, 845,1094,1109,1277 366, 1337, 1364 1344 473 321, 420, 523, 619, 622 478 662, 784, 1075,1198,1206 246, 582 469, 610, 910, 946 309, 378, 743, 750,1367 321, 398, 420, 441, 458, 523, 619, 622 400, 923, 1285 473 221, 267, 374, 558, 1252 366, 460, 474, 878,1187,1338,1364 921, 964, 1071 222, 268, 375, 418, 495, 559, 796, 924, 972, 1253,1284 24 49, 124, 196, 303,1191,1210 53, 73, 325, 502, 548, 556, 683, 880, 980, 1090,1261, 1273 291, 840, 1140,1224,1258 228, 288, 639, 805 162, 234, 286, 388, 781, 983,1362 620 203, 1002 8, 19, 31, 106, 810, 832, 997, 1288 444, 520, 1053,1281 164, 236, 479, 955,1022 996
start at an A residue in position 59 or at a G residue in position 61. The R N A molecule could not form the stem and loop structure and would thus escape the effect of the putative stop signal which conceivably would terminate all transcription from outside promoters. In spite of these features, the putative IS4 gene does not seem to be transcribed appreciably in vivo because, in this case, transcription would not find a stop signal before the end of IS4. Continuation of transcription out of IS4 at the integration site in galT would lead to a relief of polarity of galactokinase formation, which is not observed. R N A polymerase binding studies with isolated fragments (to be published elsewhere) and the investigation of an IS4-dependent protein produced in minicells (Trinks and Ehring, pers. comm.) also support the notion that the expression of the large IS4 gene is weak, though detectable. This poses the question of the regulation of this gene. If the product of the gene is a protein participating in the transposition process, it would not be expected to be produced in large amounts, and may therefore, be subject to regulation of some kind. It will be interesting to study it. 5. An open frame, encoding 131 amino acids, is present in the other orientation, starting with an A T G in position 609 (reading backwards). Here, no ribosome binding site is present, and the sequences C A A G C T C at position 661-665 and T T G T at position 681 678 might qualify for the promoter. No evidence is available for the expression of this gene in vivo or in vitro. It is interesting, however, that both IS2 (Ghosal et al. 1979) and IS903 (Grindley and Clarke 1981) show a similar arrangement of a large open frame in one orientation and a smaller open frame encoding approximately 130 amino acids in the other orientation. No similar arrangement is seen on the sequence of IS/ (Ohtsubo and Ohtsubo 1978). 6. The comparison of the integration sites of IS4 in the three sites determined presently show an 11 or 12 base pair duplication,
5' ~ a C G C G I ~ ' a V r A C ~ . ~ T C A C ~ m A 3 ' TTTGCGCAAACAATGACTACAGTGTTTAACT
®
?
GC A T A T ~ T A G C TGATAGTTTG TTAC[--~-~]ACi[~GAAC C AAGAT GC C T A ~ C AGCATCAAC TGTTA~TTATC C GTATA-AAAACATCGACTATCAAACAATG~G~G I[~TTGGTTC TA CGGATAAA~CA~A~TCGTAGTTGAC AATTTAATAG
~ _ TGGT~CCAAC GAGI%ATACTGAGC ~TTTTGTC C C CAAGAC TGGC CGC 3' AC CAAC GGTTGC TC TTATC-ACTC GAAAAAAC AGGGGTTC TGAC CGGC G 5'
@
CCCTGTGCTGCGTCGCATCGCC~ACCCATAAACGC 3 ' GGGACACGACGCAGCGTAGCGCTGGGTATTTGCGI
GCCGCCGCCG C ~ i ~ ] ? C T C G C GTCGGCGGCGGC | G C A ~ G A G C G
I,,, t It._l
3 ' CGTCGGGTAGCGGCGCGGAAACA~G~GAC
L
.G~TACCCA
l
L_J
It
3' 5'
t
TGCAA~GGGGTCAAGACGCCAAACGAGGACGGTC
5'
L,
Fig. 4. Integration sites of IS4. These sites have been drawn from the sequence determination of the junctions of IS4 with adjacent bacterial DNA, assuming that before integratxon of IS4 the flanking duplication of 11 or 12bp was represented once only in the sequence. This assumption is proven for IS4 only in galT (Bidwell 1979; Habermann et al. 1979). The integration site of the common copy of IS4 is shown in a. The first 19 bp and the last 49 bp have been determined from one strand only. The junction sequence of IS4 in b (2nd copy in strata F165; pKS58) has been determined from one DNA strand only, reading outward from IS4 in both directions. C shows the integration site in gait where to the right is galE. IS4 is integrated in a and b with the HmdII end to the left. This would mean orientation II in c
173
CA T
C
G
T
C
T
A6 A
C=G A-T A-T A-T
C
T-A T-A O:C A-T HindlI end of ISI.,
G=C
G-C C=G C=G T-A A-T
~ so A-T 79 ~ ~o s' ~_ A A T . . . . GI~TI~TGTAAG TTTTTATGCAI~ATTGGACAGG 3'
s'
C=O
Avol end
T-A G=E A-T C=G C=fi
o~ IS~
s G=C
TAAT
30
so
,0
TCTTTTTCGOGGCTGTGGGG
3'
Fig. 5. Potential stem and loop structures near both of the termini of IS4
CA T
E
G
T
E
T CG A T A T A T 6 C G C F~] ribosome C bmdng S
I~e
S'
T stort T [odon 3' I G ~ C G A T E C A G T G G C T G T G T I A A G I T T TTT~--G]
s'
3'
-3sbox
CGA ECAGTGG gTTOaca-fft
6-9
G
A G
ECGGAA C
atffgfTAtaaTg
~-7
car
Fig. 6. Sequences preceding the large open reading frame of IS4 (6a), indicating the position of a putative ribosome binding site and promoter (boxed areas) relative to the stem and loop structure of Fig. 5. In 6b the hypothetical IS4 promoter is compared to a consensus sequence for E coh promoters (Rosenberg and Court 1979)
a n d no indication of either AT-richness or limited homology of sequences outside the duplicated region to the reverted repeats (Fig. 4). In these respects, IS4 is different from IS1 which has these preferences (Meyer et al. 1980; Galas e t a l . 1980). This is n o t surprising, as IS4 integrates m the gal operon into one position only, which is n o t yet occupied by a n IS/. The latter, however, is f o u n d in several p o s m o n s and preferentially in the leader sequence (Saedler et al. 1972). The insertions of IS4 in galT that have been sequenced so far are all in the same position ( H a b e r m a n n et al. 1979). F o u r insertions of IS/ in the leader sequence of the gal operon occupy three different, t h o u g h closely linked positions (Kfihn et al. 1979). The difference in integration specificity thus seems to be matched by a difference in the structure of the integration site. C o m p a r i s o n of the integration sites of IS4 shows the limited similarities indicated by the shaded areas in Fig. 4. It is conceivable that these eight nucleotides form an integration site. There is a small violation of this site in gaiT, at which the duplication of 11 instead of 12 base pairs flanking IS4 is observed. The not very close correspondence of sites is similar to the integration sites of other transposons, to the consensus sequence of D N A gyrase (Cozzarelli 1980) and to the diversity of promoters (Rosenberg and C o u r t 1979), and is dissimilar from the stringency of the recognition sites of restriction enzymes. It will he interesting to see, whether the site proposed from the three k n o w n integration sites will hold up, when new sites become known.
Table 2. Sequence homologies between IS4 and IS1, IS2., 16S rRNA, and 23S rRNA. Only homologies that are m one block are calculated Note that an e g. 11 bp block is counted twice in the 10 bp block column etc. X-fold exp. is the number found divided by the number expected for random sequences Block length
8
9
10
11
Found x-fold Found x-fold Found x-fold Found x-fold exp. exp. exp. exp IS1 IS2 16S rRNA 23S rRNA
62 82 80 162
19 14 1.2 I3
19 16 18 41
23 1.1 11 13
5 3 2 13
2.4 0.8 0.5 t.7
1 3
1.9 1.5
7. Nisen a n d Shapiro (1980) have compared IS/ and IS2 to each other and to the D N A encoding E. coli r R N A precursors, and have discovered sequence homologies. We have compared IS4 to IS1, IS2, 16S r R N A , and 23S r R N A a n d have found no striking similarities, but homologies slightly more frequent than expected for r a n d o m D N A sequences (Table 2). A l0 and 11 base pair homology is seen between the two strands of the stern and loop structure in positions 50 79 of IS4 and positions 1495-1504 of E. coli 23S r R N A (Broslns et al. 1980). Some of these sequence homologies are presented in Fig. 7.
174 Is4
1156
1175
IS4 1327
ATTAAAATGGC G G A A C A T C T XX XXX_XXDf~DCXXX XXX TTTTAAATGGCGGAAAATCG
IS1
31;
IS4
32
29;
47
1338
AGAAGGGAAAGG XXXXXXXXXX TGA.&GGGAAAGC
IS1
57;
IS4
322
58;
339
.~rCCAG~-~a';.
c~ccG~caaa'cGcccc
XXXXXXXXXX XXX
XXX]CXXXXXXXXXXX
CTC CAGTGGCTTC TGTT
ATGCCGGGCA&CCGCCCG
ZSl
51
I$4
508
67
is1
267
250
539
ACACC-A . . . . G A G A A C G A T - G C A G C C T T C C - - C C C - G C C A XXXXX XXXXXXXX XXXXXXXX XXX XXX CCACCGATTTTGAGAACGACAGC-GACTTCCGTCCCAGCCG IS1
700
IS4
371
66~
382
IS4 1051
GCCAGCGCCTGG XXXXXXXXXX TCCAGCGCCTGC
• s2
335
34~
IS4
1226
1237
1062
;CGATGCA&CGG XXXXXXXXXX TCGATGCAACGC
~s2
36;
357
TGATGAGAATGC
xxxxxxxxxx AGATGAGAATGT IS2
316
305
8. Preceding the open reading frame, a HpaII site is located which may serve as a cleavage site to separate the putative promoter from the open frame and to link these structures to other genes and promoters respectively. Such studies are presently being carried out in this laboratory. We hope to complement the structure of IS4 by functional studies and eventually to gain an understanding o f the transposition process. 9. The EcoRII sites in IS4 are modified in the second C residue of the sequence Cc(TA)GG. In the same position methylation has been noted for EcoRII sites in the lacI gene of E. colt K12, but not o f E. colt B (Coulondre et al. 1978). In a different strain, E. coli SK, the first C residue o f the EcoRII sites is methylated (Nikolskaya et al. 1979). Acknowledgenwnts. Th~s work was supported by Deutsche Forschungsgememschaft through SFB 74.
References Besemer J, Herpers M (1977) Suppression of polarity of insertion mutations within the gal operon of E. colt. Mol Gen Genet 151. 295-304 Badwell K, Landy A (1979) Structural features of 2 site-specific recombination at a secondary att site in gaff. Cell 16:397-406 Brosius J, Dull T, Noller HF (1980) Complete nucleotide sequence
Fig. 7. Examples of sequence homologies between IS4 and IS/ or IS2
of a 23S ribosomal RNA gene from Eschertchta coh. Proc Natl Acad Scl USA 77:201-204 Calos MP, Miller JH (1980) Transposable elements. Cell (in press) Chadwell H, Fritz HI-J, Habermann P, Klaer R, K/ihn S, Starlinger P (1979) Studies with IS/ and IS4. Cold Spring Harbor Symp Quant Biol 43:1187 1192 Chou PY, Fasman GD (1974) Prediction of protein conformation. Blochem 13 : 222 245 Coulondre C, Miller JH, Farabaugh PJ, Gilbert W (1978) Molecular basis of base substitution hotspots in Escherzchta colt Nature 274 : 775-780 Cozzarelli NR (1980) DNA gyrase and the supercoiling of DNA. Science 207 : 953-960 Deomer R, Hadley R, Hu M (1979) Enumeration and identification of IS3 elements in Eschertchia colt strains. J Bact 137:1421 1424 Flethen L, Starhnger P (1970) Mutations in the galactose operator. Mol Gen Genet 108:322 330 Galas DJ, Miller JH, Calos MP (1980) Sequence analysis of Tn9 insertions in the lacZ gene. Cell (in press) Ghosal D, Sommer H, Saedler H (1979) Nucleotlde sequence of the transposable element IS2 Nucl Acids Res 6:1111 1122 Grindley NF, Clarke CM (1981) Structure and function of the transposon Tn903. Cold Spring Harbor Symp Quant Blol 45 (in press) Habermann P, Klaer R, Kfihn S, Starhnger P (1979) IS4 is found between eleven and twelve base pair duplications Mol Gen Genet 175:369 373 Klaer R, Starlinger P (1980) IS4 at its chromosomal site m E. colt K12 Mol Gen Genet 178:285 293
175 Kleckner N (1980) D N A sequence analysis of TnlO insertions. Cell (in press) Ktihn S, Fritz H-J, Starllnger P (1979) Close vicinity of IS/ mtegratlon sites m the leader sequence of the gal operon of E. colt MoI Gen Genet 167 : 235-241 M a x a m A, Gilbert W (1977) A new method for sequencing D N A . Proc Natl Acad Scl U S A 74:560 564 Meyer J, h d a S, Arber W (1980) Does the l n s e m o n element IS/ transpose preferentmlly mto A + T - r m h segments? Mol Gen Genet in press Nlkolskaya lI, Lopatina N G , Anlkemheva NV, Debov SS (1979) Determination of the recognition sltes of cytosine DNA-methylases from Escherichia coli S K. Nucl Acids Res 7:517-528 Nlsen P, Shapiro L (1979) E. coli ribosomal R N A contains sequences h o m o l o g o u s to insertion sequences IS/ and IS2. Nature 282:872 874 Ohtsubo H, Ohtsubo E (1978) Nucleotade sequence of an insertion element IS/. Proc Natl Acad Scl U S A 75:615 619 Roberts RJ (1980) R e s m c t i o n and modification enzymes and their recognition sequences. Nucl Acids Res 8'r63-rS0 Rosenberg M, Court D (1979) Regularory sequences involved m the
promotion and termination of R N A transcription. A n n u Rev Genet 13:319-354 Rothsteln SJ, Jorgensen RA, Postle K, Reznikoff WS (1980) The inverted repeats of Tn5 are functionally different. Cell 19:795-805 Saedler H, Besemer J, Kemper B, Rosenwirth B, Starhnger P (1972) Insertion mutations in the control region of the gal operon of E. coll. I. Blologmal characterization of the mutations. Mol Gen Genet 115:258-265 Saedler H, Heiss B (1973) Multiple copxes of the insertion D N A sequences IS/ and IS2 In the chromosome of E. coli K12 Mol Gen Genet 122:267 277 Starhnger P (1980) IS elements and transposons. Plasmid 3. 241-259 Steitz JA (1979) Genetic signals and nucleotide sequences in messenger RNA. Biological regulation and development, vol 1, Gene expression (R. Goldberger ed) Plenum Press, New York, p 349
C o m m u n i c a t e d b y E. B a u t z
Recmved August 14, 1980