Journal of Protein Chemistry, Vol. 12, No. 3, 1993
The Amino Acid Sequence of Hemoglobin III from the Symbiont-Harboring Clam Lucina pectinata Jerrolynn D. Hockenhull-Johnson, 1 Mary S. Stern, 1 Jonathan B. Wittenberg, 2 Serge N. Vinogradov, 3 Oscar H. Kapp, 4 and Daniel A. Walz 1'5 Received October 28, 1992
The cytoplasmic hemoglobin III from the gill of the symbiont-harboring clam Lucina pectinata consists of 152 amino acid residues, has a calculated Mm of 18,068, including heme, and has N-acetyl-serine as the N-terminal residue. Based on the alignment of its sequence with other vertebrate and nonvertebrate globins, it retains the invariant residues Phe45 at position CD1 and His98 at the proximal position F8, as well as the highly conserved Trpl6 and Pro39 at positions A 12 and C2, respectively. The most likely candidate for the distal residue at position E7 is Gln66. Lucina hemoglobin III shares 95 identical residues with hemoglobin II (J. D. Hockenhull-Johnson et al., J. Prot. Chem. 10, 609-622, 1991), including Tyr at position B10, which has been shown to be capable of entering the distal heme cavity and placing its hydroxyl group within a 2.8 A of the water molecule occupying the distal ligand position, by modeling the hemoglobin II sequence using the crystal structure of sperm whale metmyoglobin. The amino acid sequences of the two Lucina globins are compared in detail with the known sequences of mollusc globins, including seven cytoplasmic and 11 intracellular globins. Relative to 75% homology between the two Lucina globins (counting identical and conserved residues), both sequences have percent homology scores ranging from 36-49% when compared to the two groups of mollusc globins. The highest homology appears to exist between the Lucina globins and the cytoplasmic hemoglobin of Busycon canaliculatum. KEY WORDS: Hemoglobin; invertebrate; bivalve mollusc; Lucina pectinata.
1. I N T R O D U C T I O N
many invertebrates occurring in sulfide-rich coastal and deep sediments (Southward, 1987; Fisher,.1990; Childress et al., 1992). In symbiont-harboring bivalve molluscs, the bacterial symbiont is located intracellularly within the very large gills, which are generally rich in cytoplasmic hemoglobins. Wittenberg et al. have suggested that the latter may play a physiological role in the symbiosis (Wittenberg, 1985; Wittenberg and Wittenberg, 1990; Wittenberg and Kraus, 1991). Kraus and Wittenberg (1990) have shown that in Lucina pectinata, formerly known as Phacoidees pectinatus (Read, 1966), this hemoglobin consists of three components, two of which, Hb II and Ii[, combine with oxygen and are unaffected by the presence of sulfide. The third component, Hb I, is monomeric and reacts with sulfide rapidly and reversibly, in the presence of oxygen, to form a ferric hemoglobin sulfide
Symbiotic associations between invertebrates and chemo-autotrophic sulfide-oxidizing bacteria were first observed over a decade ago in several different groups of invertebrates living near hydrothermal vents in the ocean floor (reviewed by Jones, 1985). Since then, such symbioses have also been found in
Department of Physiology, Wayne State University School of Medicine, Detroit, Michigan 48201. 2 Department of Physiology and Biophysics, Albert Einstein College of Medicine, Bronx, New York 10461. 3 Department of Biochemistry, Wayne State University School of Medicine, Detroit, Michigan 48201. 4 Enrico Fermi Institute, University of Chicago, Chicago, Illinois 60637. 5 To whom all correspondence should be addressed.
261 0277-8033/93/0600,0261507.00/0© 1993 Plenum PublishingCorporation
262
Hockenhull-Johnson et aL
Table I. AminoAcid Compositionsof Lucina pectinata Hemoglobin III Amino acid Asp (D+N) Glu (E+Q) Ser (S) Gly (G) His (H) Arg (R) Thr (T) Ala (A) Pro (P) Tyr (Y) Val (V) Met (M) Cys (C) lie (I) Leu (L) Phe (F) Lys (K) Trp (W) Total Calc. apoprotein MW
No. of residues/mol AAAa AASeqb 21 13 9 10 5 7 8 10 6 5 9 7 3 5 16 10 11 2 157 17,146
20 12 10 9 5 7 8 9 4 5 !0 7 3 4 16 11 10 2 152 17,409
aAminoacid compositionof Hb III providedby Dr. Jonathan B. Wittenberg. bAminoacid compositionof Hb III determinedfrom sequence analysisof Hb III by Edmandegradation.
(Kraus and Wittenberg, 1990; Kraus et al., 1990). The oxygen-reactive components exhibit moderately high oxygen affinities (Ps0= 0.1-0.2 torr) and unusually slow rates for oxygen association and dissociation. Furthermore, Hb II and Hb Ill self-associate and a preliminary X-ray diffraction study of crystals of a complex of the two components (Kemling et al., 1991) has suggested that an equimolar, tetrameric complex is formed, similar to the heterotetrameric structure of vertebrate hemoglobins. An investigation of the optical and EPR spectra of ferrihemoglobins II and III has implicated a tyrosinate group as the distal ligand to the heme iron (Kraus et al., 1990). The amino acid sequence of hemoglobin II determined earlier (Hockenhull-Johnson et al., 1991) has the residue Tyr 30 in position B10, which was shown by computer modeling of the sequence onto the known crystal structure o f sperm whale aquometmyoglobin to be capable of moving into the distal heine cavity. We present below the results of a determination of the complete amino acid sequence of hemoglobin III, the remaining oxygenreactive globin of Lucina pectinata.
2. MATERIALS AND METHODS
2.1. Materials Hemoglobin III was prepared from the gills of Lucina pectinata as described previously (Kraus and Wittenberg, 1990). Removal of heine was effected by extraction with methylethylketone at acid pH (Teale, 1959). The Hb III globin was purified by reversedphase chromatography on a C8 column (Pro RPC HR 5/10, Pharmacia) using 90% acetonitrile in 0.1% TFA. The globin was reduced with dithiothreitol and carboxymethylated with iodoacetic acid (Crestfield et al., 1963).
2.2. Chemical Cleavage and Proteolysis Chemical cleavage of Hb III with cyanogen bromide was performed as described by Gross and Witkop (1962) using a 500-fold molar excess of CNBr. Trypsin proteolysis was carried out for 4 hr at 37°C in 0.2 M ammonium bicarbonate, pH 8.2, with an enzyme-to-protein ratio of 1:50 (w/w). Digestion with chymotrypsin was performed in trypsin buffer at 24°C for the 1 hr with an enzyme-to-protein ratio of 1:100 (w/w). Digestion with endoproteinase Asp-N was also carried out in trypsin buffer for 24 hr at 37°C with an enzyme-to-protein ratio of 1 : 76 (w/w), with 0.8 M urea used to prevent the precipitation of the protein.
2.3. Peptide Separation The peptides obtained by chemical and proteolytic cleavages were separated by reversed-phase chromatography on a Cj8 column ( P e p RPC 5/5, Pharmacia) using a Pharmacia FPLC system, in which the absorbance of the eluate was monitored at 214 and 280 nm. Gradients of 26-55%, 19-50%, 050%, and 0-40% acetonitrile in 0.1% trifluoroacetic acid were used for the separations of the CNBr, tryptic, chymotryptic, and Asp-N peptides, respectively.
2.4. Amino Acid Analysis Samples (2 nmol or 10% of peptide fractions from reversed-phase chromatography) were hydrolyzed in 6 N HCI containing 1% phenol for 1.5 hr at 155°C, dried once, and then again in methanol: 0.75 M sodium acetate: triethylamine (2:2:1). Derivatization to form the phenylthiocarbamyl (PTC) derivative, was performed by incubation in methanol: water: triethylamine: phenylisothiocyanate
Hemoglobin III Sequence from L. pectinata
263
7
i
L N N
~r "~ CII
~-4o
\
4
o
i
10
165' 168 '173 Time (rain) Reversed phase chromatography of reduced Asian peptides c! Lucina pectinata Hb3
Fig. 1. Reversed-phase chromatography of the Asp-N peptides of Hb III on a C~8 column using a gradient of 0-40% acetonitrile in 0.1% TFA.
Table II. Amino Acid Composition of Asp-N Peptides from Lucina pectinata Hemoglobin III Amino acidb
1
2a
2b
3
4
(D+N) (E+Q) S G H R X A P Y V M C 1 L F K W
0.6(1) 1.4(1) 4.0(5) 2.4(1)
3.5(4)
1.0(I)
1.1(I) !.1(t) 1.3(1)
2.3(2) 1.1(1) 1.2(1)
Total Sequence ~
20 1-20
1.4(1) 1.2(1) 2.0(2) 1.0(1)
0.5(1)
2.5(2) 0.9(1) 1.7(2) +(1)
Asp-N peptides ~ 5 6
1.9(1)
2.0(2) l.l(1)
1.7(2) 1.0(l) 1.0(1) 0.8(1)
0.9(1)
1.0(l) 0.4(1)
0.6(1)
11 21-31
1.4(1)
0.8(I) 0.4(l)
1.0(1) 1.1(1) 0.9(1)
1.9(2) 1.7(2) 1.0(1)
1.0(1)
7 33-39
11 40-50
8 51-58
1.5(1) 1.3(1) 3.5(0) 1.9(0) 0.4(1) 1.2(0) 1.7(1) 2.1(2) 1.0(l) 0.9(1) 0.7(1) 0.3(1)
7
8
9
10
3.5(3) 2.9(3)
1.2(1) 1.4(1)
2.0(1) 1.8(1)
1.8(1) 1.3(1)
1.8(1) 1.9(2) 3.2(3)
2.6(2)
2.5(2) 2.2(2) 1.3(1) 1.2(1) 1.0(2)
1.0(I)
1.3(1)
1.9(3) 0.5(1)
1.1(2) 0.8(1)
1.3(1) 1.1(1) 1.0(1)
1.8(1)
1.0(1)
0.7(1) 1.4(1)
1.7(2) 4A(5) 1.0(1) 3.4(3)
13 59-71
28 82-109
1.3(1)
0.5(1) 0.4(1)
1.5(1) 1.3(1) 1.3(1)
2.0(1)
0.2(1) 1.9(2)
3.0(3) 1.0(1)
1.l(1) t.0(1) 0.8(1)
11 135-145
7 146-152
1.0(1) +(1) 11 110-120
14 121-134
Asp-N peptide numbers refer to the respective peaks identified in Figure 1. b Residues determined as molar ratios to P (peptides l and 5), L (peptide 2b), K (peptides 3 and 8), R (peptide 7), V (peptide 2a), and F (peptides 4, 6, 9 and 10). Residues unable to be quantitated but whose presence was known due to 280 absorbance are designated as +. Numbers in parentheses are the number of residues observed by direct sequence analysis. c Position in complete sequence.
264
Hockenhull-Johnson et aL Table IlL Amino Acid Sequences of Asp-N Peptides from Lucina pectinata Hemoglobin III
Cycle
1b
2ac
2b
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
S (528) S (315) G(268) L (97) T (297) G(268) P (155) Q (107) K (45) A (74) A (66) L -K -S (7) s (7) w (6) S (3) R (5) F -M --
D -N(193) A(171) V (119) T (64) T (21) M (5)
D -L (560) F (249) K(102) A(128) Y(107) P (26)
21-27
33-39
Sequenceal-20
3 D T L T P F K S L F E
(146) (186) (114) (41) (11) (7) -(2) (2) (2) --
40-50
Asp,N peptides~ 4 5
6
8
9
10
D(193) V (136) S (58) F (86) N (67) Q (36) M (17) T (12)
D (450) H -P (200) T (208) M(255) K(163) A(247) Q (184) A(ll0) L (76) V (71) F (59) C (33)
D (36) D (54) H -E (25) V (20) L (16) V (12) V (12) L -L (9) Q (6) K (1) M (3) A (6) K (l) L (4) H --
D (391) H -C (164) H -V (127) E (105) G (66) S (22) T (31) K (4) N(30) A(15) W -E (8)
D F I A Y I C R V Q G
(139) (116) (105) (109) (60) (62) (42) -(34) (25) (10)
D(330) F (248) M(190) K (99) E (98) R -L (13)
51-58
59-7l
82-98
121-134
135-145
146-152
"Quantity, in pmoles, recovered at each cycle. When no quantitation was possible, the residue is represented by --. b N-terminal peptide was sequenced following acid-catalyzed deblocking. c This sequence was never overlapped and is felt to be an example of microheterogeneity in Hb III. d Position in complete sequence.
(7 : 1 : 1 : 1) for 20 min at r o o m temperature. After drying, the derivative was analyzed by reversed-phase c h r o m a t o g r a p h y (Henrickson and Meredith, 1984) using the P i c o - T a g System o f Waters Associates. All samples were analyzed in duplicate.
3. R E S U L T S
3.1. Amino Acid Composition The amino acid composition o f Lucina pectinata H b I I I obtained by acid hydrolysis ( K r a u s and Witt e n b e r g , 1990) is in g o o d agreement with that calculated f r o m the sequence given in Table I.
2.5. Amino Acid Sequence Determination Sequence analysis via a u t o m a t e d E d m a n degradation was performed using a Beckman 890M spinning cup sequencer a n d / o r an Applied Biosystems 470A gas phase sequencer equipped with a 120A PTH-analyzer. P h e n y l t h i o h y d a n t o i n derivatives from the spinning cup sequencer were separated and quantitated using a Beckman Ultrasphere c o l u m n isocratically eluted with 0.01 M sodium acetate, p H 4.9:acetonitrile (62:38, v / v ) maintained at 56°C and a flow rate o f 1.0 m l / m i n (Chan, 1984). E a c h peptide was sequenced at least twice. The blocked N-terminus o f H b I I I was analyzed using acid-catalyzed deblocking o f N-acetyl-serine in the N-terminal trypsin and A s p - N peptides (Wellner et al., 1990).
3.2. Asp-N Peptides The A s p - N peptides f r o m H b I I I were resolved by reversed-phase c h r o m a t o g r a p h y into 10 m a j o r peaks containing 11 peptides (Fig. 1). Table II provides the amino acid composition o f the 11 peptides. E d m a n degradation o f 10 peptides provided the placement o f 108 amino acid residues (Table III). Microheterogeneity o f H b I I I m a y exist due to the discrepancies between the composition and the sequence o f peptide 2a (Tables H and III).
3.3. Tryptic Peptides The tryptic peptides from H b I I I were separated by reversed-phase c h r o m a t o g r a p h y into nine peaks
Hemoglobin III Sequence from L. pectinata
2
265
I
/---T" 100
II
~
I~ ~ "
,~r
5
_i
A
.
¢
19
I 0
[ 10
~----~142 150
2'0
0
Time (min) Reversed phase chromatography of reduced Tryptic peptides of Lucina pectinata Hb3
Fig. 2. Reversed-phasechromatography of the trypticpeptides of Hb IIl on a Ct8 column using a gradient of 19-50%acetonitrile in 0.1% TFA. containing 10 peptides (Fig. 2). Table IV presents the amino acid compositions of nine tryptic peptides. The sequences of the 10 peptides are given in Table V; these results establish the amino acid sequence of 116 residues.
3.4. Cyanogen Bromide Peptides Reversed-phase chromatography of CNBrcleaved Hb III is shown in Fig. 3 : seven major peaks were obtained. The results of the amino acid analyses of the seven major peaks are presented in Table VI. Automated Edman degradation of the seven peptides (Table VII) identified 66 residues.
3.5. Chymotryptic Peptides The chymotryptic peptides of Hb III were separated by reversed-phase chromatography into eight major peaks (figure not shown). The amino acid composition of peptide 1 (Table VIII) and its sequence
(Table IX) led to the placement of nine amino acid residues.
3.6. N-Terminal Sequence The determination of the N-terminal sequence of Hb III involved the acid-catalyzed deblocking of the N-terminal acetyl-serine (Wellner et al., 1990), assumed to exist from the amino acid compositions of the N-terminal Asp-N and tryptic peptides and the nature of the N-blocking group in the other oxygenreactive component H b I I determined previously (Hockenhull-Johnson et al., 1991).
3.7. Complete Amino Acid Sequence of L. pectinata Hb llI The result of the alignment of the overlapping proteolytic fragments is shown in Fig. 4. All overlaps were at least two amino acids. The calculated amino
266
Hockenhull-Johnson et aL
Table IV. Amino Acid Composition of Tryptic Peptides from Lucina pectinata Hemoglobin III Amino acid b (D+N) (E+Q) S G H R T
la
2
1.1(1) 1.7(2) 1.8(2)
3
4
4.1(5)
1.2(1)
Trypfic peptides a 5
6
7
8
9
3.2(3) 2.1(2) 2.0(2) 1.7(1) 0.8(1)
5.9(5) 2.8(3) 1.8(2) 3.2(1) 0(1)
1.0(1)
0.8(1) 1.4(2) 0.9(1) 1.2(1) 0.7(2)
2.4(2) 1.4(l)
1.6(3) 1.6(1) 1.0(1)
1.0(1)
0.8(1)
1.8(2)
A
1.9(2)
1.1(1)
P V V M
2.7(2)
1.2(1)
0.7(1) 0.7(1) 0.9(1) 0.2(2)
2.1(2) 1.0(1)
1.1(1)
2.0(2)
1.4(1)
1.8(2)
2.1(2)
1.0(1) 1.1(1) 1.1(2)
c
2.8(5) 0.4(1)
0.8(1) 0.9(1)
0.4(1) 0.5(1)
+(1)
I L F K
+(1)
0.8(1)
+(l) 2.0(2)
1.0(1)
t.0(1) 2.0(3) 0.5(1)
1.0(l)
w
1.2(1) 1.0(1) 1.0(1)
1.1(1) 1.7(2) 1.0(1)
3.1(5) 1.9(2) 1.0(1)
1.9(2)
10 37-46
19 47-64
29 65-93
8 110-117
1.1(1) 1.0(1) 1.0(1)
+(1)
Total SequencC
9 1-9
5 14-t8
+(1)
18 19-36
13 118-130
12 131-142
Tryptic peptide numbers refer to the respective peaks identified in Fig. 2. b Residues determined as molar ratios to R (peptides 2 and 7), L (peptide 3), F (peptide whose presence was known due to 280 absorbance or 3H are designated as +. Numbers in parentheses are the number of residues observed by direct sequence analysis. c Sequence designation refers to the respective positions within the complete sequence.
Table V. Amino Acid Sequences of Tryptic Peptides from Lucina pectinata Hemoglobin III Cycle
la b
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
S (357) S (201) G(270) L (209) T (94) G (91) P (84) Q (66) K(121)
SequencC 1-9
lb
Tryptic peptides" 4 5
2
3
(102) (36) (270) (299) (20) (27) (68)
S -S -W(33) S -R --
F (198) M(132) D (107) N(103) A (79) V (70) T (34) N (6) G (4) T (3) N (l) F (1) Y (1) M (1) D (1) L -F -K --
A(378) Y (247) P (59) D(106) T -L (72) T (1) P (3) F (13) K (4)
S -L (91) F (65) E (48) D -V -S ~ F -N(16) Q (13) M (8) T (5) D (7) H -P (3) T (1) M (1) K --
A(52) Q (15) A (24) L (22) V (25) F (26) C (6) D(55) G(24) M (4) S -S (4) F (8) V (3) D (9) N (3)
D(217) G (94) Y (243) G (54) V (130) L (54) L (35) R --
Y(II0) L (117) E (101) D (90) H (37) C (57) H (27) V (31) E (31) G (22) S (4) T (7) K (6)
N(32) A (58) W -E.(20) D (21) F (22) I (16) A(13) Y(II) I (13) C (9) R --
143-149
14-18
19-36
37-46
47-64
65-80
110-117
118-130
131-142
V Q G D F M K
6
7
8
a Quantity, in pmoles, recovered at each cycle. When no quantitation was possible, the residue is represented by --. b N-terminal peptide which was sequenced following acid-catalyzed deblocking. c Position in complete sequence.
9
Hemoglobin III Sequence from L. pectinata
267
100
F
70
to
55
Ill u
J
,q.
J
g
E
0
10
20
26
170 180 185 190 Time (min) Reversed phase chromatography ol reduced CNRr peptides of
Lucina pectinata Hb3
Fig. 3. Reversed-phase chromatography of the CNBr fragments of Hb IlI on a Cj8 column using a gradient of 26-55% acetonitrile in 0.1% TFA. Table VI. Amino Acid Composition of Cyanogen Bromide Fragments from Lucina pectinata Hemoglobin III
Amino acidb (D+N) (E+Q) S G H R T A P Y V M C I L F K
Cyanogen bromide fragments ~ 1 1.5(1) 3.5(5) 2.8(2)
2
3
4
5
6
4.5(4)
4.0(4) 2.4(3) 1.9(2)
1.5(1) 1.1(t)
3.2(4) 2.0(2) 1.4(2)
6.4(6) 5.4(5) 2.0(1) 6.8(5) 1.3(3) 4.7(5) 1.4(I) 3.1(3)
1.4(1)
1.3(1) 0.5(1)
1.0(1) 1.2(1) 1.9(2) 1.0(1)
+(1)
1.6(2) 0.9(1) 1.6(2)
w
+(1)
Total SequenceC
20 1-20
1.2(2) 1.3(1) 0.6(1) 1.4(1) +(1)
1.0(1)
1.7(2) 1.6(1) 1.7(2) 0.6(1) 1.4(1) +(1)
3.0(3) 2.4(4) 1.7(2)
1.4(2)
0.9(1) +(1) +(t)
1.6(4) +(1)
0.9(1) 0.7(1) 1.0(1)
3.2(4) 1.4(1) 1.5(1)
2.1(3) 3.5(3) +(1) +(2) 3.5(4) 5.4(5) 2.8(3) 3.0(3)
1.3(1)
1.1(1)
1.0(1) 1.0(1)
+(1) 12 21-32
25 33-57
11 64-74
20 75-94
54 95-148
4 149-152
Cyanogen bromide fragment numbers refer to the respective peaks identified in Fig. 3. b Residues determined as mole ratios to F (fragments 2 and 7), R (fragment 1), K (fragments 4 and 6), L (fragment 3), and E + Q (fragment 5). Residues unable to be quantitated, but whose presence was known due to 280 absorbance, 3H, or homoserine lactone presence, are designated as +. Numbers in parentheses are the number of residues observed by direct sequence analysis. c Sequence designation refers to the respective positions within the complete sequence, as shown in Fig. 5.
268
Hockenhull-Johnson et al. Table VII. Amino Acid Sequences of Cyanogen Bromide Fragments from Lucina
pectinata Hemoglobin III Cyanogen bromide fragments" Cycle
2
4
5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
D (118) N (62) A (108) V (168) TN (22) G (25) T -N (22) F (5) Y (6) M-
K (114) A (99) Q (52) A (47) L (55) V (44) F (49) C -D (6) G (10) M--
S -S (92) F -V (150) D (95) N (178) L (113) D (96) D (106) HE (7) V (11)
21-32
64-74
75-86
Sequenceb
6
A (55) K (84) L (73) HF (40) N (25) RG (50) I (20) R -I (18) K (21) E (18) L (28) R -O (ll) G (27) Y (12) G (18) V (3) L (ll) L (8) R -V (13) L (4) E (5) D (4) 95-121
7
K E R L
(332) (337) -(19)
149-152
a Quantity, in pmoles, recovered at each cycle. When no quantitation was possible, the residue is represented by --. b Position in complete sequence.
acid composition is in g o o d agreement with the composition determined by amino acid analysis (Table I).
4. D I S C U S S I O N
4.1. Structural Basis for Alignment of Globin Sequences The two Lucina sequences, H b II obtained previously ( H o c k e n h u l l - J o h n s o n et al., 1991) and H b I I I determined in this study, were fitted to the " m y o g l o bin fold" (Lesk and Chothia, 1980) using Template I o f Bashford et al. (1987). This template consists of a series o f motifs for the globin helical regions incorporating the residues, which are either invariant or are conservatively replaced, in a database o f 226 globins, p r e d o m i n a n t l y vertebrate. The sequences were divided into subsets in which at least one sequence
had a k n o w n crystal structure; the sequences within each subset were aligned with each other based on the key sequence structure; finally, the subsets o f sequences were aligned with each other based on the seven k n o w n tertiary structures. The results o f fitting the sequences o f the two Lucina globins are shown in Fig. 5, which also provides the secondary structures o f sperm whale (Physeter) myoglobin, o f Aplysia limacina m y o g l o b i n (Bolognesi et al., 1989, 1990, 1991), and of Scapharea inaequivalvis H b I ( R o y e r et al., 1985, 1989). Except for the D helix which is absent in Scapharca, all the helical regions in the two structures line up very well with each other and with the Physeter secondary structure. All the 71 positions comprising Template I o f Bashford et al. (1987) are identified in the m y o g l o bin fold: (1) the 34 positions occupied by interior residues with low, medium, and severe restrictions in
Hemoglobin Ill Sequence from L. pectinata Table VII1. Amino Acid Compositions of One Selected Chymotryptic Peptide from Lucinapectinata Hb Iti Amino a c i d "
Chymotryptic peptide 1b 4.5(5) 2.9(3)
D+N E+Q S G H R T A P Y V M C I L F K W
2.3(2) 1.1(2) 2.3(3) 1.6(1) 1.00) 2.0(4) 0.7(l) 1.3(2) 3.3(6) 1.50) 1.9(3)
Total Sequencec
14 36-49
34 79-112
Chymotryptic peptide number refers to the peak labeled no. 1 (data not shown). b Residue determined as a molar ratio to Y (peptides 1). Numbers in parentheses are the number of residues observed by direct sequence analysis. c Sequence designation refers to the respective position within the complete sequence, as shown in Fig. 5.
Table IX. Amino Acid Sequence of a Selected Chymotryptic Peptide from Lucina pectinata HB III Cycle 1 2 3 4 5 6 7 8 9 10 1l 12 Sequence~
Chymotryptic peptide 1~ V D N L D D H E V L V V
(41) (36) (27) (23) (20) (27) -(11) (12) 1 (9) (14)
78-89
Quantity, in pmoles, recovered at each cycle. When no quantitation was possible, the residue is represented by --. The sequence was aborted prior to the peptide end. b Position in complete sequence.
269
size according to Table 4 of Bashford et al. (1987) are in lowercase, lowercase boldface, and capital boldface, respectively; these residues are in boldface in all the sequences; (2) the 32 underlined positions indicate surface residues; (3) the five positions where any residue is acceptable (B5, EF1, FG2, G I 9 , and H20) are indicated by a boldface X. 4.2. Comparison of Lucina Hb IIl and Hb lI
Lucina H b III is 152 amino acids long, compared with 150 for H b II, and has an N-terminal acetylserine instead of acetyl-threonine (Hockenhull-Johnson et al., 1991). There are 94 identical residues, including the Phe at position C D i , the proximal His at position F8, and the distal Gin at position E7. Luc/na H b HI also contains the Tyr at B10, which was shown by modeling to be able to move into the distal portion of the heine cavity of H b II (HockenhullJohnson et al., 1991), in accord with the suggestion that tyrosinate m a y act as the distal heine ligand in the alkaline form of ferri-Hb II (Kraus et aL, 1990). Furthermore, the residues CD4Leu and Bl4Phe, which were thought likely to facilitate the m o v e m e n t of TyrB10 into the distal part of the heme cavity of H b II, are conserved in H b III. In addition to the identities, a number of conservative replacements occur in Hb III compared to Hb II, which can be identified using the matrices for preferred amino acid exchange for buried and for surface residues obtained by Bordo and Argos (1990, 1991) in a search for structurally equivalent pairs of different amino acid residues in a database of 135 alpha-like, 130 beta-like, and 52 myoglobin sequences that were aligned on the basis of known crystal structures. Counting identities and conservative replacements provides a homology score of 75% between the sequences of Lucina H b II and H b III. 4.3. Alignment of Lucina Globins with Other Mollusc Globin Sequences The two Lucina sequences were aligned with the known sequences o f bivalve mollusc globins: the cytoplasmic globins of Bursatella leachii (Suzuki and Furukohri, 1990), Dolabella auricularia (Suzuki, 1986), Aplysia limacina (Tentori et al., 1973), Aplysia kurodai (Suzuki et al., 198 l), Aplysia juliana (Takagi et al., 1984), Busycon canaliculatum (Bonner and Laursen, t977), and Cerithidea rhizophorarum (Takagi et aL, 1983), and the intracellular globins of Calyptogena soyoe (Suzuki et al., 1989a, b) Anadara
Hockenhull-Johnson et al.
270
1 S
I0 B
G
L
T'G
P
Q
K
20 A
L
K
S
S
W
S
R
30
F
D
N A V T N G
......... ....
.... /
T-DA-I
...............
VB
- - - T - 2 - - /
A-DA-I
....
A-4
T-3
K A Y
P
...............
90
I
I
K A Q A L V F C D G M S c-4
/ .............
/ ..........
A-S
....... T-6
..........
S
F V D N L D D K
/ ......... .
.
.
c-5
.
.
110
I
K E L R D G Y G V L
..................
.....
C-6 T-7
I
L R Y L .
. ....
.
E V L V V L .
.
.
.
.
CT-2
.
.
/ ...........
. T-S
............
. .......... A-o
.
.
.
.
.
140
I
/ ......... ...........
F
K
.......
S
L
¥
D
/ ..........
.......
L
.
.
.
~-6
I
.
P
A-3
/
.........
100
.
130
.
T
Q K M A K L I [ . .
.
. .
.
t
F N R G c-6
.....
.
.............
.......
E D I [ C H V E G S T K N A W E D F .
. .
.............
120
I
. .
L
T-4
.... /
A-2b
80
M
T
/ ....... .....
I
T
50
/
/
.......
I R I
F
70
.............
.....
........
L
I
........ T-5
Y M D
60 F N Q M T D I T P
......
TN
C-2
...............
...............
40
I Ay .
. T-9
i ..........
I .
150 C R V Q G D F M K .
........
. / ....
A-9
........
E
E R L C-7
T-~
-
---/ I--- A-to
---
4. Summaryof the data usedto establishthe completeaminoacid sequenceof Lucinapectinata Hb III. C, CNBr fragments; T, trypticpeptides (T-DA-1 is deblocked); A, ASP-Npeptides (A-DA-1 is deblocked); CT, chymotrypticpeptides.The lengthof each line correspondsto the extent of automatedEdmansequencingand the / designatesthe cleavagesite. Fig.
broughtonii (Furuta and Kajita, 1983, 1986, 1991), A. trapezia (Como and Thompson, 1980; Fisher et al., 1984; Gilbert et al., 1985; Mann et al., 1986; Titchen et al., 1991), Seapharca inaequivalvis (Peruzzelli et at, 1985; 1989), and Barbatia reeveana (Riggs and Riggs, 1986, 1990), based on the alignment of Bashford et al. (1987). The results are also presented in Fig. 5. The 34 interior residues of Template I of Bashford et al. (1987) are in boldface in all the sequences. It can be seen that at these positions the amino acid residues are either conserved or replaced conservatively according to the criteria results of Bordo and Argos (1990, 1991). The only invariant residues are A12Trp, B13Leu, B14Phe, and CDIPhe and the proximal His at position F8. In addition, there are several other positions which exhibit very little variation: A1 ISer in 20 of the 23 sequences, B6Gly in all but one of the sequences, G5Phe in 19 of the 23 sequences, and H8Trp found in all but one sequence. At the distal ligand position E7, the mollusc sequences exhibit surprising diversity, with Val in five, His in 14, and Gln in the two Lueina sequences. It is interesting to note that the five mollusc cytoplasmic globin sequences which have a distal Val are monomeric,
while Busycon and Cerithedea, which have distal His, and the two Lucina globins can self-associate. Among the mollusc globin sequences, the two Lueina and the two Calyptogena sequences are unique in having a Tyr at B10. A matrix of homologies for the presently available mollusc sequences is given in Table X.~The entries across the diagonal are the number of amino acids in the given sequence. The entries above the diagonal provides the overall percent homology score {100(I+ C ) / N ) ) , where I is the number of identical residues, C is the number of conservative replacements allowed by the criteria of Bordo and Argos (1991), and N is the number of positions along which the two sequences are compared. Because of the complexity of Table XI, we have sought to simplify the results provided by comparing the Lucina sequences as a group with other groups of sequences, defined as monomeric cytoplasmic ( Bursatella, Dolabella, Aplysia limacina, A. kurodai, and A. juliana), dimeric cytoplasmic (Busyeon and Cerithedea) and the intracellular globins of Anadara broughtonii, A. trapezia, Scapharea inaequivalvis, Barbatia reeveana, and Calyptogena soyoe: the results are presented in Table
Hemoglobin III Sequence from L. pectinata
Myoglobin
1 5 i0 15 A A A ~ A A A a A A a a A ~ a A
fold
Physeter Lucina Lucina
271
1 5 i0 15 1 5 C C C C B B B B X ~ B B b b B B b B B B C C C o C ~ C D ~ D D
V L S E G E W Q L V L H V W A K V E A * * * D V A G H G Q D I L I R L F K S H P E T L E K F D R F
II III
T T L T N P Q K A A I R S S W S K F M * D * * N G V S N G Q G F Y M D L F K A H P E T L T P F K S L S S G L T G P Q K A A L K S S W S R F M * D * * N A V T N G T N F Y M D L F K A Y P D T L T P F K S L
Bursatella Dolabella
S L S G A E A D L L A K S W A P V F A * * * N K D A N G D N F L I A L F E A F P D S A N F F G D F A L S A A E A E V V A K S W G P V F A * * * N K D A N G D N F L I A L F E A Y P D S P N F F A D F
A~ly
lira f o l d
.............
Aply lim Aply kur Aply jul Busycon C e r i th Ana b I Ana b ~ Aria b ~ A n a t c4
VDAAVAKVC~S
M~oglobin
II III
Bursatella Dolabella
EAIKANLRRSWGVLTA***~ L L L L
L R R R
R R L D
P V S K V D S
S V Y D A A A Q L T A D V K K D L R D S W K V I G S * * * D K K G N G V A L M T T L F A D N Q E T I G Y F K R L A D A V A K V C G S E A I K A N L R R S W G V L S A * * * D ! E A T G L M L M S N L F T L R P D T K T Y F T R L A E L A N A V V S N A D Q K D L L R M B W G V L S V * * * D M E G T G L M L M A N L F K T S P S A K G K F A R I V A E A I N E V T Q P S H I K N I E K B W N L V K * D * * D ~ N Q N G V D L M I K L F D M E S Q S V K Y F K D F V S A K L D E V T Q P A N K N L X R S T W N M M V G * * * D * R G N G V E L M G L L F Q R A P D S K I D F K R L V L E T I E E V T K P A N K G L I R E T W N M I A G * * * D * R K N G V E L M A L L F E M A P D B K K D F R R L V S A N D I K N V Q D T W G K L Y * D Q W D * A V H A S K F Y N K L F K D S E D I S E A F V K A V S Q A D I A A V Q T S W R R C Y * C S W D * N ~ D G L K F M Q T L F D S N S K I R H A F E S A
.............
M S S S
S W W W
W G G K
G V V V
V L L L
L M S G
S S V S
I EA~GLMLMSNL,TLRPDTKTYFTR~
A E L A N A V V N A D Q K D A D A V A K V C G S E A I K G N A E L A N A V V S N A D Q K D L S V Q D A A A Q L T A D V K K D
V * * * MEG G * * * D I E A T G L T Y L * * * D M E G T G L M L M * * * D K K G D G M A L M
L A A T
M N N T
A ..........................
5 D I0 15 1 5 1 EEe D D D D D D D E E E e .E E e .e E E . E e. E e e.
fold
Physeter
Aply
C ....
R V V S T V P
I fold
Scaph I Scaph IIA Scaph IIB Barb MI2 Barb D1 Barb D2 Calyp I C a l y p II
Lucina Lucina
B .................
G L D G A Q K T A L K E S W K V L G A D G P T M M K N G S L L F G L L F K T Y P D T K K H F K H F S L Q P A S K S A L A S S W K T L A K D A A T I Q N N G A T L F S L L F K Q F P D T R N Y F T H F . P S V Q G A A A Q L T A D V K K D L R D S W K V I G S * * * D K K G N G V A L M T T L F A D N Q E T I G Y F K R L
Aria t ~ A n a ~ ), Soaph
A ............................
S L S A A E A D L A G K B W A P V F A * * * N E N A N G L D F L V A L F B K F P D B A N F F A D F S L S A A E A D L V G K S W A P V Y A * * * N K D A D G A N F L L S L F E K F P N N A N Y F A D F A L S A A D A G L L A Q S W A P V F A * * * N S D A N G A S P L V A L F T Q F P E S A N F F N D F
L L L L
M F F F
A T K N
N L T D
L R S H
~ P S Q
K D A E
T T A T
S K R I
P T T A
S Y K Y
A F F F
K G K F A R T R L A R L K R M
B ...............
20 e EX
1 F _F F _ f
5 P i0 G f _F ~ F F F - -F x
K H L K T E A E M K A S E D L K K H G V T V L T ~ L G A I L K K * * * K G H H E A E L K P L A Q S H A T K H * * K I LVF F G G L T L A Q L Q D N P K M K A Q 8 F E D V S F N Q M T D H P T M K A Q A L V
C N G M S F C D G M S
S F V D H L * S F V D N LD
D D N M L V V L I Q K M A K L H N N R * G * D H E V L V V L L Q K M ~ K L H F N R'G*
v S SR I V N R L N N D , V G N A A D A G K M A G M L D Q F K G K * S I A D I R A S P K L RS R NV S S R I VS R L E FV S S AA D A G K M AM LD K G K * S I A D I R A S P KL ................
--- D ---
lim f o l d
E .................................
F ..............
Aply lim Aply ~ur Aply jul Busycon Cerithedea
D D A * T F A A M K T T G V G K A R G V A V F S G L G S M I G N M * S D A E M K T T G V G K A H S M A V F A G I G S M I
AnabI Anab~ ~/la b ~ Ana t~ Ana t8 Aria t [
G G G G G
C$ I D D D D C V B G L A K K L S R N H D S M D D A D C M N G L A L K L S R N H I Q
D D D D D
V V V V V
* Q * * * A * Q * K G * S * A G * S * Q G
G G * * M
* * K K *
K K A D A
............. ----E
Soaph I fold G N V G D V G D V G D V G D V G D V GTG* GAT*
* * * * * *
FG * FG FG* LA
K L R G ~ S I T L M Y A L Q N F X D Q L D N T D D L V C V V E K F A VNH T I T L T Y A L WF V B L B D P S R L K C V V E K K F A V N H I A N S K L R GHA I T L M Y A L Q N F V D A L D D V D R L K C V V E FAVNH I D N S K L R G H S N S K L R G B A I T L T Y A L D W F V D S L D D P S R L K C V V E K F A V N H I N R * * * K N S K L R G N S I T L M Y A L ~ N F ! D A L D N V D R L N C V V E K F A V N H I N R * ~ N S K L R G H S I T L M Y A L N F Z D Q L D S T D D L I C V V E K F A V N H I T R *
G N V * S * ~ GM * A N D
................................
* I * I
S K E H V G * FG * • V F S K EHA G * F G * *V
K G K * S V A D I K A S P K L R D V S S R I F T R L N E F V N N A A N A G K M S A M L S Q F A K E H V G * RDVS S R I FTRL N E FVN NAA D A G K M S AM LS Q FAS EHVG K G K * S I A D I K A S P KL S R I F A R L N E F V S N A A D A G K M G S M L Q Q F A T E H A G * K G K * S L A D I Q A S P K L R D V S
Scaph I Scaph IIA Scaph IIB Barb MI2 Bal-b D 1 B a r b D2 CalyI CalyII
C ....
~ * V * * V *V
R ~ G ~ * V R * e * K I TR** NR* NR** I
*
~ K I **El Q* I *I KI
F .............
S * Q G M * A N D K L R G H S I T L M Y A L Q N F Z D Q L D N P D D L V C V V E K F A V N R I T R * * * K I Q * K G * K A N S K L R G ~ A I T L T Y A L N N F V D S L D D P S R L K C V V E K F A V N H I N R * * * K I S * A G * K D N S K L R G H S I T L M Y A L Q N F V D A L D D V E R L K C V V E K F A V N H I N R * * Q * I T S V N R R N N S K L N G H G I H L W Y A L K S F V D N L D D A D D F E D V C R I F A E K H K K R * * * E I S A E N I P Y N R K L N G H G I T L W Y A L M N F V D Q L D S K K D L E D V C R K F A V N H V I R * G * * V S P S N I P N N R K L N G E G I T L V Y A L M N F V D Q L D N K I D L E D V C R K F A V N H V N R * G * * V S* * * * * * G I A M K R Q A L V ~ G A I L Q E F V A N L N D P T A L T L K I K G L C A T B K T R * G** I N* * ** * * DT E M E K Q A N L ~ G L M M T Q F I D N L D D T T A L N Y K I S G L M A T H K T R * * N*V
Fig. 5. Alignment of Lucina pectinata globins (Hb II and Hb l i d with sperm whale (Physeter) myoglobin and with the known sequences of mollusc globins: the cytoplasmic globins: the cytoplasmic globins of Bursatella leaehii, Dolabella aurieularia, Aplysi?t limacina, A. kurodai, A. juliana, Busyeon canaficulatum, Cerithedea rhizophorarum, and the intracellular globins of Calyptogena soyoe, Anadara broughtonii, A. trapezia, Seapharca inaequioalvis, and Barbatia reeoeana. The N-terminal blocking (usually acetylation) of the A. broughtonii (Ila), A. trapezia (a and fl), and Seapharea IIA globins is not shown. The myoglobin fold shows the 71 positions which comprise Template I of Bashford et al. (1987): (I) the 34 positions occupied by interior residues, with low, medium, and severe restrictions in size, are in lowercase, lowercase boldface, and capital boldface, respectively: these residues are in boldface in all the sequences; (2) the 32 positions occupied by surface residues are underlined; (3) the five positions where any residue is acceptable (positions B5, EFI, FG2, G19, and H20) are indicated by a boldface X. The distal and proximal residues are indicated by D and P, respectively. The other secondary structures are of A. limaeina myoglobin (Bolognesi et al., 1989) and Scapharea globin I (Royer et al., 1989). The A. trapezia sequences have been corrected based on recently determined genomic sequences (E. O. P. Thompson, personal communication) Abbreviations: A, Ala; B, Asx; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; W, Trp; X, any amino acid; Y, Tyr; Z, Glx.
HockenhulI-Johnson et al.
272
Myoglobin fold Physeter
1 5 G GG G G_GG P IK¥ LEFI
Lucina II Lucina III
RA S D L R T A Y R I K E L R D G Y
BUrsatella Dolabella
G S Q Q FEN VR S M F P G FVS G S Q Q F Q NV S A M F P G FVAS
Aply Aply Aply Aply
............... G S A Q P E NVR G S AQ F ENVR G S AQ F Q NV R
llm fold lim kur jul
I0 15 g_GG G g G G g gGG X SEA I I H V L H S R H
Busycon Ceri thedea
S AAD F KLL G A S R F G EMR
A~ah[ Aria b A~a b Aria t Ana t Aria t
S A A E F G K S GDAF GA S S D E FG E S G DAFG S SAD EFGE S G A E F G K
~ ~ ~ ~ ~
Seaph I fold Scaph I Scaph I I A Scaph I I B Barb M I 2 Barb D 1 Barb D2 Calyp I Calyp II
PG
D I L I H Y M E D H N G V L L R Y L EDHC
G S S S
H * * M V G G A K D A W E V F V G F I H * * V E G S T K N A W E D F I A Y
SVA AP P* I A APP*
QG
C K T L G D Y M K E L S I C R V Q G D F M K E R L
* A G A D A A W G K L F G L I * A G A D A A W G K L F G L
I D A L K K A G K I I D A M K K A G K
............... ................. H ............... M F P G FVASV A APP* * A G A D A A W T K L F G L I I D A L K A A G K MF PA FVA S L S A P * * * * P A D D A W N K L F G L I V A A L K A A G K MF P G FVAS L S A P * * * * A A D A A W N S L F G L I I S A L Q S A G K
E A V F K ZFLD EAT Q R K A T D A Q K D A D G Q V F P N F L D E A L G G G A S G D V K G A W D
I NG P I K K V L A S K N I I EPMK E T L K A R M IV G P L R Q T L K A R M I I P EMK E T L K A R M I V G P L R Q T L K A R M I NG P M R Q V L A S K N
............... S AAEF GK I S G D A F G A I SAD E FG E I G P S E F A L I L D V K F GWI LDVK FAWI T NM E L FAFA V D PAL FA I
20 25 15 h H H H h X H H H H HH RKDI A A K Y K E L G Y
1 5 I0 HHHH_H_HHh_HHhh~H D F G A D A Q G A M N K A L E L F
G ............... NG P I K K V L A S K N V E PMK E T L K A R M VG P L R Q T L K A R M KAP L K A F L K D K M K EPMA E L L R R K C K E PLA E L L R R K C L A D L V A Y M G T T A L N E LV KP I G N Q Fig.
5.
XI as mean + SD. Furthermore, we provide in Fig. 6 skyscraper plots of the percent homologies of the foregoing groups of globins.
* G G G G *
* N N S S *
* Y Y Y Y *
F Y F * F F
G S D S D G
D D E D E D
K D D D D K
Y V T V T Y
A A V G V A
N G S A S N
A A A A A A
* G G K G G I *
* N N D N Q * *
* Y Y E D T S *
F Y F C C C F *
G S D T * T T *
D D E D D D A *
.............. K Y A N A D V A G A D T V A A S I I D S D A I Q A Q H I Q A A Q K A S * * Q P A
W W R W W W
W W W W W W W W
A L L T M L I K A H V A L L A Y L Q D N K Q A Q A L
A A A V A A
~ A A E W W T K
K A A Q A K
K A S K K K A N
L L L A L L
V V V I V V
A G A L A G
V V V G V V
V V V M V V
H .............. L V A V V Q L V G V V Q L V A V V Q L V D V I I L I D V ~ C L I D V V C V N D V I L V T A V I L
Q Q Q Q Q Q
A A A A A A H S
A A A N A A
A A S I V V Q Q
A A S A S A
L L L L L L M M
L L L V L S A L L L
A K E S K
E E E S I
K S K Y A
E K K F L
I V K G A T V A S S N
Continued.
In contrast to the monomeric state of vertebrate myoglobins, as exemplified by sperm whale myoglobin, the molluscan cytoplasmic hemoglobins (myoglobins) can exist either as monomers or as homodimers in the case of the Cerithidea and Busycon molecules, whereas the intracellular globins form homodimers and heterotetramers (Bonaventura and Bonaventura, 1983 ; Nagel, 1985; Terwilliger and Terwilliger, 1985). Not surprisingly, it can be seen from Tables X and XI and Fig. 6 that the two myoglobins show much higher homology with each other (63%) than with the other cytoplasmic hemoglobins (44±2%, n = 10), a fact noted earlier (Suzuki and Furukohri, 1989). Although Lucina Hb II and Hb III can self-associate as well as form complexes with each other (Kraus and Wittenberg, 1990; Kemling et al., 1991), their homologies with the dimeric and monomeric cytoplasmic hemoglobins are comparable: (45 4- 3, n = 4) and (42 ± 3, n = 10), respectively. Not surprisingly, the monomeric sequences exhibit the highest homology within their group, 864- 4 (n = 10).
hydrothermal vent mollusc Calyptogena have low homologies with all other mollusc globins, including the Lucina sequences. When examined among themselves, the two Anadara groups and the Scapharea group exhibit very high homology scores, in excess of 90% within the following triplets of sequences: AnblAnty-ScaI, Anbo~-Ant~-ScaIIA, and Anb~-Ant~ScaIIB, corresponding with the tendencies of the members of the first group to form homodimers and of the last two groups to participate in the formation of heterotetramers. However, the two Lucina sequences have similar homologies with all three chains of the two Anadara groups and the Scapharca group: 43 4- 3, 43 + 3, and 43 + 2, respectively, marginally higher than with the Barbatia group, 38 + 2 (n = 6), and comparable to their homology with the monomeric and dimeric cytoplasmic hemoglobins. The Barbatia sequences show low homologies with the Lucina sequences (38 i 2) than with the two Anadara and the Scapharca groups, 51 ±2, 49±3, and 51 ±3, respectively, Within the Barbatia group, the singledomain chain M12, which forms a heterotetramer with another Barbatia globin chain whose sequence has not been determined, has comparable homologies with the two domains of the polymeric hemoglobin (57% and 61%), much lower than the 84% homology between the two domains.
4.5. Comparison of Lucina Hb III and Hb II with Intracellular Mollusc Hemoglobins
4.6. Comparison of Lucina Hb III and Hb II with Selected Vertebrate and Nonvertebrate Globins
The results shown in Tables X and XI and in Fig. 6 indicate that the two intracellular globins of the
In addition to comparing the Lucina sequences with those of mollusc globins, we have also sought
4.4. Comparison of Lucina Hb III and Hb II with Cytoplasmic Mollusc Hemoglobins
Hemoglobin III Sequence from L. pectinata
273
O
~8
C~ II E
_=
o<
O
O
O
274
Hockenhull-Johnson et aL Table XI. Percent HomologyBetweenLucina Globin Group and Groups of Cytoplasmicand Intracellular Mollusc Globins Cytoplasmic
Lucina Lucina Monomer Dimera
75
Intracellular
Monomerdimer~
Anada
Seaph
Barba
Calyp
42+3
45±3
434-3
434-2
384-2
374- 1
86:k4
444-2
424-2
424-1
364-2
354-1
63
48 :k2
48 4- 1
42 4- 1
32 4-3
644- 5 (n = 6)b 644-5 (n = 6)c'a
65 4-6 (n = 12e
50 ± 3 (n = 18)
36 + 1
67 ± 8 (n=3)
51 + 3 (n=9)
37 ± 2
67 + 14
32 :k3
Anadara
Seapharca Barbatia Calyptogena
54
a The Busy¢on and Cerithedea sequences. bComparisonwithin each subgroup (i.e.,betweenthe A. broughtoniisequencesand betweenthe A. trapezia sequences). c Comparison of A. broughtoniisequences with the A. trapezia sequences. a Excluding the high homologies between the following pairs of sequences: AnBI and Ant~ (93), Anba and Anta (90), and Ant~ (96). e Excluding the high homologiesbetween the followingtriplets of sequences: SeaI-AnbI-Ant?/ (95 ± 3), ScaIIA-Anba-Anta (92 ± 4), and ScaIIB-Anb[3-Antfl (97 + 1). to compare them with several other selected g!obin sequences from distantly related phyla:vertebrate sequences, including the alpha and beta chains of human hemoglobin (Fermi and Perutz, 1981) and lamprey (Petromyzon marinus) hemoglobin (Li and Riggs, 1970; Honzatko et al., 1985; Pastore et al., 1988), two annelid sequences, including the intracellular Glycera dibranchiata hemoglobin M-II (Imamura et al., 1972; Arents and Love, 1990) and chain I of the extracellular hexagonal bilayer hemoglobin of Lumbricus terrestris (Shishikura et al., 1987), one echiuran sequence, the intracellular hemoglobin F-1 of Urechis caupo (Garey and Riggs, 1986), one insect globin, H b III of Chironomus thummi thummi (Goodman et al., 1983), one nematode globin, Trichostrongylus colubriformis hemoglobin (Frenkel et al., 1992), one plant globin, Lupinus luteus Hb II (Yegorov et al., 1980; Koniczny et al., 1987), and the bacterial hemoglobin of Fitreoscilla (Wakabayashi et al., 1986). Table XII provides a matrix of percent homology based on number of identities and number of conservative substitutions according to the criteria of Bordo and Argos, obtained from pairwise comparisons of the foregoing globins and the globins of Aplysia limacina and Scapharca inaequivalvis. These results provide a perspective of the extent of homology
between very distant groups of living organisms. The 75% homology between the Lucina hemoglobins II and III should be compared to the 65% between human alpha and beta chains. Surprisingly, the Lucina sequences exhibit as much homology with the human alpha and beta chains and lamprey globin (3437%) as with other nonvertebrate globins, except for the substantially lower homology with the nematode Trichostrongylus and the expected higher homologies with Aplysia and Scapharca. The intracellular annelid globin of Glycera appears to exhibit as much homology with the three vertebrate and the two Lucina sequences (36-40%) as does the extracellular annelid globin of Lumbricus (32-38%) ; unexpectedly, the two annelid globins have only 36% homology with each other. Likewise, the mollusc sequences including Lucina, Aplysia, and Scapharca, do not exhibit any more homology between each other than with the globins of distantly related groups. The echiuran Urechis globin has an unexpectedly high homology (45%) with Glycera globin. The globin of the insect Chironomus has substantial homologies (34-43%) across the board except for Trichostrongylus and Fitreoscilla. The plant globin of Lupinus has higher homologies with the human apha and beta and lamprey chains (39-40%), Glycera (42%), Aplysia (40%),
Hemoglobin III Sequence from L. pectinata
~',~:::..........: : 5 :
275
.........
ii
|| r:
a
i-ilh
,,;~- ,~%.:....'...=;:'.,.:...-'-.-=.:.,:......:-:-,-::-----§-: ~---§-[ '§§|:;x{-i | j i | . 0 .oe" :".........=5,: .......=.::,- .......=-::,. .......=--:,.::...... |- :|t-~:7=:':7lll|.......J ,~. -. i ,
°-~i'- :'" -
':
-:
}..
"~•
q
.-:.............. -:............... :............... :.,:............. :,........... . .......
..
......."'~..`:.:.~/~}.....{~.~....``...`......--~::`~..--}:`~..~--~..`:.--.~...~:`..:`:. " ':' ........::' ........=::'::.-W:': .......=::" ...... ........... I '::,:.:......... ',,
,.,'~:~}~:.....}:.....-_5
~....}-.....-.......:-::....-'....-..::.,..-.-'..-----:'~-----'--
......'-----'--------:~: ........=---"
o
iiiI iZZ!iiiiii !!, " ii "
~
':. ........ ':..
-...
~--,--~......... . : . . U ~ ~.... .,.:, -~-- -~":. . . . . r" --~ ~~:"~:'':"~: .......: " U U ~ ~
se,,,
i
,. . . . ,~ [;
.... g . . . .
.-.+'~
.
.i',..----~-:".:.---~-".:.--a-~,'.:..~":..~:.. ~.:.....~....:..~.:,:..~.:.:..~.....:".......:...:...£.. UU~x:. .................. ~- ................... ........ ":~!-5ii.5.5L.~, .',.........-... :v-
~~".:~"-:.~
~-, '.. -'. -
" ~,..~...~-..~...=---.'----~..~---:--~--=--~i}~"..
Fig. & Two-dimensional bar graphs of the homologies provided in Table XI between the Lueina hemoglobin sequences and the sequences of other mollusc hemoglobins: cytoplasmic hemoglobins (A) and intracellular hemoglobins (B).
and Vitreoscilla (42%) than with Trichostrongylus. In fact, the n e m a t o d e sequence appears to have most o f the low homologies with all the representative globins selected. A p a r t f r o n m the appreciable h o m o l o g y with Lupinus globin, so does the hemoglobin o f the bacter-
ium Vitreoscilla. It should be kept in mind, as indicated in Table XII, that the crystal structures o f Trichostrongylus, Vitreoscilla, and Lumbricus globins are not known, and thus m a y have some substantial differences with the " m y o g l o b i n fold."
276
Hockenhull-Johnson et aL Table XII. Percent Homology Between Lucina Globins II and III and Selected Vertebrate and Nonvertebrate Globins
LucIl LucllI Huma ~ Humfl a Pets"
Huma
Hum/~
Pets"
Glyc
Lumb
Sea
Apl
Ure
Ctt
Trich
Lup
Vitr
34 37
37 35
36 37
38 36
34 32
44 41
40 35
36 38
39 39
25 27
33 30
3l 28
65
53 46
39 39 40
38 38 37
39 38 41
37 38 53
32 37 36
40 38 39
27 28 33
39 39 40
34 31 31
36
37 35
41 40
34 34
34 34
28 28
42 31
36 31
43
40 35
39 41
3l 31
32 40
36 33
41
26 28
36 39 22
33 29 28 42
GlyC Lumb Sca~ ApP Urea CtF Trich Lup ~ Vitr
The crystal structure of the hemoglobin is known and shows it to have the conventional "myoglobin fold."
Since the homology scores in Table XII among the globins whose crystal structures are known to have the myoglobin fold do not fall below 32%, it is likely that this number reflects the minimum requirement for the maintenance of the secondary structure considered to be characteristic of oxygen-binding hemoglobins. Because the foregoing sequences are compared only over ca. 140 positions, this requirement is equivalent to a total of some 45 indentities and conservative replacements. Thus, the 34 positions in the globin interior according to Template I of Bashford et al., that are conserved, completely or conservatively, have to be complemented by approximately a dozen conserved residues at the surface of the globin in order to meet the minimum requirement for maintenance of the myoglobin fold.
ACKNOWLEDGMENTS This work was supported in part by National Institutes of Health Grants DK 30382 (to D.A.W.), DK 38674 (to S.N.V.) and National Science Foundation Grant DCB 90-17722 (to J.B.W.). J.B.W. is a Research Career Awardee 1-K6-733 of the U.S. Public Health Service National Heart, Lung and Blood Institute. J.H.J. and M.S.S. were supported by National Institutes of Health Training Grant T32 HL07624 and T32 HL07602, respectively. The Macromolecular Facility at Wayne State University is supported in part by the School of Medicine, the Center for Molecular Biology, and the Department of Biochemistry, and is directed by June Snow.
REFERENCES Arents, G., and Love, W. E. (1990). J. Mol. BioL 210, 149-161. Bashford, D., Chothia, C., and Lesk, A. M. (1987). J. MoL Biol. 196, 199-216. Bolognesi, M., Onesti, S., Gatti, G., Coda, A., Ascenzi, P., and Brunori, M. (1989). J. MoL Biol. 205, 529-544. Bolognesi, M., Coda, A., Frigerio, F., Gatti, G., Ascenzi, P., and Brunori, M. (1990). J. Mol. Biol. 213, 621-625. Bolognesi, M., Frigerio, F., Lionetti, C., Rizzi, M., Ascenzi, P., and Brunori, M. (1991). In Structure and Function of lnvertebrate Oxygen Carriers (Vinogradov, S. N., and Kapp, O. H., eds.), Springer-Verlag, New York, pp. 163-172. Bonaventura, C., and Bonaventura, J. (1983). In The Mollusca, V01.2, Academic Press, New York, pp. 1-50. Bordo, D., and Argos, P. (1990). J. Mol. Biol. 211,975-988. Bordo, D., and Argos, P. (1991). J. Mol. Biol. 217, 721-729. Bonnet, A. S., and Laursen, R. A. (1977). FEBS Lett. 73, 201-203. Chan, M. M. S. (1984). Beckman Chromatogr. 5, 2-5. Childress, J. J., and Fisher, C. R. (1992). Oceanogr. Mar. Biol. Ann. Rev. 30, 35-46. Como, P. F., and Thompson, E. O. P. (1980). Austr. J. Biol. Sci. 33, 653-664. Crestfield, A. M., Moore, S., and Stein, W. H. (1963). J. BioL Chem. 238, 622-627. Fermi, G., and Perutz, M. F. (1981). Atlas of Molecular Structures in Biology 2: Haemoglobin and Myoglobin, Clarendon Press, Oxford. Fisher, C. R. (1990). Rev. Aquatic Sci. 2, 399-436. Fisher, W. K., Gilbert, A. T., and Thompson, E. O. P. (1984). Austr. J. BioL Sci. 37, 191-203. Frenkel, M. J., Dopheide, T. A. A., Wagland, B. M., and Ward, C. W. (1992). Mol. Biochem. Parasit. 50, 27-36. Furuta, H., and Kajita, A. (1983). Biochemistry 22, 917-922. Furuta, H., and Kajita, A. (1986). In Invertebrate Oxygen Carriers (Linzen, B., ed.), Springer-Verlag, Berlin, pp. 117-121. Furuta, H., and Kajita, A. (1991). In Structure and Function of Invertebrate Oxygen Carriers (Vinogradov, S. N., and Kapp, O. H., eds.), Springer-Vedag, New York, pp. 257-260. Garey, J. R., and Riggs, A. F. (1986). J. Biol. Chem. 261, 16,44616,450. Gilbert, A. T., and Thompson, E. O. P. (1985). Austr. J. Biol. Sci. 38, 221-236.
Hemoglobin III Sequence f r o m
L. pectinata
Goodman, M., Branitzer, G., Kleinschmidt, T., and Aschauer, H. (1983) Z. Physiol. Chem. 354, 205-217. Gross, E., and Witkop, B. (1962). J. BioL Chem. 237, 1856-1860. Henrickson, R. I., and Meredith, S. C. (1984). Anal. Biochem. 136, 65-74. Hockenhull-Johnson, J. D., Stern, M. S., Martin, P., Dass, C., Desiderio, D. M., Wittenberg, J. B., Vinogradov, S. N., and Walz, D. A. (1991). J. Prot. Chem. 10, 609-622. Honzatko, R. B., Hendrickson, W. A., and Love, W. E. (1985). J. Mol. Biol. 184, 147-164. Imamura, T., Baldwin, T. O., and Riggs, A. F. (1972). J. Biol. Chem. 247, 2785-2797. Jones, M. L. (ed.) (1985). Bull. Biol. Soc. Wash. 6, 1-547. Kemling, N., Kraus, D. W., Wittenberg, J. B., Vinogradov, S. N., Walz, D. A., Hockenhull-Johnson, J. D., Edwards, B. F. P., and Martin, P. (1991). J. Mol. Biol. 222, 463-464. Konieczny, A., Jensen, E. O., Marcker, K. A., and Legocki, A. B. (1987). Mol, BioL Rep. 12, 61-66. Kraus, D. W., and Wittenberg, J. B. (1990). J. Biol. Chem. 265, 16,043-16,053. Kraus, D. W., Wittenberg, J. B, Lu, J. F., and Peisach, J. (1990). J. BioL Chem. 265, 16,054-16,059. Lesk, A. M., and Chothia, C. (1980). J. Mol. Biol. 136, 225-270. Li, S. L., and Riggs, A. F. (1970). J. Biol. Chem. 245, 6149-6152. Mann, R. G., Fisher, W. K., Gilbert, A. T., and Thompson, E. O. P. (1986). Austr. J. Biol. Sci. 39, 109-115. Nagel, R. L. (1985). In Blood Cells o f Marine Invertebrates (W. D. Cohen, ed.), Alan R. Liss, New York, pp. 227-247. Pastore, A., Lesk, A. M., Bolognesi, M., and Onesti, S. (1988). Proteins Struc. Func. Genet. 4, 240-250. Petruzzelli, R., Goffredo, B. M., Barra, D., Bossa, F., Boffi, A., Verzili, D., Ascoli, F., and Chiancone, E. (1985). FEBS Lett. 184, 328-332. Petruzzelli, R., Boffi, A., Barra, D., Bossa, F., Ascoli, F., and Chiancone, E. (1989). FEBS Lett. 259, 133-136. Read, K. R. H. (1966). In Physiology o f Mollusca (Wilbur, K. M., and Yonge, C. M., eds.), Vol. 2, Academic Press, New York, pp. 209-232. Riggs, A. F., Riggs, C. K., Lin, R. J., and Domdey, H. (1986). In Invertebrate Oxygen Carriers (Linzen, B., ed.), SpringerVerlag, Berlin, pp. 473-476. Riggs, C. K., and Riggs, A. F. (1990). In Invertebrate Dioxygen Carr&rs (Preaux, G., and Lontie, R., eds.), Leuven University Press, Leuven, pp. 57-60.
277
Royer, W. E., Love, W. E., and Fenderson, F. F. (1985). Nature 316, 277-280. Royer, W. E., Hendrickson, W. A., and Love, W. E. (1989). J. Biol. Chem. 264, 31,052-31,062. Shishikura, F., Snow, J. W., Gotoh, T., Vinogradov, S. N., and Walz, D. A. (1987). J. BioL Chem. 262, 3123-3130. Southward, E. C. (1987). In Microbes in the Sea (Sleigh, M. A., ed.), Ellis Horwood, Chichester, pp. 84-116. Suzuki, T., Takagi, T., and Shikama, K. (1981). Bioch#n. Biophys. Aeta 669, 79-83. Suzuki, T. (1986). J. Biol. Chem. 261, 3692-3699. Suzuki, T., Takagi, T., and Ohta, S. (1989a) Biochem. J. 260, 177182. Suzuki, T., Takagi, T., and Ohta, S. (1989b). Bioehim. Biophys. Acta 999, 254-259. Suzuki, T., and Furukohri, T. (1990). J. Prot. Chem. 9, 69-73. Takagi, T., Tobita, M., and Shikama, K. (1983) Biochim. Biophys. Acta 745, 32-36. Takagi, T., lida, S., Matsuoka, A., and Sikama, K. (1984). J. Mol. Biol. 180, 1179-1184. Teale, F. W. J. (1959). Biochim. Biophys. Acta 35, 543. Tento6, L., Vivaldi, G., Carta, S., Antonini, A., and Brunori, M. (1973). Int. J. Pep. Prot. Res. 5, 182-200. Terwilliger, R. C., and Terwilliger, N. B. (1985). Comp. Biochem. Physiol. 81B, 255-261. Titchen, D. A., Glenn, W. K., Nassif, N., Thompson, A. R., and Thompson, E. O. P. (1991). Biochim. Biophys. Acta 1089, 61-67. Wakabayashi, S., Matsubara, H., and Webster, D. A. (1986). Nature 322, 481-483. Wellner, D., Pannserselvam, C., and Horecker, B. L. (1990). Proe. Natl. Acad. Sci. U.S.A. 87, 1947-1949. Wittenberg, J. (1985). Bull. Biol. Soc. Washington 6, 301-310. Wittenberg, J. B., and Kraus, D. W (1991). In Structure and Function o f Invertebrate Oxygen Carriers (Vinogradov, S. N., and Kapp, O. H., eds.), Springer-Verlag, New York, pp. 323-330. Wittenberg, J. B., and Wittenberg, B. A. (1990). Annu. Rev. Biophys. Biophys. Chem. 19, 217-235. Yegorov, Ts. A., Kazakov, V. K., Shakhparonov, M. L, and Feigina, M. Yu. (1980). Bioorg. Khim. 6, 349-364.