Springer 2006
Plant Molecular Biology (2006) 60:293–319 DOI 10.1007/s11103-005-4109-7
Genome-wide analysis and experimentation of plant serine/ threonine/tyrosine-specific protein kinases Parvathi Rudrabhatla, Mamatha M. Reddy and Ram Rajasekharan* Department of Biochemistry, Indian Institute of Science, Bangalore 560012, India (*author for correspondence; e-mail
[email protected]) Received 22 June 2005; accepted in revised form 17 October 2005
Key words: abiotic stress, dual-specificity protein kinase, molecular modeling, phylogeny, tyrosine kinase in plants
Abstract Protein tyrosine phosphorylation plays an important role in cell growth, development and oncogenesis. No classical protein tyrosine kinase has hitherto been cloned from plants. Does protein tyrosine kinase exist in plants? To address this, we have performed a genomic survey of protein tyrosine kinase motifs in plants using the delineated tyrosine phosphorylation motifs from the animal system. The Arabidopsis thaliana genome encodes 57 different protein kinases that have tyrosine kinase motifs. Animal non-receptor tyrosine kinases, SRC, ABL, LYN, FES, SEK, KIN and RAS have structural relationship with putative plant tyrosine kinases. In an extended analysis, animal receptor and non-receptor kinases, Raf and Ras kinases, mixed lineage kinases and plant serine/threonine/tyrosine (STY) protein kinases, form a well-supported group sharing a common origin within the superfamily of STY kinases. We report that plants lack bona fide tyrosine kinases, which raise an intriguing possibility that tyrosine phosphorylation is carried out by dual-specificity STY protein kinases in plants. The distribution pattern of STY protein kinase families on Arabidopsis chromosomes indicates that this gene family is partly a consequence of duplication and reshuffling of the Arabidopsis genome and of the generation of tandem repeats. Genome-wide analysis is supported by the functional expression and characterization of At2g24360 and phosphoproteomics of Arabidopsis. Evidence for tyrosine phosphorylated proteins is provided by alkaline hydrolysis, anti-phosphotyrosine immunoblotting, phosphoamino acid analysis and peptide mass fingerprinting. These results report the first comprehensive survey of genome-wide and tyrosine phosphoproteome analysis of plant STY protein kinases.
Introduction Protein kinases catalyze ATP-dependent phosphorylation of serine, threonine and tyrosine residues on target proteins. Traditionally, protein kinases have been divided into two nonoverlapping families: the protein tyrosine kinases and the serine/threonine kinases (Hanks et al., 1988; Hanks and Quinn, 1991). These two distinct families were originally identified based on conserved amino acid motifs within their catalytic domains (Hanks et al., 1988). Due to the absolute
specificity for their respective substrate residues, it was believed that protein kinases exclusively phosphorylated either tyrosine residues or serine/threonine residues. The discovery of protein kinases capable of phosphorylating all three hydroxyl amino acids, termed the dual specificity kinases, has forced a re-evaluation of protein kinase specificity (Lindberg et al., 1992). Phosphorylation of tyrosine in animal proteins acts as an on–off switch in numerous pathways that regulate growth, differentiation and oncogenesis (Hunter, 1987). On the other hand, no classical
294 protein tyrosine kinase has been cloned from plants. Despite the absence of these kinases, dual-specific kinases have been reported in plants. Arabidopsis ADK1, APK1, ATN1 (Hirayama and Oka, 1992; Ali et al., 1994; Tregear et al., 1996) and soybean GmPK6 (Feng et al., 1993) are the known dual specific kinases in plants. Peanut STY protein kinase is shown to have an important role in abiotic stress response and seed development (Rudrabhatla and Rajasekharan, 2002). In addition, peanut STY protein kinase is shown to be regulated by tyrosine phosphorylation (Rudrabhatla and Rajasekharan, 2003, 2004). However, bona fide protein tyrosine phosphatases (PTPs) have been characterized from Arabidopsis and other species (Gupta et al., 1998; Xu et al., 1998; Fordham-Skelton et al., 1999). The size of the catalytic domain of protein kinases varies from 250 to 300 amino acid residues. The catalytic domains are not conserved uniformly, but consist of alternating regions of high and low conservation. Eleven major conserved subdomains are evident, out of which the subdomains VI and VIII are of interest in that they contain residues that are specifically conserved in either the serine/threonine (S/T) or the tyrosine kinases, and may play a role in recognition of correct hydroxyl amino acid. The consensus DLRAAN or DLAARN found in sub-domain VI is a strong indicator of tyrosine specificity, whereas the consensus DLKPEN is an indicator of S/T specificity. Similarly, sub-domain VIII showed a well-conserved tyrosine-specific consensus PI/ VK/RWT/MAPE, and more poorly conserved S/ T-specific consensus GT/SXXY/FXAPE. The consensus CW(X)6RPXF of sub-domain XI also confers specificity to tyrosine kinases. The elucidation of complete genome sequence of model plant Arabidopsis has allowed us to perform the comprehensive survey of tyrosine protein kinase motifs in this organism. Limited evidence for tyrosine phosphorylation in plant proteins prompted us to look for proteintyrosine kinase activity in Arabidopsis. Phosphotyrosine residues are stable at alkaline pH, but phosphoserine and phosphothreonine are not (Bourassa et al., 1988; Noiman and Shaul, 1995). We have demonstrated alkali-resistant phosphoproteins in Arabidopsis. Furthermore, the identity of tyrosine phosphorylated proteins has been confirmed by phosphoamino acid
analysis, anti-p-Tyr immunoblot analysis and peptide mass fingerprinting. Methods Sequence retrieval, alignment, and comparison Gene, protein, EST, and cDNA sequences were identified by searching public databases available at NCBI (http://www.ncbi.nlm.nih.gov), TIGR, The Institute for Genomic Research (http://www.tigr.org/tdb/agi/), and MIPS (http://mips.gsf.de/ proj/thal/; Schoof et al., 2002) with the BLAST algorithms (Altschul et al., 1990, 1997). Tyrosine kinase catalytic domain signature was retrieved from SPRINTS (http://www.bioinf.man.ac.uk/cgibin/dbbrowser/sprint/searchprintss.cgi). Sequences were aligned using the MULTALIN program (http://prodes.toulouse.inra.fr/multalin/), and pairwise comparisons were performed with ClustalW (Thompson et al., 1994). TYRKINASE fingerprint provides a signature for the catalytic domain of tyrosine kinases, distinguishing it from that of S/T kinases. The fingerprint was derived from an initial alignment of seven sequences (Hanks et al., 1988): mainly the motifs correspond to subdomains VI, VIII (cf. PROSITE pattern PROTEIN_KINASE_TYR (PS00109)) and XI (Hanks, 1987; Hanks et al., 1988; Tan and Spudich, 1990). All sequence alignments and calculations of sequence identities were performed with ClustalW (Thompson et al., 1994). Sequences were initially subjected to BLASTP (Altschul et al., 1997) searches at NCBI. ClustalW was used to generate multiple sequence alignment. The non-redundant protein sequence database was searched; high scoring sequences were retrieved and placed into the alignment. In some instances, sequences were used as queries in BLASTP searches: at NCBI to discover their degree of similarity in the PlantsP database (http://plantsp.sdsc.edu) to discover the standard Arabidopsis identification number (TIGR version 2.0). Alignments were done with the entire protein sequences or after partitioning into distinct domains. Assessment of the quality of genome project gene predictions The nucleotide sequence corresponding to each protein was submitted to analysis by two gene prediction programs: GENSCAN (Burge and
295 Karlin, 1997) and GENMARK (Lukashin and Borodovsky, 1998). The amino acid sequence predicted from these gene-finding programs was used as a query in a BLASTP search of the nonredundant database. The annotated sequence appeared as a high-scoring hit, and the alignment of the two sequences was compared. Discrepancies such as amino acid residues present in one sequence but missing in the other were investigated further. This was done by using nucleotide sequence present in the NCBI-EST Arabidopsis database as an experimental reference. The predicted nucleic acid coding sequence was used as a query in a BLASTN search of the EST database. Strong hits to Arabidopsis sequences were examined further. The EST sequence was then translated, and its amino acid sequence aligned with both the annotated amino acid sequence and the predicted amino acid sequence. The errors in the annotated protein sequences were corrected and used for phylogenetic analysis. Phylogenetic tree inference Multiple sequence alignments constructed as described above were subjected to ‘‘bootstrap resampling’’. In brief, this entails randomly removing columns of data in the multiple sequence alignment and replacing them with replicated columns from elsewhere in the alignment, so that the alignment size is not altered. These bootstrap replicate alignments were then utilized to construct phylogenetic trees by the neighbor joining method (Saitou and Nei, 1987). Phylogenetic and molecular evolutionary analyses were conducted using MEGA version 2.1 software (http://www.megasoftware.net/; Kumar et al., 2001). The phylogenetic trees were visualized using TREEVIEW (Page, 1996). The results were analyzed using the bootstrap method (1000 replicates) to provide confidence levels for the tree topology (Felsenstein, 1996). ‘‘Consensus’’ trees summarizing the topologies found among the bootstrap replicate trees are presented. The tree topology presented is that generated by neighbor-joining, with bootstrap support at critical nodes indicated as percentage.
ture/cdd/cdd.shtml) in conjunction with BLAST searches and Pfam database (Bateman et al., 2000). Computer modeling studies Modeling of the structure of STY family protein kinases was done based on X-ray structures of the protein structures that produced the best E value when using BLAST against the protein in the Protein Data Bank (PDB) database using the SwissPDB Viewer package (Guex and Peitsch, 1997; http://www.expasy.ch/swissmod/SWISS-MODEL.html). The molecular modeling method used was ProMod II (Guex and Peitsch, 1997). Three-dimensional models of STY protein kinase family members were predicted using the coordinates of templates shown in Table 3. The sequence of the kinase domain of the STY protein sequences were aligned with the sequence of the homologous tyrosine kinases using the advanced BLAST program (http://www.ncbi.nlm.nih.gov/BLAST/). Refinement of side chains and terminal chains was done using the Molecular Operation Environment (MOE) software package (Version 2001.01, Chemical Computing Group, Montreal, Canada). The generated model was then energy minimized in SYBYL using the AMBER force field (Weiner et al., 1984) by moving side chains alone, to relieve short contacts at the interprotomer interfaces. The quality of the three-dimensional models was evaluated using PROCHECK and Prosa II version 3.0 (Sippl, 1993). Improvements of the models were obtained by an iterative sequence-structure alignment procedure, yielding finally the sequence alignment between the STY protein kinase domain and homologous structures. Three-dimensional models are visualized by RasMol (Sayle and Milner-White, 1995). Gene expression analysis The Genevestigator online search tool Meta-Analyzer (http://www.genevestigator.ethz.ch; Zimmermann et al., 2004) was used to retrieve microarray expression data. The analysis is reported as heat map where a color spectrum defines the relative expression of each gene.
Examination of conserved protein domains
Plant growth
Conserved protein domains were examined by using the CDD at NCBI (http://www.ncbi.nih.gov/Struc-
Arabidopsis thaliana, ecotype Columbia, was used in this study. Seeds were surfaced sterilized for
296 5 min in 5% (w/v) sodium hypochlorite, 0.01% (v/ v) Tween 20 and rinsed five times with sterile water and plated on half-strength Murashige and Skoog medium (Murashige and Skoog, 1962). To ensure a homogenous germination, plated seeds were kept at 4 C for 3 d and then transferred to a controlled environment cabinet at 23 C for germination under a 16-h light/8-h dark cycle. In all our experiments we have used 23- to 30-day-old plants. Subcloning and expression of STY13 (At2g24360) in Escherichia coli The open reading frame of the STY13 protein kinase was subcloned into pET21d expression vector (Novagen, U.S.A.) at EcoRI and SalI restriction sites. The construct was expressed in Escherichia coli BL21 (Rosette) by inducing with 0.5 mM IPTG for 4 h. The recombinant protein was used for the assays. In vitro kinase assays Autophosphorylation assay was performed by incubating the proteins in 50 ll of reaction mixture (50 mM Tris–HCl, pH 7.5 and 10 mM MgCl2) in the presence of 25 lM [c-32P]ATP (3000 dpm pmol)1) at 30 C for indicated time intervals. Substrate phosphorylation assay mixture consisted of 10 lg of Histone III-S (Sigma) as exogenous substrate in addition to autophosphorylation reaction components. The reaction was stopped by the addition of SDS-PAGE loading buffer and the phosphorylated products were separated by SDS-PAGE and the labeled proteins were detected by autoradiography. Alkali treatment Proteins in the soluble fraction (100 lg) were phosphorylated in vitro in the presence of 10 lCi nmol)1 of [c-32P]ATP for 15 min, and the reaction was stopped with SDS-PAGE loading buffer and heated for 5 min at 60 C. Samples were analyzed by 10% (w/v) SDS-PAGE and proteins were blotted onto a PVDF membrane at 50 V for 5 h. Membranes were divided into two identical fractions that contained the same samples. One membrane was incubated in 1 M KOH at 56 C for 90 min, and the control membrane was maintained in water in the same conditions as the KOH-treated membrane.
Membranes were independently washed twice (5 min each time) in 100 ml (each time) of buffer (10 mM Tris-HCl, pH 7.4, and 150 mM NaCl) followed by two washes in 1 M Tris–HCl, pH 7.0, and distilled water. Membranes were dried and exposed to autoradiography for 48 h at )80 C. Assay of tyrosine phosphorylation The soluble proteins from Arabidopsis were run on 10% SDS-PAGE, transferred to nitrocellulose and were incubated with anti-p-Tyr monoclonal antibody (Sigma Chemical Company). After washing three times with the PBS containing 0.05% Tween20, the filters were incubated with peroxidase coupled secondary antibodies and developed using 3,3¢-diaminobenzidine as a chromogenic substrate. Phosphoamino acid analysis The purified STY protein kinase was labeled in vitro with [c-32P]ATP as described above and electroblotted onto a PVDF membrane. After autoradiography, radioactive band of interest were excised and hydrolyzed in 200 ll of 6 N HCl for 2 h at 110 C. The hydrolysate was dried in a Speed-Vac concentrator and resuspended in 20 ll of water containing 1 mg ml)1 of each of the phosphoamino acid markers such as phosphoserine, phosphothreonine and phosphotyrosine (Sigma). Two microliters of the hydrolysate were analyzed by ascending silica thin-layer chromatography (Merck) using a solvent system containing a mixture of ethanol and ammonia (3.5:1.6, v/v) (Munoz and Marshall, 1990). The position of phosphoamino acid markers was detected by ninhydrin staining of the silica-TLC plate (0.25% ninhydrin in acetone). The plate was then exposed for autoradiography to locate the position of the 32 P-labeled amino acids. In-gel proteolytic digestion, mass spectrometry, database searching and sequence analysis Gels were stained using Coomassie brilliant blue R-250 or silver nitrate. Proteins were excised from the gel, diced finely, washed in 100 mM ammonium bicarbonate (NH4HCO3), dehydrated in acetonitrile (ACN), and dried in a vacuum centrifuge (SpeedVac Plus SC110A; Savant) as described (Shevchenko et al., 1996). A minimal volume of
297 100 mM NH4HCO3 containing 10 mM DTT was added, and the gel fragments were incubated for 60 min at 56 C and then for 30 min in the dark in 55 mM iodoacetamide in 100 mM NH4HCO3. The gel pieces were washed with 100 mM NH4HCO3, dehydrated in ACN, and dried completely in the vacuum centrifuge. The gel pieces were reswollen in 50 mM NH4HCO3 containing trypsin (sequencing grade, modified; 13 ng/ll; Promega) and incubated at 37 C overnight. The peptides were then extracted and dissolved in 20 ll of 5% (v/v) formic acid and applied to the MALDI-TOF target plate by the dried droplet method using cyano-4-hydrocynnamic acid as matrix. The mass spectra were obtained using a MALDI-TOF mass spectrometer (REFLEX II from Bruker Daltonics and Voyager-DE-STR from Perseptive Biosystems Inc.). The spectra were annotated with the program ‘‘m/z’’ from Proteometrics and internally calibrated using either trypsin autodigestion peptides (842.51 D and 2211.11 D) or ACTH (18–39). The latest versions of the NCBI non-redundant database were searched with the resulting peptide mass lists, using the search engine MASCOT (http:// www.matrixscience.com) and ProFound (http:// prowl.rockefeller.edu/cgi-bin/ProFound).
Results and discussion Structural, functional and evolutionary relationship of serine/threonine/tyrosine kinase family from Arabidopsis Consensus serine/threonine and tyrosine kinase motifs in protein kinases are depicted in Table 1. Out of the 11 sub-domains present in protein kinases, sub-domain VI confers serine/threonine specificity, and sub-domains VIII and XI confer tyrosine kinase specificity. The consensus CW(X)6RPXF of sub-domain XI is conserved among biochemically characterized tyrosine kinases from mammals, fruitfly and Dictyostelium.
Protein tyrosine kinases, FES (Katzen et al., 1991), LYN (Rider et al., 1994; Yi et al., 1991), CSK (Cloutier and Veillette, 1996; Shekhtman et al., 2001), growth factor receptors (Yarden et al., 1986; Holtrich et al., 1991), janus kinases (Takahashi and Shirasawa, 1994), guanylyl cyclase (Yang et al., 1995, Garg et al., 2002), KIN (Morgan and Greenwald, 1993), and mixed lineage kinase (Liu et al., 2000) from human, mice, rat, fruitfly and worm have the motif CW(X)6RPXF conserved in the sub-domain XI. Dictyostelium protein tyrosine kinases (Tan and Spudich, 1990; Nuckolls et al., 1996) also retain this motif in the sub-domain XI. Once we have confirmed that all the characterized tyrosine kinases have CW(X)6RPXF sequence motif conserved in the sub-domain XI, we performed extensive search in all the available databases to obtain plant sequences containing this motif. Repetitive database searches with CW(X)6RPXF sequence motif revealed the existence of a total of 80 Arabidopsis protein kinases. After individual protein sequences were identified, we ensured that we had eliminated duplications and contaminating sequences through careful comparisons with theoretical cDNA and genomic DNA sequences in the AGI genomic database. In this manner, BLAST search revealed the existence of a total of 57 catalogued A. thaliana protein kinases for which the complete deduced catalytic domain sequences are available (Table 2). AGI gene code was not available for six protein sequences (STY 47–51 and STY 57) and these sequences were abbreviated as NA (non-annotated). Myosin light chain protein (gi: 9802560) was identified as a false positive with this motif. We have also performed extensive database analysis in TAIR, TIGR, MIPS and PlantsP. Table 2 reveals the AGI gene code, PlantsP gene code, BAC, expression, molecular weight of the predicted Arabidopsis STY protein kinases. All the 57 protein sequences have been verified to have 11 kinase sub-domains. Sequence annotations were subjected to the quality assessment procedure as detailed in
Table 1. Serine/threonine and tyrosine protein kinase motifs. Protein kinase
VI B
VIII
XI
Ser/Thr kinase Ser/Thr/Tyr kinase Tyr kinase
DLKXXN DLKSDN DL(R/A)A(A/R)N
G(T/S)XX(Y/F)XAPE GTYRWMAPE XP(I/V)(K/R)W(T/M)APE
XX(X)6RXXX CW(X)6RPXF CW(X)6RPXF
298 Table 2. Characteristics of predicted Arabidopsis Serine/Threonine/Tyrosine protein kinases.
STY1 STY2 STY3 STY4 STY5 STY6 STY7 STY8 STY9 STY10 STY11 STY12 STY13 STY14 STY15 STY16 STY17 STY18 STY19 STY20 STY21 STY22 STY23 STY24 STY25 STY26 STY27 STY28 STY29 STY30 STY31 STY32 STY33 STY34 STY35 STY36 STY37 STY38 STY39 STY40 STY41 STY42 STY43 STY44 STY45 STY46 STY47 STY48 STY49 STY50 STY51 STY52 STY53 STY54 STY55
GenBank GI Accession
AGI Genea Plants Pb Genomic locusc ESTd Mol Wt.
15232679 15226883 22326737 15232197 15241070 15223025 15219796 7488198 15219183 18390931 22329643 18415205 8400528 15219417 15240070 15235845 22329194 15237684 15219517 15240630 15232181 15237443 18411024 15230755 15242791 15230296 8408889 15230753 15225139 15232680 15233846 18424175 15230168 7940280 15238163 15242848 18408017 22327668 7488198 9719730 18416060 15224375 15223409 15220773 15230295 18420244 12321912 22652534 7485274 25518342 2505884 15240597 15233574 15232131 15229398
At3g46920 At2g35050 At5g11850 At3g27560 At5g01850 At1g14000 At1g04700 At2g17700 At1g16270 At1g08720 At1g18160 At4g18950 At2g24360 At1g79570 At5g66710 At4g31170 At4g35780 At5g58520 At1g73660 At5g50180 At3g59830 At5g40540 At3g58760 At3g06640 At5g57610 At3g50730 At1g67890 At3g06620 At2g31800 At3g46930 At4g24480 At5g58950 At3g24720 At1g62400 At5g41730 At5g03730 At1g64300 At5g49470 At1g08735 At1g18160 At4g23050 At2g43850 At1g01450 At1g62400 At3g50720 At4g38470 NA NA NA NA NA At5g50000 At4g14780 At3g01490 At3g63260
NP_190276 NP_181050 NP_196746 NP_189393 NP_195805 NP_172853 NP_171964 NP_179361 NP_173077 NP_563824 NP_173254 NP_567568 NP_565568 NP_178075 NP_201472 NP_194846 NP_195303 NP_200660 NP_177507 NP_199829 NP_191542 NP_198870 NP_567074 NP_187316 NP_200569 NP_190642 NP_564913 NP_187314 NP_180739 NP_190277 NP_194179 NP_568893 NP_189116 AAF70839 NP_198988 NP_195993 NP_564829 NP_199758 T00726 AAF97832 NP_567676 NP_181913 NP_171652 NP_176430 NP_190641 NP_568041 AAG50991 AAN03743 T08864 B86145 CAA73313 NP_199811 NP_193214 NP_186798 NP_191885
21747 21586 21971 21214 21942 – 21068 – 21431 21410 21128 21836 21646 21429 22062 21896 21916 21343 21462 22020 21266 21998 – 21998 22036 21759 21477 21691 21523 21748 21870 21346 21224 21504 21327 21949 21034 22014 – 21128 21849 21516 21120 21504 21758 21920 – – – – – 22019 21929 21929 21789
T6H20.50 F19I3.28 F14F18.20 MMJ24.11 T20L15.120 F7A19.9 T1G11.5 T17A5.2 F3O9.7 F22O13.20 T10O22.13 F13C5.120 T28I24.9 T8K14.1 MSN2.10 F6E21.90 F4B14.1 MQJ2.14 F25P22.8 K6A12.4 F24G16.100 MNF13.60 T20N10.110 F5E6.3 MUA2.19 F18B3.10 T23K23.26 F5E6.5 F20M17.16 T6H20.40 K19M22.20 K19M22.20 K7P8.1 F24O1.13 MUF8.1 F17C15.150 F15H21.13 K7J8.16 F22O13.21 T10F20.16 F7H19.240 F18O19.4 F22L4.1 F18O19.4 T3A5.100 F20M13.30 T8E24.12 – – T22A6.310 – K9P8.18 FCAALL.308 F4P13.4 F16M2.110
2 6 1 2 3 2 1 1 7 6 3 2 10 15 – 6 1 3 4 2 – 5 4 – 2 – 7 2 4 1 9 9 – 3 – 3 2 1 – 5 6 3 2 3 3 – – – – 3 – 2 1 3
129010.36 139704.80 97881.60 40096.14 37588.54 49335.24 118006.22 61509.08 126780.34 103588.24 108159.84 52617.71 46001.55 137242.56 46011.83 46083.57 64696.27 66226.33 112205.46 38969.04 53894.71 39660.74 54020.62 85593.35 117444.16 42327.97 85217.42 86054.92 55314.60 53822.80 58856.24 58856.24 33294.33 39189.22 81868.60 90305.89 82612.38 54547.91
Remarks
MAP3K delta-1-like PK PB1 domain/GmPK6-like PK MAP3K delta-1-like PK ATN1-like PK ATN-like PK Ankyrin repeat domain PK Similarity to GmPK6 Peanut STY-like PK PB1/GmPK6-like PK MAP3K delta-1/EDR1-like PK MAP3K/CTR1-like PK Ankyrin repeat domain PK Peanut STY-like PK Putative PK ATN1-like PK Peanut STY-like PK Peanut STY-like PK ATN-1like PK CTR1-like PK ATN1-like PK ATN1-like PK ATN1-like PK Ankyrin repeat domain kinase CTR1 related PK Putative PK ATN1-like PK MAP3K delta-1 PK CTR1-like PK Ankyrin repeat domain PK GmPK6-like PK GmPK6-like PK GmPK6-like PK Putative PK GmPK6-like PK Light sensory PK Protein kinase CTR1 Light sensory PK Ankyrin repeat domain PK EDR1-like PK 108159.84 CTR1-/ MAP3K delta-1-like PK MAP-3K-like PK 53216.06 Ankyrin repeat domain PK 53117.53 Light sensory PK 56640 GmPK6-like PK 43103.98 ATN-1-like PK 64811.26 Peanut STY-like PK 85470 CTR1/EDR1 related PK – Ankyrin repeat domain PK 39533 Peanut STY-like PK 107994.21 Tomato PK CTR1 Light sensory PK 42728.95 ATMRK1 related PK 40724.66 ATMRK1 related PK 46049.31 ATMRK1 related PK 42582.45 ATMRK1 related PK
299 Table 2. (Continued)
STY56 STY57
GenBank GI
Accession
AGI Genea
Plants Pb
Genomic locusc
ESTd
Mol Wt.
Remarks
18403507 21554375
NP_566716 AAM63482
At3g22750
21206
MWI23.12
5
42714.76 391 aa
ATMRK1 related PK ATMRK1 PK
a
Systematic designation given to gene by Arabidopsis Genome Initiative (2000). The PlantsP plant phosphorylation database (Gribskov et al., 2001) identification number is shown for all plant kinase sequences. c Designation of gene locus on annotated bacterial artificial chromosome. NA denotes the STY kinases that are not annotated in the AGI database. d Gene expression status is based on cloned cDNA or presence of cognated EST or full-length cDNA. b
‘‘Methods’’. The protein sequence predicted from genomic DNA sequence was determined to have differences in the GENSCAN predicted sequence and the annotated sequence in the public database. STY7, STY11 and STY36 had two additional amino acids towards the N-terminus in the annotated sequence with respect to the predicted sequence in the GenBank. STY12 and STY19 revealed 12 and 32 additional amino acids at the N-terminus as compared with the predicted protein sequence by GENSCAN, respectively. STY25 had 53 additional amino acids at the C-terminus in the annotated sequence with respect to the GENSCAN predicted sequence. STY38 had additional 287 amino acids at the N-terminus with GENSCAN predicted sequence compared to annotated sequence. However, the annotated sequences were found to be correct with respect to STY7, STY11, STY25 and STY36. STY19 had 32 amino acids at the N-terminus in the annotated sequence that was not predicted by the GENSCAN. There was no EST counterpart available in the database for this region. GENSCAN predicted STY38 sequence had additional 287 amino acids at the N-terminus and we found an EST match in the database (gi|8680157|dbj|AV520630.1|AV520630). The confirmed sequences were used for phylogenetic analyses. Multiple sequence alignment indicated that the catalytic domain has all 11 conserved sub-domains of the protein kinases (data not shown). The subdomain XI has the consensus protein tyrosine kinase motif CW(X)6RPXF (Supplemental Figure S1). However, the sub-domain VI of all the sequences showed KXXN motif, indicative of serine/threonine specificity. None of the kinases had the tyrosine kinase consensus motif RAA or ARR in the sub-domain VIB. Thus, the catalytic domains of all the catalogued Arabidopsis kinases
have motifs for serine/threonine in sub-domain VIB and tyrosine kinase motif in sub-domain XI. These results suggested that all the kinases belong to dual-specificity kinase family. In the complete genome of Arabidopsis, we could not obtain protein kinases that confer tyrosine kinase specificity alone. Hence, we have tentatively named these protein sequences as STY (serine/threonine/ tyrosine) protein kinases. The characteristics of Arabidopsis STY protein kinases are depicted in Table 2. To examine the protein relationships of Arabidopsis STY protein kinases, a topographic cladogram was constructed. Casein kinase 1 (serine/threonine kinase) was used as an outgroup to perceive the true class of STY protein kinase family. To examine the relationship between genes in more detail, kinase domain sequences were used to determine the genetic distances and to construct phylogenetic trees. The complete protein sequences as well as kinase catalytic domains of all the 57 Arabidopsis kinases were used to construct the dendrogram. We did not observe any significant difference between the clustering pattern of the dendrogram constructed with the entire protein sequences as well as kinase catalytic domains. Neighbor joining tree was constructed with the full-length protein sequences following the alignments (Figure 1). Dendrogram of STY protein kinases suggested that the kinases are mainly clustered in to four groups. Group I is further divided in to four families. Family 1.1 (ATN1-like kinases), Family 1.2 (peanut STYrelated kinases), Family 1.3 (soybean GmPK6-like kinases) and Family 1.4 (ATMRK1-like kinases). Group II mainly consists of MAP3K/CTR1/ EDR1 protein kinases. However, family 2.1 of group II also contains GmPK6-like protein kinases. Group II is divided in to three families. Family 2.1 (PB1 domain/GmPK6/EDR1/MAP3K like
300
301 b Figure 1. Phylogenetic analysis of predicted Arabidopsis STY kinase family. STY protein kinases from Arabidopsis were aligned by ClustalW program (Supplemental Figure S1). The tree and bootstrap analyses were performed using the neighbor-joining algorithm implemented in MEGA software (Kumar et al., 2001). Casein kinase 1 (protein serine/threonine kinase) was used as an outgroup to perceive the true class of serine/threonine/tyrosine kinase family. The numbers at the nodes of branches of the tree are bootstrap values. Bootstrap values from 1000 replicates were used to assess the robustness of the trees. Branch lengths are indicated at the branches. The lengths of the branches are proportional to the degree of divergence and thus correspond to the statistical significance of the phylogeny between the protein sequences. Family 1.1, 1.2 and 1.3, shaded in grey, have the consensus TYRWMAPE in the sub-domain VIII in addition to tyrosine kinase consensus CW(X)6RPXF. Arabidopsis Genome Initiative (AGI) genome codes are given for all the proteins. NA indicates the protein sequences that are not annotated in the AGI.
kinases), Family 2.2 (PAS domain/MAP3K/ CTR1/EDR1-like kinases) and family 2.3 (MAP3K/CTR1/EDR1-like kinases). Group III consists of protein kinases containing ankyrin domain repeat motifs. These kinases are related to Medicago sativa ankyrin kinase, MsAPK1 (Chinchilla et al., 2003). Group IV includes light sensory kinases that are related to Ceratodon purpureus phytochrome/kinase. C. purpureus light sensory kinase has both phytochrome and protein kinase domains. However, the protein kinases of group IV does not have phytochrome domain. Similar kind of classification is found in PlantsP database (Gribskov et al., 2001). The classification uses the entire sequence; sequences that share domains outside of the kinase catalytic domain should therefore cluster together before sequences that only have the catalytic domain in common. All plant protein kinases (1286) were filtered to remove those lacking a relatively complete (90%) protein kinase domain. The class II of the PlantsP kinase classification includes the protein kinases ATN1/CTR1/EDR1/GmPK6 like kinases. PlantsP protein ID numbers are given in Table 2. STY8, STY23, STY39, STY47–52, STY57 are not encoded in the PlantsP classification. A similar kind of bioinformatic analysis revealed that less than 4% of Arabidopsis kinases are tyrosine specific kinases (Carpi et al., 2002). Arabidopsis STY protein kinases are highly homologous to each other. Pair-wise analysis with the full protein sequence indicated the overall identities and similarities. High identities are
found between STY21 and STY42 (81%), STY39 and STY41 (78%), STY7 and STY33 (78%), STY2 and STY33 (74%), STY20 and STY4 (72%), STY9 and STY14 (71%), and such high identities may indicate a similar function. The most divergent groups among this family are with 9% identity (STY3 and STY35), (STY3 and STY37) and (STY19 and STY37). Interestingly, the kinases of group I, family 1.1 (ATN1 protein kinase family), family 1.2 (peanut STY-related kinases) and family 1.3 (soybean GmPK6-like kinases) have tyrosine kinase consensus sequence RWMAPE in the sub-domain VIII in addition to CW(X)6RPXF consensus sequence in the sub-domain XI, suggesting that these kinases are evolutionarily more closer to the protein tyrosine kinase family. Arabidopsis ATN1 (Tregear et al., 1996) and soybean GmPK6 (Feng et al., 1993) are the known dual-specificity kinases that are not characterized and the function of these kinases remains unknown. These genes belong to multigene family. Peanut STY protein kinase is shown to be involved in cold- and salt-stress responses and is developmentally regulated. It is known to autophosphorylate predominantly at tyrosine (Rudrabhatla and Rajasekharan, 2002, 2003). The genes of family 1.2 that are related to peanut STY protein kinase (STY8, STY13, STY16, STY 17, STY18, STY46 and STY49) may have a possible role in cold and salt signaling. Group 1.4 protein kinases belong to ATMRK1 family. ATMRK1 kinases are similar to mixed lineage kinases and Raf protein kinases (Ichimura et al., 1997). Group II mostly consists of mixed lineage kinases and CTR1/EDR1-related protein kinases. CTR1 protein kinases are most closely related to the Raf protein kinase family from animal systems. Protein kinase Raf is strategically located in the ‘‘RAS-MAP-kinase signal transduction pathway’’. The family of Raf protein kinases is involved in cellular processes that regulate proliferation, differentiation and apoptosis (Yuryev and Wennogle, 1998). EDR1 kinases encode mixed lineage kinases similar to CTR1 and regulate defense responses in a wide range of crop species (Tang and Innes, 2002). Mixed lineage kinases are shown to be involved in defence response (Innes, 2001). Thus, the predicted STY protein kinases of group II may have a possible role in defense signaling. Alfalfa ankyrin kinase is involved in osmotic stress (Chinchilla et al., 2003).
302
303 b Figure 2. Schematic representation of the domains of STY protein kinase family. The tyrosine kinase catalytic domain is shown for all the protein kinases. The other domains are as follows. ACT domain (GnI|CDD|17058), PB1 domain (gnl|CDD|3981), PAS domain (gnl|CDD|5369), ankyrin repeat domain (gnl|CDD|14848).
Thus the kinases of group III that have ankyrin domain repeat motif might have a possible role in osmotic stress response. The protein kinases of group IV are similar to Ceratodon purpureus phytochrome/kinase suggesting a possible role in light sensing. These kinases are classified under family 2.1.1 of PlantsP. Family 2.1 kinases belong to GmPK6/EDR1/MAP3K-related protein kinases that have conserved PB1 (Phox and Bem1p; gnl|CDD|3981) domain. Similarly, PAS domain is conserved in family 2.2 protein kinases. Gene structure and subcellular distribution of Arabidopsis STY protein kinases The high degree of functional diversification among the protein kinases is made possible by their ability to interact with large numbers of cellular proteins. These interactions are mediated through additional subunits or domains of the kinase that are regulatory or act as protein-interaction modules. The functions of these non-catalytic domains thus suggest the biological roles of the kinase and the specificity of the proteins or other ligands that bind to their specific domains. The protein tyrosine kinase catalytic domain is present in all the 57 Arabidopsis STY protein kinases (Figure 2). In addition, other interesting domains are conserved among these kinases. A noteworthy feature of clustering of STY protein kinase family is that the members within each of the families tend to have similar functional domains. The functional domains conserved in STY family protein kinases include PAS, PB1, ACT and ankyrin repeat motif (Figure 2). PB1 (Phox and Bem1p; gnl|CDD|3981) domain is conserved in predicted STY protein kinases of group II (family 2.1). This domain is present in many eukaryotic cytoplasmic signaling proteins (Zhou et al., 1995). The domain adopts a b-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is
necessary for PB1 domain function. Octicosapeptide repeat (OPR) motif is known to occur in an isoform of protein kinase C (PKC). This divalent cation binding domain suggests an influence of divalent cations in mitogenic signaling. STY17 protein kinase of group I (family 1.2) has conserved ACT domain (gnl|CDD|17058). ACT families of domains generally have a regulatory role. ACT domains are linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme (Aravind and Koonin, 1999). PAS domain (gnl|CDD|5369) is conserved among group II (family 2.2) protein kinases that belong to MAP3K group. PAS domains have been found to bind ligands, and act as sensors for light and oxygen in signal transduction (Ponting and Aravind, 1997). Group III protein kinases of STY protein kinases have ankyrin repeat motifs. Ankyrin repeats (gnl|CDD|14848) mediate protein–protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20. ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a b-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains four consecutive repeats (Bennet and Chen, 2001). A total 57 STY protein kinases are predicted in Arabidopsis genome, out of which 10 genes do not have ESTs in the database (Table 2). The hydropathy plot (Kyte and Doolittle, 1982) of all the predicted STY protein kinases suggested that they are localized in the cytosol (data not shown). Chromosomal distribution of Arabidopsis STY protein kinases Fifty-seven Arabidopsis STY protein kinases are distributed among all five chromosomes (Figure 3). The regions that contain no STY protein kinases are the short arms of the chromosome II and IV. Chromosome I has 14 genes with 8 genes (STY6, STY7, STY9, STY10, STY11, STY39, STY40 and STY43) located on the upper arm of the chromosome within the region spanning less than 6 cM. All the genes except STY43 are organized in the same transcriptional orientation. The lower arm of the chromosome I has six genes (STY14, STY19, STY27, STY34, STY37 and
304
Figure 3. Distribution of STY protein kinases on Arabidopsis chromosomes. The genes encoding STY protein kinases are distributed among all five chromosomes. Chromosomes (I–V) are depicted by the vertical bars. The darkened circles indicate centromeres. Arrows next to the gene names show the direction of transcription. Horizontal bars indicate the location of each of the STY genes, and their physical position is given in centiMorgans.
STY44) distributed in a region less than 7 cM. Four genes (STY27, STY34, STY37 and STY44) are in the same transcriptional orientation, and the remaining two genes, STY14 and STY19 are adjacent to the telomere in the opposite transcriptional orientation. Chromosome II has least number of genes (STY2, STY8, STY13, STY29 and STY42). Chromosome III has 13 genes of this family, and 1 gene cluster is organized in tandem in the same transcriptional orientation (STY28 and STY24). Furthermore, STY24 and STY28 are present in the same sub-group and pair-wise analysis with the full protein sequences indicated 74% sequence homology in this gene cluster. In addition, sequence homology also exists in the N-terminal non-kinase domain in this gene cluster. These results suggested that they arose relatively by gene duplication and they may have similar or overlapping functions. In the lower arm of the chromosome III, seven genes are organized. STY26 and STY45 are organized in tandem in the same transcriptional orientation. STY26 and STY45 clustered together in family 1.1, suggesting that these genes arose by gene duplication. Seven genes (STY12, STY16, STY17, STY31, STY41, STY46 and STY53) are located on chromosome
IV. Chromosome V has 12 genes; 3 of them are located in the upper arm. STY5 is located at the end of the upper arm of chromosome V. The lower arm of the chromosome V has the largest number of genes (nine STY protein kinase genes). STY25 and STY18 are present as one gene cluster in tandem in the same transcriptional orientation. STY15 is located at the end of the lower arm of the chromosome V. Homology modeling of STY family of protein kinases The homology modeling of Arabidopsis STY protein kinase molecules was predicted using the templates that produced the best E value when using BLAST against the protein in the Protein Data Bank database. Interestingly, the templates that had high homology with STY protein kinase molecules are the crystal structures of the bona fide animal tyrosine kinases (Table 3). Threedimensional model of the protein kinase domain of major groups of STY protein kinase family was determined using SWISS-MODEL (Guex and Peitsch, 1997) and visualized by RasMol (Sayle and Milner-White, 1995). The motif
305 Table 3. Details of the templates of crystal structures used for prediction of STY protein kinases. STY
Family
PDB code (A)
Template description
STY5
Family 1.1
STY17
Family 1.2
STY32
Family 1.3
STY52
Family 1.4
STY1
Family 2.1
STY27
Family 2.2
STY3
Family 2.3
STY12
Group III
STY43
Group IV
1m52.pdb (2.6 A) 1opj.pdb (1.65 A) 1iep.pdb (2.1 A) 1FPU.pdb (2.4 A) 1IEP.pdb (2.1 A) 1k9A.pdb (2.5 A) 1qpc.pdb (1.6 A) 1qpe.pdb (2.00 A) 1qcf.pdb (2.00 A) 3lck.pdb (1.7 A) 1qjo.pdb (2.4 A) 1m52.pdb (2.6 A) 1opj.pdb (1.75 A) 1fpu.pdb (2.4 A) 1k2p.pdb (2.1 A) 1irk.pdb (2.1 A) 1qaq.pdb (2.7 A) 1k3a.pdb (2.1 A) 1m7n.pdb (2.7 A) 1opj.pdb (1.75 A) 1iep.pdb (2.1 A) 1m52.pdb (2.6 A) 1opj.pdb (1.75 A) 1iep.pdb (2.1 A) 1k2p.pdb (2.1 A) 1qjo.pdb (2.1 A) 1aqw.pdb (2.1 A) 1k9a.pdb (2.5 A) 1h26.pdb (2.24 A) 1m14.pdb (2.6 A) 1m17.pdb (2.6 A)
C-ABL kinase domain in complex with STI-571 Structural basis of auto inhibition of C-ABL kinase C-ABL kinase domain in complex with STI-571 ABL kinase domain in complex with small molecule inhibitor C-ABL kinase domain in complex with STI-571 Carboxy terminal SRC kinase Lymphocyte-specific kinase in complex with non-selective inhibitor Lymphocyte-specific kinase in complex with non-selective inhibitor HCK I complex with SRC kinase specific inhibitor LCK activated form (autophosphorylated on tyr394) FGFR2 tyrosine kinase domain C-ABL kinase domain in complex with PD173955 Structural basis of auto inhibition of C-ABL kinase C-ABL kinase domain in complex with small molecule inhibitor Burtons tyrosine kinase domain Insulin Receptor kinase insulin receptor kinase in couple with bisubstrate inhibitor Insulin-like growth factor receptor Unactivated apo Insulin-like growth factor receptor Structural basis of auto inhibition of C-ABL kinase C-ABL kinase domain in complex with STI-571 C-ABL kinase domain in complex with PD173955 Structural basis of auto inhibition of C-ABL kinase C-ABL kinase domain in complex with STI-571 Burtons tyrosine kinase domain Burtons tyrosine kinase domain Fibroblast Growth Factor Receptor 1 Carboxy terminal SRC kinase CDK2/cyclin in complex with 11-residue from p53 Epidermal growth factor receptor Epidermal growth factor receptor domain with 4-anilinoquinozoline with inhibitor erlotinib Human CDK2 kinase complex with CKSHS1 CDK2 complex with 11-residue recruitment peptide from retinoblastoma associated protein
1buh.pdb (2.6 A) 1h25.pdb
CW(X)6RPXF is conserved among the predicted STY protein kinase molecules and the templates used for prediction of models (data not shown). The molecular models of STY protein kinase family members are predicted based on the structural alignment of a-helices and b-sheets. Three-dimensional models of STY protein kinase family members were predicted using the coordinates of templates as given in Table 3. All the predicted structures of STY protein kinase molecules resemble the typical structure of protein kinase catalytic domain, consisting of small lobe, which is involved in ATP binding and orientation, and a large lobe that provides sites for substrate recognition and catalysis (Morgan and De Bondt, 1994). The small lobe represents the N-terminus of the molecule and consists predom-
inantly of b-strands, while the large lobe comprises the C-terminus of the molecule and includes several a-helices. The A-loop helix packs between N and C lobes. A similar structure is conserved among all the groups of STY protein kinase family (Figure 4). Prediction of Plant STY protein kinases Once we have studied Arabidopsis STY protein kinases, BLAST analysis was performed using CW(X)6RPXF against all the available plant sequences in the database. We obtained 14 Dictyostelium protein kinases and 11 rice protein kinases. In addition, we could retrieve STY protein kinases from soybean, tomato, wheat, barley, alfalfa and beech. All the protein sequences
306
Figure 4. Three-dimensional model of STY protein kinases. Three-dimensional structure of the putative STY protein kinase family members was predicted with the Swiss-model program (Guex and Peitsch, 1997; http://www.expasy.ch/swissmod/SWISS-MODEL.html). The molecular modeling method used was ProMod II (Guex and Peitsch, 1997). Three-dimensional models of STY protein kinase family members were predicted using the coordinates of templates as shown in Table 3. Letters N and C indicate N- and C-termini of the structure, respectively. Arrows in blue indicate boundaries of the STY protein kinase activation domain (A-loop), which lies between conserved amino acids Asp and Glu. ATP-binding loop is shown in green. a-helices and b-strands are colored in magenta and yellow, respectively. Three-dimensional models are visualized by RasMol (Sayle and Milner-White, 1995).
contain conserved eleven sub-domains for protein kinases. Multiple sequence alignment revealed consensus serine/threonine kinase motif in sub-
domain VI and tyrosine kinase consensus CW(X)6RPXF in sub-domain XI (Figure 5). Arabidopsis STY protein kinases from each family
307 (Figure 1) were included along with the other plant species to construct the cladogram. Topographic cladogram of predicted plant STY protein kinases is depicted in Figure 5. Dendrogram revealed that the plant STY protein kinases are clustered in a similar manner as observed for Arabidopsis (Figure 5). Plant STY protein kinases are divided into eight main clusters; Peanut STYrelated kinases (cluster a), ATN1-like protein kinases (cluster b), GmPK6-like protein kinases (cluster c), ATMRK1-related kinases (cluster d), Dictyostelium SHK1-like kinases (cluster e). CTR1/EDR1-like kinases (cluster f), ankyrin domain repeat protein kinases (cluster g), Dictyostelium PTKs (cluster h) and Dictyostelium PI4K1-related kinases (cluster i). Dictyostelium RCK1 and STY 35 were not clustered into any group. The protein sequences of cluster a, b and c have the conserved protein tyrosine kinase motif TYRWMAPE in sub-domain VIII in addition to CW(X)6RPXF of sub-domain XI. PlantsP database has 668 rice protein kinase sequences (Gribskov et al., 2001). In order to understand rice protein kinases in more detail, we have analyzed PlantsP database. PlantsP database search revealed similar annotation of these kinases. The two rice kinases with larger branch length were analyzed in PlantsP database. The rice kinase (gi: 13129438) was classified as putative receptor like kinase. Cluster a includes peanut STY-related protein kinases. Four rice protein kinases and one wheat protein kinase are included in cluster a. Peanut STY protein kinase is the well-known dual-specificity protein kinase that is shown to autophosphorylate on tyrosine, and phosphorylates the substrate histone on threonine. This kinase belongs to a multigene family and is developmentally regulated in seeds. It is shown to have a role in cold and salt signaling (Rudrabhatla and Rajasekharan, 2002). Cluster b includes one rice protein kinase related to Arabidopsis ATN1 kinase. ATN1 is a dual specificity protein kinase and belongs to multigene family (Tregear et al., 1996). Cluster c includes two beech kinases that are related to soybean GmPK6 protein kinase. GmPK6 is reported to have dual specificity, and is shown to belong to multigene family (Feng et al., 1993). Cluster d includes two rice protein kinases related to ATMRK1 protein kinase. ATMRK1 kinases are similar to mixed lineage kinases and
Raf-protein kinases (Ichimura et al., 1997). Cluster e includes Dictyostelium SHK1 kinases. Cluster f includes CTR1/EDR1-related protein kinases from rice, tomato, wheat and Arabidopsis. EDR1 gene encodes a putative MAP kinase kinase kinase similar to CTR1, a negative regulator of ethylene responses in Arabidopsis (Frye et al., 2001). Putative orthologs of EDR1 are present in monocots such as rice and barley, indicating that EDR1 may regulate defense responses in a wide range of crop species. Function of Raf-like MAPKKKs is yet to be discovered (Mizoguchi et al., 1996). Cluster g includes two rice protein kinases with ankyrin repeat domain and alfalfa ankyrin kinase (Chinchilla et al., 2003). Cluster h includes Dictyostelium protein tyrosine kinases. Dictyostelium protein kinases are developmentally regulated (Tan and Spudich, 1990; Nuckolls et al., 1996). Cluster i includes slime mold PI-4K kinase and rice protein kinase (gi|28829504). Dictyostelium Pats1 (gi|27466900) and rice protein kinase (gi|28812101) were not clustered with any of the above members. Rice kinase (gi|131294438) was also not clustered with any of the above members. Based on the PlantsP database we annotated it as putative receptor like kinase. Relationship of the tyrosine kinases between plants and animals The protein tyrosine kinase phosphorylation motif CW(X)6RPXF was used to perform BLAST analysis against all the known animal sequences. A very large number of protein kinases belonging to protein tyrosine kinase family were obtained from rat, mice, human, fruit fly and worm. Nonreceptor tyrosine kinases (LYN, FYN, SRC, FES, YES) and receptor tyrosine kinases (RET, ROS, RYK, KIN IRK) have the consensus CW(X)6RPXF motif. In addition, guanylate cyclase, janus kinases and mixed lineage kinases also have the conserved CW(X)6RPXF motif. As large numbers of protein tyrosine kinases are available in animals, we have used representative kinases of each family from all these organisms towards the construction of phylogeny tree. The multiple sequence alignment revealed the presence of consensus tyrosine kinase motif CW(X)6RPXF in all the animal sequences. Tyrosine kinase affinities were identified in the sub-domain V, VIII and IX. However, the main difference between the
308
309 b Figure 5. Phylogenetic analysis of predicted plant STY protein kinase family. STY protein kinases from all plant species were aligned by ClustalW program. The tree and bootstrap analyses were performed using the neighbor joining algorithm implemented in MEGA software. The numbers at the nodes of branches of the tree are bootstrap values. Bootstrap values from 1000 replicates were used to assess the robustness of the trees. The lengths of the branches are proportional to the degree of divergence and thus correspond to the statistical significance of the phylogeny between the protein sequences. Cluster a, b and c shaded in grey, have the consensus TYRWMAPE in the sub-domain VIII in addition to protein tyrosine kinase consensus CW(X)6RPXF.
plant and animal tyrosine kinases is in the consensus motif conferring the tyrosine and serine/ threonine specificity in the sub-domain VIB (Supplemental Figure S2). Multiple sequence alignment revealed that the sub-domain VIB has tyrosine kinase consensus ARR/RAA in animals and serine/threonine consensus KXXN in plants (Supplemental Figure S2). None of the plant sequences had protein tyrosine kinase motif (ARR/RAA) in
sub-domain VIB, further confirming the finding that plants lack typical protein tyrosine kinases that are found in animals. However, Ras and Raf family of kinases have consensus serine/threonine motif KXXN conserved in sub-domain VI (Supplemental Figure S2). To determine the relationship between plant STY protein kinases and animal tyrosine kinases a dendrogram was constructed with a few representative sequences (Figure 6). Dendrogram is mainly divided into eight groups. More elaborate details of clustering pattern of genes and bootstrap values are provided in the Supplemental Figure S3. Group I consists of non-receptor tyrosine kinases such as LYN, SRC, FYN, ABL and FES from rat, human, worm and mice (Goddard et al., 1986; Katzen et al., 1991; Okada et al., 1991; Yi et al., 1991; Rider et al., 1994; Lamers et al., 1999; McCafferty et al., 2002). Group II consists of receptor tyrosine kinases RET, ROS, RYK, KIN (Wilks and Kurban, 1988; Morgan and Greenwald, 1993), and growth factor receptors from
Figure 6. Relationship of STY protein kinases between plants and animals. Few representative STY protein kinases from plants and animals were aligned by ClustalW program. The gene names, their corresponding numbers and bootstrap values are represented in Supplemental Figure S3. Group I – Non-receptor tyrosine kinases, Group II – Receptor tyrosine kinases, Group III – Ras/Raf related protein kinases, Group IV – Plant STY protein kinases, Group V – janus kinases, Group VI – Mixed lineage kinases, Group VII – Guanylate cyclase and Group VIII – Ankyrin/TNFRSF kinases.
310 human, mice and fruit fly (Yarden et al., 1986; Chan et al., 1988; Yee et al., 1993; Binari and Perrimon, 1994). Group III consists of Ras/Raf related protein kinases from mice, rat and worm (Lazar et al., 2002; Roy and Therrien, 2002). Interestingly Ras/Raf family protein kinases have KXXN motif in the sub-domain VIB similar to plant protein kinases. Group IV consists entirely plant STY protein kinases. Interestingly there are no animal sequences between any of the plant kinases of this group suggesting that the predicted plant STY protein kinases form a well supported clad. Group V consists of JAK protein kinases from mice, fruit fly and rat (Takahashi and Shirasawa, 1994; Frank et al., 2002). Group VI consists of mixed lineage kinases from Drosophila (Liu et al., 2000; Gotoh et al., 2001). MLKs are involved in differentiation, proliferation and stress response. Group VII includes guanylate cyclases from worm, mice, fruit fly, human and rat (Yang et al., 1995; Garg et al., 2002). Group VIII includes ankyrin protein kinases and TNSRF kinases from human, rat, fruit fly and worm (Stanger et al., 1995; Liu et al., 2000; Gotoh et al., 2001). Expression analysis of STY protein kinases The Genevestigator online search tool Meta-Analyzer was used to analyze the Arabidopsis Affymetrix microarray chip experiments (http:// www.genevestigator.ethz.ch; Zimmermann et al., 2004). Genes are grouped based on their relative expression during a specific growth stage, in particular organ or following an environmental stimulus. The experiments are mentioned in detail on the Genevestigator website. The expression profile of STY protein kinases is summarized in Table 4 and the corresponding heat maps and the color codes are provided in Supplemental Figures S4, S5 and S6. Microarray data is available for 47 STY protein kinases. Twelve kinases are highly expressed in stamen, out of which 9 (STY7, STY21, STY25, STY32, STY33, STY35, STY37, STY53 and STY56) are expressed exclusively in stamen. The expression of STY14, STY41 and STY45 is seen only in seeds. The third group includes kinases whose transcripts are ubiquitous, but they exhibited a peak expression pattern at a particular growth stage. Seven genes (STY1, STY8, STY10, STY24, STY28, STY36 and STY52) are associated with the callus and STY1,
STY3, STY15, STY23, STY36 and STY55 are expressed in cell suspension cultures. However, their expression is also seen in other organs. Low expression of different kinases can be seen in different organs, which could be clearly appreciated in the Supplemental Figure S4. The expression of different STY kinases is spread through developmental stages; however they can be grouped into four categories. First category includes genes (STY17, STY19, STY22, STY23, STY24, STY26, STY41, STY45, STY54 and STY55) that are expressed during germination and early seedling stage. Second category includes the genes expressed during early flowering stage and includes STY2, STY5, STY6, STY10, STY12, STY15, STY16, STY23, STY30 and STY35; STY2, STY30 and STY35 continue to express through later stages of flowering. Genes from third category are expressed specifically in late flowering stage; among these STY7, STY21, STY32, STY33, STY35, STY37 and STY56 are shown to be highly expressed in stamen suggesting that they could play a role in pollen development. The expression of STY7, STY29, STY32 and STY53 is also consistent with pollen transcriptome analysis (Honys and Twell, 2003). The fourth category includes STY1, STY8, STY11, STY13, STY14, STY15, STY25, STY26, STY36, STY45 and STY54 that are highly expressed during silique formation. Other genes are expressed throughout the lifecycle of the plant with varied expression during different developmental stages (Supplemental Figure S5). STY protein kinase genes are also analyzed for upregulation or downregulation when they are exposed to different stress factors. We have included hormones, senescence, sucrose and abiotic stresses in our analysis and summarized in Table 4. Effect of biotic and other stress factors on STY kinase gene expression could be viewed from Supplemental Figure S6. Salicylic acid (SA) upregulates eight genes (STY1, STY5, STY6, STY12, STY13, STY15, STY16 and STY31) suggesting that probably they are associated with systemic acquired resistance (Ryals et al., 1995). Large numbers of genes are expressed during senescence. Ethylene and methyl jasmonate (MJ) induce the upregulation of six to seven STY kinase genes. IAA upregulate the expression of STY22, 24, 27, 38 and 54 and GA3 induces the expression of STY7, 18, 24, 28 and 34. Abiotic stresses (cold, salt
311 Table 4. Expression analysis of STY protein kinases. AGI Code
Gene
Expression in growth stage
Ubiquitously expressed in plant organ
Up regulated during stress condition
At3g46920 At2g35050 At5g11850 At3g27560 At5g01850 At1g14000 At1g04700 At2g17700 At1g16270 At1g08720 At1g18160 At1g18950 At2g24360 At1g79570 At5g66710 At4g31170 At4g35780 At5g58520 At1g73660 At5g50180 At3g59830 At5g40540 At3g58760 At3g06640 At5g57610 At3g50730 At1g67890 At3g06620 At2g31800 At3g46930 At4g24480 At5g58950 At3g24720 At1g62400 At5g41730 At5g03730 At1g64300 At5g49470 At1g08735 At1g18160 At4g23050 At2g43850 At1g01450 At1g62400 At3g50720 At4g38470 NA Ankyrin NA NA NA At5g50000 At4g14780
STY1 STY2 STY3 STY4 STY5 STY6 STY7 STY8 STY9 STY10 STY11 STY12 STY13 STY14 STY15 STY16 STY17 STY18 STY19 STY20 STY21 STY22 STY23 STY24 STY25 STY26 STY27 STY28 STY29 STY30 STY31 STY32 STY33 STY34 STY35 STY36 STY37 STY38 STY39 STY40 STY41 STY42 STY43 STY44 STY45 STY46 STY47 STY48 STY49 STY50 STY51 STY52 STY53
8 5, 6, 6b 1b 1b, 8 5 1b, 5 6b 8 NA 1b, 5, 6b 1, 1d, 6b, 8 5 6b, 8 8 1, 5, 6, 8 1b, 6b 0 6b 1 1b, 1d 6b 0 0, 1b, 1c, 6b 1b, 0 8 8, 0 1b, 1c, 1d 1b 6b 1b, 5, 8 1, 1b 6b 6b 1b, 1c 6b, 5 8 6b 1b, 1c, 1d NA 1, 1d, 6b, 8 0 6b 1, 6 1b, 1c 8, 0 1, 1b, 1c NA NA NA NA NA 1, 1b, 1c, 1d 8
cell suspension, seed, callus node, cauline & senescent leaf cell suspension, roots lateral root, senescent leaf senescent leaf senescent leaf, stamen, roots stamen callus, carpel, seed NA callus, cauline & senescent leaf senescent leaf adult leaf senescent leaf seed cell suspension senescent leaf petal, seed, senescent leaf, roots node cauline, senescent leaf pedicel, carpel, petal stamen seed, roots cell suspension seed, shoot, callus, roots stamen pedicel roots callus, lateral root stamen, senescent & cauline leaf senescent leaf lateral root stamen stamen cauline leaf stamen seed, leaf, callus, cell suspension stamen roots NA senescent leaf seed senescent leaf lateral root cauline leaf seed radicle NA NA NA NA NA pedicel, callus stamen
SA
MJ, SA, cold SA GA3
PCD, salt PCD sucrose, cold, salt, SA SA PCD SA PCD, SA PCD, ABA GA3
ethylene, salt IAA, cold GA3, MJ, cold, IAA ABA, osmotic, salt Light PCD, IAA, ABA GA3, MJ, cold PCD Cold, MJ ABA, osmotic, salt, SA PCD GA3 ethylene ethylene ethylene PCD, IAA, ABA
Down regulated during stress condition
SA
PCD
Cold PCD PCD, SA, zeatin
PCD
osmotic, cold, salt
ABA, osmotic, salt, MJ PCD
heat, osmotic ethylene, MJ, PCD
PCD, ethylene
SA, sucrose
312 Table 4. (Continued) AGI Code
Gene
Expression in growth stage
Ubiquitously expressed in plant organ
Up regulated during stress condition
At3g01490 At3g63260 At3g22750 AtMRK1
STY54 STY55 STY56 STY57
0, 8 1 6b NA
petal, seed, petiole roots, cell suspension,senescent leaf stamen NA
PCD, IAA
Down regulated during stress condition
PCD, ethylene
The gene expression data were retrieved from Genevestigator website using Meta-Analyzer search tool. The growth stages are indicated as follows; stage 0 represents 1.0–5.9 days; stage 1 represents 6.0–13.9 days; stage 1b represents 14.0–17.9 days; stage 1c represents 18.0–20.9 days; stage 1d represents 21.0–24.9 days; stage 5 represents 25.0–28.9 days; stage 6 represents 29.0–35.9 days; stage 6b represents 36.0–44.9 days and stage 8 represents 45.0–50.0 days. Abbreviations: SA – salicylic acid, MJ – methyl jasmonate, GA3 – gibberellic acid, PCD – programmed cell death, ABA – Abscissic acid, IAA – indole acetic acid, NA – not available.
ABA, osmotic and heat) induce high levels of expression of at least 14 STY protein kinases. Down regulation of STY protein kinases is minimal as compared to their upregulation. However, SA suppresses expression of STY7, STY20 and STY45 to a great extent. Abiotic factors downregulate STY18, 34 and 45 kinase genes. STY15, 21 and 25 kinase genes are downregulated during senescence. Zeatin treatment downregulates STY20 kinase gene expression. Detailed expression pattern of STY kinase genes under different stress conditions is shown in Supplemental Figure S6. The expression data suggests that different members of the family of STY protein kinases may play a role in different plant processes. This data should be further strengthened by mutant analysis in order to assign the physiological roles to these kinases. Determination of biochemical and physiological functions for this large family of STY protein kinases raises an important challenge in future. Physiological functions of STY protein kinases Studies on protein tyrosine phosphorylation in plants are still at its infancy. Arabidopsis proteins will have to await reports detailing the structural analysis of the kinases, and in vivo functional studies. The clustering of the proteins reflects their function. The clusters with high bootstrap support documented in this study represent groups of sequences with highly similar structures, which might be expected to serve similar functions. A systematic study of these clusters to guide the
isolation of knockout lines and the design of biochemical experiments should allow the plant research community to most rapidly and efficiently canvas the diversity of functional capacities that the collection of cluster represents. Among group I (ATN1/GmPK6/STY/ATMRK1), Arabidopsis STY protein kinases, ATN1 and soybean GmPK6 are not functionally characterized. Peanut STY protein kinase is developmentally regulated in seed and is induced by cold- and salt-stresses. Peanut STY related protein kinases (STY8, STY13, STY16, STY 17, STY18, STY46 and STY49) might be candidates for testing in abiotic stress signaling and seed development. ATMRK1 kinases are similar to mixed lineage kinases and Rafprotein kinases (Ichimura et al., 1997). Arabidopsis STY protein kinases of group II are related to mixed lineage kinases/CTR1/EDR1 protein kinases. CTR1 protein kinases are most closely related to the Raf protein kinase family from animal systems. The family of Raf-protein kinases is involved in cellular processes that regulate proliferation, differentiation and apoptosis (Yuryev and Wennogle, 1998). EDR1 kinases encode mixed lineage kinases similar to CTR1 and regulate defense responses in a wide range of crop species (Tang and Innes, 2002). Mixed lineage kinases are shown to be involved in defense response (Innes, 2001). Thus, the predicted STY protein kinases of group II may have a possible role in defense signaling. Several sequences of Arabidopsis and rice have marked similarity to CTR1 and related kinases, and might be logical candidates for testing in ethylene signaling. Mixed
313 lineage kinases are cell cycle regulated genes. Analyses of transcript levels show that expression of the MAP3K gene is regulated by developmental processes (etiolation/de-etiolation) and by wounding (Asai et al., 2002). It is possible to predict similar function for Arabidopsis protein kinases that are related to MAP3Kd. Dictyostelium protein tyrosine kinases are known to have an important role in the development of spore (Tan and Spudich, 1990). Alfalfa ankyrin kinase (apk1) is induced by osmotic stress. We could predict similar function for protein kinases related to alfalfa ankyrin kinase from Arabidopsis (STY6, STY12, STY21, STY23, STY29, STY42 and STY49) and rice (gi|14209542 and gi|22296378). Arabidopsis protein kinases of group IV that is related to C. purpureus light sensory kinase might have a possible role in light signaling. Animal protein tyrosine kinases are involved in differentiation, proliferation and stress response. Mixed lineage kinases are activated by stress response and are involved in regulation of actin organization (Gotoh et al., 2001). The structural and evolutionary relationship of plant STY protein
kinases with classical animal protein tyrosine kinases suggests a possible role for these proteins in stress and development. STY protein kinases are structurally different from MAP kinase kinases MAP kinase kinases are the sensu stricto dual specificity kinases found in plants that are shown to be involved in stress signaling. These kinases are structurally different from the STY family of protein kinases. MAPKK’s are the dual specificity protein kinases which activate MAPKs by phosphorylation of both threonine and tyrosine residues of the TXY motif of MAPKs (Zhang and Klessig, 2001). MAPKKs are activated themselves by the phosphorylation of two conserved serine or threonine residues between kinase subdomains VII and VIII. In most MAPKKs, these conserved amino acids have motif S/TXXXS/T. When compared with mammalian MAPKs, all plant MAPKs have the highest homology to ERK subfamily. Almost all isolated MAPKs have TEY motif as the dual phosphorylation site.
Figure 7. Functional characterization of STY13 (At2g24360). (A) SDS-PAGE (12%) analysis of expressed protein. Lanes 1 and 2 represent uninduced and induced E. coli cells harboring pET21d vector. Lanes 3 and 4 indicate uninduced and induced E. coli cells harboring pET21d_STY13. (B) Time course of autophosphorylation of STY13 protein kinase. (C) Phosphoamino acid analysis of autophosphorylated STY13 protein kinase. The positions of origin, phosphoserine, phosphothreonine and phosphotyrosine are indicated. (D) Histone III-S (10 lg) was subjected to phosphorylation in the presence of STY13 and other reaction components. The phosphorylated products were separated by 12% SDS-PAGE and visualized by autoradiography.
314 Evidence for the tyrosine phosphorylated proteins from Arabidopsis To provide experimental support for the bioinformatic analysis, we have sub-cloned the open reading frame of STY13 (At2g24360, obtained from Arabidopsis Biological Resource Center, The Ohio State University, USA) into pET21d at EcoRI and SalI restriction sites. The resultant construct was expressed in E. coli by inducing with 0.5 mM isopropyl-b-D-thiogalactopyranoside (IPTG), and the recombinant protein was seen as 46 kD protein on 12% SDS-PAGE (Figure 7A). Time dependant increase in autophosphorylation was observed with the recombinant STY13 kinase (Figure 7B). The autophosphorylated protein was transferred onto a polyvinylidine difluoride (PVDF) membrane; band corresponding to radioactive protein kinase was excised and hydrolyzed with 6 N HCl. The resultant hydrolysate was analyzed for phosphoamino acids by thin layer chromatography followed by autoradiography. The analysis revealed that tyrosine is preferentially phosphorylated. In addition, phosphorylation was also observed in serine and threonine confirming the dual specificity of STY13 kinase (Figure 7C). The protein kinase phosphorylated Histone III-S in a substrate phosphorylation reaction (Figure 7D).
Phosphotyrosine residues are stable at alkaline pH, but phosphoserine and phosphothreonine are not (Bourassa et al., 1988; Noiman and Shaul, 1995). [c-32P]ATP-phosphorylated proteins from the cell lysate of Arabidopsis were transferred to PVDF membranes, treated with KOH, and autoradiographed. The membrane that was not treated with KOH was considered as a control that showed a complex pattern of bands, some showing high-intensity signals (Figure 8A). After alkali treatment a high content of 32P-label was removed, as shown by the decrease in signal intensity of bands (Figure 8B). Several of the bands disappeared from the membrane, except proteins corresponding to 37.5 and 46 kD (hereafter referred to as P37.5 and P46). Protein gels were blotted onto PVDF membrane and detected with immunostaining using anti-phosphotyrosine monoclonal antibody. Interestingly, we found that the P37.5 and P46 proteins from Arabidopsis crossreacted with anti-phosphotyrosine antibody (Figure 8C). When RC20 anti-phosphotyrosine antibody was used in combination with enhanced chemiluminescence detection method, at least four tyrosine phosphorylated proteins were observed in Arabidopsis wild-type cell suspension culture (Barizza et al., 1999) and hypocotyls (Huang et al., 2003). When 2 mM phosphotyrosine was
Figure 8. Effect of alkali treatment on the soluble fractions of Arabidopsis. The 32P-labeled proteins were separated by 10% SDSPAGE and electroblotted onto PVDF membranes. Membranes were left without treatment (A) or treated with 1 M KOH (B) and bands were visualized by autoradiography. The positions of the molecular mass markers are indicated along the left side. The results shown are the representative of four separate experiments. Arabidopsis seedling proteins were separated by SDS-PAGE, electroblotted onto nitrocellulose membranes, and immunodetected using the anti-p-Tyr antibody. Proteins were assayed directly with anti-p-Tyr antibodies (C) or analyzed with anti-p-Tyr antibody pretreated with phosphotyrosine (D). TLC analysis of phosphoamino acids of in vitro 32P-labeled proteins of P46 (lane 1) and P37.5 (lane 2) from Arabidopsis seedlings (E).
315 added to the immunodetection buffer, all of the bands detected with p-Tyr antibody disappeared (Figure 8D), indicating the specific binding of antibody to phosphorylated tyrosine residues in P37.5 and P46. Phosphoamino acid analysis of blotted proteins that remained labeled after treatment with KOH not only revealed the occurrence of phosphothreonine and phosphoserine, but also of phosphotyrosine, which was the most abundant phosphoamino acid (Figure 8E). We then sought to perform peptide mass fingerprinting to identify the tyrosine phosphorylated proteins. Protein bands detected from Arabidopsis cell extract on the anti-p-Tyr-immunoblot and phosphorylation were subjected to in-gel digestion with trypsin, and peptide mass fingerprint analyses. Table 5 summarizes this study for P37.5 protein. The P37.5 protein has nine peptides match for ATN1 like protein kinase (At5g01850) with 46% sequence coverage obtained by MALDI-TOF MS at 50 ppm mass accuracy (Table 5). At5g01850 (STY5) belongs to family 1.1 from Figure 1. Similarly, P46 has 10 peptides match for peanut STY like protein kinase (At2g24360) with the 42% sequence coverage obtained by MALDITOF MS (Table 6). At2g24360 (STY13) belongs to family 1.2 (Figure 1). Prediction of protein tyrosine phosphorylation sites were performed by
NetPhos (www.cbs.dtu.dk/services/NetPhos/). The tyrosine phosphorylation sites of STY5 correspond to Y181, Y189 and Y202, and that of STY13 are Y23, Y108 and Y290. The tyrosine kinase consensus TYRWMAPE of subdomain VIII and CW(X)6RPXF of sub-domain XI are conserved in STY5 and STY13 (Tables 5 and 6). STY13 has 75.4% identity to peanut STY protein kinase. Y181 of STY5 and Y290 of STY13 correspond to the consensus tyrosine of TYRWMAPE motif of the activation loop. Sequence alignment of STY5 and STY13 with peanut STY kinase reveals that the tyrosine residues are conserved. In conclusion, this study forms the first report on the comprehensive survey of STY protein kinases from plants. Data presented here report on the occurrence of serine/threonine/tyrosine kinases in Arabidopsis. In this study we have shown evidence indicating the presence of protein tyrosine kinases and found relatively high levels of phosphotyrosine in the soluble proteins of Arabidopsis. The proteins identified after alkaline hydrolysis in Arabidopsis belong to group 1 (ATN1/GmPK6/STY/ATMRK1) and they are found to be phosphorylated on serine, threonine and tyrosine, corroborating the genome analysis. This raise an intriguing possibility that protein tyrosine phosphorylation is carried out by dual
Table 5. Identification of P37.5 from Figure 8 as a protein similar to Arabidopsis ATN1-like kinase. Masses submitted
Masses matched
Start-
End
Peptide
1779.10 1419.2 974.1 2272.5 1455.3 1650.1 1835.6 1947.1 2673.1 1162.4
1778.09 1418.19 973.55 2271.49 1454.39 1649.09 1834.59 1946.09 2672.09 1169.34
1 50 73 105 129 142 167 223 278 306
16 62 80 128 141 155 182 240 301 315
MSSDDTIEESLLVDPK GSKPDQQSSLESR VQHHNLVK YLTSIRPQLLHLPLALSFALDIAR ALHCLHANGIIHR DLKPDNLLLTENHK EESVTEMMTAETGTYR MPFEGMSNLQAAYAAAFK LLNEFLLTLTPPPPQPLPETATNR AITEFSIRPK
1 MSSDDTIEES LLVDPKLLFI GSKIGEGAHG KVYQGRYGRQ IVAIKVVNRG 51 SKPDQQSSLE SRFVREVNMM SRVQHHNLVK FIGACKDPLM VIVTELLPGM 101 SLRKYLTSIR PQLLHLPLAL SFALDIARAL HCLHANGIIH RDLKPDNLLL 151 TENHKSVKLA DFGLAREESV TEMMTAETGT YRWMAPELYS TVTLRQGEKK 201 HYNNKVDVYS FGIVLWELLT NRMPFEGMSN LQAAYAAAFK QERPVMPEGI 251 SPSLAFIVQS CWVEDPNMRP SFSQIIRLLN EFLLTLTPPP PQPLPETATN 301 RTNGRAITEF SIRPKGKFAF IRQLFAAKRN INS Peptide mass fingerprint analysis of P37.5 protein (Figure 8) matches 10 peptide masses with Arabidopsis ATN1-like protein kinase (At5g01850; STY5), which belongs to Family 1.1 of Arabidopsis STY protein kinase family (Figure 1). The total coverage is 46% of the protein. Peptides covered in ATN1-like kinase are shown in bold letters. The tyrosine kinase consensus TYRWMAPE of sub-domain VIII and CW(X)6RPXF of subdomain XI are underlined. The tyrosine phosphorylation sites predicted from NetPhos (www.cbs.dtu.dk/services/NetPhos/) are 181, 189 and 202.
316 Table 6. Identification of P46 from Figure 8 as a protein similar to peanut STY-like protein kinase from Arabidopsis. Masses submitted
Masses matched
Start-
End
Peptide
3148.31 1690.53 1955.70 2086.81 1409.37 2271.09 1076.52 1812.31 2184.10 1991.31
3147.29 1689.49 1954.67 2085.79 1408.36 2270.08 1075.49 1811.29 2183.09 1990.30
7 75 90 108 126 165 255 276 342 362
32 89 107 124 139 183 264 291 361 378
FNVLAVGNHHNNDNNYYAFTQEFYQK HYSLSVGQSVFRPGR VTHALNDDALAQALMDTR YPTEGLTNYDEWTIDLR LNMGPAFAQGAFGK AQFMEQQFQQEVSMLANLK SDNLLISADK IEVQTEGMTPETGTYR GVRPTVPNDCLPVLSDIMTR CWDANPEVRPCFVEVVK
1 MLEGAKFNVL AVGNHHNNDN NYYAFTQEFY QKLNEGSNMS MESMQTRSVS 51 MSVDNSSVGS SDALIGHPGL KPVRHYSLSV GQSVFRPGRV THALNDDALA 101 QALMDTRYPT EGLTNYDEWT IDLRKLNMGP AFAQGAFGKL YKGTYNGEDV 151 AIKILERPEN SPEKAQFMEQ QFQQEVSMLA NLKHPNIVRF IGACRKPMVW 201 CIVTEYAKGG SVRQFLTRRQ NRAVPLKLAV KQALDVARGM AYVHGRNFIH 251 RDLKSDNLLI SADKSIKIAD FGVARIEVQT EGMTPETGTY RWMAPEMIQH 301 RAYNQKVDVY SFGIVLWELI TGLLPFQNMT AVQAAFAVVN RGVRPTVPND 351 CLPVLSDIMT RCWDANPEVR PCFVEVVKLL EAAETEIMTT ARKARFRCCL 401 SQPMTID Peptide mass fingerprint analysis of P46 protein (Figure 8) matches 10 peptide masses with peanut STY-like protein kinase from Arabidopsis (At2g24360; STY13), which belongs to Family 1.2 of Arabidopsis STY protein kinase family (Figure 1). The total coverage is 42% of the protein. Peptides covered in peanut STY-like kinase are shown in bold letters. The tyrosine kinase consensus TYRWMAPE of sub-domain VIII and CW(X)6RPXF of subdomain XI are underlined. The tyrosine phosphorylation sites predicted from NetPhos (www.cbs.dtu.dk/services/NetPhos/) are 23, 108 and 290.
specificity protein kinases in plants. Earlier attempts to clone the classical tyrosine kinases resulted only in the identification of dual specificity protein kinases (Hirayama and Oka, 1992; Ali et al., 1994; Tregear et al., 1996). Do plants have bona fide tyrosine kinases? At this time, we believe that tyrosine phosphorylation is carried out by serine/threonine/tyrosine kinases, until a classical tyrosine kinase is cloned from plants. The sequences with dual specificity have been shown to play an important role in abiotic stress response, ethylene and defense signaling, which increases our understanding of the specific role of protein tyrosine kinases in many aspects of plant biology. This will prove invaluable for many future biotechnology applications. The presented phylogenetic analysis and tyrosine phosphoproteomics of predicted plant serine/threonine/tyrosine kinases should aid in understanding the signaling of plant tyrosine kinase family, an otherwise poorly understood area of plant biology. Acknowledgements MMR is a recipient of Council of Scientific and Industrial Research Senior Research Fellowship
(CSIR-SRF). This research is supported by grant from Department of Science and Technology, New Delhi, India.
References Ali, N., Halfter, U. and Chua, N.H. 1994. Cloning and biochemical characterization of a plant protein kinase that phosphorylates serine, threonine and tyrosine. J. Biol. Chem. 269: 31626–31629. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403–410. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389–3402. Arabidopsis Genome Initiative 2000. Analysis of the genome of the flowering plant Arabidopsis thaliana. Nature 408: 820–826. Aravind, L. and Koonin, E.V. 1999. Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches. J. Mol. Biol. 287: 1023–1040. Asai, T., Tena, G., Plotnikova, J., Willmann, M.R., Chiu, W.L., Gomez-Gomez, L., Boller, T., Ausubel, F.M. and Sheen, J. 2002. MAP kinase signalling cascade in Arabidopsis innate immunity. Nature 415: 977–983. Barizza, E., Lo Schiavo, F., Terzi, M and Filippini, F. 1999. Evidence suggesting protein tyrosine phosphorylation in
317 plants depends on the developmental conditions. FEBS Lett. 447: 191–194. Bateman, A., Birney, E., Durbin, R., Eddy, S.R., Howe, K.L. and Sonnhammer, E.L. 2000. The Pfam protein families database. Nucleic Acids Res. 28: 263–266. Bennett, V. and Chen, L. 2001. Ankyrins and cellular targeting of diverse membrane proteins to physiological sites. Curr. Opin. Cell Biol. 1: 61–67. Binari, R. and Perrimon, N. 1994. Stripe-specific regulation of pair-rule genes by hopscotch, a putative Jak family tyrosine kinase in Drosophila. Genes Dev. 8: 300–312. Bourassa, C., Chapdelaine, A., Roberts, K.D. and Chevalier, S. 1988. Enhancement of the detection of alkali-resistant phosphoproteins in polyacrylamide gels. Anal. Biochem. 169: 356–362. Burge, C. and Karlin, S. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268: 78–94. Carpi, A., Maira, G.D., Vedovato, M., Rossi, V., Naccari, T., Floriduz, M., Terzi, M. and Filippini, F. 2002. Comparative proteome bioinformatics: identification of a whole complement of putative protein tyrosine kinases in the model flowering plant Arabidopsis thaliana. Proteomics 2: 1494– 1503. Chan, A.M., King, H.W., Deakin, E.A., Tempest, P.R., Hilkens, J., Kroezen, V., Edwards, D.R., Wills, A.J., Brookes, P. and Cooper, C.S. 1988. Characterization of the mouse met proto-oncogene. Oncogene 2: 593–599. Chinchilla, D., Merchan, F., Megias, M., Kondorosi, A., Sousa, C. and Crespi, M. 2003. Ankyrin protein kinases: a novel type of plant kinase gene whose expression is induced by osmotic stress. Plant Mol. Biol. 51: 555–566. Cloutier, J.F. and Veillette, A. 1996. Association of inhibitory tyrosine protein kinase p50csk with protein tyrosine phosphatase PEP in T cells and other hemopoietic cells. EMBO J. 15: 4909–4918. Felsenstein, J. 1996. Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. Methods Enzymol. 266: 418–426. Feng, X.H., Zhao, Y., Bottino, P.J. and Kung, S.D. 1993. Cloning and characterization of a novel member of protein kinase family from soybean. Biochim. Biophys. Acta 1172: 200–204. Fordham-Skelton, A.P., Skipsey, M., Evans, I.M., Edwards, R. and Gatehouse, J.A. 1999. Higher plant tyrosine-specific protein phosphatases (PTPs) contain novel amino-terminal domains: expression during embryogenesis. Plant Mol. Biol. 39: 593–605. Frank, G.D., Saito, S., Motley, E.D., Sasaki, T., Ohba, M., Kuroki, T., Inagami, T. and Eguchi, S. 2002. Requirement of Ca2+ and PKCdelta for Janus kinase 2 activation by angiotensin II: involvement of PYK2. Mol. Endocrinol. 16: 367–377. Frye, C.A., Tang, D. and Innes, R.W. 2001. Negative regulation of defense responses in plants by a conserved MAPKK kinase. Proc. Natl. Acad. Sci. USA 98: 373–378. Garg, R., Oliver, P.M., Maeda, N. and Pandey, K.N. 2002. Genomic structure, organization, and promoter region analysis of murine guanylyl cyclase/atrial natriuretic peptide receptor-A gene. Gene 291: 123–133. Goddard, J.M., Weiland, J.J. and Capecchi, M.R. 1986. Isolation and characterization of Caenorhabditis elegans DNA sequences homologous to the v-abl oncogene. Proc. Natl. Acad. Sci. USA 8: 2172–2176.
Gotoh, I., Adachi, M. and Nishida, E. 2001. Identification and characterization of a novel MAP kinase kinase, MLTK. J. Biol. Chem. 276: 4276–4286. Gribskov, M., Fana, F., Harper, J., Hope Harmon, D.A. A.C., Smith, D.W., Tax, F.E. and Zhang, G. 2001. PlantsP: a functional genomics database for plant phosphorylation. Nucleic Acids Res. 29: 111–113. Guex, N. and Peitsch, M.C. 1997. SWISS-MODEL and the Swiss-Pdbviewer: an environment for comparative protein modelling. Electrophoresis 18: 2714–2723. Gupta, R., Huang, Y., Kieber., J. and Luan., S. 1998. Identification of a dual-specificity protein phosphatase that inactivates a MAP kinase from Arabidopsis. Plant J. 16: 581–589. Hanks, S.K. 1987. Homology probing identification of cDNA clones encoding members of the protein-serine kinase family. Proc. Natl. Acad. Sci. USA 84: 388–392. Hanks, S.K. and Quinn, A.M. 1991. Protein kinase catalytic domain sequence database: identification of conserved features of primary structure and classification of family members. Methods Enzymol. 200: 38–62. Hanks, S.K., Quinn, A.M. and Hunter, T. 1988. The protein kinase family: conserved features and deduced phylogeny of the catalytic domains. Science 241: 42–51. Hirayama, T. and Oka, A. 1992. Novel protein kinase of Arabidopsis thaliana (APK1) that phosphorylates tyrosine, serine and threonine. Plant Mol. Biol. 20: 653–662. Holtrich, U., Brauninger, A., Strebhardt, K. and RubsamenWaigmann, H. 1991. Two additional protein-tyrosine kinases expressed in human lung: fourth member of the fibroblast growth factor receptor family and an intracellular protein-tyrosine kinase. Proc. Natl. Acad. Sci. USA 88: 10411–10415. Honys, D. and Twell, D. 2003. Comparative analysis of the Arabidopsis pollen transcriptome. Plant Physiol. 132: 640– 652. Huang, H.J., Lin, Y.M., Huang, D.D., Takahashi, T. and Sugiyama, M. 2003. Protein tyrosine phosphorylation during phytohormone-stimulated cell proliferation in Arabidopsis hypocotyls. Plant Cell Physiol. 44: 770–775. Hunter, T. 1987. A thousand and one protein kinases. Cell 50: 823–829. Ichimura, K., Mizoguchi, T. and Shinozaki, K. 1997. ATMRK1, an Arabidopsis protein kinase related to mammal mixed-lineage kinases and Raf protein kinases. Plant Sci. 130: 171–179. Innes, R.W. 2001. Mapping out the roles of MAP kinases in plant defense. Trends Plant Sci. 6: 392–394. Katzen, A.L, Montarras, D., Jackson, J., Paulson, R.F., Kornberg, T. and Bishop, J.M. 1991. A gene related to the proto-oncogene fps/fes is expressed at diverse times during the life cycle of Drosophila melanogaster. Mol. Cell Biol. 11: 226–239. Kumar, S., Tamura, K., Jakobsen, I.B. and Nei, M. 2001. MEGA2: Molecular Evolutionary Genetics Analysis Software . Arizona State University, Tempe. Kyte, J. and Doolittle, R. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157: 105–132. Lamers, M.B., Antson, A.A., Hubbard, R.E., Scott, R.K. and Williams, D.H. 1999. Structure of the protein tyrosine kinase domain of C-terminal Src kinase (CSK) in complex with staurosporine. J. Mol. Biol. 285: 713–725.
318 Lazar, S., Galiani, D. and Dekel, N. 2002. cAMP-Dependent PKA negatively regulates polyadenylation of c-mos mRNA in rat oocytes. Mol. Endocrinol. 16: 331–341. Lindberg, R.A., Quinn, A.M. and Hunter, T. 1992. Dualspecificity protein kinases: will any hydroxyl do? Trends Biochem. Sci. 17: 114–119. Liu, T.C., Huang, C.J., Chu, Y.C., Wei, C.C., Chou, C.C., Chou, M.Y., Chou, C.K. and Yang, J.J. 2000. Cloning and expression of ZAK, a mixed lineage kinase-like protein containing a leucine-zipper and a sterile-alpha motif. Biochem. Biophys. Res. Commun. 274: 811–816. Lukashin, A.V. and Borodovsky, M. 1998. GeneMark.hmm: new solutions for gene finding. Nucleic Acid Res. 26: 1107–1115. McCafferty, D.M., Craig, A.W., Senis, Y.A. and Greer, P.A. 2002. Absence of Fer protein-tyrosine kinase exacerbates leukocyte recruitment in response to endotoxin. J. Immunol. 168: 4930–4935. Mizoguchi, T., Irie, K., Hirayama, T., Hayashida, N., Yamaguchi-Shinozaki, K., Matsumoto, K. and Shinozaki, K. 1996. A gene encoding a mitogen-activated protein kinase kinase kinase is induced simultaneously with genes for a mitogen-activated protein kinase and an S6 ribosomal protein kinase by touch, cold, and water stress in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 93: 765–769. Morgan, D.O. and De Bondt, H.L. 1994. Protein kinase regulation: insights from crystal structure analysis. Curr. Opin. Cell Biol. 6: 239–246. Morgan, W.R. and Greenwald, I. 1993. Two novel transmembrane protein tyrosine kinases expressed during Caenorhabditis elegans hypodermal development. Mol. Cell Biol. 13: 7133–7143. Munoz, G.E. and Marshall, S.H. 1994. Selective phosphoamino acid enrichment by organic solvent fractionation. Biotechniques 17: 1044–1046. Murashige, T. and Skoog, F. 1962. A revised medium for rapid growth and bioassays with tobacco tissue cultures. Plant Physiol. 15: 473–497. Noiman, S. and Shaul, Y. 1995. Detection of histidinephosphoproteins in animal tissues. FEBS Lett. 364: 63–66. Nuckolls, G.H., Osherov, N., Loomis, W.F. and Spudich, J.A. 1996. The Dictyostelium dual-specificity kinase splA is essential for spore differentiation. Development 122: 3295–3305. Okada, M., Nada, S., Yamanashi, Y., Yamamoto, T. and Nakagawa, H. 1991. CSK: a protein-tyrosine kinase involved in regulation of src family kinases. J. Biol. Chem. 266: 24249–242452. Page, R.D.M. 1996. TREEVIEW: an application to display phylogenetic trees on personal computers. Comput. Appl. Biosci. 12: 357–358. Ponting, C.P. and Aravind, L. 1997. PAS: a multifunctional domain family comes to light. Curr. Biol. 7: 74–77. Rider, L.G., Raben, N., Miller, L. and Jelsema, C. 1994. The cDNAs encoding two forms of the LYN protein tyrosine kinase are expressed in rat mast cells and human myeloid cells. Gene 138: 219–222. Roy, F. and Therrien, M. 2002. MAP kinase module: the Ksr connection. Curr. Biol. 12: 325–327. Rudrabhatla, P. and Rajasekharan, R. 2002. Developmentally regulated dual-specificity kinase that is induced by abiotic stresses. Plant Physiol. 130: 380–390. Rudrabhatla, P. and Rajasekharan, R. 2003. Mutational analysis of stress-responsive peanut dual specificity kinase: identification of tyrosine residues involved in protein kinase activity. J. Biol. Chem. 278: 17328–17335.
Rudrabhatla, P. and Rajasekharan, R. 2004. Functional characterization of peanut STY kinase Molecular docking and inhibition kinetics with tyrosine kinase inhibitors. Biochemistry 43: 12123–12132. Ryals Lawton, J. K.A., Delaney, T.P., Friedrich, L., Kessmann, H., Neuenschwander, U., Uknes, S., Vernooij, B. and Weymann, K. 1995. Signal transduction in systemic acquired resistance. Proc. Natl. Acad. Sci. USA 92: 4202–4205. Saitou, N. and Nei, M. 1987. The neighbor joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406–425. Sayle, R. and Milner-White, E.J. 1995. RasMol: biomolecular graphics for all. Trends Biochem. Sci. 20: 374. Schoof, H., Zaccaria, P., Gundlach, H., Lemcke, K., Rudd, S., Kolesov, G., Arnold, R., Mewes, H.W. and Mayer, K.F.X. 2002. MIPS Arabidopsis thaliana database (MAtDB): an integrated biological knowledge resource based on the first complete plant genome. Nucleic Acids Res. 30: 91–93. Shekhtman, A., Ghose, R., Wang, D., Cole, P.A. and Cowburn, D. 2001. Novel mechanism of regulation of the non-receptor tyrosine kinase Csk: insights from NMR mapping studies and site-directed mutagenesis. J Mod Biol. 314: 129–138. Shevchenko, A., Jensen, O.N., Podtelejnikov, A.V., Sagliocco, F., Wilm, M., Vorm, O., Mortensen, P., Shevchenko, A., Boucherie, H. and Mann, M. 1996. Linking genome and proteome by mass spectrometry: large-scale identification of yeast proteins from two-dimensional gels. Proc. Natl. Acad. Sci. USA 93: 14440–14445. Sippl, M.J. 1993. Assessment of the CASP4 fold recognition category. Proteins 17: 355–362. Stanger, B.Z., Leder, P., Lee, T.H., Kim, E. and Seed, B. 1995. RIP: a novel protein containing a death domain that interacts with Fas/APO-1 (CD95) in yeast and causes cell death. Cell 81: 513–523. Takahashi, T. and Shirasawa, T. 1994. Molecular cloning of rat JAK3, a novel member of the JAK family of protein tyrosine kinases. FEBS Lett. 342: 124–128. Tan, J.L. and Spudich, J.A. 1990. Developmentally regulated protein-tyrosine kinase genes in Dictyostelium discoideum. Mol. Cell Biol. 10: 3578–3583. Tang, D. and Innes, R.W. 2002. Overexpression of a kinasedeficient form of the EDR1 gene enhances powdery mildew resistance and ethylene-induced senescence in Arabidopsis. Plant J. 32: 975–983. Thompson, J.D., Higgins, D.G. and Gibson, T.J. 1994. ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positionspecific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673–4680. Tregear, J.W., Jouannic, S., Schwebel-Dugue., N. and Kreis, M. 1996. An unusual protein kinase displaying characteristics of both the serine/threonine kinase and tyrosine families is encoded by the Arabidopsis thaliana gene ATN1. Plant Sci. 117: 107–119. Weiner, S.J., Kollman, P.A., Case, D.A., Singh, U.C., Ghio, C., Profeta, A.S. and Weiner, P. 1984. A new force field for molecular mechanical simulation of nucleic acids and proteins. J. Am. Chem. Soc. 106: 765–784. Wilks, A.F. and Kurban, R.R. 1988. Isolation and structural analysis of murine c-fes cDNA clones. Oncogene 3: 289– 294. Xu, Q., Fu, H., Gupta., R. and Luan, S. 1998. Molecular characterization of a tyrosine-specific protein phosphatase
319 encoded by a stress-responsive gene in Arabidopsis. Plant Cell 10: 849–857. Yang, R.B., Foster, D.C., Garbers, D.L. and Fulle, H.J. 1995. Two membrane forms of guanylyl cyclase found in the eye. Proc. Natl. Acad. Sci. USA 92: 602–606. Yarden, Y., Escobedo, J.A., Kuang, W.J., Yang-Feng, T.L., Daniel, T.O., Tremble, P.M., Chen, E.Y., Ando, M.E., Harkins, R.N., Francke, U., Fried, V.A., Ullrich, A. and Williams, L.T. 1986. Structure of the receptor for plateletderived growth factor helps define a family of closely related growth factor receptors. Nature 323: 226–232. Yee, K., Bishop, T.R., Mather, C. and Zon, L.I. 1993. Isolation of a novel receptor tyrosine kinase cDNA expressed by developing erythroid progenitors. Blood 82: 1335–1343.
Yi, T.L., Bolen, J.B. and Ihle, J.N. 1991. Hematopoietic cells express two forms of lyn kinase differing by 21 amino acids in the amino terminus. Mol. Cell Biol. 11: 2391–2398. Yuryev, A. and Wennogle, L.P. 1998. The RAF family: an expanding network of post-translational controls and protein–protein interactions. Cell Res. 8: 81–98. Zhang, S. and Klessig, D.F. 2001. MAPK cascades in plant defense signaling. Trends Plant Sci. 6: 520–527. Zimmermann, P., Hirsch-Hoffman, M., Hennig, L. and Gruissem, W. 2004. GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox. Plant Physiol. 136: 2621–2632. Zhou, G., Bao, Z.Q. and Dixon, J.E. 1995. Components of a new human protein kinase signal transduction pathway. J. Biol. Chem. 270: 12665–12669.