Online ISSN 2092-9293 Print ISSN 1976-9571
Genes & Genomics https://doi.org/10.1007/s13258-018-0709-x
RESEARCH ARTICLE
The complete mitochondrial genome of Vanessa indica and phylogenetic analyses of the family Nymphalidae Youxue Lu1 · Naiyi Liu1 · Liuxiang Xu1 · Jie Fang1 · Shuyan Wang1 Received: 24 January 2018 / Accepted: 29 May 2018 © The Genetics Society of Korea and Springer Nature B.V. 2018
Abstract Vanessa indica is a small butterfly lacking historical molecular and biological research. Vanessa indica belongs to the family Nymphalidae (Lepidoptera: Papilionoidea), which is the largest group of butterflies and are nearly ubiquitous. However, after more than a century of taxonomic and molecular studies, there is no consensus for family classification, and the phylogenetic relationships within Nymphalidae are controversial. The first objective was to sequence and characterize the complete mitochondrial genome of V. indica. The most important objective was to completely reconstruct the phylogenetic relationships for family members within Nymphalidae. The mitochondrial genomic DNA (mtDNA) of V. indica was extracted and amplified by polymerase chain reaction. The complete mitochondrial sequence was annotated and characterized by analyzing sequences with SeqMan program. The phylogenetic analyses were conducted on thirteen protein coding genes (PCGs) in 95 mtDNA of Nymphalidae downloaded from GenBank for reference using the maximum likelihood method and Bayesian inference to ensure the validity of the results. The complete mitogenome was a circular molecule with 15,191 bp consisting of 13 protein coding genes, two ribosomal RNA genes (16S rRNA and 12S rRNA), 22 transfer RNA (tRNA) genes, and an A + T-rich region (D-loop). The nucleotide composition of the genome was highly biased for A + T content, which accounts for 80.0% of the nucleotides. All the tRNAs have putative secondary structures that are characteristic of mitochondrial tRNAs, except tRNASer(AGN). All the PCGs started with ATN codons, except cytochrome c oxidase subunit 1 (COX1), which was found to start with an unusual CGA codon. Four genes were observed to have unusual codons: COX1 terminated with atypical TT and the other three genes terminated with a single T. The A + T rich region of 327 bp consisted of repetitive sequences, including a ATAGA motif, a 19-bp poly-T stretch, and two microsatellite-like regions (TA)8. The phylogenetic analyses consistently placed Biblidinae as a sister cluster to Heliconiinae and Calinaginae as a sister clade to Satyrinae. Moreover, the phylogenetic tree identified Libytheinae as a monophyletic group within Nymphalidae. The complete mitogenome of V. indica was 15,191 bp with mitochondrial characterizations common for lepidopteran species, which enriched the mitochondria data of Nymphalid species. And the phylogenetic analysis revealed different classifications and relationships than those previously described. Our results are significant because they would be useful in further understanding of the evolutionary biology of Nymphalidae. Keywords Vanessa indica · Mitochondrial genome · Nymphalidae · Phylogenetic analyses
Introduction Youxue Lu and Naiyi Liu have contributed equally to this work. Electronic supplementary material The online version of this article (https://doi.org/10.1007/s13258-018-0709-x) contains supplementary material, which is available to authorized users. * Jie Fang
[email protected] * Shuyan Wang
[email protected] 1
School of Life Science, Anhui University, Hefei, China
The family Nymphalidae (Lepidoptera: Papilionoidea) is the largest group of butterflies that are ubiquitous throughout the world except in Antarctica (Shi et al. 2015). It contains about 6000 species that have been placed into about 542 genera. Currently, these genera are placed into 12 subfamilies and 40 tribes (Nymphalidae Systematics Group 2016). They are usually medium to large sized butterflies, with striking colors, specialized insect–plant interaction, and a deep-rooted evolutionary history probably
13
Vol.:(0123456789)
90-million-years old (Shields 1989; Wahlberg et al. 2009). Traditional taxonomy classifies organisms according to their morphological features and physiological behavior. Because of the presence of diverse subfamilies and tribes, the systematic relationships within Nymphalidae are complex and have been evaluated based on both morphological characteristics and molecular data. Hence, after more than a century of taxonomic and molecular study, there has been no consensus in the classification of this family and the phylogenetic relationships of nymphalids have been controversial (Freitas et al. 2004; Wahlberg and Brower 2008). Mitochondria, the oxygen-processing factories of eukaryotic cells, have their own genome that encodes genes involved in oxidative phosphorylation (Chen et al. 2014). The insect mitogenome is a closed, circular duplex molecule, ranging in size from 13 to 20 kb (Chai et al. 2012). It contains 13 protein-coding genes (PCGs), two ribosomalRNA-coding genes (rRNAs), 22 transfer-RNA-coding genes (tRNAs), and an A + T rich displacement loop (D-loop) control region (Wolstenholme 1992; Boore 1999; Cameron 2014). With the development of sequencing technology, the genomes and genes of mitochondria are often used as molecular markers in the studies of comparative and evolutionary genomics, molecular evolution, phylogenetics, and population genetics because of their maternal inheritance, compact structure, lack of genetic recombination, and relatively fast evolutionary rate (Boore 1999; Nardi et al. 2003; Arunkumar et al. 2006). Nevertheless, only approximately 110 mitogenomes of Nymphalid species have been deposited in GenBank, as of October 11, 2017, and the phylogenetic relationships within and among the Nymphalid subgroups have not been fully resolved (Freitas et al. 2004). At present, the systematics of Nymphalidae, which is acknowledged by most taxonomists, includes 12 subfamilies, Heliconiinae, Nymphalinae, Limenitidinae, Charaxinae, Apaturinae, Calinaginae, Satyrinae, Cyrestidinae, Pseudergolinae, Danainae, Biblidinae, and Libytheinae (Zhang et al. 2008). The butterflies of Libytheinae are morphologically one of the most unusual groups of Lepidoptera owing to their snout; their relationship with other butterfly groups remains unclear (Kawahara 2009). Traditionally, they are considered as the basal group of Nymphalidae (Wahlberg et al. 2003). Vanessa indica is a kind of small but beautiful butterfly, but in the past its research mainly concentrated on its feeding habits (Ômura and Honda 2003, 2005). In this study, the complete mitogenome sequence of V. indica was determined; the mitogenome sequence was described by comparing with the sequences of other Lepidopteran mitogenomes, particularly those of the Nymphalid species. Moreover, we used the concatenated nucleotide sequences of 13 PCGs for reconstruction of phylogenetic relationships among the family members of Nymphalidae.
13
Genes & Genomics
Materials and methods Sample collection and DNA extraction The specimen of V. indica was collected from Yaoluoping Nature Reserve (30°59′2.25″N, 116°4′59.40″E) in July 2015, Anhui province, China. The sample was photographed and identified by a taxonomist. The genomic DNA from ethanolfixed tissues was extracted by the standard phenol/chloroform method, and the DNA samples were stored at − 20 °C for further PCR amplification.
Primer design, PCR amplification, and sequencing According to the sequence information of universal primers for V. indica available on the NCBI website (https://www. ncbi.nlm.nih.gov/), several pairs of specific primers were designed to obtain the entire mitochondrial genome (Supplementary 1) (Kambhampati and Smith 1995). All the PCRs were performed using the Veriti 96 Well Thermal Cycler (Applied Biosystems, Foster City, CA). The PCRs were performed in a 50 µL reaction volume containing 31.75 µL sterile deionized water, 5.0 µL 10× PCR Buffer (Mg2+ free), 4.0 µL dNTP mixture (2.5 mM each), 4.0 µL MgCl2 (25 mM), 2.0 µL of each primer (10 µM), 1.0 µL DNA template, 0.25 µL Takara Taq DNA polymerase (5 U/µL)(Shiga, Japan), using the following cycling conditions: initial denaturation at 94 °C for 4 min, followed by 35 cycles at 94 °C for 40 s (denaturation), 50–62 °C for 50 s (annealing), and 72 °C for 1–5 min (extension), and a final extension step at 72 °C for 10 min. Each amplified product (5 µL) was checked by 1% agarose gel electrophoresis to validate the amplification efficiency. The amplified fragments were purified using Axygen agarose-out kit (California, USA), and were sent to Sangon Biotech Company (Shanghai, China) for sequencing using the primers used for PCR amplification.
Mitochondrial genome annotation The complete mitochondrial genome sequences of V. indica were examined and assembled by SeqMan program (DNASTAR 5.0) (Swindell and Plasterer 1997) and were subsequently corrected manually. The tRNA genes with standard cloverleaf secondary structures were recognized by tRNAscan-SE search server v.1.21 (Lowe and Eddy 1997). Those tRNA genes, which could not be determined by tRNAscan-SE, were ascertained by sequence similarity to tRNAs of other Nymphalids. The rRNA genes were identified through sequence alignment with the published sequences of Nymphalid mitochondrial genome from the NCBI database. The PCGs were determined by sequence
Genes & Genomics
similarity as revealed in the alignment and the start and stop codons were identified using an open reading frame (ORF) finder. The resulting sequences and codon usage were analyzed by Mega 5.0 (Tamura et al. 2011) to confirm the protein coding genes.
Phylogenetic analyses Thirteen PCGs in 95 mtDNA from Nymphalid butterflies were downloaded from Genbank as references to reconstruct the phylogenetic relationships within Nymphalidae and 13 PCGs in four mtDNA from Hesperiidae were selected as the outgroup taxa. The alignment of the nucleotide sequence of 13 concatenated PCGs was done by MEGA5.0 (Tamura et al. 2011). The phylogenetic analyses were conducted using the maximum likelihood (ML) method in RAxML7.03 (Stamatakis et al. 2008), and Bayesian inference (BI) in MrBayes v.3.0b4 (Huelsenbeck and Ronquist 2001; Ronquist and Huelsenbeck 2003). According to the Akaike information criterion, the GTR + I + G model was optimal for analysis with nucleotide alignments (Akaike 1974). The BI analysis was performed using MrBayes in which two simultaneous runs were allowed for 3,000,000 generations with sampling at each hundredth generation. Each run had four chains, one cold and three heated. When the convergence of MCMC chains was achieved (the average standard deviation of split frequencies (SD) < 0.01, the potential scale reduction factor (PSRF) ≈ 1), the first 25% of the sampled generations were Table 1 Nucleotide composition and skews of V. indica mitochondrial DNA by regions
Gene
ATP6 ATP8 COX1 COX2 COX3 ND1 ND2 ND3 ND4 ND4L ND5 ND6 Cytb 12SrRNA 16SrRNA Control region L-strand H-strand Total
discarded as burn-in samples. The resulting phylogenetic trees were visualized in FigTree v1.4 (http://tree.bio.ed.ac. uk/software/figtree/).
Results Mitogenome organization of V. indica The V. indica (Genbank accession number: MG 736927) mitogenome has a size of 15,191 bp (Table 1), including the entire set of 37 genes (13 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes) as those in the ancestral mitogenome of insects (Fig. 1), and a A + T rich region. Among these, 14 genes are transcribed from the light strand (L-strand), whereas the remaining 23 genes and a A + T rich region are encoded on the heavy strand (H-strand). The overall gene arrangement (Fig. 1) was basically identical to other published Nymphalidae mitogenomes. Further sequence analyses revealed differences existed between the arrangement of the tRNAs (tRNAMet; tRNAIle; tRNAGln) from V. indica and those found in ancient insects. The overall base of V. indica mitogenome constitutes of 40.1% for A, 39.9% for T, 12.3% for C and 7.6% for G. Our results identified that the mitogenome of V. indica had highly biased A + T content (80.0%). The overall AT and GC skew for V. indica was 0.003 and 0.236, respectively. The nucleotide frequency
Nucleotide frequency (% ) A
T
C
G
34.1 42.3 32.8 36.2 33.3 48.1 34.7 35.3 46.6 51.4 50.3 35.4 33.7 40.7 39.8 46.0 45.5 35.6 40.1
42.5 48.8 39.3 40.5 39.7 30.0 49.5 45.7 33.2 31.9 31.5 49.8 41.6 44.2 44.5 48.5 36.4 42.9 39.9
14.7 6.5 14.9 13.1 15.0 14.6 10.1 12.0 13.5 11.8 12.0 9.5 15.0 10.1 10.5 4.3 12.2 12.4 12.3
8.7 2.4 13.0 10.2 12.0 7.3 5.7 7.0 6.6 4.9 6.2 5.3 9.7 5.0 5.1 1.2 6.0 9.1 7.6
A + T (%)
AT-skew
CG-skew
76.6 91.1 72.1 76.7 73.0 78.1 84.2 81.0 79.8 83.3 81.8 85.2 75.3 84.9 84.3 94.5 81.9 78.5 80.0
− 0.110 − 0.073 − 0.090 − 0.056 − 0.088 0.232 − 0.176 − 0.128 0.168 0.234 0.230 − 0.169 − 0.105 − 0.041 − 0.056 − 0.026 0.111 − 0.220 0.003
0.256 0.461 0.068 0.124 0.111 0.333 0.278 0.263 0.343 0.413 0.319 0.284 0.215 0.338 0.346 0.564 0.341 0.153 0.236
AT skew = (A% − T%)/(A% + T%); GC skew = (G% − C%)/(G% + C%)
13
Genes & Genomics
Fig. 1 Map of the circular V. indica mitogenome. Gene names on the outside line side indicated that these genes were located on the H-strand, whereas the others were located on the L-strand. Color codes for different genes are listed on the map
of the H-strand and L-strand was T > A > C > G and A > T > C > G, respectively, showing a relatively strong AT bias. The nucleotide skew statistics for the H-strand (AT-skew = 0.220, CG-skew = 0.153) and L-strand (ATskew = 0.111, CG-skew = 0.341) showed slight A or T skews and a moderate C skew. The mitogenome of V. indica contained 12 intergenic spacers, ranging from 1 to 49 bp in length (adding up to 137 bp). Half of these spacers were short (1–5 bp), whereas the others were long (10–49 bp). The longest intergenic spacer region was 49 bp long and was located between tRNASer(AGN) and tRNAGlu. The overlaps ranged in size from 1 to 8 bp; the longest overlapping region was located between tRNATrp and tRNACys (Table 2), which included 7 bp overlapping regions between ATP8 and ATP6, and contained the ATGATAA motif, and the overlaps were less than 4 bp at the remaining 14 positions.
13
Protein coding genes The 13 PCGs of V. indica mitogenome consisted of 3714 codons in total (excluding the termination codons), within the range of 168 bp for ATP8 and 1731 bp for ND5. Among these13 PCGs, 9 (ND2, COX1, COX2, ATP8, ATP6, COX3, ND3, ND6 and Cytb) were coded by the H-strand, while the remaining four PCGs (ND5, ND4, ND4L and ND1) were coded by the L-strand. The relative synonymous codon usage (RSCU) in the mitogenome of V. indica was analyzed and is shown in Fig. 2 and Table 2. Like other lepidopterans (Cameron and Whiting 2008), all 13 PCGs in the V. indica mitogenome have ATN as their start codon, except for the COX1 gene, which used CGA instead (Table 2). Many prior studies reported that the typical initial code for COX1 gene was often TTG, ACG, TTA or TTAG (Yamauchi et al. 2002; Kim et al. 2011).
Genes & Genomics Table 2 Characteristics of V. indica mitochondrial DNA genome Gene name
Coding strand
Start position
End position
tRNAMet tRNAIle tRNAGln ND2 tRNATrp tRNACys tRNATyr COX1 tRNALeu(UUR) COX2 tRNALys tRNAAsp ATP8 ATP6 COX3 tRNAGly ND3 tRNAAla tRNAArg tRNAAsn tRNASer(AGN) tRNAGlu tRNAPhe ND5 tRNAHis ND4 ND4L tRNAThr tRNAPro ND6 Cytb tRNASer(UCN) ND1 tRNALeu(CUN) 16SrRNA tRNAVal 12SrRNA D-loop
H H L H H L L H H H H H H H H H H H H H H H L L L L L H L H H H L L L L L H
1 69 131 245 1257 1318 1379 1448 2979 3046 3722 3792 3858 4019 4696 5487 5551 5914 5978 6040 6104 6213 6289 6354 8086 8154 9494 9790 9855 9922 10,456 11,606 11,669 12,630 12,698 14,026 14,090 14,866
68 133 199 1258 1325 1379 1443 2979 3045 3724 3792 3857 4025 4696 5484 5553 5907 5978 6039 6105 6163 6279 6354 8085 8153 9492 9781 9854 9919 10,449 11,607 11,670 12,625 12,696 14,027 14,089 14,868 15,191
Intergenic Overlapping Size (bp) nucleotides nucleotide
45
4
3 2 8 1
3 1
2 6
49 9
7 1 3 1 2
1
1 8 2 6
4 1
Nine of the 13 PCGs possessed the complete stop codon TAA, one (COX1) using the truncated stop codon TT and the other three (COX2, ND4 and ND5) owning the incomplete termination codon T (Table 3). From the codon usage analysis (Table 3; Fig. 2) in the mitogenomes of V. indica, the five most frequently used codons were identified to be UUU, UUA, AUU, AUA, and AAU, with each being used in 200 instances.
2 2
2 3
No. of codons
68 65 69 1014 69 62 65 1532 67 680 71 66 168 678 789 67 357 65 62 66 60 67 66 1731 68 1339 287 65 65 528 1152 65 957 67 1330 64 779 326
Start codon
Stop codon
ATT
TAA
CGA
TT
ATG
T
ATC ATG ATG
TAA TAA TAA
ATA
TAA
ATT
T
ATG ATA
T TAA
ATC ATG
TAA TAA
ATG
TAA
Transfer RNA and rRNA genes The predicted secondary structures of the tRNAs are shown in Fig. 3. The lengths of the tRNA genes ranged from 60 bp (tRNASer(AGN)) to 71 bp (tRNALys). Of these genes, 14 tRNA genes were located on the H-strand and eight were present on the L-strand. The canonical cloverleaf secondary structures could be found in all the tRNA
13
Genes & Genomics
Fig. 2 a Codon distribution in V. indica mitogenome. Numbers to the left refer to the total number of the codon. Codon families are provided on the X-axis. b The RSCU in the mitogenome of V. indica. Codon families are provided on the X-axis
genes, except in tRNASer(AGN). The tRNASer(AGN) lacked the dihydrouridine (DHU) arm, which has been observed in several insects (Lavrov et al. 2000).
13
Two rRNA genes were identified on the L-strand in the V. indica mitogenome, with the 16S rRNA gene located between tRNALeu(CUN) and tRNAVal, and the 12S rRNA gene
Genes & Genomics Table 3 Codon usage in the protein coding genes of V. indica mitogenome
Codon
Count RSCU Codon
UUU(F) UUC(F) UUA(L) UUG(L) CUU(L) CUC(L) CUA(L) CUG(L) AUU(I) AUC(I) AUA(M) AUG(M) GUU(V) GUC(V) GUA(V) GUG(V)
351 30 441 20 44 4 36 1 424 37 250 25 79 0 52 5
1.84 0.16 4.85 0.22 0.48 0.04 0.4 0.01 1.84 0.16 1.82 0.18 2.32 0 1.53 0.15
Count RSCU Codon
UCU(S) 103 UCC(S) 17 UCA(S) 93 UCG(S) 4 CCU(P) 57 CCC(P) 18 CCA(P) 46 CCG(P) 0 ACU(T) 78 ACC(T) 9 ACA(T) 69 ACG(T) 2 GCU(A) 73 GCC(A) 8 GCA(A) 36 GCG(A) 1
2.48 0.41 2.24 0.1 1.88 0.6 1.52 0 1.97 0.23 1.75 0.05 2.47 0.27 1.22 0.03
Count RSCU Codon
UAU(Y) 167 UAC(Y) 15 UAA(*) 0 UAG(*) 0 CAU(H) 53 CAC(H) 16 CAA(Q) 58 CAG(Q) 2 AAU(N) 233 AAC(N) 25 AAA(K) 81 AAG(K) 23 GAU(D) 61 GAC(D) 4 GAA(E) 59 GAG(E) 11
1.84 0.16 0 0 1.54 0.46 1.93 0.07 1.81 0.19 1.56 0.44 1.88 0.12 1.69 0.31
Count RSCU
UGU(C) 32 UGC(C) 3 UGA(W) 89 UGG(W) 6 CGU(R) 19 CGC(R) 0 CGA(R) 34 CGG(R) 0 AGU(S) 33 AGC(S) 1 AGA(S) 80 AGG(S) 1 GGU(G) 65 GGC(G) 2 GGA(G) 101 GGG(G) 27
1.83 0.17 1.87 0.13 1.43 0 2.57 0 0.8 0.02 1.93 0.02 1.33 0.04 2.07 0.55
A total of 3714 codons were analysed excluding all initiation termination codons RSCU relative synonymous codon usage
present between tRNAVal and the control region. The length of the 16S rRNA and 12S rRNA genes were 1330 bp and 779 bp, respectively. The A + T contents were 84.3% for 16S rRNA and 84.9% for 12S rRNA.
Control region The A + T-rich region of V. indica extends over 326 bp (14,866–15,191 nt) and was found to be located between the 12S rRNA and tRNAMet genes. The A + T-rich region harbors the highest AT content (94.5%) among any of the regions in the V. indica mitogenome. Some conserved features, like those present in the other lepidopteran mitogenomes, were found in the A + T-rich region of the V. indica mitogenome. The conserved motif ‘ATAGA + poly T’ was present, which is considered as the origin of the N-strand DNA replication (Jiang 2009; Taanman 1999).
Non‑coding and overlapping regions The mitogenome of V. indica contained 12 intergenic spacers, ranging from 1 to 49 bp in length (adding up to 137 bp). Half of these spacers were short (1–5 bp), whereas the others were long (10–49 bp). The longest intergenic spacer region was 49 bp long and was located between tRNASer(AGN) and tRNAGlu. The overlaps ranged in size from 1 to 8 bp; the longest overlapping region was located between tRNATrp and tRNACys (Table 2), which included 7 bp overlapping regions between ATP8 and ATP6, and contained the ATG ATAA motif, and the overlaps were less than 4 bp at the remaining 14 positions.
Phylogenetic analysis Figure 4 shows the results of the two phylogenetic analyses (BI and ML) based on the nucleotide dataset of the 13 PCG genes, and the results yielded the same topology. The tree was constructed using the mitogenome sequences from eleven subfamilies within Nymphalidae, namely Heliconiinae, Biblidinae, Limenitidinae, Apaturinae, Nymphalinae, Cyrestinae, Calinaginae, Satyrinae, Charaxinae, Danainae, and Libytheinae. Four clades (Heliconiine, Nymphaline, Satyrine, and Libytheine) with high support were identified. The Heliconiine clade (MLBP = 99%; BPP = 100%) comprises of three subfamilies, Limenitinae, Heliconiinae, and Biblidinae. This is the first time that the Heliconiinae subfamily (MLBP = 100%; BPP = 100%) has been observed to cluster with Biblinae (MLBP = 85%; BPP = 100%). The Nymphaline clade (MLBP = 99%; BPP = 100%) was observed to be the sister clade to the Heliconiine clade in our study, which also contained three subfamilies Apaturinae, Nymphalinae, and Cyrestinae. The Apaturinae (MLBP = 100%; BPP = 100%) subfamily was determined to be the sister group of the Nymphalinae subfamily (MLBP = 100%; BPP = 100%). Vanessa indica clustered with Kallima inachus, Hypolimnas bolina, Issoria lathonia, and with the species of the genus Junonia, which was referred in this analysis, forming the tribe Nymphanii. The Satyrine clade (MLBP = 100%; BPP = 100%) consisted of four subfamilies including Calinaginae, Satyrinae, Charaxinae, and Danainae. The tribe Calinaginii was composed only of Calinaga davidis. The subfamily Danainae
13
Fig. 3 Putative secondary structure for the tRNA of V. indica
13
Genes & Genomics
Genes & Genomics
Fig. 4 Phylogeny of V. indica. Phylogenetic tree inferred from the nucleotide sequences of the 13 PCGs in the mitogenome. The number on the branches indicates posterior probabilities (BI) and bootstrap
(ML). The tree was rooted with the sequences of the outgroup species belonging to the Hesperiidae family
constituted the Danainii tribe and was identified as the sister taxon of Satyrinii (MLBP = 100%; BPP = 100%). Our analyses revealed that the Libytheine clade (MLBP = 100%; BPP = 100%) was the monophyletic group in the phylogenetic tree of Nymphalidae, which was sister to all the other subfamilies of Nymphalidae.
such as Scirpophaga incertulas (0,131) (Ma et al. 2016). The value of GC-skew was 0.236 in V. indica, whereas the GC-skew is always negative for the lepidopteran insects. The 12 intergenic spacer regions ranged in size from 1 to 9 bp, with a total length of 137 bp in V. indica; the largest spacer region was found to be located between tRNASer(AGN) and tRNAGlu. This intergenic region might be the result of partial duplication of mtDNA and subsequent random loss of each duplicated gene due to replication errors according to the duplication/random loss model (Boore 2000; Macey et al. 1998; Moritz et al. 1987). There were 16 regions (42 bp, overall) in the mitochondrion where adjacent genes overlapped and the 7 bp overlap (ATGATAA) between ATP8 and ATP6 observed in this study has commonly been observed in other species of Lepidoptera (Liu et al. 2008; Zhu et al. 2013), such as Cnaphalocrocis medinalis (Chai et al. 2012), Actias selene (Liu et al. 2012) and Damora sagana (Liu et al. 2017a). The overlapping and intergenic regions were both present in the mitogenomes of other Nymphalid butterflies. Except for the COX1 gene, all the protein-coding genes of V. indica started with the typical ATN codons. Because there is no regular start codon after the last stop codon,
Discussion This is the first report on characterization of the complete mitogenome of V. indica. Vanessa indica mitogenome had the common lepidopteran gene order without any gene rearrangement (Liu et al. 2017b). Because of asymmetrical mutation pressure on the mitochondrial genome, the nucleotide composition was found to be typically skewed in opposite directions on each strand (Perna and Kocher 1995; Hassasin et al. 2005). The nucleotide composition of the V. indica mitogenome was biased towards A and T (A + T content was 80.0%), with a slightly positive AT-skew (0.030) based on the formula used for AT-skew [AT skew = (A − T)/ (A + T)] (Perna and Kocher 1995). This indicated the occurrence of more As than Ts, as in some lepidopteran species,
13
upstream of the COX1 ORF, the COX2 gene must use an atypical start site (Liu et al. 2008). The sequence alignment revealed that the ORF of the COX1 gene also starts at a CGA codon for arginine, as found in other lepidopteran insects (Huang et al. 2016; Shen et al. 2015; Zou et al. 2017). The incomplete stop codons, as observed in the present study, widely existed in insect mt genomes, and their corresponding transcripts could be processed into mature RNA by precise endonucleolytic cleavages using the recognition signals of tRNA secondary structures. The single T stop codon would be a complete and functional stop codon (TAA) after posttranslational modification that occurs during the mRNA maturation process (Ojala et al. 1980, 1981). From the perspective of evolutionary strategy, the existence of a partial stop codon is proposed to minimize the intergenic spacers and gene overlaps (Hao et al. 2012; Kang et al. 2017). The A + T content strongly affects the use of degenerate synonymous sites in protein coding genes and relaxes the pressure, probably as a result of more frequent use of NNA and NNU codons (Cheng et al. 2016). The RSCU values of the codons most frequently ending in U and A were greater than 1, in this study, as is dominantly the case in Nymphalid butterflies. In the A + T rich region of V. indica, there was a conserved motif ‘ATAGA,’ followed by a 19-bp poly-T stretch, which is a common structural feature observed in many other lepidopteran insects (Chai et al. 2012; Fan et al. 2016; Sivasankaran et al. 2017). Further more, two microsatellite-like elements (TA)8 were detected in the A + T region of V. indica. At present, the systematics of Nymphalidae, which is acknowledged by most taxonomists, includes 12 subfamilies, Heliconiinae, Nymphalinae, Limenitidinae, Charaxinae, Apaturinae, Calinaginae, Satyrinae, Cyrestidinae, Pseudergolinae, Danainae, Biblidinae, and Libytheinae (Zhang et al. 2008). The butterflies of Libytheinae are morphologically one of the most unusual groups of Lepidoptera owing to their snout; their relationship with other butterfly groups remains unclear (Kawahara 2009). Traditionally, they are considered as the basal group of Nymphalidae (Wahlberg et al. 2003). In this study, both the ML and BI trees indicated that Libythea celtis is sister to all the other nymphalid species, standing at the point of the entire nymphalid tree. This provides molecular evidence for clarifying the taxonomic status of Libytheinae (Brower 2000). Stichophthalma louisa originally belongs to the subfamily Morphinae in another taxonomic system (PeÑA et al. 2011). In our phylogenetic tree, S. louisa was first clustered with the species of Satyrinae. The higher classification of Satyrinae is a controversial issue since long. It is also the first time that Calinaginae clustered with Satyrinae, forming the Satyrine clade, whereas it has been reported to clustered with Charaxinae in previous studies (Zhang et al. 2008; Wahlberg et al. 2009).
13
Genes & Genomics
Many previous studies have suggested that Biblidinae could be a sister group of Limenitidinae and Apaturinae (Wahlberg et al. 2003, 2009), whereas our phylogenetic analyses revealed that Biblidinae clustered with Heliconiinae and Acraeidae. Nevertheless, this topology has never been proposed previously (Freitas et al. 2004; Nylin et al. 2014; Zhang et al. 2008). The Nymphalidae systematics reviewed by Chinese taxonomists were mainly classified on the basis of morphological characteristics (Chou 1994, 1998; Li and Zhu 1992). In additional, Chou (1994) classified Satyrinae and Danainae as Satyridae and Danaidae, respectively. Our phylogenetic analysis revealed that Satyrinae and Danainae clustered together as the subfamilies of Nymphalidae. For further evaluation of the phylogenetic relationships among the family Nymphalidae and among true butterflies, a larger number of complete mitogenome sequences that encompass more of the taxonomic diversity would be required. Acknowledgements We wish to express our appreciations to Yun-he Wu, University of China Academy of Science, Wen-bo Li, Anhui University, for assistance during the experiment, corrections and comments regarding this manuscript. This study was sponsored by the Undergraduate student innovation and entrepreneurship training projects of Anhui University (J10118516034). The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.
Compliance with ethical standards Conflict of interest Youxue Lu declares that she has no conflict of interest. Naiyi Liu declares that she has no conflict of interest. Liuxiang Xu declares that he has no conflict of interest. Jie Fang declares that he has no conflict of interest. Shuyan Wang declares that she has no conflict of interest. Ethical approval The article does not contain any studies with human subjects or animals performed by any of the authors.
References Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723 Arunkumar KP, Metta M, Nagaraju J (2006) Molecular phylogeny of silkmoths reveals the origin of domesticated silkmoth, Bombyx mori from Chinese Bombyx mandarina and paternal inheritance of Antheraea proylei mitochondrial DNA. Mol Phylogenet Evol 40:419–427 Boore JL (1999) Animal mitochondrial genomes. Nucleic Acids Res 27:1767–1780 Boore JL (2000) The duplication/random loss model for gene rearrangement exemplified by mitochondrial genomes of deuterostome animals. In: Sankoff D, Nadeau JH (eds) Comparative genomics. Kluwer Academic Publishers, Dordrecht, pp 133–216 Brower AV (2000) Phylogenetic relationships among the Nymphalidae (Lepidoptera) inferred from partial sequences of the wingless gene. Proc R Soc Lond B: Biol Sci 267:1201–1211 Cameron SL (2014) Insect mitochondrial genomics: implications for evolution and phylogeny. Annu Rev Entomol 59:95–117
Genes & Genomics Cameron SL, Whiting MF (2008) The complete mitochondrial genome of the tobacco hornworm, Manduca sexta (Insecta: Lepidoptera: Sphingidae), and an examination of mitochondrial gene variability within butterflies and moths. Gene 408:112–113 Chai HN, Du YZ, Zha BP (2012) Characterization of the complete mitochondrial genomes of Cnaphalocrocis medinalis and Chilo suppressalis (Lepidoptera: Pyralidae). Int J Biol Sci 8:561–579 Chen MM, Li Y, Chen M, Wang H, Li Q, Xia RX, Zeng CY, Li YP, Liu YQ, Qin L (2014) Complete mitochondrial genome of the atlas moth, Attacus atlas (Lepidoptera: Saturniidae) and the phylogenetic relationship of Saturniidae species. Gene 545:95–101 Cheng XF, Zhang LP, Yu DN, Storey KB, Zhang JY (2016) The complete mitochondrial genomes of four cockroaches (Insecta: Blattodea) and phylogenetic analyses within cockroaches. Gene 586:115–122 Chou I (1994) Monograph of Chinese butterflies (in Chinese). Henan Scientific and Technological Publishing House, Zhengzhou Chou I (1998) Classification and identification of Chinese butterflies (in Chinese). Henan Scientific and Technological Publishing House, Zhengzhou Fan C, Xu C, Li JL, Lei Y, Gao Y, Xu CR, Wang RJ (2016) Complete mitochondrial genome of a satyrid butterfly, Ninguta schrenkii (Lepidoptera: Nymphalidae). Mitochondrial DNA A DNA Mapp Seq Anal 27: 80–81 Freitas AVL, Brown KS, Schultz T (2004) Phylogeny of the Nymphalidae (Lepidoptera). Syst Biol 53:363–383 Hao JS, Sun QQ, Zhao HB, Sun XY, Gai YH, Yang Q (2012) The complete mitochondrial genome of Ctenoptilum vasava (Lepidoptera: Hesperiidae: Pyrginae) and its phylogenetic implication. Comp Funct Genom 2012:1–13 Hassasin A, Léger N, Deutsch J (2005) Evidence for multiple reversals of asymmetric mutational constraints during the evolution of the mitochondrial genome of metazoan, and consequences for phylogenetic inferences. Syst Biol 54:277–298 Huang ZH, Dai PF, Zhao GF (2016) The complete mitochondrial genome of Heliconius pachinus (Insecta: Lepidoptera: Nymphalidae). Mitochondrial DNA A DNA Mapp Seq Anal 27:1251–1252 Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17:754–755 Jiang ST (2009) Characterization of the complete mitochondrial genome of the giant silkworm moth, Eriogyna pyretorum (Lepidoptera: Saturniidae). Int J Biol Sci 5:351–365 Kambhampati S, Smith PT (1995) PCR primers for the amplification of four insect mitochondrial gene fragments. Insect Mol Biol 4:233–236 Kang XC, Hu YQ, Hu J, Hu LQ, Wang F, Liu DB (2017) The mitochondrial genome of the lepidopteran host cadaver (Thitarodes sp.) of Ophiocordyceps sinensis and related phylogenetic analysis. Gene 598:32–42 Kawahara AY (2009) Phylogeny of snout butterflies (Lepidoptera: Nymphalidae: Libytheinae): combining evidence from the morphology of extant, fossil, and recently extinct taxa. Cladistics 25:263–278 Kim MJ, Kang AR, Jeong HC, Kim KG, Kim I (2011) Reconstructing intraordinal relationships in Lepidoptera using mitochondrial genome data with the description of two newly sequenced lycaenids, Spindasis takanonis and Protantigius superans (Lepidoptera: Lycaenidae). Mol Phylogenet Evol 61:436–445 Lavrov DV, Brown WM, Boore JL (2000) A novel type of RNA editing occurs in the mitochondrial tRNAs of the centipede Lithobius forficatus. Proc Natl Acad Sci USA 97:13738–13742 Li CL, Zhu BY (1992) The profile of butterflies in China (in Chinese). Shanghai Yuandong Press, Shanghai Liu Y, Li Y, Pan M, Dai F, Zhu X, Lu C, Xiang Z (2008) The complete mitochondrial genome of the Chinese oak silkmoth, Antheraea
pernyi (Lepidoptera: Saturniidae). Acta Biochim et Biophys Sin 40:693–703 Liu QN, Zhu BJ, Dai LS, Wei GQ, Liu CL (2012) The complete mitochondrial genome of the wild silkworm moth, Actias selene. Gene 505:291–299 Liu NY, Li N, Yang PY, Sun CQ, Fang J, Wang SY (2017a) The complete mitochondrial genome of Damora sagana and phylogenetic analyses of the family Nymphalidae. Genes Genom. https://doi. org/10.1007/s13258-017-0614-8 Liu QN, Xin ZZ, Zhu XY, Chai XY, Zhao XM, Zhou CL, Tang BP (2017b) A transfer RNA gene rearrangement in the lepidopteran mitochondrial genome. Biochem Biophys Res Commun 489:149–154 Lowe TM, Eddy SR (1997) tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25:955–964 Ma HF, Zheng XX, Peng MH, Bian H, Chen MM, Liu YQ, Jiang XF, Qin L (2016) Complete mitochondrial genome of the meadow moth, Loxostege sticticalis (Lepidoptera: Pyraloidea: Crambidae), compared to other Pyraloidea moths. J Asia-Pac Entomol 19:697–706 Macey JR, Schulte JA, Larson A, Papenfuss TJ (1998) Tandem duplication via light strand synthesis may provide a precursor for mitochondrial genomic rearrangement. Mol Biol Evol 15:71–75 Moritz C, Dowling TE, Brown WM (1987) Evolution of animal mitochondrial DNA: relevance for population biology and systematics. Annu Rev Ecol Syst 18:269–292 Nardi F, Spinsanti G, Boore JL, Carapelli A, Dallai R, Frati F (2003) Hexapod origins: monophyletic or paraphyletic? Science 299:1887–1889 Nylin S, Slove J, Janz N (2014) Host plant utilization, host range oscillations and diversification in nymphalid butterflies: a phylogenetic investigation. Evolution 68:105–124 Nymphalidae Systematics Group (2016) The subfamily Danainae. http://www.nympha lidae .net/http://www.Nympha lidae /Danain ae/ Danainae.htm. Accessed 20 Sept 2016 Ojala D, Merkel C, Gelfand R, Attardi G (1980) The tRNA genes punctuate the reading of genetic information in human mitochondrial DNA. Cell 22:393–403 Ojala D, Montoya J, Attardi G (1981) tRNA punctuation model of RNA processing in human mitochondria. Nature 290:470–474 Ômura H, Honda K (2003) Feeding responses of adult butterflies, Nymphalis xanthomelas, Kaniska canace and Vanessa indica, to components in tree sap and rotting fruits: synergistic effects of ethanol and acetic acid on sugar responsiveness. J Insect Physiol 49:1031–1038 Ômura H, Honda K (2005) Priority of color over scent during flower visitation by adult Vanessa indica butterflies. Oecologia 142:588–596 PeÑA C, Nylin S, Wahlberg N (2011) The radiation of Satyrini butterflies (Nymphalidae: Satyrinae): a challenge for phylogenetic methods. Zool J Linn Soc 161:64–87 Perna NT, Kocher TD (1995) Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes. J Mol Evol 41:353–359 Ronquist F, Huelsenbeck JP (2003) MrBayes 3: bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574 Shen J, Cong Q, Grishin NV (2015) The complete mitochondrial genome of Papilio glaucus and its phylogenetic implications. Meta Gene 5:68–83 Shi QH, Sun XY, Wang YL, Hao JS, Yang Q (2015) Morphological characters are compatible with mitogenomic data in resolving the phylogeny of nymphalid butterflies (Lepidoptera: Papilionoidea: Nymphalidae). PLoS ONE 10:e0124349 Shields O (1989) World numbers of butterflies. J Lepid Soc 43:178–183
13
Sivasankaran K, Mathew P, Anand S, Ceasar SA, Mariapackiam S, Ignacimuthu S (2017) Complete mitochondrial genome sequence of fruit-piercing moth Eudocima phalonia (Linnaeus, 1763) (Lepidoptera: Noctuoidea). Genom Data 14:66–81 Stamatakis A, Hoover P, Rougemont J (2008) A rapid bootstrap algorithm for the RAxML web servers. Syst Biol 57:758–771 Swindell SR, Plasterer TN (1997) Seqman. Contig assembly. Methods Mol Biol 70:75 Taanman JW (1999) The mitochondrial genome: structure, transcription, translation and replication. Biochim Biophys Acta 1410:103–123 Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 28:2731–2739 Wahlberg N, Brower AVZ (2008) Nymphalinae Rafinesque 1815. The tree of life web project. http://tolweb.org/Nymphalinae/12195 /2016.02.14. Accessed 14 Feb 2016 Wahlberg N, Weingartner E, Nylin S (2003) Towards a better understanding of the higher systematics of Nymphalidae (Lepidoptera: Papilionoidea). Mol Phylogenet Evol 28:473–484
13
Genes & Genomics Wahlberg N, Leneveu J, Kodandaramaiah U, Pena C, Nylin S, Freitas AV, Brower AV (2009) Nymphalid butterflies diversify following near demise at the Cretaceous/Tertiary boundary. Proc R Soc B 276:1–8 Wolstenholme DR (1992) Animal mitochondrial DNA: structure and evolution. Int Rev Cytol 141:173–216 Yamauchi A, Nakashima T, Tokuriki N, Hosokawa M, Nogami H, Arioka S, Urabe I, Yomo T (2002) Evolvability of random polypeptides through functional selection within a small library. Protein Eng 15:619 Zhang M, Cao T, Jin K, Ren Z, Guo Y, Shi J, Zhong Y, Ma E (2008) Estimating divergence times among subfamilies in Nymphalidae. Sci Bull 53:2652–2658 Zhu BJ, Liu QN, Dai LS, Wang L, Sun Y, Lin KZ, Wei GQ, Liu CL (2013) Characterization of the complete mitochondrial genome of Diaphania pyloalis (Lepidoptera: Pyralididae). Gene 527:283–291 Zou ZW, Min Q, Cheng SY, Xin TR, Xia B (2017) The complete mitochondrial genome of Thitarodes sejilaensis (Lepidoptera: Hepialidae), a host insect of Ophiocordyceps sinensis and its implication in taxonomic revision of Hepialus adopted in China. Gene 601:44–55