Chromosomal and mitochondrial diversity in Melitaea didyma complex (Lepidoptera, Nymphalidae): eleven deeply diverged DNA barcode groups in one non-monophyletic species?

Abstract It is generally accepted that cases of species’ polyphyly in COI trees arising as a result of deep intraspecific divergence are negligible, and the detected cases reflect misidentifications or/and methodological errors. Here we studied the problem of species’ non-monophyly through chromosomal and molecular analysis of butterfly taxa close to Melitaea didyma (Esper, 1779) (Lepidoptera, Nymphalidae). We found absence or low interspecific chromosome number variation and presence of intraspecific variation, therefore we conclude that in this group, chromosome numbers have relatively low value as taxonomic markers. Despite low karyotype variability, the group was found to have unexpectedly high mitochondrial haplotype diversity. These haplotypes were clustered in 23 highly diverged haplogroups. Twelve of these haplogroups are associated with nine traditionally recognized and morphologically distinct species Melitaea chitralensis Moore, 1901, Melitaea deserticola Oberthür, 1909, Melitaea didymoides Eversmann, 1847, Melitaea gina Higgins, 1941, Melitaea interrupta Colenati, 1846, Melitaea latonigena Eversmann, 1847, Melitaea mixta Evans, 1912, Melitaea saxatilis Christoph, 1873 and Melitaea sutschana Staudinger, 1892. The rest of the haplogroups (11 lineages) belong to a well-known west-palaearctic species Melitaea didyma. The last species is particularly unusual in the haplotypes we obtained. First, it is clearly polyphyletic with respect to COI gene. Second, the differentiation in COI gene between these mostly allopatric (but in few cases sympatric) eleven lineages is extremely high (up to 7.4%), i.e. much deeper than the “standard” DNA barcode species threshold (2.7–3%). This level of divergence normally could correspond not even to different species, but to different genera. Despite this divergence, the bearers of these haplogroups were found to be morphologically indistinguishable and, most importantly, to share absolutely the same ecological niches, i.e. demonstrating the pattern which is hardly compatible with hypothesis of multiple cryptic species. Most likely such a profound irregularity in barcodes is caused by reasons other than speciation and represents an extraordinary example of intra-species barcode variability. Given the deep level of genetic differentiation between the lineages, we assume that there was a long period (up to 5.0 My) of allopatric differentiation when the lineages were separated by geographic or/and ecological barriers and evolved in late Pliocene and Pleistocene refugia of north Africa, the Iberian and Balkan Peninsulas, the Middle East and Central Asia. We discuss the refugia-within-refugia concept as a mechanism explaining the presence of additional diverged minor haplogroups within the areas of the major haplogroups. We also provide the first record of Melitaea gina in Azerbaijan and the record of Melitaea didyma turkestanica as a new taxon for Russia and Europe.


Introduction
The Melitaea didyma (Esper, 1779) species complex, a group of taxa close to M. didyma (Bryk 1940, Higgins 1941, Kolesnichenko 1999, Kolesnichenko et al. 2011) is widely distributed in the Palaearctic region. This complex exhibits a high level of individual and seasonal variation, although distinction between described taxa and between different populations in wing pattern is often unclear (Higgins 1941, 1955, Lvovsky and Morgun 2007, Oorschot and Coutsis 2014. Simultaneously these butterflies are similar in male and female genitalia structure (Higgins 1941).
The significant reviews of this complex were published by Bryk (1940), Higgins (1941Higgins ( , 1955, Kolesnichenko (Kolesnichenko 1999, Kolesnichenko et al. 2011), Tuzov and Churkin (2000. More recently the whole genus Melitaea Fabricius, 1807 was revised by Oorschot and Coutsis (2014). However, available cytogenetic (Lukhtanov and Kuznetsova 1989), morphological (Lvovsky and Morgun 2007, Kolesnichenko et al. 2011, Oorschot and Coutsis 2014 and molecular (Wahlberg and Zimmermann 2000, Lukhtanov et al. 2009, Dincă et al. 2015 data show that the M. didyma species complex requires a more detailed study. Combination of molecular and cytogenetic methods is a useful tool for detecting cryptic species  and can be a good addition to morphological analysis for ordering complex taxonomic structures (Lukhtanov et al. 2016). In our previous paper we applied analysis of DNA barcodes to demonstrate that M. didyma complex is a monophyletic group and is represented by multiple deeply diverged mitochondrial DNA haplogroups (Pazhenkova et al. 2015).

Material and methods
We studied standard COI barcodes (658-bp 5' segment of mitochondrial cytochrome oxidase subunit I). We obtained COI sequences from 121 specimens collected in Armenia, Bulgaria, Georgia, Greece, Iran, Israel, Kazakhstan, Kyrgyzstan, Russia, Slovenia, Syria and Turkey. DNA was extracted from a single leg removed from each voucher specimen.
Legs from 21 specimens were processed at Department of Karyosystematics of Zoological Institute of the Russian Academy of Sciences. Primers and PCR protocol are given in our previous publications , Pazhenkova et al. 2015. Sequencing of double-stranded product was carried out at the Research Resource Center for Molecular and Cell Technologies of St. Petersburg State University. Legs from 100 specimens of Melitaea were processed at the Canadian Centre for DNA Barcoding (CCDB, Biodiversity Institute of Ontario, University of Guelph) using their standard high-throughput protocol described by deWaard et al. (2008). The set of voucher specimens of butterflies is kept in the Zoological Institute of the Russian Academy of Science (St. Petersburg).
The analysis involved 265 COI sequences (including outgroup) (Suppl. material 1). Among them there were 144 published sequences (Wahlberg and Zimmermann 2000, Vila and Bjorklund 2004, Leneveu et al. 2009, Lukhtanov et al. 2009, Dincă 2011, Hausmann et al. 2011, Ashfaq et al. 2013, Pazhenkova et al. 2015 collected from GenBank. Within the studied samples, we are not completely sure of the identity of M. chitralensis specimens (their barcodes were obtained from GenBank) because we were not able to check these vouchers and used the identification of these samples accepted in Ashfaq et al. (2013). According to Kolesnichenko (1999), M. chitralensis is a member of the M. ala subgroup, but the analysed samples clearly clustered with M. mixta. Therefore, we can not exclude the possibility that these samples represent a north Pakistani population close to M. mixta, but not a true M. chitralensis.
Karyotypes were obtained from fresh adult males and processed as previously described (Vershinina et al. 2015). Briefly, gonads were removed from abdomen and placed to freshly prepared fixative (3:1; 96% ethanol and glacial acetic acid) directly after capturing butterfly in the field. Testes were stored in the fixative for 1 month at +4°C. Then the gonads were stained in 2% acetic orcein for 7-10 days at +18-20°C. Haploid chromosome numbers (n) were counted in meiotic metaphase I (MI) and metaphase II (MII).

Karyotype
The haploid chromosome number n=28 was found in prometaphase I, MI and MII cells of seven studied individuals (Table 1, Fig. 1). All chromosome elements formed a gradient size row. The karyotype contained no exceptionally large or small chromosomes.

Chromosome number variation
The genus Melitaea is known to be characterized by relatively low interspecific chromosome number variation. The representatives of basal clades (see phylogeny in Leneveu et al. 2009 (Federley 1938, de Lesse 1960, Robinson 1971, Larsen 1975, Hesselbarth et al. 1995. These haploid numbers are modal ones not only for Melitaea, but also for the family Nymphalidae and for the order Lepidoptera in whole (Robinson 1971, Lukhtanov 2000, 2014. Most likely, one of them (probably, n=31, see Lukhtanov 2014) represents an ancestral lepidopteran condition preserved in the basal lineages of Melitaea.
The younger lineages, the M. fergana Staudinger, 1882 and M. didyma species groups, were found to possess lower chromosome numbers varying from n=27 to n=29-30. Within the M. fergana species group, M. athene Staudinger, 1881, the only karyologically studied species, was found to have n=29 (with n=30 as a rare intraindividual variation) (Lukhtanov and Kuznetsova 1989). The species-rich M. didyma group consists of three complexes: a complex of taxa close to M. ala, a complex of taxa close to M. persea and a complex of taxa close to M. didyma. Within these complexes the following chromosome numbers were found: n=29 in M. ala (Lukhtanov and Kuznetsova 1989), n=27 in M. persea (de Lesse 1960) and different numbers from n=27 to n=29-30 in species of the M. didyma complex (Table 3).
Together with M. deserticola (n=29, Larsen 1975), M. gina occupies a basal position within the M. didyma complex (Fig. 6). Therefore analysis of M. gina was crucially important for understanding chromosome number evolution in this complex. Our study revealed M. gina to have n=28, a number previously observed in M. didyma from Italy (de Lesse 1960) and M. didyma neera from the Kazakh Altai (Lukhtanov and Kuznetsova 1989). Taking into account absence or relatively low level of interspecific chromosome number variation in the M. didyma complex and presence of intraspecific variation (Table 3), we conclude that in this group chromosome numbers have relatively low value as taxonomic markers (but see: Lukhtanov and Kuznetsova 1989).

DNA barcode haplogroups and problem of non-monophyletic species
Despite low level of chromosome number variability, the M. didyma complex was found to have unexpectedly high level of mitochondrial haplotype diversity. These haplotypes were clustered in 23 highly diverged haplogroups (Fig. 2). 12 of these haplogroups are associated with nine traditionally recognized and morphologically distinct species The rest of the haplogroups belong to the well-known west-palearctic species M. didyma. Despite intrapopulation and seasonal variability, this species is very homogenous with respect to morphology, including the structure of genitalia, a character which is most useful for species separation in Melitaea (Suschkin 1913, Higgins 1941, Oorschot and Coutsis 2014. In accordance with this homogeinity, in the recent revision (Oorschot and Coutsis 2014) all populations of this species, except for Central Asian populations, were considered as members of the same subspecies M. didyma didyma. The populations from Central Asia were treated by Oorschot and Coutsis (2014) as a separate subspecies M. didyma turkestanica.
If we follow the opinion of experts in Melitaea taxonomy (Kolesnichenko et al. 2011, Oorschot andCoutsis 2014) and accept the traditional taxonomic treatment of the species M. didyma, we should acknowledge that this species is particularly unusual in the haplotypes we obtained. First, it is clearly polyphyletic with respect to COI gene, and  (Figs 2-6). Second, the number of distinct COI lineages within M. didyma is unusually high (11 lineages) and their genetic differentiation is extreme. The majority of these haplogroups are allopatric, but some of them have sympatric (neera/neera1, turkestanica/ turkestanica2, turkestanica/turkestanica3, turkestanica/turkestanica4) or partially sympatric (neera/turkestanica, occidentalis/didyma) distribution. The mean uncorrected pairwise distances between the lineages is up to 7.4% if the lineages turkestanica3 and turkestanica4 are considered ( Table 2). The lineages turkestanica3 and turkestanica4 are the most diverged lineages of M. didyma. Together with gina2, on the tree (Fig. 2)   but with a very low support (0.54). However, even if the lineages gina2, turkestanica3 and turkestanica4 are not considered, the distances between M. didyma groupings remains high, up to 4.1% between turkestanica2 and liliputana, i.e. much deeper than the "standard" DNA barcode species threshold (2.7-3%) (Hebert et al. 2003, Lukhtanov et al. 2016).
There are two theoretically possible explanations for this pattern. First, M. didyma sensu auctorum can be a mix of multiple species that mostly have allopatric distribution ranges, but some of them are sympatric. Second, the recovered haplogroups (at least the allopatric ones) can represent highly diverged intraspecific lineages. Of course, a combination of the first and the second hypotheses is possible, and a part of the haplogroups could represent different species, and another part of the haplogroups could represent intraspecific variations.
In our opinion, the second hypothesis seems to be more plausible. There are the following arguments for the second scenario. First, no morphological differences between the bearers of these haplogroups are known (except for lighter, more yellowish wing colour in the three M. didyma turkestanica haplogroups as compared with other haplogroups). The second (and the most convincing) argument is based on our field obseravtion of butterfly habitats and ecological preferences. In ecology the competitive exclusion principle, also known as Gause's law is one of the most important rule (Gause 1934, Hardin 1960 ). This was not a case for sympatric haplogroups neera/ neera2, turkestanica/turkestanica, turkestanica/turkestanica3 and turkestanica/turkestan-ica4 (Fig. 8). The bearers of these haplogroups were not only morphologically identical, but also were found to fly exactly syntopically and synchronously. This pattern is hardly compatible with non-conspecifity of these haplogroups.
M. didyma neera and M. didyma turkestanica are differentiated ecologically (Pazhenkova et al. 2015), however, there was no ecological separation between bearers of the neera and turkestanica haplogroups in cases of their sympatry. In Samara and Aktobe, where the haplogroup neera was predominant, both haplogroups were found in M. didyma neera biotope (steppe), and in Karabiryuk where the haplogroup turkestanica was predominant, both haplogroups were found in M. didyma turkestanica biotope (desert) (Fig. 8). This pattern corresponds more to a result of haplotype introgression than to co-habitation of two ecologically differentiated species.
Interestingly, the haplogroup turkestanica2 is not related to the haplogroup turkestanica and is a derivative from West-European haplogroup didyma. This pattern can be treated as a result of ancient introgression. Generally, footprints of ancient and more recent introgression are both an evidence for transparency of boundaries between M. didyma populations.
The mega-analysis of species-level para-and polyphyly in DNA barcode gene trees was recently conducted by using a huge data set (4977 species and 41,583 specimens of European Lepidoptera) (Mutanen et al. 2016), however without in-depth-analyses of particular cases. This study resulted in conclusion that cases of species' polyphyly in COI trees arising as a result of deep intraspecific divergence were negligible, and the detected cases reflected misidentifications or/and methodological errors. Despite this, our analysis demonstrates that species-level polyphyly in DNA barcode based on deep intraspecific divergence may be a real phenomenon.

Distribution ranges and phylogeography
The M. didyma complex consists of at least 23 COI haplogroups, the majority of which demonstrated a strict attachment to particular geographic ranges: chitralensis (north Pa-kistan); deserticola (north Africa, Israel, Jordan, Lebanon, Syria); didyma (west Europe); didymoides (Asian Russia, Mongolia, North China); gina (W Iran, Azerbaijan); interrupta (Caucasus, NE Turkey); latonigena (Asian Russia, north-east Kazakhstan, Mongolia, north-west China); liliputana (Armenia, Turkey, Syria, Lebanon, Israel); mauretanica (south Spain); mixta (Tajikistan, Kyrgyzstan, Uzbekistan, Pakistan, Afghanistan); neera (east Europe, north Caucasus, west Siberia, north Kazakhstan); occidentalis (Spain); protaeoccidentis (north Africa); saxatilis (north Iran); sutschana (Russian Far East, Korea, north-east China) and turkestanica (Kazakhstan, Kyrgyzstan, Uzbekistan, Tajikistan, west China). With few exceptions (e.g. deserticola/protaeoccidentis, deserticola/liliputana), the ranges of these haplogroups do not overlap substantially (Fig. 7), and we hypothesize that mitochondrial diversity was formed in allopatry. Given the deep level of genetic differentiation between the lineages, we assume that there was a long period of allopatric differentiation when the lineages were separated by geographic or/and ecological barriers. Under generally accepted maximum 2.3% (Brower 1994) and minimum 1.5% uncorrected pairwise distance per million years (Quek et al. 2004) for COI sequence of various arthropod taxa, this period can be estimated to be as long as 0.5-5.0 My. In our opinion, this is an evidence that each of these haplogroups evolved in one of the main west-palaearctic late Pliocene and Pleistocene refugia in north Africa (protaeoccidentis, deserticola), the Iberian Peninsula (occidentalis, mauretanica), the Balkan Peninsula (neera), the Middle East (liliputana, saxatilis, gina) and Central Asia (turkestanica, mixta, chitralensis). The presence of additional diverged minor haplogroups neera2, turkestanica2, turkestanica3, turkestanica4, gina2, which could originate allopatrically in small isolated spots, but currently exist in secondary sympatry with major haplogroups neera, turkestanica and gina, agrees well with the refugia-within-refugia concept (Gòmez andLunt 2007, Karaiskou et al. 2014). Interestingly, the area of the most diverged haplogroup turkestanica3 is close to the area of the recently described subspecies M. didyma carminea (Kolesnichenko et al. 2011).

Taxonomic interpretation
We tentatively suggest interpreting the main clusters discovered within M. didyma sensu stricto (M. didyma didyma, M. didyma mauretanica, M. didyma occidentalis, M. didyma protaeoccidentis, M. didyma liliputana, M. didyma neera and M. didyma turkestanica) as subspecies because each of them has its own distribution range and is distinct with respect to mtDNA (i.e. represents by a monophyletic lineage or a combination of two or three monophyletic lineages). As a result we propose the following classification: