﻿Karyotype characteristics and gene COI sequences of Chironomusbonus Shilova et Dzhvarsheishvili, 1974 (Diptera, Chironomidae) from the South Caucasus (Republic of Georgia, Paravani river)

﻿Abstract


Introduction
Ch. plumosus has a very wide distribution range from Western Europe to the Far East (Linevich and Sokolova 1983). However, karyological analysis has shown that Ch. borokensis and Chironomus sp. prope agilis replace Ch. plumosus in Eastern Siberia and the Far East (Kiknadze et al. 1996;Golygina et al. 2003). As indicated before, the karyotype study is the only reliable method for recognizing species in this group.
The karyotype of Ch. bonus has been described by Kerkis et al. (1989) and Kiknadze et al. (1991a). A short communication about the Ch. bonus karyotype was presented by Belyanina (1983). Some information on the karyotype and external morphology of Ch. bonus from Bulgaria was given by Michailova (1994). The biggest DNA databases, GenBank and BOLD, do not contain any DNA data on Ch. bonus, including sequences of the COI gene.
The aim of the work is to present the description of the karyotype and gene COI sequences of Ch. bonus from the South Caucasus, as well as to compare the karyotype characteristics and DNA data of Ch. bonus with the corresponding information available for other species of the Ch. plumosus group.

Methods
For both DNA and karyological studies, we used fourth-instar larvae of Ch. bonus. We collected larvae from a particular site in the Republic of Georgia (South Caucasus): 18.07.17, 41°19.305'N, 43°45.563'E, Ninotsminda district in the region of Samtskhe-Javakheti, one of the branches of the Paravani river, just 0.6 km north of Saghamo settlement, altitude of ca. 2000 m a.s.l. The maximum depth of the river is about 1 m, and the salinity of the water is about 40 ppm. The collection site is marked on the map with a dark circle (Fig. 1). The geographic division of the Caucasus follows Gvozdetsky (1963). The head capsules and bodies of six larvae were slide-mounted in Faure-Berlese medium. The specimens have been deposited at the Tembotov Institute of Ecology of the Mountain Territories RAS in Nalchik, Russia. We studied the karyotype of all six larvae from the Caucasus region.
For karyological study, we fixed the larvae in an ethanol-glacial acetic acid solution (3:1). The preparations of the chromosomes were made using the ethanol-orcein technique (see Dyomin and Ilyinskaya 1988;Dyomin and Shobanov 1990). The banding sequences were designated as per the accepted convention, specifying the abbreviated name of the species, the symbol of the chromosome arm, and sequence number, as h'bonA1, h'bonB1, etc. (Keyl 1962;Wülker and Klötzli 1973).
We performed the identification of chromosome banding sequences for arms A, E, and F using photomaps by Kiknadze et al. (1991aKiknadze et al. ( , 2016 in the system of Keyl (1962) and chromosome mapping for arms C and D as per Kiknadze et al. (1991aKiknadze et al. ( , 2016 in the system of Dévai et al. (1989). The chromosome preparations were studied using a Carl Zeiss Axio Imager A2 microscope.

DNA extraction, amplification and sequencing
We used four karyologically studied larvae of Ch. bonus for further DNA extraction. DNA was extracted from the larvae and preserved in 96% ethanol using a Diatom DNA Prep 100 kit (Izogen Laboratory Ltd, Moscow, Russia) according to the manufacturer's protocol. DNA extraction was performed on vacuum-dried samples without prior homogenization. Samples were incubated in a lysis buffer at a temperature of 55.5 °C for 16 h. After the extraction, the head capsules were retrieved for dry mounting. The resulting DNA solutions were stored at -18 °C. The amplification of the mitochondrial COI gene was conducted using the MasterMix X5 kit (Dialat Ltd, Moscow).
To amplify the mitochondrial COI gene's barcoding region, primers 911 (5´-TTTCTACAAATCATAAAGATATTGG-3´) and 912 (5´-TAAACTTCAGGGTGACCAAAAAATCA-3´) (Folmer et al. 1994) were used. PCR was performed in a 25-µL reaction volume. The amplification profile consisted of an initial step of 95 °C for 5 min, followed by 45 cycles of 95 °C for 30 s, 50 °C for 30 s, and 72 °C for 50 s, and finally an 8-min extension step at 72 °C, a final elongation at 72 °C (8 min), and final storage at 4 °C. The resulting PCR products were purified by precipitation in a 0.15 M CH 3 COONa solution in 90% ethanol and then rinsed with 70% ethanol. The results were visualized by 1.5% agarose gel electrophoresis with ethidium bromide.
Purified PCR products were sequenced in both directions. DNA sequencing of the COI gene was performed according to Sanger using the BigDye Terminator v3.1 commercial kit (ThermoFisher) and the ABI 3130×l genetic analyzer (ThermoFisher) at Syntol JSC (Moscow, Russia). The GenBank accession numbers of the three sequences obtained in this study are MZ014021, MZ014022, and MZ014023.
We found some COI gene data in both the GenBank and BOLD databases only for seven species of the Ch. plumosus group out of 14. We used in our study COI gene sequences from both the aforementioned databases for Ch. balatonicus, Ch. muratensis, Chironomus sp. prope agilis, Ch. borokensis, Ch. usenicus, Ch. entis, and Ch. plumosus, with available data for species with Holarctic and Nearctic distributions. The most abundant data on the COI gene are available for Ch. plumosus (GenBank and BOLD -66 and 138 sequences, respectively) and Ch. entis (GenBank and BOLD -339 and 13 sequences, respectively). DNA sequences of Ch. plumosus obtained from material collected from both Western and Eastern Europe, the Middle East, the Far East, and Northern America were included into the analysis. Concerning Ch. entis, available DNA sequences are more uniform and were obtained from material collected almost exclusively from Northern America (Canada). In cases when a large number of sequences were available from the same region, we used no more than 5-6 sequences with different haplotypes to avoid overloading the phylogenetic tree.
We conducted the alignment of COI sequences with MUSCLE with a genetic code of "invertebrate mitochondrial" packaged in MEGA 6 (Tamura et al. 2013). The pairwise sequence distances (Tables 1-4) consisting of the estimated number of base substitutions per site using MEGA 6 and the K2P model (Kimura 1980) were calculated. The analysis involved 61 nucleotide sequences. The codon positions included were 1 st +2 nd +3 rd +Noncoding. All positions containing gaps and missing data were eliminated. There was a total of 579 positions in the final data set.
We conducted the estimation of phylogenetic relationships in BEAST V1.10.4 (Suchard et al. 2018) by the Bayesian Markov-chain Monte-Carlo (MCMC) method, using the HKY+G substitution model as selected in MEGA 6. The determination of Table 1. Estimates of evolutionary divergence between sequences of Palearctic Ch. plumosus cluster. The number of base substitutions per site (%) from between sequences are shown. Analyses were conducted using the Kimura 2-parameter model (Kimura 1980).   (Tamura et al. 2013) was performed. The strict clock as a clock model and the Yule process as a speciation model were used. We run MCMC for 10.000.000 iterations and 1000 iterations of burn in. Our analysis involved 61 nucleotide sequences, and we eliminated all positions with less than 95% site coverage. There were 579 positions in the final data set. We used the COI sequence of Pagastiella orophila (Genbank accession number JN265047.1) as an outgroup.
We also tried to get average estimates of divergence time between different branches and clusters that appear on the obtained phylogenetic tree (Figs 3, 4). The age of the most recent common ancestors (TMRCAs) for DNA clades was estimated in BEAST V1.10.4 (Suchard et al. 2018) by the MCMC method, using the HKY+G substitution model as selected in MEGA 6. We used a strict clock as a clock model and a constant size as a coalescent model, with the same calibration point assumed by Cranston et al. (2012). The time estimate of 36 million years ago (Mya) for the root node of the divergence between Table 2. Estimates of evolutionary divergence between sequences of Nearctic Ch. entis-Ch. plumosus cluster. The number of base substitutions per site (%) from between sequences are shown. Analyses were conducted using the Kimura 2-parameter model (Kimura 1980).   Pagastiella orophila and all Chironomus species was used as a calibration point. We ran MCMC for 10.000.000 iterations and 1000 iterations of burn in. Tracer v1.7.1 was used to examine the BEAST log file and ESSs for each parameter, which were all > 200. Recent research demonstrates that the range of divergence rates of the COI gene sequence in insects varies from 1.5% to 2.3% per 1 Mya (Jamnongluk et al. 2003;Stevens et al. 2006 etc.). In the study of tenebrionid beetles, Papadopoulou et al. (2010) obtained a divergence rate of 3.54% per 1 Mya for the COI gene (2.69% when combined with the 16S rRNA gene) under the preferred partitioning scheme and substitution model selected using Bayes factors. In our study, we used for calculations of divergence time these three commonly assumed mutation rates: 1.5%, 2.3%, and 3.54%. We calculated TMRCAs for the nodes 1-9 of the phylogenetic tree (Fig. 3). The obtained values are given in Table 5.

Results
Based on morphological and chromosomal characters, we identified the larvae belonging to the genus Chironomus at the studied site as Ch. bonus. The morphology of Ch. bonus larvae from the South Caucasus is similar to that previously described for this species by Kiknadze et al. (1991b).

Karyotype of Ch. bonus from the South Caucasus
The diploid number of chromosomes in the Ch. bonus karyotype is 2n = 8 plus the B-chromosome. Such a picture for the C. bonus karyotype is based on the almost constant presence of an additional B-chromosome in the karyotype of each larva. The chromosome arm combinations are AB, CD, EF, and G (the "thummi" cytocomplex) (Fig. 2). The chromosomes AB and CD are metacentric, EF is submetacentric, and G is telocentric. Arm G homologues are paired in the nucleolus and Balbiani rings (BRs) regions. The centromeric bands are easily identifiable. There is one nucleolus and two BRs on the arm G, and one BR is present on the arm B.

Banding sequences and chromosomal polymorphism of Ch. bonus from the South Caucasus
The karyotype of Ch. bonus from the South Caucasus is monomorphic. The banding sequences of all the chromosome arms of Ch. bonus are identical to those of Ch. plumosus.

Results of phylogenetic analysis of COI gene sequences of Ch. bonus and estimated ages of the most recent common ancestors (TMRCAs) for DNA clades
Overall, we successfully obtained three complete COI gene sequences of Ch. bonus from six larvae from the South Caucasus. (MZ014021.1: 627 bp, base composition is 25.99% A, 36.84% T, 16.91% G, and 20.26% C; MZ014022.1: 658 bp, base composition is 26.59% A, 36.17% T, 16.57% G, and 20.67% C; MZ014023.1: 650 bp, base composition is 27.08% A, 35.38% T, 16.77% G, and 20.77% C). Each of the three sequences had a different haplotype. This is the first DNA data obtained for Ch. bonus.
The resulting phylogenetic tree (Fig. 3) represents a very complex pattern, with several obvious clusters with rather high probabilities. We conditionally named them the Palearctic Ch. plumosus cluster, the Far Eastern Ch. borokensis-Ch. suwai cluster, and the Nearctic Ch. entis-Ch. plumosus cluster.
The Palearctic Ch. plumosus cluster (Figs 3, 4), is formed mostly by Ch. plumosus sequences from Western and Eastern Europe (UK, Sweden, Poland, Montenegro, and the European part of Russia). The only available sequences of Ch. usenicus from Russia (Saratov Terr.) and, surprisingly, sequences of Ch. bonus obtained in this study, are also included in this cluster. It is formed by sequences obtained from material identified through both karyological and morphological analyses (all Ch. usenicus and Ch. bonus sequences, together with a few Ch. plumosus ones, i.e., JN016830.1, JN016829.1, AB740262.1, AB740263.1), and we therefore named it the Palearctic Ch. plumosus cluster.
The Far Eastern Ch. borokensis-Ch. suwai cluster mostly formed by Ch. plumosus sequences from the Far East (China, South Korea, and Japan) and a sequence of Ch. borokensis from Russia. We named this branch as the Far Eastern Ch. borokensis-Ch. suwai cluster because this particular Ch. borokensis sequence (AB740261.1) was obtained from the material identified through karyological analysis (Kondo et al. 2016). According to the BOLD database, Ch. plumosus sequences from the Far East were obtained from specimens identified only through morphological analysis. Perhaps the observed picture is an error in species identification, which can happen quite often when chromosomal analysis is not involved, and at least some of these Ch. plumosus specimens from the Far East could actually be Ch. borokensis. On the other hand, at least two Japanese sequences that we used in our study from Lake Suwa could be Ch. suwai. We assume this because Lake Suwa is the type locality for the species. According to Golygina et al. (2003), the karyotype of Ch. suwai is closely related to that of Ch. borokensis as indicated by many common banding patterns, but it differs by the much smaller size of the centromeric bands. Also, Golygina et al. (2003) suppose that the true Ch. plumosus does not occur in Japan.
Almost the same pattern is observed in the Nearctic Ch. entis-Ch. plumosus cluster, consisting equally of Ch. plumosus and Ch. entis sequences. According to the data from Proulx et al. (2013), just two sequences of Ch. entis (KF278213.1 and KF278212.1) and three sequences of Ch. plumosus (GBDPC133-14/KF278209.1, GBDPC138-14/KF278210.1 and GBDPC144-14/KF278216.1) from this cluster were obtained from material identified through karyological analysis. Except for these sequences, it is most likely an error in species identification as well, and at least some of the Ch. plumosus sequences presented in BOLD from Northern America could actually be Ch. entis and vice versa. Also, in this cluster, there are no Ch. plumosus or Ch. entis sequences from the Palearctic region. Given all this data, we named this cluster the Nearctic Ch. entis-Ch. plumosus cluster.
In addition to the above-mentioned obvious clusters of the tree (Fig. 3), there is a fourth ambiguous cluster formed by Ch. balatonicus, Ch. muratensis, Chironomus sp. prope agilis, and all Ch. plumosus sequences from Finland. We know from the BOLD database that these Ch. plumosus sequences were obtained from adults identified just through morphological analysis. Perhaps this pattern is an error in species identification, and these Finnish specimens could actually belong to other already known or even previously undescribed species.

Genetic distances
Calculated pairwise sequence distances (Tables 1-4) consisting of the estimated number of base substitutions per site using the K2P model (Kimura 1980) show an interesting pattern. Proulx et al. (2013), who used genetic, morphological, and karyological information to discriminate Chironomus species from Canada, showed that intraspecific K2P distances for Chironomus species characterized by the COI gene ranged from zero to 3%. These values also could be used as a reference for distinguishing Chironomus species in the present work, but data on the COI gene should be complemented with other methods. In our study, the distances between the Palearctic Ch. plumosus cluster's sequences are less than 3% and range from 0 to 2.595% (Table 1). If data on the Iranian Ch. plumosus are removed, the distances between all other sequences in this cluster are less than 1%, ranging from 0 to 0.914%. Distances between the sequences of Ch. bonus obtained in this study are very small, varying from 0.182% to 0.364%. The sequences of Ch. usenicus and of several individuals of Ch. plumosus from Russia (Saratov Terr.), Sweden, and Montenegro are closest to those of Ch. bonus in terms of distances.
Almost the same pattern is observed in the Nearctic Ch. entis-Ch. plumosus cluster, where the distances between the sequences are also lower than the 3% range, varying from 0 to 1.285% (Table 2).
In the Far Eastern Ch. borokensis-Ch. suwai cluster, the distances between the sequences are also lower than the 3% range, varying from 0 to 2.217% (Table 3). If we disregard Ch. plumosus sequence MN750315.1 from China, Hongze, Jiangsu, the distances between all other sequences in this cluster are significantly less, reaching only 1.839%.
At the same time, the average distances between the various clusters exceed the 3% threshold. The distance between Palearctic Ch. plumosus and Far Eastern Ch. borokensis-Ch. suwai clusters is 3.55%. The distance between Palearctic Ch. plumosus and Nearctic Ch. entis-Ch. plumosus clusters is 3.75%. Finally, the distance between Far Eastern Ch. borokensis-Ch. suwai and Nearctic Ch. entis-Ch. plumosus clusters is 5.98%.
In the fourth cluster, which contains Ch. plumosus sequences from Finland, the distances between the sequences are generally higher than the 3% range (Table 4). Interestingly, the distances between the sequence of Chironomus sp. prope agilis and all other sequences are pretty high, varying from 4.123 to 8.357%. On the other hand, analogous distances in the case of Ch. muratensis are also fairly high, varying from 4.119 to 8.115%. However, the distances between the sequence of Ch. balatonicus and most of the Finnish sequences of Ch. plumosus are also high enough, varying from 3.372 to 3.555%. At the same time, the distance between the sequence of Ch. balatonicus and one Finnish sequence of Ch. plumosus from Regio aboensis (LEFIJ3948-16) is just 0.547%, which is much lower than the 3% range, and we therefore can assume that this Ch. plumosus sequence could actually belong to Ch. balatonicus. Moreover, the distances between the three other Ch. plumosus sequences from Finland, i.e., that from Regio aboensis (LEFIJ3947-16) and two from Satakunta (CHIFI299-16 and CHIFI298-16), are relatively high, varying from 3.555 to 3.939%. At the same time, the distance between the two latter sequences is just 1.099%, which is lower than the 3% threshold. Considering the tree topology (Fig. 3) and genetic distances between the sequences, we can suggest that a particular sequence from Regio aboensis, on the one hand, and another two sequences from Satakunta, on the other hand, belong to two different, possibly previously undescribed species. This assumption is quite possible because Michailova (2001) found in Finland (Lake Arima and Lokka Reservoir) two unknown karyotypes similar to those of the Ch. plumosus group. She proposed that at least one of these karyotypes could correspond to Ch. coaetaneus Hirvenoja, 1998(Hirvenoja 1998, which may be related to Ch. plumosus. Some sequences in the Palearctic Ch. plumosus cluster initially were not complete, and it was hard to make a good comparison. But still, we found a small number of substitutions that distinguish the sequences of Ch. bonus from other sequences in the Palearctic Ch. plumosus cluster. Overall, we found six substitutions of that kind ( Table 5). Four of them are nonsynonymous substitutions, and the remaining two are synonymous ones. Among them, there is a single unique 340-position substitution that was found in all three sequences of Ch. bonus. All other substitutions are also found in certain sequences from other clusters. Only this unique substitution clearly distinguishes Ch. bonus from other species in our entire data set.

Discussion
Studied larvae of Ch. bonus have a monomorphic karyotype, with its details similar to those previously described for this species by Kiknadze et al. (1991a). Following Proulx et al. (2013), we can conclude that the genetic distances between observed Palearctic Ch. plumosus, Far Eastern Ch. borokensis-Ch. suwai, and Nearctic Ch. entis-Ch. plumosus clusters exceed the 3% range. This result leads us to some interesting conclusions about the level of divergence between the Palearctic and Nearctic populations of Ch. plumosus.
Our calculations show that the distance (3.75%) between the Palearctic and Nearctic sequences of Ch. plumosus exceeds the 3.0% range for Chironomus species. One can say that since the divergence time of 2.88-1.72 Mya ( Fig. 3; Table 6, node 3), the Nearctic populations of Ch. plumosus have already become a separate species.
We can propose two possible explanations for the observed pattern within the Palearctic Ch. plumosus cluster (Fig. 4), which also included the Ch. bonus sequences obtained during this study. The first explanation is similar to that earlier suggested by Guryev and Blinov (2002), who found that populations of Ch. entis and Ch. plumosus did not group together on the trees based on the mitochondrial cytb gene according to their species affiliation. They suggested that it could result from interspecific hybridization followed by recurrent crosses. Consequently, the offspring inherited mtDNA of one of the parental species. In this case, even an insignificant selective advantage of this mtDNA is able to lead to the rapid fixation of the new haplotype in the population. Later, Polukonova et al. (2009) in the work where they studied the COI sequences of cytologically identified Ch. usenicus, also inclined to this explanation when some Ch. usenicus and Ch. plumosus COI gene sequences were almost identical. In addition, Proulx et al. (2013) reported that the COI The observed pattern also can be explained by a relatively recent separation of the two species, with Ch. plumosus being a parental species to Ch. bonus. The COI gene sequences of these species are therefore very similar, with a very low number of new substitutions in the Ch. bonus lineage. However, we discovered a number of substitutions that clearly distinguish Ch. bonus from Ch. usenicus and Ch. plumosus from European populations (Table 5).
We can assume that the Ch. plumosus group originated from the common ancestor during the Pliocene of 5.75-3.43 Mya. However, since we have certain DNA data only for seven species of the Ch. plumosus group out of 14, this temporary estimate could change in the future in favor of the older age. At the same time, the obtained age of the most recent common ancestor of the Ch. plumosus group corresponds rather well to the estimations by Demin and Polukonova (2008) (5.8-3.7 Mya), despite the substantially lower amount of data available for those authors.
We can be more confident about the age of the most recent common ancestors of species constituting the Palearctic Ch. plumosus, Far Eastern Ch. borokensis-Ch. suwai, and Nearctic Ch. entis-Ch. plumosus clusters. It is possible that the age of the Palearctic Ch. plumosus, Far Eastern Ch. borokensis-Ch. suwai, and Nearctic Ch. entis-Ch. plumosus clusters is 0.638-0.378, 0.906-0.539 and 1.288-0.759 million years (Myr) respectively. The age of European populations of Ch. plumosus is approximately 0.517-0.307 Myr. We therefore suggest that observed clusters have arisen relatively recently in the Middle Pleistocene sub-epoch.
We concluded that the most recent common ancestor of the Ch. plumosus group originated in the Pliocene epoch (5.3-2.58 Mya). It is known that this epoch is characterized by the appearance of a new type of biome, the first true grasslands, due to the retreat of the forests associated with the gradual cooling of the climate that began in the previous epochs. True grasslands and Serengeti-like communities of grazing animals probably did not appear until the Late Miocene in the New World and the Pliocene in the Old World (ca. 5 Mya) (Pärtel 2005).
Due to the heterogeneity of the landscapes, new stagnant water bodies became increasingly abundant. In contrast to lowland rivers, which usually have similar environmental parameters, each of these stagnant water bodies was often characterized by a unique combination of size, shape, depth, temperature profile, mineralization level etc. This variation in environmental parameters could easily lead to differences in breeding time between various populations or individuals that can potentially lead to reproductive isolation and the emergence of new species. We suggest that the species divergence in this group could have been caused by invasion of their common ancestor into newly originated water bodies.

Data availability statement
The data (Figs