Repetitive sequences: the hidden diversity of heterochromatin in prochilodontid fish

Abstract The structure and organization of repetitive elements in fish genomes are still relatively poorly understood, although most of these elements are believed to be located in heterochromatic regions. Repetitive elements are considered essential in evolutionary processes as hotspots for mutations and chromosomal rearrangements, among other functions – thus providing new genomic alternatives and regulatory sites for gene expression. The present study sought to characterize repetitive DNA sequences in the genomes of Semaprochilodus insignis (Jardine & Schomburgk, 1841) and Semaprochilodus taeniurus (Valenciennes, 1817) and identify regions of conserved syntenic blocks in this genome fraction of three species of Prochilodontidae (Semaprochilodus insignis, Semaprochilodus taeniurus, and Prochilodus lineatus (Valenciennes, 1836) by cross-FISH using Cot-1 DNA (renaturation kinetics) probes. We found that the repetitive fractions of the genomes of Semaprochilodus insignis and Semaprochilodus taeniurus have significant amounts of conserved syntenic blocks in hybridization sites, but with low degrees of similarity between them and the genome of Prochilodus lineatus, especially in relation to B chromosomes. The cloning and sequencing of the repetitive genomic elements of Semaprochilodus insignis and Semaprochilodus taeniurus using Cot-1 DNA identified 48 fragments that displayed high similarity with repetitive sequences deposited in public DNA databases and classified as microsatellites, transposons, and retrotransposons. The repetitive fractions of the Semaprochilodus insignis and Semaprochilodus taeniurus genomes exhibited high degrees of conserved syntenic blocks in terms of both the structures and locations of hybridization sites, but a low degree of similarity with the syntenic blocks of the Prochilodus lineatus genome. Future comparative analyses of other prochilodontidae species will be needed to advance our understanding of the organization and evolution of the genomes in this group of fish.


Introduction
Multiple copies of DNA sequences, known as "repetitive DNA", compose large portions of eukaryotic genomes. Repetitive DNA is generally divided into two groups: (1) tandem repeats, which include DNA satellites, minisatellites, and microsatellites; and (2) dispersed interspersed repeats composed of transposable elements (TEs) (Timberlake 1978, Charlesworth 1994, Jurka 2005, but there are other gene families with sequence repetitions also known as repetitive DNA, such as the genes encoding for ribosomal RNA (rRNA) (Long 1980). While the structure and organization of this genome fraction is still poorly understood in fish, most of these non-coding repetitive sequences appear to be located in heterochromatic regions (Fishcher et al. 2004.
The repetitive sequences were largely considered to be "junk", "selfish", or "parasitic" DNA (Doolittle and Sapienza 1980, Orgel and Crick 1980, Nowak 1994 due to the lack of any known functions in the genome for these sequences. With everincreasing volumes of genomic information, however, these repetitive sequences are now known to play larger roles in the structural and functional evolution of the genome (Shapiro andVonsternberg 2005, Biémont andVieira 2006). Indeed, repetitive sequences are now known to be involved in chromosomal rearrangements and responsible for significant proportions of the karyotypic variations observed in many groups (Kidwell 2002, Schneider et al. 2013. In Prochilodontidae, centromeric heterochromatin regions have been observed in all 54 chromosomes in all of the species analyzed, as well as in the B chromosomes of Prochilodus lineatus (Valenciennes, 1836) (Pauls and Bertollo 1980, Oliveira et al. 2003, Hatanaka et al. 2002, Terencio et al. 2012). However, Semaprochilodus insignis (Jardine & Schomburgk, 1841) has additional heterochromatic blocks in the terminal regions of the first metacentric pair, while Semaprochilodus taeniurus (Valenciennes, 1817) has large bitelomeric markings in metacentric pairs 2 and 3.
The ZZ/ZW sex chromosome system may have originated through an in cis process of heterochromatin accumulation that differentiated into the W chromosome -with consequent recombination restrictions starting with the first chromosome pair (Terencio et al. 2012a).
The phylogenetic biogeography of the Prochilodontidae indicates that the family dates back minimally to approximately 12 million years ago, with higher level intrafamilial cladogenic events also dating to at least that time period; these dates are congruent with data from the fossil record for more encompassing groups within the Characiformes (Sivasundar andBermingham 2001, Castro andVari 2004). Phylogenies constructed based on morphological (information from osteological and soft anatomical systems) and molecular (ATPase, D-loop, ND4 and COI) characters demonstrates that Prochilodus is the sister group to the clade formed by Ichthyoelephas plus Semaprochilodus (Turner et al. 2004). It is believed that heterochromatic regions play important roles in the differentiation of this fish group, despite the relatively stable karyotypic macrostructures of the Prochilodontidae. The genetic composition of these regions is still only poorly understood, however, and the only firm information available concerns the presence of large amounts of repetitive DNA sequences in the B chromosomes of P. lineatus (Camacho and Beukeboom 2000) and in the W sex chromosome of S. taeniurus (Terencio et al. 2012b). These sequences were identified and classified as microsatellites, transposons, and retrotransposons in the latter species.
Heterochromatic regions are essential to evolutionary processes because of their ability to propagate and influence genes (Grewal and Jia 2007), and the present study therefore sought to characterize the moderate to highly repetitive DNA sequences in S. insignis and S. taeniurus by cloning and sequencing them and identifying conserved syntenic blocks of this fraction in three species of the family Prochilodontidae (S. insignis, S. taeniurus, and P. lineatus) using cross-FISH techniques with Cot-1 DNA probes.

Materials and methods
Ten specimens of S. insignis (six females and four males) and 12 S. taeniurus (seven females and five males) were examined cytogenetically. These fish were captured with the authorization of ICMBio SISBIO 10609-1/2007 at the confluence of the Negro and Solimões Rivers (AM) and at the Amazonas and Tapajós (PA) Rivers. Five P. lineatus (two females and three males) were captured from the Tibagi River (PR). The fish were anesthetized in ice-cold water and were sacrificed. Voucher specimens were deposited in the INPA Animal Genetics Laboratory fish collection (10034, 10037, 10047 and 10696). Chromosome preparations were obtained from anterior kidney cells using an in vivo colchicine treatment (Bertollo et al. 1978). Institutional abbrevoations: UFAM, Federal University of Amazonas; INPA, National Institute of Amazonian Research; UEPG, State University of Ponta Grossa.

Isolation of repetitive DNA via re-association kinetics
Enriched samples containing repetitive DNA sequences from S. insignis and S. taeniurus were constructed based on the renaturation kinetics of Cot-1 DNA (DNA enriched for highly and moderately repetitive DNA sequences) according to the protocol described by Zwick et al. 2010) and recently adapted by Ferreira and Martins (2008). DNA samples (50 μl of 100-500 ng/μl of DNA in 0.3 M NaCl) were autoclaved (121 °C) for 5 minutes (min) to obtain fragments ranging from 100 to 2000 base pairs. Next, the DNA was denatured at 95 °C for 10 min, placed on ice for 10 seconds (s) and subsequently placed at 65 °C for 1 min for re-annealing. The samples were incubated at 37 °C for 8 min with 1 U of S1 nuclease to permit the digestion of single-stranded DNA. The repetitive portion of this DNA was recovered by freezing in liquid nitrogen, and the DNA was extracted using phenol-chloroform. The resulting DNA fragments were used as probes for fluorescence in situ hybridization, cloned and sequenced.

Microscopy/Image Processing
Hybridized chromosomes were analyzed using an Olympus BX51 epifluorescence microscope, and the images were captured with a digital camera (Olympus DP71) using the Image-Pro MC 6.3 software.

Cloning and sequencing of repetitive sequence
One microgram of the Cot-1 DNA products was cloned using a pMOS Blunt-ended PCR Cloning Kit (GE Healthcare), purified using the GFX PCR Purification Kit (GE Healthcare) and sequenced using the Big Dye Kit (Applied Biosystems) in an ABI 3130 genetic analyzer. Sequence alignment was performed using Clustal W ( Thompson et al. 1994), which is included in the BioEdit 7.0 software program (Hall 1999). Each clone was used as a query in BLASTN (Basic Local Alignment Search Tool nucleotide) searches against the NCBI nucleotide collection (http://www.ncbi.nlm.nih. gov) and in searches against the Repbase database (Jurka et al. 2005) at the Genetic Information Research Institute (Giri) (http://www.girinst.org/repbase/) using CEN-SOR software (Kohany et al. 2006).

Results
Hybridization of the S. insignis Cot-1 DNA probe to its own chromosomes demonstrated that the repetitive elements of its genome were located in the centromeric regions of all chromosomes, as well as in the terminal region of several chromosomes ( Fig. 1a and b). The cross-hybridization of the S. taeniurus Cot-1 DNA probe to the chromosomes of S. insignis revealed markers in the centromeric region, although they were smaller than those observed using species-specific probes (Fig. 1c). Additionally, no terminal markers were observed in S. insignis using the heterologous probe, indicating that this species has chromosome pairs ( Fig. 1d, arrowheads) that carry species-specific repetitive sequences not shared with S. taeniurus ( Fig. 1a, b, c and d).
Hybridization of the S. taeniurus Cot-1 DNA probe to its own chromosomes likewise revealed that repetitive sequences were abundant in the genome of this species and located in various regions (e.g., centromeric, interstitial, and terminal) of the entire chromosome complement ( Fig. 1e and g). Cross-FISH reactions were performed using the S. insignis Cot-1 DNA probe and demonstrated the presence of conserved syntenic blocks in several chromosomal regions (Fig. 1f). S. taeniurus also displayed species-specific repetitive DNA sites located in the centromeric and terminal regions of 14 chromosomes (Fig. 1h, arrows), with no observed hybridizations of the S. insignis Cot-1 DNA probe to these same regions (Fig. 1e, f, g, and h).
Both Cot-1 DNA probes of S. insignis and S. taeniurus displayed positive hybridization signals in the terminal regions of the entire complement of P. lineatus chromosomes. The supernumerary (i.e., B) chromosomes (Fig. l, arrowheads) revealed hybridization signals only with the S. taeniurus Cot-1 DNA probe. The same marker pattern seen on one of the B chromosomes was also observed on the autosomal chromosomes, while only one of the chromosome arms exhibited hybridization signals shared with the other B chromosome (Fig. 1 i, j, k and l).
Cloning and sequencing the repetitive genome elements obtained from S. insignis and S. taeniurus Cot-1 DNA identified 48 DNA fragments of varying sizes (GenBank: JX848379-JX848393). 71% of repetitive DNA diversity sampled (Cot-1 DNA) of S. insignis displayed high similarity to microsatellites, 17% to DNA transposons, and Figure 1. Cot-1 DNA fraction hybridization in three species of Prochilodontidae. a S. insignis chromosomes counterstained with DAPI b Cot-1 DNA from the S. insignis genome hybridized to its own chromosomes c Cot-1 DNA from the S. taeniurus genome hybridized to S. insignis chromosomes d Double-FISH of the Cot-1 DNA fraction e S. taeniurus chromosomes counterstained with DAPI f Cot-1 DNA from the S. taeniurus genome hybridized to its own chromosomes g Cot-1 DNA from the S. insignis genome hybridized to the S. taeniurus chromosomes h Double-FISH of the Cot-1 DNA fraction i P. lineatus chromosomes counterstained with DAPI j Cot-1 DNA from the S. insiginis genome hybridized to P. lineatus chromosomes k Cot-1 DNA from the S. taeniurus genome hybridized to P. lineatus chromosomes l Double-FISH of the Cot-1 DNA fraction. 10% to retrotransposons ( Table 1); 75% of the repetitive sequences sampled of S. taeniurus displayed high similarity to microsatellites, 5% to transposons, and 15% to retrotransposons (Table 2).
The sequences described in the present study may play an evolutionary role in the genomes of S. insignis and S. taeniurus as one of the sequences identified in the genome of S. taeniurus (Ste17) displayed 82% homology with a retrotransposon called SINE3 identified by (Kapitonov and Jurka 2003) as originating from 5S rRNA. As other species of the Prochilodontidae family have only one pair of chromosomes carrying these ribosomal sites, this information strengthens the hypothesis that the multiple 5S rDNA sites observed in S. insignis and S. taeniurus are pseudogenes (or repetitive sequences) derived from 5S rDNA (Terencio et al. 2012a). In Gymnotus paraguensis (Albert & Crampton, 2003) the multiplication of 5S rDNA gene clusters might has be caused by the involvement of transposable elements because the NTS has high identity (90%) with a Tc1-like transposon (Silva et al. 2011).
We were also able to identify sequences in S. insignis that exhibited high similarity with the transposable element Helitron. In maize, this TE seems to continually produce new non autonomous elements responsible for the duplicative insertion of gene segments at new locations and for the unprecedented genomic diversity of this species (Morgante et al. 2005). Intact Helitron elements were identified in the sex-determining region of the sex chromosomes of the platyfish X. maculatus, suggesting that TE are still active in the genome of platyfish and related species -where they may have roles in the evolution of sex chromosomes and other genomic regions (Zhou et al. 2003).
Tc1/mariner (isolated from the genomes of S. insignis and S. taeniurus) are the most widespread superfamily of DNA transposons and can be found in fungi, plants, ciliates, and animals (including nematodes, arthropods, fish, frogs, and humans). Most of the transposon copies isolated from vertebrates are clearly inactive remnants of once active transposons that were inactivated by mutations, but only after successfully colonizing their genomes (Plasterk et al. 2009, Ivics et al. 2006. The retroelement Rex1 was also detected in the repetitive fraction of the genomes of S. insignis and S. taeniurus. The Rex family has been widely studied in fish, and a number of different lineages have been described in this group (Volff et al. 2000) where they are known to be scattered or grouped into conspicuous clusters in the chromosomes of Neotropical cichlids , Gross et al. 2009, Teixeira et al. 2009, Oliveira et al. 1999). These elements display compartmentalized distributions in some autosomes and show clear signals along the full lengths of W chromosomes in S. taeniurus (Terencio et al. 2012).
The Line2 element was detected only in the repetitive fraction of the S. taeniurus genome. This repetitive sequence may be present in the S. insignis genome but simply not sampled in our study, or alternatively, it may have been eliminated from the genome of this species. FISH showed that Line2 sequences are organized in small clusters dispersed over all of the chromosomes of Oreochromis niloticus (Linnaeus, 1758), but with higher concentrations near chromosome ends (Oliveira et al. 1999). Line elements in mammals appear clustered in the G-banding regions of the chromosomes, and on the sex chromosomes in some cases (Wichman et al. 1992, Fishcher et al. 2004).

Repetitive DNA organization in chromosomes
Repetitive DNA sequences comprising mostly the heterochromatic portions of the genome were observed using the C-banding technique. Previous studies (Feldberg et al. 1987, Terencio et al. 2012b revealed that this technique revealed that repetitive DNA sequence fractions in the genomes of fish of the family Prochilodontidae are not abundant and are located mainly in the centromeric regions of the chromosomes and, less frequently, in the terminal regions of the long arms of some chromosome pairs. Large heterochromatic blocks can be observed, however, in the supernumerary chromosomes (i.e., the B chromosomes) of Prochilodus spp. , Voltolin et al. 2011) and in the W sex chromosome of S. taeniurus (Feldberg et al. 1987, Terencio et al. 2012a. Fluorescence in situ hybridization using species-specific probes of the repetitive fractions of the genomes partially confirmed the heterochromatic pattern demonstrated with the C-banding technique in both S. insignis and S. taeniurus -and positive signals were detected in the centromeric regions of all of their chromosomes. Markers were also observed in the terminal regions of some chromosomes, confirming that repetitive DNAs are also present in this area, although heterochromatin was not observed. Repetitive sequences located outside of heterochromatic regions are believed to significantly influence genome evolution, particularly by controlling and regulating gene activities, and genome sequencing has frequently revealed short and truncated copies of repetitive sequences in euchromatic genomic regions (Fischer et al. 2002, Biémont and Vieira 2006, Timberlake 1978, Yuan and Wessler 2011, Torres et al. 2011). This observation does not necessarily indicate that these repetitive sequences are constitutively expressed, however, since they tend to be silenced and undergo subsequent molecular deterioration. In other words, these sequences becomes inactive and progressively accumulate mutations, insertions, and deletions at neutral rates until completely losing their identities or become lost in the host genome (Fernández-Medina et al. 2012). The presence of repetitive DNAs in euchromatic regions has been observed in many groups, such as insects , fish (Teixeira et al. 2009, and lizards (Pokorná et al. 2011), and these TEs have acquired structural/regulatory functions so that their accumulation in euchromatic regions may lend advantages to the host genome.
Cross-hybridizations of S. insignis and S. taeniurus showed patterns similar to those observed in homologous hybridization -which suggests that this portion of the genome has been conserved throughout evolution, perhaps due to a functional role. However, revealed that these species have species-specific centromeric and terminal sites not identified by heterologous hybridization.
Cross-FISH using S. insignis and S. taeniurus Cot-1 DNA probes revealed hybridization signals in the subterminal regions of P. lineatus, in contrast to the heterochromatic pattern revealed by the C-banding technique with heterochromatin blocks being primarily observed in the centromeric region (Pauls and Bertollo 1990, Venere et al. 1990, Cavallaro et al. 2000, Voltolin et al. 2011. These data indicate that the repetitive fraction of centromeric heterochromatin of P. lineatus is different from the other two species examined (S. insignis and S. taeniurus) and that shared repetitive sequences are located on the subtelomeric portions of their chromosomes. This same pattern was observed in three species of Prochilodus using (AATTT)n microsatellite (Hatanaka et al. 2002) and W-specific probes (Terencio et al. 2012b).
A common karyotypic feature of species belonging to the genus Prochilodus is the presence of supernumerary chromosomes (B chromosomes). Many studies of B chromosomes have indicated that these supernumerary chromosomes are rich in repetitive sequences and, in certain cases, may contain a number of functional genes (Camacho 2005, Ruiz-Estévez et al. 2012). The P. lineatus population analyzed in the present study displayed two B chromosomes with distinct hybridization sites. The S. insignis Cot-1 DNA probe did not hybridize to the B chromosomes, possibly because the repetitive fraction of the S. insignis genome is not shared with the B chromosomes of P. lineatus. One possible hypothesis explaining this result would be that these repetitive sequences have undergone rapid differentiation and evolution in the genome of S. insignis, resulting in a loss of homology with sequences on the B chromosomes of P. lineatus. Another explanation could be due to the fact it is different genera, and therefore the Bs found in each species could have different origins. The S. taeniurus Cot-1 DNA probe was positive, revealing sequence sharing with P. lineatus B chromosomes. A number of studies have suggested that B chromosomes can influence sex determination in fish (Noleto et al. 2012, Yoshida et al. 2012, although no relationship between the occurrence of B chromosomes and sex determination has been observed in P. lineatus. The B chromosomes detected in P. lineatus were recently shown to demonstrate positive signals when hybridized with the W-specific probe, indicating the sharing of repetitive DNA families between these two species (Terencio et al. 2012a).

Conclusions
Results from DNA sequencing indicated that the genomes of S. insignis and S. taeniurus comprise different classes of repetitive sequences that may have played important roles in their evolution. The repetitive fractions of the S. insignis and S. taeniurus genomes also exhibit high degrees of conserved syntenic blocks in terms of both the structure and location of hybridization sites. However, the genomes of both S. insignis and S. taeniurus displayed a low degree of syntenic blocks with the P. lineatus genome, especially with regard to the B chromosome, and the origin of this situation has not yet been elucidated.

Authors' contributions
MLT, CHS and MCG collected the samples, collaborated on all cytogenetic procedures, undertook the bibliographic review and coordinated the writing of the manuscript. EJC, VN, RFA, MCA and MRV participated in the development of the laboratory techniques, performed the specific W-probe for Semaprochilodus and reviewed the manuscript. EF coordinated the study and reviewed the manuscript. All authors read and approved of the final manuscript.