Karyotype asymmetry: again, how to measure and what to measure?

Abstract One of the most popular, cheap and widely used approaches in comparative cytogenetics – especially by botanists – is that concerning intrachromosomal and interchromosomal karyotype asymmetry. Currently, there is no clear indication of which method, among the many different ones reported in literature, is the most adequate to infer karyotype asymmetry (especially intrachromosomal), above all in view of the criticisms recently moved to the most recent proposal published. This work addresses a critical review of the methods so far proposed for estimation of karyotype asymmetry, using both artificial and real chromosome datasets. It is shown once again how the concept karyotype of asymmetry is composed by two kinds of estimation: interchromosomal and intrachromosomal asymmetries. For the first one, the use of Coefficient of Variation of Chromosome Length, a powerful statistical parameter, is here confirmed. For the second one, the most appropriate parameter is the new Mean Centromeric Asymmetry, where Centromeric Asymmetry for each chromosome in a complement is easily obtained by calculating the difference of relative lengths of long arm and short arm. The Coefficient of Variation of Centromeric Index, strongly criticized in recent literature, is an additional karyological parameter, not properly connected with karyotype asymmetry. This shows definitively what and how to measure to correctly infer karyotype asymmetry, by proposing to couple two already known parameters in a new way. Hopefully, it will be the basic future reference for all those scientists dealing with cytotaxonomy.


introduction
Cytotaxonomy is a branch of cytogenetics, devoted to the comparative study of karyological features for systematic and evolutionary purposes (Siljak-Yakovlev and Peruzzi 2012). Today, a number of data can be obtained by chromosome studies: chromosome number, karyotype structure, karyotype asymmetry, chromosome banding, FISH, GISH and chromosome painting (Stace 2000, Levin 2002, Graphodatsky et al. 2011, Guerra 2012. Among them, one of the most popular, cheap and widely used approaches -especially by botanists -is that concerning karyotype asymmetry. The concept of karyotype asymmetry, i.e. a karyotype marked by the predominance of chromosomes with terminal/subterminal centromeres (intrachromosomal asymmetry) and highly heterogeneous chromosome sizes (interchromosomal asymmetry), was developed for the first time by Levitsky (1931). Later, Stebbins (1971), in his masterpiece "Chromosomal evolution in higher plants", proposed a quali-quantitative method for the estimation of karyotype asymmetry in twelve categories, by taking into account four classes (from 1 to 4), defined according to the increasing proportion of chromosomes with arm ratio <2:1, to be combined with three classes (from A to C) defined according to the increasing ratio between largest and smallest chromosome in a complement.
Concerning interchromosomal asymmetry, which is due to heterogeneity among chromosome sizes in a complement, other researchers proposed quantitative estimation methods in the following years. This is the case of the Rec index (Greilhuber andSpeta 1976, Venora et al. 2002), the A 2 index (Romero Zarco 1986), the R ratio (Siljak-Yakovlev 1996), the CV CL (Lavania and Srivastava 1992, Watanabe et al. 1999, Paszko 2006. The latter, actually a Coefficient of Variation, is a statistically correct parameter and is able to capture even small variation among chromosome sizes in a complement. Hence, the estimation method for interchromosomal asymmetry does not need to be further discussed here. More complex and debated is the quantitative estimation of the intrachromosomal asymmetry, which is due to centromere position. To address this issue, the first quantitative index proposed was the TF% of Huziwara (1962), soon followed by the AsK% of Arano (1963). Then, further proposals were AsI% of Arano and Saito (1980), Syi (Greilhuber andSpeta 1976, Venora et al. 2002), A 1 (Romero Zarco 1986), CG (Lavania and Srivastava 1992), A (Watanabe et al. 1999), CV CI (Paszko 2006). The latter, a Coefficient of Variation of Centromeric Index, was claimed by Paszko (2006) to be the only parameter with statistical foundation. However, her proposal was recently strongly criticized by Zuo and Yuan (2011), who evidenced that CV CI is not able to capture and quantitatively express the original meaning of karyotype asymmetry (i.e. the prevalence of telocentric-subtelocentric chromosomes), but only to quantify the relative variation (heterogeneity) among centromere positions in a karyotype. Hence, the problem of a correct intrachromosomal asymmetry estimation is still open.
Finally, a few authors tried to combine the two kinds of asymmetry in a single index, such as Lavania and Srivastava (1992) with DI, Paszko (2006) with AI. However, both these indices were strongly criticized, by Paszko (2006) and Peruzzi et al. (2009) respectively, and their use has to be definitely discouraged. The aim of this review is to critically analyze the proposed methods for estimating intrachromosomal asymmetry and to elaborate the proposal for a new suitable estimator which should be: 1) strictly quantitative, 2) statistically correct, 3) not a dispersion or variability index.

Which kind of basic measures were used -and differently combinedfor intrachromosomal Asymmetry estimation?
Fundamentally, the basic measures, used in every method proposed so far, are those concerning the length of long (L) and short arm (S) of each chromosome in a complement. All the karyotypes where these measures are not applicable (for instance those with holocentric chromosomes or those with very small chromosomes, 1 µm or less), are not suitable for the estimation of intrachromosomal asymmetry at all. For all the others (the majority), typically L ≥ S ≥ 0 and L ≥ S. The variation extremes are S = L (i.e. chromosomes with centromere perfectly median) and S = 0 (i.e. chromosomes with centromere perfectly terminal). These two variables were combined by researchers in various ways:

L/S
also called arm ratio (r), it was used for instance in the widely known chromosome nomenclature proposed by Levan et al. (1964). Its values can range from 1 (if S = L) to +∞ (the limit for S = 0). S/L first proposed by Battaglia (1955), it is reciprocal to the arm ratio. Its values can range from 1 (if S = L) to 0 (if S = 0). It is fundamentally used also in Syi = (Mean S length / Mean L length) × 100 (Greilhuber andSpeta 1976, Venora et al. 2002).

S/(L+S)
also called centromeric index, it is the proportion of short arm respect with the whole chromosome. Its values can range from 0.5 (if S = L) to 0 (if S = 0). It is fundamentally used in TF% = Total length of S in a chromosome set / Total length of a chromosome set × 100 (Huziwara 1962), CG = Median S length / Median (L+S) length × 100 (Lavania and Srivastava 1992), and CV CI (Paszko 2006).

Please note that it can be expressed also as 2L/(L+S) -1 or as 1 -2S/(L+S).
Given that L/(L+S) and S/(L+S) are the only parameters which are formally correct on descriptive statistical grounds (they are both proportions, or relative lengths), and given their peculiar complementary relationships, the only parameter well suited to capture the mean intrachromosomal asymmetry in a karyotype is that proposed by Watanabe et al. (1999). It is noteworthy that these authors already stressed that their method is preferable respect with others "because it usually follows a normal distribution". Indeed, given an artificial dataset of chromosomes with normal distribution (mean = median), only the estimators L/(L+S), S/(L+S) and their difference (L-S)/(L+S) are able to correctly describe these features (Table 1). However, it also must be noted that all the other karyotype intrachromosomal asymmetry estimators proposed in literature (Syi, TF%, CG, AsK%, A 1 ), albeit not statistically correct, are highly correlated with A, with values typically above r = |0.9|, p < 0.01 (Paszko 2006, Peruzzi et al. 2009).

how to compare Karyotype Asymmetry among individuals, populations, species etc.?
Let us return to karyotype asymmetry as a whole, with its two parts: interchromosomal and intrachromosomal. Concerning the measure of interchromosomal asymmetry, as explained above, the main point is to measure how much the chromosome lengths of a complement are different each other, and CV CL (Paszko 2006) is perfectly suited for it. As all coefficients of variation, it is a ratio between standard deviation and mean of a sample (i.e. a dispersion index) × 100 (Sokal and Rohlf 1981). Typically, this parameter ranges from 0 (no variation) to 100 or more (in those cases of exceptionally heterogeneous samples, where standard deviation can be higher than the mean). table 1. Comparison of different estimators of intrachromosomal asymmetry on a set of 11 artificial chromosomes with gradually increasing asymmetry, from perfectly median (on the left) to perfectly terminal (on the right) centromeres. Also the mean values are reported in the last column on the right. L/S was excluded because no real value is obtained when S = 0. Concerning the measure of intrachromosomal asymmetry, CV CI should not be used for the reasons explained above. Indeed, it is actually a measure of intrachromosomal heterogeneity, which does not necessarily means asymmetry in the original sense given by Levitsky (1931) and Stebbins (1971). Among others, as shown above, the statistically best suited parameter is A (Watanabe et al. 1999), ranging from 0 (perfectly symmetric) to 1 (perfectly asymmetric).
Since the two kinds of asymmetry express different concepts, it is not desirable to combine them in a single value. On the contrary, as argued for the first time by Romero Zarco (1986) and then by Peruzzi et al. (2009), the best way in representing karyotype asymmetry relationships among organisms is by means of bidimensional scatter plots, where the two asymmetry estimators are put in the x and y axes and points represent each sample. Up to the present day, this was done with the couples of parameters A 1 and A 2 (Romero Zarco 1986) or CV CI and CV CL (noteworthy, CV CL = A 2 × 100) (Paszko 2006;Peruzzi et al. 2009).