A Mitochondrial DNA Phylogeny of Extant Species of the Genus Trachemys with Resulting Taxonomic Implications
ABSTRACT
The phylogenetic relationships among taxa within the emydid genus Trachemys have largely remained unresolved. A 768-basepair fragment of ND4, as well as the histidine, serine, and leucine tRNAs were sequenced from 18 of 26 of the extant species and subspecies of Trachemys. The aligned sequences were analyzed using maximum parsimony, maximum likelihood, and Bayesian methods. The results support the taxonomy of the genus as proposed by Seidel.
The genus Trachemys is a speciose group of turtles in the family Emydidae. Species of this genus are spread throughout North, Central, and South America as well as the Caribbean Islands. Most members of this genus were historically placed into the ambiguously defined T. scripta complex, which has been attributed to the fact that few members of this genus are sympatric (Seidel 2002). More recent studies have argued that some, if not many, of these are likely to actually be species rather than subspecies (Stephens and Wiens 2003). If the species designation of many of the subspecies is correct, then the interspecific relationships within this genus are largely unresolved, and a more comprehensive phylogenetic analysis of the genus is needed to resolve these issues (Seidel et al. 1999; Seidel 2002). This is especially true in the case of the T. scripta complex (Seidel et al. 1999; Stephens and Wiens 2003). We use the taxonomy proposed by Seidel (2002) to avoid confusion among historical species and subspecies.
In this study, mitochondrial DNA sequence data from the NADH 4 region and flanking tRNAs of 52 individuals of 18 of the 26 extant species and subspecies in Trachemys were analyzed by maximum parsimony, maximum likelihood, and Bayesian analysis methods. Our explicit goal was to provide an mtDNA phylogeny, which includes sequence data for a majority of the currently described taxa. Particular emphasis was given to the North American species group, specifically the relationship and validity of Trachemys gaigeae.
Methods
Blood samples were collected from wild caught, pet trade, and zoo animals by various individuals (mainly MRJF, DES, and James Dixon; for a list of specimens, see Appendix 1). Remaining blood and/or DNA samples are in the MRJ Forstner Frozen Tissue Collection at Texas State University San Marcos.
Blood was isolated from each individual and stored in blood storage buffer (100 mM Tris pH 8.0, 100 mM Na2EDTA, 10 mM NaCl, and 1% SDS) at −80°C until needed. DNA was extracted from blood using the proteinase K protocol of Maniatis et al. (1982), as modified by Hillis and Davis (1986). The primers used in polymerase chain reaction amplification were obtained from Arevalo et al. (1994). The primers ND4 and leucine were chosen because they show a high degree of conservation within turtle sequences. Additionally, this region has been shown to be phylogenetically informative in squamates (Arevalo et al. 1994; Forstner et al. 1995). A 992-basepair fragment of mtDNA was amplified by these primers and contained the last 768 bases of the ND4 gene and the tRNAs histidine, serine, and leucine. Sequencing reactions were performed using the Applied Bio-Systems (ABI) Dideoxy termination cycle sequencing kit in conjunction with an ABI 373A automated sequencer.
All sequences were aligned using MacClade 4 (Madison and Madison 2003). All sequences from individuals of the same species that were identical were collapsed into a single sequence, again using MacClade. This resulted in a data set of 54 individual sequences from 20 taxa. All sequences used in this analysis were accessioned into NCBI GenBank (see Appendix 1). A partition homogeneity test was conducted using PAUP* 4b10 (Swofford 2002) to determine if it would be necessary to partition the tRNAs and the protein coding fragment of ND4. Modeltest 3.5 (Posada and Crandall 1998) was used to determine the appropriate model of sequence evolution for this data set under the Akaike Information Center (AIC) criteria (Posada and Buckley 2004) with 4 different outgroup arrangements. The outgroups tested were Testudo kleinmanni only; Testudo and Pseudemys texana; Testudo, Heosemys, Sacalia, and Callagur; and finally Pseudemys, Testudo, Heosemys, Sacalia, and Callagur. Neighbor joining analyses were conducted using Maximum Likelihood Estimate (MLE) distance settings corresponding to the results of the model selection process for each outgroup arrangement, and the results were compared in order to ascertain sensitivity of the data to outgroup selection. All 4 outgroup arrangements resulted in the selection of the same model in Modeltest 3.5 (GTR + G) and produced analogous neighbor joining topologies using MLE distances. Thus, the data set was not sensitive to outgroup selection and a single outgroup arrangement was chosen (Testudo kleinmanni and Pseudemys texana), providing a distantly related taxon, as well as a proximal sister taxon within the same family.
The model selected by Modeltest (GTR + G) was then used in maximum likelihood analysis of the dataset in PAUP*. The parameter estimates from Modeltest were used in this analysis. The resulting ML topology was bootstrapped (1000 replicates) to evaluate support of the relationships proposed.
MrModeltest was used to determine the most appropriate model using AIC (GTR+G) for Bayesian analysis using MrBayes (Huelsenbeck and Ronquist 2001). An MCMC analysis was conducted in MrBayes using the GTR+G model to implement a “best” model. This analysis was run for 1 × 106 generations, sampling every 100, with 1 cold and 3 hot chains. A burn in of 300 samples (sumt burnin = 300) was determined to be appropriate from stabilization of a log likelihood plot, and posterior probabilities for the resulting topology were calculated using PAUP*.
A partitioned Bayesian analysis was also conducted using MrBayes. The data set was divided into 4 partitions, one for each codon position in the protein coding ND4 portion, and the fourth partition contained the tRNAs. Each partition was independently run through MrModeltest, and the best model for each partition selected by AIC. The selected model and parameter estimates for each partition were then input in MrBayes. Six chains (5 hot, 1 cold) were run for 3 × 106 generations, sampling every 1000 generations. The first 25% of the samples were discarded, equivalent to a burn in of 750 samples. Posterior probabilities for the resulting topology were calculated using PAUP*.
Parsimony analyses were conducted using PAUP*. The most parsimonious tree for the dataset was found using a full heuristic search with simple stepwise addition and tree bisection-reconnection (TBR). The result was then subjected to a nonparametric bootstrap as implemented in PAUP*, for 1000 replications with 10 TBR steps each, and the resulting 50% consensus topology was retained.
Results
The result of the partition homogeneity test was not significant (p = 0.15); therefore, partitioning of the data set was not required. Modeltest selected GTR+G as the most appropriate single model for the dataset. Base frequencies for A, C, G, and T were 0.3513, 0.2635, 0.1305, and 0.2547, respectively. The rate variation followed a gamma distribution with a shape parameter of 0.4655, and there were 4 rate categories and 6 substitution types. For the partitioned dataset, MrModeltest selected the GTR model for the first codon position, HKY+I for the second position, and GTR+G for the third position partition. HKY+G was selected for the tRNA partition.
The results of the ML (Fig. 1) and Bayesian (Fig. 2) analyses were generally congruent with each other and with the taxonomy of Seidel (2002). Both topologies supported the significance of T. gaigeae, T. emolli, T. taylori, T. yaquia, T. dorbigni, T. terrapen, and T. decussata lineages. The results of both analyses also showed clearly resolved North American (T. scripta scripta, T. scripta troostii, T. scripta elegans, and T. gaigeae), Meso-American (T. emolli, T. taylori, T. venusta venusta, T. venusta cataspila, T. venusta grayi, T. callirostris callirostris, T. callirostris chichiriviche, T. yaquia, and T. dorbigni), and West Indian (T. decorata, T. stejnegeri stejnegeri, T. stejnegeri vicina, T. terrapen, T. decussata decussata, and T. decussata angusta) monophyletic units.



Citation: Chelonian Conservation and Biology 7, 1; 10.2744/CCB-0692.1



Citation: Chelonian Conservation and Biology 7, 1; 10.2744/CCB-0692.1
Discussion
While the three main monophyletic lineages (North American, Meso-American, and West Indian) apparent in the results of these analyses are generally consistent with the results of other studies (Seidel 2002; Stephens and Wiens 2003), there are some incongruences regarding the relationships among some species.
The analysis of Stephens and Wiens (2003) placed T. gaigeae in a clade with species from South America and Mexico, while our analysis places this taxon as more closely related to the North American T. scripta complex, and as part of the monophyletic North American lineage. Our placement of T. gaigeae is strongly supported by both the MP and ML bootstrap values and Bayesian posterior probabilities from both partitioned and nonpartitioned analyses (Figs. 1 and 2).
Together with the concept of the evolutionarily significant unit (Ryder 1986; Moritz 1994), which in some cases is the equivalent of a “species” (Moritz 1994), our analysis supports the species status of T. gaigeae as proposed by several authors (Weaver and Rose 1967; Ward 1984; Seidel et al. 1999; Seidel 2002). Our intention here, however, is to recognize this lineage as unique and worthy of treatment as a unit for conservation, rather than contribute to the overabundance of literature arguing the appropriate criteria for species definition.
Our study failed to resolve the T. venusta and T. callirostris species complexes of Seidel (2002). However, the lack of phylogenetic resolution does not provide an inherent default hypothesis, and therefore Seidel's taxonomy is provisionally retained as we feel that this makes the most use of all available data. These ambiguous relationships may eventually be resolved as more data are collected and analyzed.
In conclusion, it appears that when mtDNA data are considered, the taxonomy of Trachemys proposed by Seidel (2002) is the most reasonable for the genus. The proposed species status of T. gaigeae (Weaver and Rose 1967; Ward 1984; Seidel et al. 1999; Seidel 2002) is also supported by our data. In our evaluation of the specific status for this taxon, we have sought to use historical evaluations in conjunction with supported results from our current mtDNA hypothesis. In our support for T. gaigeae, we explicitly acknowledge our failure to more broadly evaluate the remaining potential evolutionarily significant units within this genus (Moritz 1994). This decision was made in keeping with the recent voucher paper (Lehn et al. 2007) in which we agree that significant systematic decisions should not be completed in the absence of traditional voucher specimens. We would still suggest, however, that the proposed taxonomy of Seidel (2002) represents the best current working taxonomy of Trachemys. This taxonomic arrangement does the most to preserve the diversity contained within the genus by recognizing diagnosable lineages as unique.

Bootstrap consensus of the maximum parsimony and maximum likelihood analyses of ND4-leucine tRNA region of mitochondrial DNA in Trachemys. ML bootstrap support values are shown above supported branches, and MP bootstrap values are shown below. Major regional clades are illustrated to the right of taxon names.

Results of Bayesian analyses of the ND4-leucine tRNA region of mitochondrial DNA in Trachemys. Posterior probabilities from analysis using a single model are shown above supported branches, and the posterior probabilities from the partitioned analysis are shown below. Regional clades are illustrated to the right of taxon names.