Estimating Species Trees: Practical and Theoretical Aspects
Buy Rights Online Buy Rights

Rights Contact Login For More Details

More About This Title Estimating Species Trees: Practical and Theoretical Aspects


Recent computational and modeling advances have produced methods for estimating species trees directly, avoiding the problems and limitations of the traditional phylogenetic paradigm where an estimated gene tree is equated with the history of species divergence. The overarching goal of the volume is to increase the visibility and use of these new methods by the entire phylogenetic community by specifically addressing several challenges: (i) firm understanding of the theoretical underpinnings of the methodology, (ii) empirical examples demonstrating the utility of the methodology as well as its limitations, and (iii) attention to technical aspects involved in the actual software implementation of the methodology. As such, this volume will not only be poised to become the quintessential guide to training the next generation of researchers, but it will also be instrumental in ushering in a new phylogenetic paradigm for the 21st century.


L. Lacey Knowles, Ph.D., is an Associate Professor/Associate Curator for the Museum of Zoology at the University of Michigan. Her research areas include speciation, sexual selection, phylogeography, and evolutionary radiations. Dr. Knowles was recently awarded a three-year grant by The National Science Foundation, titled "Population genetics of species delimitation: Methodology and application of a unified approach to inferring species boundaries."

Laura S. Kubatko, Ph.D., is an Associate Professor in the Departments of Statistics and Evolution, Ecology, and Organismal Biology. Her research interests are in statistical genetics, including the estimation of phylogenetic trees from nucleotide sequence data, linkage and QTL analysis, and the analysis of microarray data. She recently became an Associate Editor for the journal Systematic Biology, and was also elected to the Council for the Society of Systematic Biology beginning in 2008.




Chapter 1 Estimating Species Trees: An Introduction to Concepts and Models (L. Lacey Knowles and Laura S. Kubatko).

1.1 Introduction.

1.1.1 Different Tree Types and Their Relationship to Phylogeny.

1.2 The Relationship Between Gene Trees and Species Trees.

1.2.1 Evolutionary Mechanisms for Gene Tree Discord.

1.2.2 The Coalescent Process and Gene Tree Distributions.

1.2.3 Phylogenetic Extensions of the Coalescent Model.

1.3 The Relationship Between Sequence Data and Gene Trees.

1.3.1 Modeling DNA Sequence Evolution along a Gene Tree.

1.4 Statistical Inference of Species Trees.

1.4.1 ML.

1.4.2 Bayesian Analysis.

1.5 Collecting DNA Sequence Data.

1.6 Conclusions.


Chapter 2 Bayesian Estimation of Species Trees: A Practical Guide to Optimal Sampling and Analysis (Santiago Castillo-Ramírez, Liang Liu, Dennis Pearl and Scott V. Edwards).

2.1 Introduction.

2.1.1 Empirical Examples Using BEST.

2.2 Factors Influencing Confidence in Estimated Species Trees Using BEST.

2.2.1 Simulation Protocol.

2.2.2 Results of Simulations on Number and Length of Loci.

2.2.3 Multifactorial Prediction of Confidence in Species Trees.

2.2.4 Effect of the Number of Alleles Sampled per Locus on Species Tree Estimation.

2.2.5 Effect of Recombination on Species Tree Inference.

2.3 Some Tips on Running the BEST MCMC Algorithm.

2.4 Conclusions and Challenges.



Chapter 3 Reconstructing Concordance Trees and Testing the Coalescent Model from Genome-Wide Data Sets (Cécile Ané).

3.1 Introduction.

3.2 BCA: Background.

3.2.1 Sharing of Information across Gene Trees.

3.2.2 How to Choose the A Priori Level of Discordance α.

3.2.3 The Choice of an Infinite α in BCA.

3.2.4 A Nonparametric Prior Distribution on Gene Trees.

3.3 Genomic Support versus Statistical Support.

3.4 Comparing CFs of Contradicting Clades for Reconstructing the Dominant History.

3.5 Testing the Hypothesis That All Discordance is Due to ILS.

3.6 Species Tree Reconstruction from CFs.

3.7 The Challenge of Determining Loci on Whole-Genome Alignments.

3.7.1 The Assumption of Homogeneous, Unlinked Loci for GT/ST Reconstruction.

3.7.2 Detecting Recombination Breakpoints for GT/ST Reconstruction.

3.7.3 A Minimum Description Length (MDL) Information Criterion.

3.7.4 Comparisons with Other Partitioning Criteria.



Chapter 4 Probabilities of Gene Tree Topoligies with Intraspecific Sampling Given a Species Tree (James H. Degnan).

4.1 Introduction.

4.2 Background and Terminology.

4.2.1 Incomplete Lineage Sorting.

4.2.2 Notation.

4.3 Gene Tree Topology Probabilities—Theory.

4.3.1 Enumerating Coalescent Histories.

4.3.2 The Probability of a Coalescent History.

4.3.3 Probability Mass Function for Gene Tree Topologies.

4.4 Gene Tree Topology Probabilities—Examples.

4.4.1 Enumeration of Coalescent Histories.

4.4.2 Calculation of Probabilities of Coalescent Histories.

4.5 Applications.

4.5.1 Probabilities of Multilabeled Trees.

4.5.2 Probability of Monophyletic Concordance.

4.5.3 AGTs.

4.6 Conclusions.


Appendix: Using Coal.

Using the Software.

Setting Up Species Tree Branch Lengths.

Chapter 5 Inference of Parsimonious Species Tree from Multilocus Data by Minimizing Deep Coalescences (Cuong Than and Luay Nakhleh).

5.1 Introduction.

5.2 Trees, Clusters, and the Compatibility Graph.

5.3 Valid Coalescent Histories, Extra Lineages, and the MDC Criterion.

5.4 Exact Algorithms for the MDC Problem.

5.4.1 An ILP Algorithm.

5.4.2 A DP Algorithm.

5.5 Handling Special Cases.

5.5.1 Multiple Individuals per Species.

5.5.2 Nonbinary Trees.

5.6 Performance of MDC.

5.7 Inference from The Clusters of The Gene Trees.

5.8 Using PhyloNet.

5.8.1 Using PhyloNet to Count Valid Coalescent Histories.

5.8.2 Using PhyloNet to Infer Species Trees Under MDC.

5.9 Conclusions.



Chapter 6 Accommodating Hybridization in a Multilocus Phylogenetic Framework (Laura S. Kubatko and Chen Meng).

6.1 Introduction.

6.2 Methods for Detecting Hybridization in The Presence of Incomplete Lineage Sorting.

6.3 A Phylogenetic Model for Hybridization in The Presence of Incomplete Lineage Sorting.

6.3.1 Estimation and Testing for the Hybridization Parameters: Gene Tree Data.

6.3.2 Estimation and Testing for the Hybridization Parameters: Sequence Data.

6.3.3 Comparison of Hybrid Species Phylogenies Using Gene Tree Data.

6.4 Application: Hybridization in the Heliconius Butterflies.

6.4.1 Estimation and Testing for the Hybridization Parameters: Application to the Estimated Gene Trees in Heliconius.

6.4.2 Estimation and Testing for the Hybridization Parameters: Application to Sequence Data in Heliconius.

6.4.3 Comparison of Hybrid Species Phylogenies for the Heliconius Gene Tree Data.

6.5 Conclusions and Future Directions.



Chapter 7 The Influence of Hybrid Zones on Species Tree Inference in Manakins (Robb T. Brumfi eld and Matthew D. Carling).

7.1 Introduction.

7.2 The Manacus Manakins.

7.2.1 Distribution.

7.2.2 Hybrid Zone between M. vitellinus and M. candei.

7.2.3 Two Contact Zones between M. vitellinus and M. manacus.

7.2.4 Inferring a Manacus Species Tree.

7.3 Is Introgression Across the Hybrid Zones Influencing the Species Tree Inference?

7.4 Conclusions.



Chapter 8 Summarizing Gene Tree Incongruence at Multiple Phylogenetic Depths (Karen A. Cranston).

8.1 Introduction.

8.2 Sample Data: Rice, Flies, and Yeast.

8.3 Bayesian Inference of Gene Trees.

8.4 Detecting Convergence Across Hundreds of Genes.

8.5 A Note on Combining Trees.

8.6 BCA.

8.7 gsi.

8.8 Triplet Analysis.

8.9 Missing Data.

8.10 Genomic Distribution of Gene Tree Incongruence.

8.11 Visualization of Gene Tree Incongruence.

8.12 Concluding Remarks.



Chapter 9 Species Tree Estimation for Complex Divergence Histories: A Case Study in Neodiprion Sawflies (Catherine R. Linnen).

9.1 Introduction.

9.2 Study System: Neodiprion Sawflies.

9.3 Sampling Strategy.

9.4 Determining the Source of Mitonuclear Discordance.

9.5 Approaches for Species Tree Estimation.

9.5.1 Concatenation with Monophyly Constraints (CMC).

9.5.2 Minimize Deep Coalescences (MDC).

9.5.3 Shallowest Divergences (SD).

9.5.4 Bayesian Estimation of Species Trees (BEST).

9.6 Comparison of Species Tree Estimates.

9.7 Comparison of Gene Trees to Species Trees.

9.8 Conclusions and Future Directions.


Chapter 10 Sampling Strategies for Species Tree Estimation (L. Lacey Knowles).

10.1 Introduction.

10.2 Information Content in DNA Sequences for Species Tree Inference.

10.3 Why Phylogenetic History Dictates Appropriate Sampling Strategy.

10.4 Properties of the Data That Impact Sampling Decisions.

10.5 Making Informed Decisions about Sampling Strategies.

10.5.1 Where Does the Initial Species Tree Come from?

10.5.2 Is There Consistency in the Estimated Species Tree Given the Data?

10.6 Summary.



Chapter 11 Developing Nuclear Sequences for Species Tree Estimation in Nonmodel Organisms: Insights from a Case Study of Bottae's Pocket Gopher, Thomomys Bottae (Natalia M. Belfiore).

11.1 Introduction.

11.2 Pocket Gophers.

11.3 Marker Generation Approach and Methodological Comments.

11.3.1 Library Construction.

11.3.2 Subtraction of High-Copy-Number Regions.

11.3.3 Locus Characterization by Genomic Approaches.

11.3.4 Primer Design Experiments.

11.3.5 Locus Evaluation for Inclusion in the Study.

11.3.6 Variation within the Library Construction Species.

11.3.7 Inclusion of Loci and Data Generation within the Genus.

11.4 Data Management and Analysis.

11.4.1 Handling Data and Choosing Analysis Programs.

11.4.2 Phylogenetic Analysis.

11.5 Conclusions.



Chapter 12 Estimating Species Relationships and Taxon Distinctiveness in Sistrurus Rattlesnakes Using Multilocus Data (Laura S. Kubatko and H. Lisle Gibbs).

12.1 Introduction.

12.1.1 Sistrurus Rattlesnakes.

12.2 Analysis of Species and Subspecific Relationships.

12.2.1 Estimation of the Species Phylogeny.

12.2.2 Distinctiveness of Subspecies.

12.2.3 Phased versus Unphased Data.

12.3 Species Tree Estimation.

12.3.1 Estimation Using Gene Trees as Data.

12.3.2 Estimation Using Sequences as Data.

12.4 Distinctiveness Among Species and Subspecies.

12.4.1 Phased Data.

12.4.2 Unphased Data and the Effect of Sample Size.

12.5 Evolutionary and Conservation Implications.

12.6 Conclusions.





"I do not disagree, and I therefore hope that this book will stimulate theoreticians to become involved in exploring specific topics in more depth and encourage empiricists to test the available methods by using them in their own data analyses. In this way, phylogenetics should move closer to its ultimate practical goal of producing accurate evolutionary histories for all known organisms." (Systematic Biology, 15 June 2011)