We’re all related and I don’t just mean us humans, though that’s most definitely true! Instead, all living things on Earth can trace their descent back to a common ancestor. Any smaller group of species can also trace its ancestry back to common ancestor, often a much more recent one.
Given that we can’t go back in time and see how species evolved, how can we figure out how they are related to one another? In this article, we’ll look at the basic methods and logic used to build phylogenetic trees, or trees that represent the evolutionary history and relationships of a group of organisms.
What is Phylogenetic Tree?
When we draw a phylogenetic tree, we are representing our best hypothesis about how a set of species (or other groups) evolved from a common ancestor. As we’ll explore further in the article on building trees, this hypothesis is based on information we’ve collected about our set of species – things like their physical features and the DNA sequences of their genes.
In a phylogenetic tree, the species or groups of interest are found at the tips of lines referred to as the tree’s branches. For example, the phylogenetic tree below represents relationships between five species, A, B, C, D, and E, which are positioned at the ends of the branches
The pattern in which the branches connect represents our understanding of how the species in the tree evolved from a series of common ancestors. Each branch point (also called an internal node) represents a divergence event, or splitting apart of a single group into two descendant groups.
At each branch point lies the most recent common ancestor of all the groups descended from that branch point.
For instance, at the branch point giving rise to species A and B, we would find the most recent common ancestor of those two species. At the branch point right above the root of the tree, we would find the most recent common ancestor of all the species in the tree (A, B, C, D, E).
Each horizontal line in our tree represents a series of ancestors, leading up to the species at its end. For instance, the line leading up to species E represents the species’ ancestors since it diverged from the other species in the tree. Similarly, the root represents a series of ancestors leading up to the most recent common ancestor of all the species in the tree.
How to Read an Evolutionary Tree
Understanding a phylogeny is a lot like reading a family tree. The root of the tree represents the ancestral lineage, and the tips of the branches represent the descendants of that ancestor. As you move from the root to the tips, you are moving forward in time.
When a lineage splits (speciation), it is represented as branching on a phylogeny. When a speciation event occurs, a single ancestral lineage gives rise to two or more daughter lineages.
Phylogenies trace patterns of shared ancestry between lineages. Each lineage has a part of its history that is unique to it alone and parts that are shared with other lineages.
Similarly, each lineage has ancestors that are unique to that lineage and ancestors that are shared with other lineages common ancestors.
A clade is a grouping that includes a common ancestor and all the descendants (living and extinct) of that ancestor. Using a phylogeny, it is easy to tell if a group of lineages forms a clade. Imagine clipping a single branch off the phylogeny all of the organisms on that pruned branch make up a clade.
Clades are nested within one another they form a nested hierarchy. A clade may include many thousands of species or just a few. Some examples of clades at different levels are marked on the phylogenies below. Notice how clades are nested within larger clades.
So far, we’ve said that the tips of a phylogeny represent descendent lineages. Depending on how many branches of the tree you are including however, the descendants at the tips might be different populations of a species, different species, or different clades, each composed of many species.
Types of Clades
- Monophyletic is when it includes the most recent common ancestor of all the organisms and all the descendants of that most recent common ancestor.
- Paraphyletic is if it excludes one or more descendants.
- Polyphyletic when it excludes the common ancestor.
Types of Phylogenetic Tree
The two main types of phylogenetic trees are cladograms and phylograms. Cladograms do not have scaled branches, so they don’t represent the amount of time between generations, while phylograms do have scaled branches. Both cladograms and phylograms can be rooted or unrooted.
#1. Cladograms.
Cladograms are not scaled, meaning that the distance between each generation on the phylogenetic tree is the same and does not represent the actual amount of time between them. Because of this, cladograms are often used to depict hypothesized evolutionary relationships relatively quickly.
#2. Phylogram.
Phylogram is a type of phylogenetic tree that represents the evolutionary relationships among organisms by showing both the branching pattern and the amount of evolutionary divergence. Phylograms are scaled, which means that the branch lengths are proportional to the amount of evolutionary divergence.
Other Types
#3. Rooted trees.
Rooted trees are trees that have a specified root node, which represents the common ancestor of all the organisms in the tree.
#4. Unrooted trees.
Unrooted trees do not have a specified root node and show only the branching pattern of the evolutionary relationships among taxa or OTUs, without any information about their common ancestor.
How to Make a Phylogenetic Tree
phylogenetic tree may be built using morphological (body shape), biochemical, behavioral, or molecular features of species or other groups. In building a tree, we organize species into nested groups based on shared derived traits (traits different from those of the group’s ancestor).
Different types of data, such as nuclear and mitochondrial gene sequences, ribosomal RNA sequences, protein sequences, structural features, types of organs, and fossil studies, are used to create a phylogenetic tree.
These data find homology (similarity due to common ancestry) among living organisms: plants or animals. For example, all human beings have large brains and possess hairs in their bodies, similar to our ancestors. Again, all mammals produce milk from their mammary gland.
Phylogenetic trees are drawn using the principle of parsimony, which says that the most likely pattern requires the fewest changes.
For example, the body hairs in present-day man are assumed to be because their ancestors had body hairs rather than multiple groups of organisms, each independently developing them.
Importance of Phylogenetic Trees
Phylogenetic trees are important tools for organizing knowledge of biological diversity, and they communicate hypothesized evolutionary relationships among nested groups of taxa (monophyletic groups) that are supported by shared traits known as synapomorphies.
Given the increasing use of phylogenies across the biological sciences, it is now essential that biology students learn what tree diagrams do (and do not) communicate.
Developing “tree thinking” skills also has other benefits. Most importantly, trees provide an efficient structure for organizing knowledge of biodiversity and allow one to develop an accurate, nonprogressive conception of the totality of evolutionary history.
It is therefore important for all aspiring biologists to develop the skills and knowledge needed to understand phylogenetic trees and their place in modern evolutionary theory.
Limitations of phylogenetic analysis
Although phylogenetic trees produced on the basis of sequenced genes or genomic data in different species can provide evolutionary insight, these analyses have important limitations.
Most importantly, the trees that they generate are not necessarily correct they do not necessarily accurately represent the evolutionary history of the included taxa.
As with any scientific result, they are subject to falsification by further study (e.g., gathering of additional data, analyzing the existing data with improved methods).
The data on which they are based may be noisy; the analysis can be confounded by genetic recombination, horizontal gene transfer, hybridisation between species that were not nearest neighbors on the tree before hybridisation takes place, convergent evolution, and conserved sequences.
Also, there are problems in basing an analysis on a single type of character, such as a single gene or protein or only on morphological analysis, because such trees constructed from another unrelated data source often differ from the first, and therefore great care is needed in inferring phylogenetic relationships among species.
This is most true of genetic material that is subject to lateral gene transfer and recombination, where different haplotype blocks can have different histories.
In these types of analysis, the output tree of a phylogenetic analysis of a single gene is an estimate of the gene’s phylogeny (i.e. a gene tree) and not the phylogeny of the taxa (i.e. species tree) from which these characters were sampled, though ideally, both should be very close.
For this reason, serious phylogenetic studies generally use a combination of genes that come from different genomic sources (e.g., from mitochondrial or plastid vs. nuclear genomes), or genes that would be expected to evolve under different selective regimes, so that homoplasy (false homology) would be unlikely to result from natural selection.
When extinct species are included as terminal nodes in an analysis (rather than, for example, to constrain internal nodes), they are considered not to represent direct ancestors of any extant species. Extinct species do not typically contain high-quality DNA.
The range of useful DNA materials has expanded with advances in extraction and sequencing technologies. Development of technologies able to infer sequences from smaller fragments, or from spatial patterns of DNA degradation products, would further expand the range of DNA considered useful.
Phylogenetic trees can also be inferred from a range of other data types, including morphology, the presence or absence of particular types of genes, insertion and deletion events and any other observation thought to contain an evolutionary signal.
Phylogenetic networks are used when bifurcating trees are not suitable, due to these complications which suggest a more reticulate evolutionary history of the organisms sampled.