Science

When DNA Speaks of Nature — the Use of Genetic Data in Biodiversity Research

The biodiversity we see around us is a product of millions of years of evolution on the earth. To understand how animals and plants have evolved or come to exist on this planet, one can make use of the deoxyribonucleic acid (DNA) molecules that are present in cells of most living organisms. DNA provides biological instructions that are inherited from one generation to the next, also known as “the process of descent with modification”. The ability to use this record of history stored in DNA is an essential part of a biologist’s toolkit. It allows us to understand the tree of life – the origin of species, the process of species formation, evolution of form and function, relationships between organisms and their response (evolutionary adaptations) to environmental change.

Certain mutations in genes, changes in DNA sequences, can translate to changes in proteins that they encode, giving rise to variants known as allozymes for enzymatic proteins. In the 1970s, researchers used allozyme analysis, which can detect protein variants using their electric charge, as a proxy to study genetic variation. By the 1980s, over a thousand animal species had been screened at tens of allozyme loci, which allowed comparisons across different groups. One study found that cheetahs in South Africa had very low genetic variation because of a drastic reduction in their population size in recent history. This loss of individuals and the consequent inbreeding among close relatives contributed to their low genetic diversity.

molecular ecology timeline jahnavi Molecular Biology

A timeline of
Photo Credit: Jahanavi Joshi

The development of Sanger sequencing in the late 1970s and polymerase chain reaction (PCR) in the mid 1980s revolutionized molecular biology by allowing us to make copies of DNA from small quantities of samples and determining the sequence of units forming the DNA sequence. Using these techniques to generate DNA sequences across multiple individuals and species allowed scientists to directly view the genetic material, detect mutations and thus measure genetic variation. In one of the earliest studies using DNA sequence data, researchers compared a single gene across species to build an evolutionary tree classifying living organisms into three broad groups. Since then, multiple genes and non-protein coding DNA sequences have been combined and analyzed to establish relationships between species. For example, a large global evolutionary tree of flowering plants has helped us understand the characteristics that helped some groups to expand across novel environments. DNA sequence data has been especially useful in identification of morphologically cryptic species, which cannot be differentiated based on their external appearance.

Apart from understanding the evolutionary relationships between different species, DNA sequences are also used to understand how genetic variation is geographically distributed within a species or closely related species. Starting from the 1970s and 80s, maternally inherited genes have been widely used in such studies. They have revealed, for example, that populations of several marine species diverged between the Atlantic and Pacific Oceans with the closure of the Isthmus of Panama. Such sex-linked markers have also been used to study the social structure of animals, where they have shown that female humpback whales follow specific migration routes over generations that are different across ocean basins. In addition to using gene sequences, non-protein coding regions of DNA such as microsatellites, have been widely used to understand relationships between individuals within a species. A recent study looked at the effects of forest fragmentation on genetic connectivity of four mammals in central India using DNA obtained from fecal samples. Microsatellite data found that anthropogenic factors had varying impacts on species depending on their biology, with the largest impact on tigers, followed by leopards, sloth bears and jungle cats.

Rapid advances in sequencing technology now allow us to go beyond sequencing genes to sequencing large stretches of DNA across an organism’s entire genetic material. These genomic approaches use parallel sequencing to generate hundreds of gigabases of DNA sequence data, that comes with analytical challenges related to high computational power and sophisticated mathematical models. Many of these techniques can also use trace amounts of DNA from the natural environment, which allows researchers to quickly survey the biodiversity of poorly studied regions and taxa. For example, thousands of DNA sequences from soil samples enabled scientists to estimate the invertebrate diversity of a remote island in New Zealand. High-resolution genome data can help researchers tell apart closely related species when traditional genetic markers fail, as has been done in cichlid fish in Lake Victoria, Africa. Technological advances have also allowed researchers to use poor quality DNA, as in the case of eastern lowland gorillas where old museum specimens helped scientists understand the genetic effects of severe population declines in recent history.


Bharti Dharapuram is a postdoctoral researcher at the CSIR-Centre for Cellular and Molecular Biology. She is interested in processes driving patterns of species distribution and genetic diversity, especially in poorly studied terrestrial and marine invertebrates.

Jahnavi Joshi is an Assistant Professor at the CSIR-Centre for Cellular and Molecular biology, Hyderabad, India. She studies systematics, biogeography, diversification, and community assembly in Asian tropical forests primarily using arthropods as a model system.

This series is an initiative by the Nature Conservation Foundation (NCF), under their programme ‘Nature Communications’ to encourage nature content in all Indian languages. To know more about birds and nature, join The Flock.