One of the great joys of being a scientist is that we get to spend our lives exploring the aspects of the natural world that most intrigue and excite us. However, the equally great frustration of being a researcher is that our curiosity and passion invariably outstrip the resources available for our explorations. It often feels like we spend the bulk of our creative energy begging for money, and when this is declined — as it often is — it can be crushing. What keeps us going is the conviction that what we are doing, and what we have not yet found a way to do, is interesting and important and worth pursuing.
The primary focus of my research is the evolution of genome size in animals. Genome size is the amount of DNA in one copy of the chromosome set of a species, generally measured in terms of the number of base pairs (bp) or in mass (in picograms, or 10-12g). What makes this an intriguing topic of research is the enormous variability that exists across species: in animals, genome sizes range more than 7,000-fold. Think about that for a moment. Some animals have 7,000 times more DNA in their cells than others. Even within vertebrates, there is huge diversity at the genomic level: the largest (lungfish) is 350 times larger than the smallest (pufferfish). Or consider amphibians, which range about 120-fold from the smallest in some frogs to the largest in a few aquatic salamanders.
The human genome contains about 3.2 billion base pairs. In the simplest terms, one might expect this to be the largest genome of all — humans are the most complicated organisms (right?) and that should require the most genes (right?) which in turn means more DNA (right?). This was indeed the assumption when researchers began assessing genome sizes in the late 1940s — before the structure of DNA was elucidated, and even before it had been established that DNA is the hereditary molecule. At this time it was reported that the amount of DNA in a species’ cells is mostly constant (thus, genome size is also called “C-value”). This itself was suggested to indicate that DNA, and not protein, serves as the molecular basis of inheritance. However, it was also obvious by 1951 that the amount of DNA varies dramatically among species, and that the “complexity” of an animal and its genome size are decoupled. There are, it was discovered, salamanders with 40x more DNA per genome than in humans. This made no sense. DNA amount is constant within species because it is what genes are made of, and yet more complicated organisms (which presumably require more genes) may have substantially less DNA in their genomes than simpler organisms. This became known as the “C-value paradox” in the early 1970s.
It was not long before the apparent “paradox” was resolved: most DNA in animal and plant genomes is not genes (it is “non-coding DNA”). This means that genome size need not be related to the number of protein-coding genes, and that there is no reason to expect more complex animals to have more DNA in their genomes. However, this raised many new questions: What is this non-coding DNA? Where does it come from? How does it increase or decrease in amount in different genomes? Does it have any effect on the organism? Does it have any function? Why do some species have so much of it and others so little?
Despite several decades of research, most of these questions remain at best only partially answered. This is where my lab’s research comes in. We are interested in genome size diversity across all animals, in its effects on organism biology, and in the factors ranging in scale from individual DNA elements to ecological properties that accentuate or constrain amounts of DNA in the genomes of different species.
One thing that has become clear over the past several decades is that genome size is not randomly distributed across taxa. Some, like birds, all seem to have relatively small genomes. Others, like salamanders, all have large genomes. The quantity of DNA also relates to important features such as cell size and cell division rate, such that large genomes are found in cells that are big and divide slowly. Because all animals are made of cells, this means that any feature relating to cell size or cell division rate could be indirectly related to genome size. Body size is an obvious possibility, at least when cell numbers are held mostly constant. Metabolic rate is another possibility, because the larger a cell gets, the lower its relative surface area is, and this can influence gas exchange. Developmental rate is yet another, because slower individual cell divisions can add up to protracted development overall.
We have found that body size is correlated with genome size not only in some invertebrates like flatworms and copepod crustaceans, but also within specific groups of vertebrates like rodents, bats, and birds. Inverse relationships between genome size and metabolic rate have been reported in both mammals and birds, and in particular it has been argued that flight imposes a constraint on genome size due to its high metabolic demands. This latter idea has been around for several years, but it has recently become the subject of renewed interest and some intriguing new discoveries. For example, my colleague Chris Organ has used fossil cell size measurements to reveal that theropod dinosaurs (the lineage from which birds evolved) already had somewhat reduced genome sizes relative to other lineages before birds evolved, and that pterosaurs (the first vertebrates to evolve flight) also had small genomes. One of my students has been working on flight in birds, and showed that wing parameters associated with flight ability are related to genome size as well. We have also found recently that hummingbirds have the smallest genomes among birds (this isn’t published yet, but we’re writing the paper as we speak).
In terms of development, we have found in insects like lady beetles and vinegar flies that larger genomes are associated with slower overall development. Similar correlations have been known for some time in amphibians. What is more interesting is the pattern that we see with regard to metamorphosis, which represents a period of rapid and extreme physical reorganization. Groups with intensive metamorphosis, like frogs living in deserts that complete their life cycle quickly during wet seasons, have very small genomes (smaller than birds). Others, like aquatic salamanders that have lost the ability to metamorphose, have some of the largest genomes among animals. This also seems to apply to the major lineages of insects. Orders exhibiting complete metamorphosis (“holometabolous development”) appear almost never to exceed about 2 billion base pairs, whereas some without complete metamorphosis (“hemimetabolous development”) can be very large — there are grasshoppers with 5x more DNA than in humans.
Although genome size has been investigated for more than 60 years, some of these trends are only now coming to light. One reason is that we are focusing on the “big picture” now. Another reason is that we have technology that allows us to estimate genome sizes for large numbers of species. To give one example, an undergraduate student and I produced new data for more than 300 species of moths last summer alone. Previously, only 50 moth species had been analyzed (almost all of them in a pilot study I did a few years ago). Of course, this is a miniscule fraction of the 180,000 or so described species in the order, but it’s infinitely better than no information at all. Various students of mine have begun filling other major gaps, including in mammals, birds, insects, worms, and molluscs, but a huge amount of work remains just to get a basic picture of genomic diversity and its significance.
Over the upcoming series of posts, I will highlight some of the projects that I am very interested in undertaking, but which are on indefinite hold due to lack of funds. (It’s not that I haven’t tried — but granting agencies tend not to like this kind of large-scale “discovery” science as compared to the testing of very focused hypotheses). There are several reasons why I think it is worth doing this. First, most members of the public get only snippets of what goes on in research labs, most often provided by news reports. The raw curiosity that drives basic research is not often conveyed, particularly when projects are first conceived (vs. once they’re completed and published). Second, this is the stuff that gets me out of bed in the morning, and I hope that others can share in the excitement that my students and I feel when we think about, and try to answer, these fundamental questions about the diversity of life. Third, I believe it is useful for people to grasp the frustration that every scientist lives with when he or she feels that there are great ideas collecting dust for simple lack of funds. Finally, it provides an opportunity to talk about some intriguing animal groups from a perspective that most people haven’t considered. In that sense, it should be an interesting exercise in thinking about the wondrous biological diversity that surrounds us.
In the meantime, you are welcome to explore the Animal Genome Size Database to get a sense of the tremendous diversity — and glaring gaps in our knowledge — that drive my research program.
excellent post. I work in the testing hypothesis side of research, which is exploratory, but can be constricting in focus at times.