Concept contrasts.

Many science blogs have a regular series on particular subjects. Thus far I have not done anything like that, but I think the “X vs. Y” pieces could make for a useful series. I shall dub it “Concept contrasts”, and present the first three in the following compendium.

I will update this list whenever a new entry in the series is posted.


Inter-lineage selection versus "just in case".

I still want to grant the benefit of the doubt to my fellow biologists who recently have made statements about non-coding DNA being potentially useful in the future. Natural selection does not work this way, because it is simply the differential survival and reproduction of entities based on heritable differences. In the most common case, this means individual organisms within populations leaving more or fewer offspring and/or surviving or dying under given conditions in a non-random manner due to heritable trait differences. However, the general principle of natural selection is not restricted to this level, and is a logical consequence in any circumstance in which there is differential survival and reproduction based on inherited variation. There can be selection within the genome among transposons, for example, and some authors also argue that selection can take place among species (as differential speciation and extinction).

The most straightforward way of thinking about natural selection is to imagine that a certain genetic trait is either beneficial or detrimental to an organism, such that it is passed on either more or less commonly to subsequent generations. However, there can be higher-order selection as well, in which some lineages persist longer or branch off to form additional daughter lineages more often than others for non-random reasons. This is not why those traits originated nor why they are maintained from one generation to the next, but it could explain why lineages with those traits are more common or last longer than others.

As an example, consider sex. Sexual reproduction involves the recombination of genes which has two important effects: 1) it allows beneficial mutations to spread more easily in a population, and 2) it prevents the ratchet-like accumulation of deleterious mutations at multiple loci. What this means is that sexual lineages can be expected to evolve more quickly and to last longer than asexual lineages. So, when we look around, we expect to see more sexual lineages than asexual ones, and indeed that is what we see (at least in animals). Sex did not evolve so that lineages would have greater evolutionary potential or would survive for a longer time, but that is nevertheless a significant effect when considering the distribution of biological diversity. However, there is still an issue that sexual reproduction is costly: you only pass on half your genes, you produce “wasteful” males, you have to find a mate, and so on, so we also need to consider immediate benefits that keep the trait around long enough for us to even notice the higher-order effects.

Now back to “junk DNA”. It may be that over the long term, lineages with more non-coding DNA are more flexible and can diverge more often, or that they are more resilient to environmental change and will last longer than those with less DNA. If this is so, then this might explain why we see lineages with lots of non-coding DNA — because those lineages persisted while others disappeared. We would still have to explain the origin of the non-coding DNA and the reason it persists over the shorter term though. There are several possibilities. One, non-coding DNA is beneficial to the organism in some way. Lots of ideas have been proposed for this over the last half century. Two, non-coding DNA could be neutral and is simply not eliminated by selection. Three, non-coding DNA is slightly detrimental, but selection has been too weak (e.g., if populations are small) or mutation too strong (e.g., continual transposable element insertions) for it to be deleted. In any of these situations, it could be possible for non-coding DNA to persist long enough to be co-opted (by chance mutations and subsequent selection) or to have impacts on lineage diversification and/or lifespan.

The problem with this is that species with small genomes are much more common than ones with large genomes and large-genomed species seem to be more sensitive to environmental challenges. So, the most likely scenario is that mutational mechanisms affect DNA amount from the bottom up, while selection comes into play from the top down in terms of effects on cell size and also selection against disruptions of genes. On balance, some lineages end up with large amounts of non-coding DNA, and in some cases this is co-opted into functions like regulation or structure.

It certainly could be that some people are thinking about this from a reasonable perspective based on multiple levels of selection and time scales and are just being sloppy in their descriptions of the net processes. Or maybe they really do think that “junk DNA” is kept because it might become useful. Either way, we need to steer clear of simplified soundbites that obfuscate more than enlighten.


"Because" versus "so that".

I want to make a quick point about how evolution works and how it does not. The reason is that two stories about non-coding DNA posted today include a major misconception about evolution. Unfortunately, this is a misconception attributed in the articles to biologists, so I can only imagine what the state of comprehension is among non-scientists.

The distinction is between “because” and “so that”. In evolution, things evolve “because,” meaning that there are causes and effects that can be identified. Why are some strains of bacteria resistant to antibiotics? Because a mutation that occurred that happened to be beneficial under the conditions of antibiotic treatment became common in the population over the course of several generations. By contrast, things do not evolve “so that”. Bacteria do not experience mutations so that they will become resistant to antibiotic agents.

Why is there so much non-coding DNA? Because transposable elements spread, or because there are accidental duplications that are not eliminated by selection, or because of the interaction of some other mutational processes and their consequences (or lack thereof). So much non-coding DNA did not evolve so that it might someday be useful, or so that it could be coopted when needed, or so that evolution would have more potential in the form of genetic raw materials.

So why, then, do we see quotes like these?

Wired One Scientist’s Junk Is a Creationist’s Treasure:

“I’ve stopped using the term [‘junk’],” Collins said. “Think about it the way you think about stuff you keep in your basement. Stuff you might need some time. Go down, rummage around, pull it out if you might need it.”

Reuters Human instruction book not so simple: studies:

“It is not the sort of clutter that you get rid of without consequences because you might need it. Evolution may need it,” [Collins] said.

That little extra padding might be just what an animal needs to adapt to some unforeseen circumstance, the researchers said. “They may become useful in the future,” Birney said.

The latter quote by Ewan Birney illustrates the problem that can arise when a detailed, nuanced discussion is summarized into a short soundbite. I know this from experience, and I suspect that this is what has happened here, given how his very reasonable interpretation is paraphrased in New Scientist ‘Junk’ DNA makes compulsive reading:

Birney says that the additional switches may be mutations that appear by accident and then generate new slugs of RNA, but because they are produced randomly, most are evolutionarily neutral ‘passengers’ in the genome. There might be rare occasions, however, when a new RNA does confer an advantage.

Collins, on the other hand, seems to have said his bit to two different reporters, so I strain to give him the benefit of the doubt on this one. When I began this blog, I did not think I would be pointing out obvious misconceptions about evolution, genomes, and DNA as propagated by the likes of Collins or Nature. But here we are.


Effect versus function.

There has been quite a bit of discussion in the media recently about discoveries of [indirect evidence for] functions in [small portions of] non-coding DNA. Unfortunately, the parts in square brackets are often omitted. It is also the case that many reports overlook the important distinction between effect and function, leaving readers with the impression that non-coding DNA can only be either totally insignificant or vitally important.

Here is the relevant part of the Merriam-Webster Dictionary entry on function:

“The action for which a person or thing is specially fitted, used, or responsible or for which a thing exists.”

And on effect:

“Something that is produced by an agent or cause; something that follows immediately from an antecedent; a resultant condition.”

In other words, a function fulfills a specific role to produce a positive result, with a close fit between cause and outcome shaped by either design (in human technology) or natural selection (in biological systems). Effects are also the outcome of identifiable causes, but they can be positive, neutral, or negative and may be generated directly or indirectly by the causal mechanism. Thus, it is not possible to have a function without any effects, but something can exert an effect — perhaps a very important one — without this constituting a function.

Consider an example. The immune system of the body has a clear function: to defend against pathogens. Viruses likewise have functions, but this only makes sense if one considers the issue from the perspective of the viruses themselves and not of their hosts. Specifically, parts of the virus function in allowing them to circumvent the host’s immunity and to usurp its replication machinery. Viruses do, however, have effects on hosts — usually negative, but apparently sometimes indirectly positive.

The genomes of eukaryotes consist of many types of DNA sequences. The exons that encode proteins make up a small percentage (less than 2% in humans), and the rest is non-coding DNA of various sorts: introns, pseudogenes, satellite DNA, and especially transposable elements (also called TEs, transposons, or mobile elements). The latter represent a diverse set of sequences that are capable of moving about and duplicating in the genome independently of the normal replication process. In this sense, they are often considered “parasites” of the “host” genome. Overall, TEs also make up the largest portion of non-coding DNA in the genomes analyzed so far (at least 45% in humans), although the particular types, abundances, and levels of activity of TEs vary among species.

Some TEs have evidently been co-opted (exapted) to perform functions at the host level, meaning that they have moved from being parasites to integrated participants in the functioning of the genome. This includes regulating genes, involvement in the genetic cutting-and-splicing mechanism of the vertebrate immune system, and perhaps cellular stress response. On the other hand, many diseases can result from mutations caused by the insertion of a TE into an existing gene. From the perspective of the host, TEs can have different effects depending on the context: some TEs are functional but some are detrimental. The large majority, however, have not been shown to fall into either category.

Nevertheless, a lack of evidence for either function or harm does not mean that TEs are without effects. It is well known that the total amount of DNA (genome size) is linked to cell size, cell division rate, metabolic rate, and developmental rate. In other words, a large genome is typically found in large, slowly dividing cells within an organism displaying a low metabolic rate and sluggish development. Conversely, organisms with high metabolic rate or rapid development tend to have small genomes. To the extent that total DNA content directly affects cell size and division, these can be considered effects — by their presence in the aggregate — of non-coding DNA elements.

Is slowing down metabolism or delaying development a function? Some authors think so, but most would argue that these are effects that are tolerated by the organism because they are not overly detrimental. That is, parasites spread within the genome and individually may have little or no effect (and no function), but in sum may have substantial effects on the cell and organism. The amount of accumulation would depend on the tolerance of the organism based on its biology. For example, it is unlikely that a mammal with a high metabolic rate could have a genome the size of a salamander’s.

The point of this discussion is to note that seeking functions for non-coding DNA is an interesting area of research, but that even if most sequences are not functional, they can still be important from a biological perspective. Similarly, one would not invoke function for hosts to explain the existence of viruses, nor would one dismiss viruses as unimportant if functions were never found at the host level. One would, however, focus considerable attention on explaining how viruses spread, why some are more virulent than others, and how they exert their effects.