I imagine that every practicing scientist has experienced, in one form or another, the tendency of many non-scientists to expect all research to be directly beneficial to human health and well-being. I used to respond facetiously to these kinds of expectations when expressed by friends or family members, with something along the lines of “My work has absolutely no practical applications to human welfare whatsoever”.
Of course, this is not true. Genome size is becoming very relevant to fields of inquiry that are likely to have major significance for medicine. Notably, genome size data provide an important indication of the cost and difficulty of sequencing a given genome, and thus represent a prime criterion in the choice of sequencing targets. As an example, I performed a genome size estimate for Biomphalaria glabrata, a planorbid snail that serves as an intermediate host for the trematode flatworm Schistosoma mansoni which causes the debilitating disease known as schistosomiasis. The genome of B. glabrata is one of the smallest so far reported for a gastropod, and is now being sequenced (along with S. mansoni).
More recently, Jenner and Wills (2007) made explicit mention of genome size as an important factor in deciding on the next set of models for evo-devo studies. Discoveries regarding the fundamental genetic underpinnings of development have obvious implications for medical science and here, too, genome size is becoming increasingly seen as important. As they put it,
Whole-genome sequences are an increasingly important resource for many biological disciplines, including evo–devo15, 49, 50. However, financial and technical constraints mean that there is currently a preference for species with small genomes. This compounds the bias that is already introduced by the big six. First, putatively general conclusions about genome evolution might actually be specific to those smaller genomes that have been fully sequenced. For example, when focusing only on sequenced genomes, a close correspondence between genome size and gene number in eukaryotes is observed. The C-value paradox becomes apparent only when genome-size data from non-sequenced genomes is included51. Second, there are important genetic, morphological, physiological and ecological correlates of genome size in a range of animals and plants51, 52. Some correlates seem ubiquitous in animals and plants, such as those between genome size and cell size, body size and the inverse of developmental rate52. Others are group specific: genome size correlates mostly with metabolic rate in homeotherms, but with developmental type and ecology in amphibians53, and is positively correlated with egg size in copepods, plethodontid salamanders and fishes51, 52, 54. Studying these correlated traits in phylogenetically disparate taxa could illuminate the relationships between small genome size and rapid development, as well as the evolution of strongly cell-lineage-dependent development in taxa such as tunicates and nematodes, and the partial fragmentation of their Hox clusters55, 56.
References 51, 52, and 53 in that paragraph are papers of mine, so again I am forced to admit that my work may have some practical application after all.
My main focus is on genome size diversity in eukaryotes, which mostly means differences among species in the abundance of noncoding DNA. In bacteria, most of the genome is composed of protein-coding genes, so unlike in eukaryotes there is a very strong correlation between genome size and gene number. Genome size is generally small in parasites and endosymbionts and larger in free-living species (probably because population bottlenecks and relaxed selection on gene function result in gene loss by deletion bias in bacteria associated with hosts [Mira et al. 2001]).
But this observation is not the link between genome size and human health that I had in mind for this post. In this month’s issue of Antimicrobial Agents and Chemotherapy, Steven Projan argues that genome size is associated with the evolution of antibiotic resistance in bacteria. In Dr. Projan’s own words,
It is observed here that the ability of a given bacterium to evolve toward a multidrug resistance phenotype is a function of genome size. In Table 1, a number of examples are provided, but even an expanded analysis shows that this observation holds true. That is, the larger the genome the greater the propensity of a bacterium to display multidrug resistance phenotypes and the smaller the genome the less likely it is that antibacterial resistance will emerge and disseminate within that species. What is proposed here is that, just as there is a continuum of genome sizes among bacteria, there is a continuum in the ability or propensity of a bacterium to become “multidrug resistant” and that continuum is reflected in the size of the genome. This is not to say that we do not observe resistance to certain agents even in organisms with the smallest genomes (macrolide resistance appears in virtually every pathogen at some level). There is probably a solid biological reason for this observation; organisms with larger genomes are more adaptable to environmental changes because they have more (genetic) information to draw upon. It appears that organisms with smaller genomes have become more “specialized,” residing in particular environmental niches (Treponema pallidum and the Chlamydiae are cases in point), and their lack of versatility in adapting to different environments is also manifest in an inability to develop mechanisms for coping with antibiotics. Indeed, we have learned that virtually each and every time a bacterium either acquires a novel resistance determinant or a mutant strain arises with decreased susceptibility to an antibacterial drug, the bacterium experiences a “fitness burden.” With time, compensatory mutations are selected in which the bacterium accumulates mutations that allow for something like wild-type growth in a strain that is now phenotypically resistant (e.g., topA mutations in gyrB mutant strains). Bacteria with larger genomes simply have a greater opportunity to develop these compensatory mutations. It must be emphasized that it does not matter whether we are discussing the acquisition of a novel resistance gene as opposed to a mutation that alters the target or results in up-regulation of an efflux pump. The accumulating evidence tells us that all require some form of adaptation. Another consequence of this phenomenon is that antibiotic cycling in health care settings is unlikely to result in a reversion of the local microflora to susceptibility as the compensatory mutations “lock in” the resistance phenotype.
He continues by noting, “I and several of those I have discussed this observation with were perplexed that it had not previously been articulated. Although to be fair, others have suggested it is a trivial, if not nonsensical, observation and worthy only of cocktail party conversation… in fact, I believe that this is an important guide as to where and which organisms we actually need novel antibacterial agents for.” Projan blames an overemphasis on individual organisms with small genomes for the overlooking of this potentially important pattern. In other words, it is the sort of thing that can only be applied to human health research if one takes a broad view of genomic diversity.
As much fun as it is to study genome size for purely academic reasons, it seems it actually may be good for us too.
From my limited IT support experience – the newer sequencing technologies like 454 produce raw images of the plate. As such, it does not matter how large the genome is – the cost of sequencing run is the same whether the genome is 5Mbp or 500Mbp. The analyses of that genome can still be priced using the genome size, however.
Certainly, the cost of sequencing can be expected to continue to plummet. However, so far what you’re describing applies mostly to bacteria and archaea which have tiny genomes. Animal genome sizes range from 30Mb to 130,000Mb. Vertebrate genome sizes alone vary from about 400Mb to 130,000Mb. Flowering plants range from 60Mb to 124,000Mb. For the time being, genome size remains an important (and more or less required) datum in sequencing proposals. It is also the case that a large and (presumably) highly repetitive genome sequence will be more challenging to assemble than a simple, mostly single-copy genome. Not surprisingly, mammals (~3,000Mb) are at the top end of the range of genome sizes sequenced so far, which provides a very biased view of genome size if this is the only source of data considered.
As much fun as it is to study genome size for purely academic reasons, it seems it actually may be good for us too.
D’oh! I wanted to get into a field that was as completely divorced from human health as possible. Maybe if I focus on invertebrates and ignore the human side as much as possible it will just go away.