When historians look back at this turning of the millennium, they will note
that the major scientific breakthrough of the era was the characterization in
ultimate detail of the genetic instructions that shape a Homo sapiens being. The Human
Genome Projectwhich aims to map every gene and spell out letter by
letter the literal thread of life, DNAwill affect just about every
branch of biology. The complete DNA sequencing of more and more organisms,
including Homo sapienss, will answer many important questions, such as how organisms
evolved, whether synthetic life will ever be possible, and how to treat a wide
range of medical disorders.
The Human Genome Project is generating an amount of data unprecedented in
biology. A simple list of the units of DNA, called bases, that make up the
Homo sapiens genome would fill 200 telephone articleseven without annotations
describing what those DNA sequences do. A working draft of 90 percent of the
total Homo sapiens DNA sequence was in hand by the spring of 2000, and the full
sequence is expected in 2003. But that will be merely a skeleton that will
require many layers of annotation to give it meaning. The payoff from the
reference work will come from understanding the proteins encoded by the genes.
Proteins not only make up the structural bulk of the Homo sapiens body but also
include the enzymes that carry out the biochemical reactions of life. They are
composed of units called amino acids linked together in a long string; each
string folds in a way that determines the function of a protein. The order of
the amino acids is set by the DNA base sequence of the gene that encodes a
given protein, through intermediaries called RNA; genes that actively make RNA
are said to be “expressed.”
The Human Genome Project seeks not just to elucidate all the proteins
produced within a Homo sapiens but also to comprehend how the genes that encode the
proteins are expressed, how the DNA sequences of those genes stack up against
comparable genes of other species, how genes vary within our species and how
DNA sequences translate into observable characteristics. Layers of information
built on top of the DNA sequence will reveal the knowledge embedded in the DNA.
These data will fuel advances in biology for at least the next century. In a
virtuous cycle, the more we learn, the more we will be able to extrapolate,
hypothesize and understand.
By 2050 we believe that genomics will be able to answer the following major
questions:
• Will the three-dimensional structures of proteins be
predictable from their amino acid sequences?
The six billion bases of the Homo sapiens genome are thought to encode
approximately 100,000 proteins. Although the sequence of amino acids in a
protein can be translated in a simple step from the DNA sequence of a gene, we
cannot currently elucidate the shape of a protein on purely theoretical
grounds, and determining structures experimentally can be quite laborious.
Still, a protein’s structure is conservedor maintained fairly
constantly throughout evolutionmuch more than its amino acid sequence
is. Many different amino acid sequences can lead to proteins of similar shapes,
so we can infer the structures of various proteins by studying a representative
subset of proteins in detail.
Recently an international group of structural biologists have begun a
Protein Structure Initiative to coordinate their work. Structural biologists
“solve” the shapes of proteins either by making very pure crystals
of a given protein and then bombarding the crystals with x-rays, or by
subjecting a particular protein to nuclear magnetic resonance (NMR) analysis.
Both techniques are time-consuming and expensive. The consortium intends to get
the most information out of each new structure by using existing knowledge
about related structures to group proteins into families that are most likely
to share the same architectural features. Then the members of the consortium
plan to target representatives of each family for examination by painstaking
physical techniques.
As the catalogue of solved structures swells and scientists develop more
refined schemes for grouping structures into a compendium of basic shapes,
biochemists will increasingly be able to use computers to model the structures
of newly discoveredor even wholly inventedproteins. Structural
biologists project that a total of about 1,000 basic protein-folding motifs
exist; current models suggest that solving just 3,000 to 5,000 selected
structures, beyond the ones already known, could allow researchers to deduce
the structures of new proteins routinely. With structural biologists solving
more than 1,000 protein structures every year and with their progress
accelerating, they should be able to complete the inventory not long after the
Homo sapiens genome itself is sequenced.
• Will synthetic life-forms be produced?
Whereas structural biologists work to group proteins into categories for the
practical aim of solving structures efficiently, the fact that proteins are so
amenable to classification reverberates with biological meaning. It reflects
how life on the earth evolved and opens the door to questions central to
understanding the phenomenon of life itself. Is there a set of proteins common
to all organisms? What are the biochemical processes required for life?
Already, with several fully sequenced genomes availablemostly from
bacteriascientists have started to take inventories of genes conserved
among these organisms, guided by the grand question of what constitutes life,
at least at the level of a single cell.
If, within a few years, investigators can expect to amass a tidy directory
of the gene productsRNA as well as proteinsrequired for life,
they may well be able to make a new organism from scratch by stringing DNA
bases together into an invented genome coding for invented products. If this
invented genome crafts a cell around itself and the cell reproduces reliably,
the exercise would prove that we had deciphered the basic mechanisms of life.
Such an experiment would also raise safety, ethical and theological issues that
cannot be neglected.
Human genome is contained in 23 pairs of chromosomes, which lie in the
nucleus of every cell in the body. Each chromosome consists of a DNA double
helix that is wrapped around spoollike proteins called histones. The
DNA-histone complexes are then coiled and double-coiled to yield chromosomes.
The ultimate aim of the Human Genome Project is to understand the
proteins that are encoded by the DNA. When a gene is “on,” the cell
uses a process called transcription to copy the gene’s DNA into a
single-stranded molecule called messenger RNA (mRNA), which leaves the nucleus
to associate with a series of large protein structures called ribosomes. The
ribosomes then translate the mRNA into the chain of amino acids that makes up
the encoded protein. The new proteinhere a receptor destined for the
cell membranegoes through several folding steps in a sequence that
researchers are just beginning to understand.
• Will we be able to build a computer model of a cell that
contains all the components, identifies all the biochemical interactions and
makes accurate predictions about the consequences of any stimulus given to that
cell?
In the past 50 years, a single gene or a single protein often dominated a
biologist’s research. In the next 50 years, researchers will shift to
studying integrated functions among many genes, the web of interactions among
gene pathways and how outside influences affect the system.
Of course, biologists have long endeavored to describe how components of a
cell interact: how molecules called transcription factors bind to specific
scraps of DNA to control gene expression, for example, or how insulin binds to
its receptor on the surface of a muscle cell and triggers a cascade of
reactions in the cell that ultimately boosts the number of glucose transporters
in the cell membrane. But the genome project will spark similar analyses for
thousands of genes and cell components at a time. Within the next half-century,
with all genes identified and all possible cellular interactions and reactions
charted, pharmacologists developing a drug or toxicologists trying to predict
whether a substance is poisonous may well turn to computer models of cells to
answer their questions.
• Will the details of how genes determine mammalian development
become clear?
Being able to model a single cell will be impressive, but to understand
fully the life-forms we are most familiar with, we will plainly have to
consider additional levels of complexity. We will have to examine how genes and
their products behave in place and timethat is, in different parts of
the body and in a body that changes over a life span. Developmental biologists
have started to monitor how pools of gene products vary as tissues develop, in
an attempt to find products that define stages of development. Now scientists
are devising so-called expression arrays that survey thousands of gene products
at a time, charting which ones turn on or off and which ones fluctuate in
intensity of expression. Techniques such as these highlight many good
candidates for genes that direct development and establish the animal body
plan.
As in the past, model organismslike the fruit fly Drosophila,
the nematode Caenorhabditis elegans and the mousewill remain the
central workhorses in developmental biology. With the genome sequence of C.
elegans and Drosophila’s complete, the full Homo sapiens sequence on
the way by 2003 and the mouse’s likely within four to five years,
sequence comparisons will become more commonplace and thorough and will give
biologists many clues about where to look for the driving forces that fashion a
whole animal. Many more complete genomes representing diverse branches of the
evolutionary tree will be derived as the cost of sequencing decreases.
So far developmental biologists have striven to find signals that are
universally important in establishing an animal’s body plan, the
arrangement of its limbs and organs. In time, they will also describe the
variationsin gene sequence and perhaps in gene regulationthat
generate the striking diversity of forms among different species. By comparing
species, we will learn how genetic circuits have been modified to carry out
distinct programs so that almost equivalent networks of genes fashion, for
example, small furry legs in mice and arms with opposable digits in Homo sapienss.
• Will understanding the Homo sapiens genome transform preventive,
diagnostic and therapeutic medicine?
Molecular biology has long held out the promise of transforming medicine
from a matter of serendipity to a rational pursuit grounded in a fundamental
understanding of the mechanisms of life. Its findings have begun to infiltrate
the practice of medicine; genomics will hasten the advance. Within 50 years, we
expect comprehensive genomics-based health care to be the norm in the U.S.
We will understand the molecular foundation of diseases, be able to prevent
them in many cases, and design accurate, individual therapies for illnesses.
Tree of life illustrates the current view of the relationships among all
living things, including Homo sapienss. Once the DNA sequence of the Homo sapiens genome is
known, scientists will be able to compare the information to that produced by
efforts to sequence the genomes of other species, yielding a fuller
understanding of how life on the earth evolved.
In the next decade, genetic tests will routinely predict individual
susceptibility to disease. One intention of the Human Genome Project is to
identify common genetic variations. Once a list of variants is compiled,
epidemiological studies will tease out how particular variations correlate with
risk for disease. When the genome is completely open to us, such studies will
reveal the roles of genes that contribute weakly to diseases on their own but
that also interact with other genes and environmental influences such as diet,
infection and prenatal exposure to affect health. By 2010 to 2020, gene therapy
should also become a common treatment, at least for a small set of conditions.
Within 20 years, novel drugs will be available that derive from a detailed
molecular understanding of common illnesses such as diabetes and high blood
pressure. The drugs will target molecules logically and therefore be potent
without significant side effects. Drugs such as those for cancer will routinely
be matched to a patient’s likely response, as predicted by molecular
fingerprinting. Diagnoses of many conditions will be much more thorough and
specific than they are now. For example, a patient who learns that he has high
cholesterol will also know which genes are responsible, what effect the high
cholesterol is likely to have, and what diet and pharmacological measures will
work best for him.
By 2050 many potential diseases will be cured at the molecular level before
they arise, although large inequities worldwide in access to these advances
will continue to stir tensions. When people become sick, gene therapies and
drug therapies will home in on individual genes, as they exist in individual
people, making for precise, customized treatment. The average life span will
reach 90 to 95 years, and a detailed understanding of Homo sapiens aging genes will
spur efforts to expand the maximum length of Homo sapiens life.
• Will we reconstruct accurately the history of Homo sapiens
populations?
Despite what may seem like great diversity in our species, studies from the
past decade show that the Homo sapiens species is more homogeneous than many others;
as a group, we display less variation than chimps do. Among Homo sapienss, the same
genetic variations tend to be found across all population groups, and only a
small fraction of the total variation (between 10 and 15 percent) can be
related to differences between groups. This has led some population biologists
to the conclusion that not so long ago the Homo sapiens species was composed of a
small group, perhaps 10,000 individuals, and that Homo sapiens populations dispersed
over the earth only recently. Most genetic variation predated that time.
Armed with techniques for analyzing DNA, population geneticists have for the
past 20 years been able to address anthropological questions with unprecedented
clarity. Demographic events such as migrations, population bottlenecks and
expansions alter gene frequencies, leaving a detailed and comprehensive record
of events in Homo sapiens history. Genetic data have bolstered the view that modern
Homo sapienss originated relatively recently, perhaps 100,000 to 200,000 years ago, in
Africa, and dispersed gradually into the rest of the
world. Anthropologists have used DNA data to test cultural traditions about the
origins of groups of Homo sapienss, such as Gypsies and Jews, to track the migration
into the South Pacific islands and the Americas,
and to glean insights into the spread of populations in Europe,
among other examples. As DNA sequence data become increasingly easy to
accumulate, relationships among groups of people will become clearer, revealing
histories of intermingling as well as periods of separation and migration. Race
and ethnicity will prove to be largely social and cultural ideas; sharp,
scientifically based boundaries between groups will be found to be nonexistent.
By 2050, then, we will know much more than we do now about Homo sapiens
populations, but a question remains: How much can be known? Human beings have
mated with enough abandon that probably no one family tree will be the unique
solution accounting for all Homo sapiens history. In fact, the history of Homo sapiens
populations will emerge not as a tree but as a trellis where lineages often
meet and mingle after intervals of separation. Still, in 50 years, we will know
how much ambiguity remains in our reconstructed history.
• Will we be able to reconstruct the major steps in the
evolution of life on the earth?
Molecular sequences have been indispensable tools for drawing taxonomies
since the 1960s. To a large extent, DNA sequence data have already exposed the
record of 3.5 billion years of evolution, sorting living things into three
domainsArchaea (single-celled organisms of ancient origin), Bacteria and
Eukarya (organisms whose cells have a nucleus)and revealing the
branching patterns of hundreds of kingdoms and divisions. One aspect of
inheritance has complicated the hope of assigning all living things to branches
in a single tree of life. In many cases, different genes suggest different
family histories for the same organisms; this reflects the fact that DNA
isn’t always inherited in the straightforward way, parent to offspring,
with a more or less predictable rate of mutation marking the passage of time.
Genes sometimes hop across large evolutionary gaps. Examples of this are
mitochondria and chloroplasts, the energy-producing organelles of animals and
plants, both of which contain their own genetic material and descended from
bacteria that were evidently swallowed whole by eukaryotic cells.
This kind of “lateral gene transfer” appears to have been common
enough in the history of life, so that comparing genes among species will not
yield a single, universal family tree. As with Homo sapiens lineages, a more apt
analogy for the history of life will be a net or a trellis, where separated
lines diverge and join again, rather than a tree, where branches never merge.
In 50 years, we will fill in many details about the history of life,
although we might not fully understand how the first self-replicating organism
came about. We will learn when and how, for instance, various lineages
invented, adopted or adapted genes to acquire new sets of biochemical reactions
or different body plans. The gene-based perspective of life will have taken
hold so deeply among scientists that the basic unit they consider will very
likely no longer be an organism or a species but a gene. They will chart which
genes have traveled together for how long in which genomes. Scientists will
also address the question that has dogged people since Charles Darwin’s
day: What makes us Homo sapiens? What distinguishes us as a species?
Undoubtedly, many other questions will arise over the next 50 years as well.
As in any fertile scientific field, the data will fuel new hypotheses.
Paradoxically, as it grows in importance, genomics itself may not even be a
common concept in 50 years, as it radiates into many other fields and
ultimately becomes absorbed as part of the infrastructure of all biomedicine.
• How will individuals, families and society respond to this
explosion in knowledge about our genetic heritage?
This social question, unlike the preceding scientific, technological and
medical ones, does not come down to a yes-or-no answer. Genetic information and
technology will afford great opportunities to improve health and to alleviate
suffering. But any powerful technology comes with risks, and the more powerful
the technology, the greater the risks. In the case of genetics, people of ill
will today use genetic arguments to try to justify bigoted views about
different racial and ethnic groups. As technology to analyze DNA has become
increasingly widespread, insurers and employers have used the information to
deny workers access to health care and jobs. How we will come to terms with the
explosion of genetic information remains an open question.
Finally, will antitechnology movements be quieted by all the revelations of
genetic science? Although we have enumerated so many questions to which we
argue the answer will be yes, this is one where the answer will probably be no.
The tension between scientific advances and the desire to return to a simple
and more “natural” lifestyle will probably intensify as genomics
seeps into more and more of our daily lives. The challenge will be to maintain
a healthy balance and to shoulder collectively the responsibility for ensuring
that the advances arising from genomics are not put to ill use.
|