The Recipe for Life
For all the diversity of the world's five and a half billion people,
full of creativity and contradictions, the machinery of every human mind
and body is built and run with fewer than 100,000 kinds of protein molecules.
And for each of these proteins, we can imagine a single corresponding gene
(though there is sometimes some redundancy) whose job it is to ensure an
adequate and timely supply. In a material sense, then, all of the subtlety
of our species, all of our art and science, is ultimately accounted for
by a surprisingly small set of discrete genetic instructions. More surprising
still, the differences between two unrelated individuals, between the man
next door and Mozart, may reflect a mere handful of differences in their
genomic recipes -- perhaps one altered word in five hundred. We are far
more alike than we are different. At the same time, there is room for near-infinite
variety.
It is no overstatement to say that to decode our 100,000 genes in
some fundamental way would be an epochal step toward unraveling the manifold
mysteries of life.
"THE BOOK OF LIFE"
What is the Human Genome Project?
Begun in 1990, the U.S. Human Genome Project is a 13-year international
effort coordinated by the U.S. Department of Energy and the National Institutes
of Health. Originally conceived as a 15-year project, rapid technological
advances have accelerated the project to an expected completion date of
2003. Project goals are to:
discover and to identify all the estimated 80,000 genes in human
DNA;
determine the complete sequences of the 3 billion chemical bases
that make up human DNA;
store this information in databases;
develop tools for data analysis; and
address the ethical, legal, and social issues (ELSI) that may arise
from the project.
According to Ari Patrinos, DOE Associate Director for Biological
and Environmental Research, "Although we have as our primary goal the finished
"Book of Life" by the end of 2003,
we also want the working draft to be as useful as possible."
The Basics
The complete set of instructions for making an organism is called
its genome. It contains the master blueprint for all cellular
structures and activities for the lifetime of the cell or organism.
Found in every nucleus of a person's many trillions of cells, the
human genome consists of tightly coiled threads of deoxyribonucleic
acid (DNA) and associated protein molecules, organized
into structures called chromosomes.
What's a genome? And why is it important?
A genome is all the DNA in an organism, including its genes. Genes
carry information for making all the proteins required by all organisms.
These proteins determine, among other things, how the organism looks, how
well its body metabolizes food or fights infection, and sometimes even
how it behaves.
DNA is made up of four similar chemicals (called bases and abbreviated
A, T, C, and G) that are repeated millions or billions of times throughout
a genome. The human genome, for example, has 3 billion pairs of bases.
The particular order of As, Ts, Cs, and Gs is extremely important. The
order underlies all of life's diversity, even dictating whether an organism
is human or another species such as yeast, rice, or fruit fly, all of which
have their own genomes and are themselves the focus of genome projects.
Because all organisms are related through similarities in DNA sequences,
insights gained from nonhuman genomes often lead to new knowledge about
human biology.
Some definitions
The human genome is the full complement of genetic material
in a human cell. (Despite five and a half billion variations on a theme,
the differences from one genome to the next are minute; hence, we hear
about
the human genome -- as if there were only one.) The genome,
in turn, is distributed among 23 sets of chromosomes, which, in
each of us, have been replicated and re-replicated since the fusion of
sperm and egg that marked our conception. The source of our personal uniqueness,
our full genome, is therefore preserved in each of our body's several trillion
cells. At a more basic level, the genome is DNA, deoxyribonucleic acid,
a natural polymer built up of repeating nucleotides, each consisting
of a simple sugar, a phosphate group, and one of four nitrogenous bases.
The hierarchy of structure from chromosome to nucleotide is shown in Some
DNA details. In the chromosomes, two DNA strands are twisted together into
an entwined spiral -- the famous double helix -- held together by weak
bonds between complementary bases, adenine (A) in one strand to thymine
(T) in the other, and cytosine to guanine (C-G). In the language of molecular
genetics, each of these linkages constitutes a base pair. All told,
if we count only one of each pair of chromosomes, the human genome comprises
about three billion base pairs.
The specificity of these base-pair linkages underlies all that is
wonderful about DNA. First, replication becomes straightforward. Unzipping
the double helix provides unambiguous templates for the synthesis of daughter
molecules: One helix begets two with near-perfect fidelity. Second, by
a similar template-based process, depicted in From genes to proteins, a
means is also available for producing a DNA-like messenger to the cell
cytoplasm. There, this messenger RNA, the faithful complement of
a particular DNA segment, directs the synthesis of a particular protein.
Many subtleties are entailed in the synthesis of proteins, but in a schematic
sense, the process is elegantly simple.
Every protein is made up of one or more polypeptide
chains, each a series of (typically) several hundred molecules known asamino
acids, linked by so-called peptide bonds. Remarkably,
only 20 different kinds of amino acids suffice as the building blocks for
all human proteins. The synthesis of a protein chain, then, is simply a
matter of specifying a particular sequence of amino acids. This is the
role of the messenger RNA. (The same nitrogenous bases are at work in RNA
as in DNA, except that uracil takes the place of the DNA base thymine.)
Each linear sequence of three bases (both in RNA and in DNA) corresponds
uniquely to a single amino acid. The RNA sequence AAU thus dictates that
the amino acid asparagine should be added to a polypeptide chain, GCA specifies
alanine -- and so on. A segment of the chromosomal DNA that directs the
synthesis of a single type of protein constitutes a single gene.
DNA
If unwound and tied together, the strands of DNA would stretch
more than 5 feet but
would be only 50 trillionths of an inch wide. For each organism,
the components of these
slender threads encode all the information necessary for building
and maintaining life,
from simple bacteria to remarkably complex human beings. Understanding
how DNA
performs this function requires some knowledge of its structure
and organization.
In humans, as in other higher organisms, a DNA molecule consists
of two strands that wrap around each other to resemble a twisted ladder
whose sides, made of sugar and phosphate molecules, are connected by rungs
of nitrogen-containing chemicals called bases. Each strand is a linear
arrangement of repeating similar units called nucleotides, which are each
composed of one sugar, one phosphate, and a nitrogenous base. Four different
bases are present in DNA: adenine (A), thymine (T), cytosine (C), and guanine
(G). The particular order of the bases arranged along the sugar-phosphate
backbone is
called the DNA sequence; the sequence specifies
the exact genetic instructions required to create a particular organism
with its own unique traits.
The two DNA strands are held together by weak bonds between the bases
on each strand, forming base pairs (bp). Genome size is usually stated
as the total number of base pairs; the human genome contains roughly 3
billion bp.
Each time a cell divides into two daughter cells, its full genome
is duplicated; for humans and other complex organisms, this duplication
occurs in the nucleus. During cell division the DNA molecule unwinds
and the weak bonds between the base pairs break, allowing the strands to
separate. Each strand directs the synthesis of a complementary new strand,
with free nucleotides matching up with their complementary bases on each
of the separated strands. Strict base-pairing rules are adhered to; adenine
will pair only with thymine (an A-T pair) and cytosine with guanine (a
C-G pair). Each daughter cell receives one old and one new DNA strand.
The
cells adherence to these base-pairing rules ensures that the new strand
is an exact copy of the old one. This minimizes the incidence of
errors (mutations) that may greatly affect the resulting organism or its
offspring.
What are some practical benefits to learning about DNA?
Knowledge about the effects of DNA variations
among individuals can lead to revolutionary new ways to diagnose, treat,
and someday prevent the thousands of disorders that affect us.
Besides providing clues to understanding human biology, learning
about nonhuman organisms' DNA sequences can lead to an understanding of
their natural capabilities that can be applied toward solving challenges
in health care, energy sources, and environmental cleanup.
