Introducing the HumanGenome



"All of DNA is a twisted rope ladder let down from heaven to draw us up from the abyss of not-being."                       Jonathan Weiner

The Recipe for Life

For all the diversity of the world's five and a half billion people, full of creativity and contradictions, the machinery of every human mind and body is built and run with fewer than 100,000 kinds of protein molecules. And for each of these proteins, we can imagine a single corresponding gene (though there is sometimes some redundancy) whose job it is to ensure an adequate and timely supply. In a material sense, then, all of the subtlety of our species, all of our art and science, is ultimately accounted for by a surprisingly small set of discrete genetic instructions. More surprising still, the differences between two unrelated individuals, between the man next door and Mozart, may reflect a mere handful of differences in their genomic recipes -- perhaps one altered word in five hundred. We are far more alike than we are different. At the same time, there is room for near-infinite variety. 

It is no overstatement to say that to decode our 100,000 genes in some fundamental way would be an epochal step toward unraveling the manifold mysteries of life.


What is the Human Genome Project?

Begun in 1990, the U.S. Human Genome Project is a 13-year international effort coordinated by the U.S. Department of Energy and the National Institutes of Health. Originally conceived as a 15-year project, rapid technological advances have accelerated the project to an expected completion date of 2003. Project goals are to:

discover and to identify all the estimated 80,000 genes in human DNA;
determine the complete sequences of the 3 billion chemical bases that make up human DNA;
store this information in databases;
develop tools for data analysis; and 
address the ethical, legal, and social issues (ELSI) that may arise from the project.

According to Ari Patrinos, DOE Associate Director for Biological and Environmental Research, "Although we have as our primary goal the finished "Book of Life" by the end of 2003, we also want the working draft to be as useful as possible."

The Basics

The complete set of instructions for making an organism is called its genome. It contains the master blueprint for all cellular structures and activities for the lifetime of the cell or organism. Found in every nucleus of a person's many trillions of cells, the human genome consists of tightly coiled threads of deoxyribonucleic acid (DNA) and associated protein molecules, organized into structures called chromosomes.

What's a genome? And why is it important?

A genome is all the DNA in an organism, including its genes. Genes carry information for making all the proteins required by all organisms. These proteins determine, among other things, how the organism looks, how well its body metabolizes food or fights infection, and sometimes even how it behaves.

DNA is made up of four similar chemicals (called bases and abbreviated A, T, C, and G) that are repeated millions or billions of times throughout a genome. The human genome, for example, has 3 billion pairs of bases.   The particular order of As, Ts, Cs, and Gs is extremely important. The order underlies all of life's diversity, even dictating whether an organism is human or another species such as yeast, rice, or fruit fly, all of which have their own genomes and are themselves the focus of genome projects.  Because all organisms are related through similarities in DNA sequences, insights gained from nonhuman genomes often lead to new knowledge about human biology.

Some definitions

The human genome is the full complement of genetic material in a human cell. (Despite five and a half billion variations on a theme, the differences from one genome to the next are minute; hence, we hear about the human genome -- as if there were only one.) The genome, in turn, is distributed among 23 sets of chromosomes, which, in each of us, have been replicated and re-replicated since the fusion of sperm and egg that marked our conception. The source of our personal uniqueness, our full genome, is therefore preserved in each of our body's several trillion cells. At a more basic level, the genome is DNA, deoxyribonucleic acid, a natural polymer built up of repeating nucleotides, each consisting of a simple sugar, a phosphate group, and one of four nitrogenous bases. The hierarchy of structure from chromosome to nucleotide is shown in Some DNA details. In the chromosomes, two DNA strands are twisted together into an entwined spiral -- the famous double helix -- held together by weak bonds between complementary bases, adenine (A) in one strand to thymine (T) in the other, and cytosine to guanine (C-G). In the language of molecular genetics, each of these linkages constitutes a base pair. All told, if we count only one of each pair of chromosomes, the human genome comprises about three billion base pairs. 

The specificity of these base-pair linkages underlies all that is wonderful about DNA. First, replication becomes straightforward. Unzipping the double helix provides unambiguous templates for the synthesis of daughter molecules: One helix begets two with near-perfect fidelity. Second, by a similar template-based process, depicted in From genes to proteins, a means is also available for producing a DNA-like messenger to the cell cytoplasm. There, this messenger RNA, the faithful complement of a particular DNA segment, directs the synthesis of a particular protein. Many subtleties are entailed in the synthesis of proteins, but in a schematic sense, the process is elegantly simple. 

Every protein is made up of one or more polypeptide chains, each a series of (typically) several hundred molecules known asamino acids, linked by so-called peptide bonds. Remarkably, only 20 different kinds of amino acids suffice as the building blocks for all human proteins. The synthesis of a protein chain, then, is simply a matter of specifying a particular sequence of amino acids. This is the role of the messenger RNA. (The same nitrogenous bases are at work in RNA as in DNA, except that uracil takes the place of the DNA base thymine.) Each linear sequence of three bases (both in RNA and in DNA) corresponds uniquely to a single amino acid. The RNA sequence AAU thus dictates that the amino acid asparagine should be added to a polypeptide chain, GCA specifies alanine -- and so on. A segment of the chromosomal DNA that directs the synthesis of a single type of protein constitutes a single gene


If unwound and tied together, the strands of  DNA would stretch more than 5 feet but
would be only 50 trillionths of an inch wide.  For each organism, the components of these
slender threads encode all the information necessary for building and maintaining life,
from simple bacteria to remarkably complex human beings. Understanding how DNA
performs this function requires some knowledge of its structure and organization.

In humans, as in other higher organisms, a DNA molecule consists of two strands that wrap around each other to resemble a twisted ladder whose sides, made of sugar and phosphate molecules, are connected by rungs of nitrogen-containing chemicals called bases. Each strand is a linear arrangement of repeating similar units called nucleotides, which are each composed of one sugar, one phosphate, and a nitrogenous base. Four different bases are present in DNA: adenine (A), thymine (T), cytosine (C), and guanine (G).  The particular order of the bases arranged along the sugar-phosphate backbone is
called the DNA sequence; the sequence specifies the exact genetic instructions required to create a particular organism with its own unique traits.

The two DNA strands are held together by weak bonds between the bases on each strand, forming base pairs (bp). Genome size is usually stated as the total number of base pairs; the human genome contains roughly 3 billion bp.

Each time a cell divides into two daughter cells, its full genome is duplicated; for humans and other complex organisms, this duplication occurs in the nucleus.  During cell division the DNA molecule unwinds and the weak bonds between the base pairs break, allowing the strands to separate. Each strand directs the synthesis of a complementary new strand, with free nucleotides matching up with their complementary bases on each of the separated strands. Strict base-pairing rules are adhered to; adenine will pair only with thymine (an A-T pair) and cytosine with guanine (a C-G pair). Each daughter cell receives one old and one new DNA strand. The cells adherence to these base-pairing rules ensures that the new strand is an exact copy of the old one. This minimizes the incidence of errors (mutations) that may greatly affect the resulting organism or its offspring.

What are some practical benefits to learning about DNA?

Knowledge about the effects of DNA variations among individuals can lead to revolutionary new ways to diagnose, treat, and someday prevent the thousands of disorders that affect us.

Besides providing clues to understanding human biology, learning about nonhuman organisms' DNA sequences can lead to an understanding of their natural capabilities that can be applied toward solving challenges in health care, energy sources, and environmental cleanup.