Unit 5. Biotechnology

BIOCHEMISTRY FOR CITIZENS

Wonders of modern genetics

• Lynch syndrome is an inherited genetic defect whose carriers have a 50% chance of having bowel cancer by the age of 70, and an 80% lifetime risk, compared to the normal risk of less than 5%. A simple genetic test can detect this flaw, and insurance companies now cover earlier and more frequent colonoscopies aimed at finding such cancers before they spread. A number of other cancer-causing or cancer-predisposing genes can be found, allowing appropriate precautions and surveillance.

• Couples planning to have children can learn whether certain combinations of genes might result in serious birth defects, and they can learn from genetic testing of placental blood very early in pregnancy whether such eventualities have come to pass.

• Owners of mixed-breed dogs can now learn the breeds of recent ancestors, and owners of pure-bred and designer dogs can confirm that their heritage is as advertised.

• For less than $100, anyone can send off a DNA sample and learn their status of many genes that might affect their health, including disease susceptibility (diabetes, arthritis, coronary heart disease),  and intolerance to specific foods and drugs. They also learn some just-for-fun things that they might already know, like whether they have attached ear lobes or wet ear wax. The correctness of these genetic "predictions" are likely to increase their confidence in the other conditions, which are otherwise undetectable.

• Before long, having full genetic sequences for all patients will be affordable—the cost is already below $1000 and should get down to $100 within a few years. Full sequences will probably become a routine part of our health records, and we cannot yet predict the full range of their use.

The seemingly wondrous things that are being made possible by modern genetics all grow out of a small number of crucial developments that began with cloning in the 1970s. Cloning, rapid DNA sequencing, the polymerase chain reaction, and powerful computer algorithms for comparing DNA sequences have led rapidly to floods of data about our genes.

Tools of biotechnology

Cloning and PCR: making workable quantities of identical DNAs

Cloning first allowed scientists to make large enough quantities of specific genes to be able to sequence them, that is to learn the exact sequence of As, Ts, Gs, and Cs in them. Chemists rarely work with a single molecule; they need large numbers of identical molecules (a pure sample of a substance, such as a gene), in order to do the chemistry needed to sequence it. Cloning gave them the necessary quantities of identical DNA pieces.

The figure below summarizes cloning. DNA that contains desired genes (in this slide, from bacteria) are chopped into small pieces, all different. They are introduced into carrier DNA molecules called vectors that can be inserted into easy-to-grow cells in such a way that each cell usually get no more than one piece. Cells are then spread out so sparsely that each cell can divide in isolation from others. Each dividing group, called a clone, provides a larger quantity of the original DNA piece that was introduced into it.




Note that this method also represents a basic operation in genetic engineering. Each clone of cells has a some new DNA sequence within it, a sequence that has become part of its genetic heritage. This has led to the goal, by now realized many times, of using easily cultured organisms to produce the products of genes from other organisms. Bacteria have been used to make and secrete human proteins like insulin. Insulin genes have also been inserted into mammalian livestock so that their milk contains useful proteins that can be easily purified.

A purely chemical method has replaced cloning in many applications. It is called the polymerase chain reaction, or PCR, and produces many identical copies of a DNA by the direct action of DNA-replicating enzymes. Scientist refer to this as amplifying DNA, and the process can start from a sample that contains only a few molecules of the desired DNA, even if it is mixed with much extraneous DNA.

Note that, in developing many of these methods, scientists have turned manyy enzymes into laboratory tools, for such tasks as replicating DNA, and cutting it at specific sites.

The figure below summarizes PCR. To start, one needs to know a little bit of sequence: short DNA sequences called primers that match short sequences at each end of the DNA to be amplified. The enzymes that replicate the desired DNA use primers as starting points. By repeatedly replicating, heating to separate new double strands into single strands, cooling to allow primers to find their targets, and allowing replication again, the process amplifies the DNA between the primers without amplifying anything else.



This chart shows the arithmetic of PCR, and the explosive amplification of the desired DNA, until undesired DNA is no more than a minor impurity.



DNA sequencing: reading the code

The DNA whose sequence is desired is allowed to replicate under conditions that frequently cause replication to stop at a particular base, say T. The result is a mixture of all possible complementary sequences that end in T. The same kind of procedures produce three other mixtures: all possible complements that end in A, in C, and in G. Each mixture is subjected to a procedure called electrophoresis, which allows the length, in bases, of each fragment in each mixture to be separated and seen. If A is, say, the sixth base, a fragment of length 6 will end in T. Detection of a band of length 6 in the T sample identifies the sixth base pair.



In a fast automated method, all possible fragments are made in a single experiment, but fragments ending in each base are a different color. So if fragments ending in T are red, and a red band shows that there is a red fragment 140 base pairs in length, then the 140th base is T. Automated instruments can take DNA sample, amplify regions of interest, sequence them, and put the sequence out as a computer file ready for analysis.



Sequence searching and comparison

The genetic revolution had to await another revolution: the advent of computers fast enough to search through sequences billions of base-pairs long, and to align and compare such vast sequences. The result is that whole genomes can be compared, in order to establish evolutionary relationships, or to find the genetic basis for medical conditions that occur more frequently in certain ethnic groups or regions of the world.

•••••

I hope that this very sketchy look at some of the basic tools of biotechnology gives you a general picture of what kind of work goes on to produce detailed knowledge of gene sequences. Now let's look at some uses to which this information is being put.

Medical genomics

Many of our traits, from obvious ones, like height and skin color, to subtle ones, like cancer susceptibility, arise from variations in several or many genes, rather than just one. This makes correlating genes with specific medical conditions a tricky job. Other conditions, like Lynch syndrome mentioned above, and certain breast cancers, result from mutation in a single gene. Recognition and the required responses to such one-gene conditions are among the first successes of the use of genomic information to understand disease.

Cancer is particularly hard to study because, out of all the people that are exposed to carcinogens, and out of all the people who have genes that predispose them to cancer, only a modest percentage will get cancer. Furthermore, not all who get cancer will get the same type. Even in a single tumor, there can be varied populations of cells that became cancerous by different sequences of events. (To me, the remarkable thing about a condition that raises the chance of bowel cancer by age 70 to 50% is that 50% of these people are carrying the defect, but do not get cancer by age 70—predisposition does not mean predestination.)

Finding specific genetic mutations that might be causes of disease is made easier by the use of DNA markers. A marker is simply a DNA sequence that varies considerably (has numerous versions) in a population, lies near a gene of interest so that it is always inherited along with the gene, and is easy to detect. Unique mutations to the gene will often be associated with unique variants of the marker, so the marker tells you which version of the gene is present, with very little sequencing. Large numbers of markers can be checked cheaply, to survey for many possible mutations.

For example, two genes, called BRAC1 and BRAC2 (full names: breast cancer susceptibility gene, type 1 or 2), code for proteins that are involved in DNA repair, and normally protect cells from DNA damage, which might lead to cancer. Mutants in the BRAC genes lift some of this protection. Relatively simple genetic tests (using cheek scrapes) can identify markers associated variants of BRAC genes that are associated with cancer, and thus warn a person that more cancer surveillance is a good idea.

Any group of people or animals who have a high frequency of specific diseases or medical conditions is potentially a good source of information about genes that might underlie the conditions. A marker can often be found that is common in the group but rare elsewhere, and this can lead to determining just what gene is involved in the condition.

By donating a cheek scrape, and without ever becoming a lab animal, a pure-bred dog can be a most useful research subject. Many breeds have specific ailments, including types of cancer, displasias of joints, and respiratory problems. Scientists can look for versions of markers characteristic of a breed, and they might find that these same markers are associated with a gene that causes the breed's specific ailments. The related gene in humans might also be associated with a similar ailment.

It takes sampling from many dogs of the same breed to find strong statistical correlation between and particular version of a marker and a disease. When a research group working on this problem needed greater numbers of a breed than were available, they thought that they might add to their sample size by including the most closely associated breed. How do you know which breed is closest to your current subject breed? They realized that the markers could tell them: the breeds with the largest number of corresponding versions of markers were the closest relatives. Thus they could expand sample sizes while adding as little additional genetic variation as possible.

These researchers also had another flash of insight, and with it, they started the industry of identifying dog breed from cheek swabs, now used to confirm the heritage of pure and designer breeds (like golden doodles) and to learn something about the ancestry of mixed-breed dogs. That's how I know that our website mascot Darwin is part airedale (but less than 50%), with detectable genetic contributions from Rhodesian ridgeback and Belgian shepherd.

Note that a correlation between a particular marker and a disease does not prove that the associated gene is actually the cause. But it's a good suggestion about where to look for causes.

Genetic engineering

The article "Are Engineered Foods Evil?" (Scientific American, September 2013, sent by email) discusses the controversy about foods from genetically modified organisms (called GMO foods). It provides some background about how the genetic engineering is done. I hope that the information above, in the section Tools of biotechnology, makes the article more understandable.