DNA :: Age of the deciphering of the Homo sapiens genome ::
“Plastics.” When a family friend whispered this word to Dustin Hoffman’s character in the 1967 film The Graduate, he was advocating not just a novel career choice but an entirely different way of life. If that movie were made today, in the age of the deciphering of the Homo sapiens genome, the magic word might well be “bioinformatics.” Corporate and government-led scientists have already compiled the three
gigabytes of paired A’s, C’s, T’s and G’s that spell
out the Homo sapiens genetic codea quantity of information that could fill more
than 2,000 standard computer diskettes. But that is just the initial trickle of
the flood of information to be tapped from the Homo sapiens genome. Researchers are
generating gigantic databases containing the details of when and in which
tissues of the body various genes are turned on, the shapes of the proteins the
genes encode, how the proteins interact with one another and the role those
interactions play in disease. Add to the mix the data pouring in about the
genomes of so-called model organisms such as fruit flies and mice, and you have
what Gene Mayers, Jr., vice president of informatics research at Celera
Genomics in “For the next two to three years, the amount of information will be phenomenal, and everyone will be overwhelmed by it,” Myers predicts. “The race and competition will be who can mine it best. There will be such a wealth of riches.” A whole host of companies are vying for their share of the gold. Jason Reed
of the investment banking firm Oscar Gruss & Son in The reason drug companies are so willing to line up and pay for such servicesor to develop their own expensive resources in-houseis that bioinformatics offers the prospect of finding better drug targets earlier in the drug development process. This efficiency could trim the number of potential therapeutics moving through a company’s clinical testing pipeline, significantly decreasing overall costs. It could also create extra profits for drug companies by whittling the time it takes to research and develop a drug, thus lengthening the time a drug is on the market before its patent expires. “Assume I’m a pharmaceutical company and somebody can get [my] drug to the market one year sooner,” explains Stelios Papadopoulos, managing director of health care at the New York investment banking firm SG Cowen. “It could mean you could grab maybe $500 million in sales you would not have recovered.” Before any financial windfalls can occur, however, bioinformatics companies must contend with the current plethora of genomic data while constantly refining their technology, research approaches and business models. They must also focus on the real challenge and opportunityfinding out how all the shards of information relate to one another and making sense of the big picture. “Methods have evolved to the point that you can generate lots of
information,” comments Michael R. Fannon, vice president and chief
information officer of Human Genome Sciences, also in Divining that importance is the job of bioinformatics. The field got its
start in the early 1980s with a database called GenBank, which was originated
by the U.S. Department of Energy to hold the short stretches of DNA sequence
that scientists were just beginning to obtain from a range of organisms. In the
early days of GenBank a roomful of technicians sat at keyboards consisting of
only the four letters A, C, T and G, tediously entering the DNA-sequence
information published in academic journals. As the years went on, new protocols
enabled researchers to dial up GenBank and dump in their sequence data
directly, and the administration of GenBank was transferred to the National
Institutes of Health’s Once the Human Genome Project (HGP) officially got off the ground in 1990, the volume of DNA-sequence data in GenBank began to grow exponentially. With the introduction in the 1990s of high-throughput sequencingan approach using robotics, automated DNA-sequencing machines and computersadditions to GenBank skyrocketed. GenBank held the sequence data on more than seven billion units of DNA by July 2000. Around the time the HGP was taking off, private companies started parallel
sequencing projects and established huge proprietary databases of their own.
Today companies such as Incyte Genomics in But GenBank and its corporate cousins are only part of the bioinformatics picture. Other public and private databases contain information on gene expression (when and where genes are turned on), tiny genetic differences among individuals called single-nucleotide polymorphisms (SNPs), the structures of various proteins, and maps of how proteins interact. Mixing and MatchingOne of the most basic operations in bioinformatics involves searching for similarities, or homologies, between a newly sequenced piece of DNA and previously sequenced DNA segments from various organisms. Finding near-matches allows researchers to predict the type of protein the new sequence encodes. This not only yields leads for drug targets early in drug development but also weeds out many targets that would have turned out to be dead ends. A popular set of software programs for comparing DNA sequences is BLAST (for Basic Local Alignment Search Tool), which first emerged in 1990. BLAST is part of a suite of DNA-and protein-sequence search tools accessible in various customized versions from many database providers or directly through NCBI. NCBI also offers Entrez, a so-called metasearch tool that covers most of NCBI’s databases, including those housing three dimensional protein structures, the complete genomoes of organisms such as yeast, and references to scientific journals that back up the database entries. An early example of the utility of bioinformatics is cathepsin K, an enzyme
that might turn out to be an important target for treating osteoporosis, a
crippling disease caused by the breakdown of bone. In 1993 researchers at
SmithKline Beecham, based in Human Genome Sciences scientists sequenced the sample and conducted database homology searches to look for matches that would give them a clue to the proteins that the sample’s gene sequences encoded. Once they found near-matches for the sequences, they carried out further analyses and discovered that one sequence in particular was overexpressed by the osteoclast cells and that it matched those of a previously identified class of molecules: cathepsins. For SmithKline Beecham, that exercise in bioinformatics yielded in just weeks a promising drug target that standard laboratory experiments could not have found without years and a pinch of luck. Company researchers are now trying to find a potential drug that blocks the cathepsin K target. Searches for compounds that bind to and have the desired effect on drug targets still take place mainly in a biochemist’s traditional “wet” lab, where evaluations for activity, toxicity and absorption can take years. But with new bioinformatics tools and growing amounts of data on protein structures and biomolecular pathways, some researchers say, this aspect of drug development will also shift to computers, in what they term “in silico” biology. |
legal disclaimer
Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.
related articles
Most readers are probably familiar with the idea of a gene as something that transmits inherited traits from one generation to the next. Less well appreciated is that malfunctioning genes are deeply involved in most diseases, not only inherited ones. Cancer, atherosclerosis, osteoporosis, arthritis and Alzheimer’s disease, for example, are all characterized by specific changes in the activities of genes. Even infectious disease usually provokes the activation of identifiable genes in a patient’s immune system. Moreover, ac...
Cells use messenger RNA to make protein. We discover genes by making complementary DNA (cDNA) copies of messenger RNA. First we have to clone and produce large numbers of copies of each cDNA, so there will be enough to determine its constituent bases. Molecular biologists have developed ways to insert cDNA into specialized DNA loops, called vectors, that can reproduce inside bacterial cells. A mixture of cDNAs from a given tissue is called a library. Researchers at HGS have now prepared Homo sapiens cDNA libraries from almost all n...
3. How to Find a Partial cDNA Sequence
Researchers find partial cDNA sequences by chemically breaking down copies of a cDNA molecule to create an array of fragments that differ in length by one base. In this process, the base at one end of each fragment is attached to one of four fluorescent dyes, the color of the dye depending on the identity of the base in that position. Machines then sort the labeled fragments according to size. Finally, a laser excites the dye labels one by one. The result is a sequence of colors that can be read electronically and that corresponds ...
4. Origin of Species by Means of Natural Selection
The questions we do not yet have the wit to ask will be a growing preoccupation of science in the next 50 years. That is what the record shows. Consider the state of science more than a century ago, in 1899. Then, as now, people were reflecting on the achievements of the previous 100 years. One solid success was the proof by John Dalton in 1808 that matter consists of atoms. Another was the demonstration (by James Prescott Joule in 1851) that energy is indeed conserved and the earlier surmise (by French physicist Sadi Carnot) that the...
5. Several companies have sprouted up to provide bioinformatics tools
Unprecedented fanfare greeted the June 26, 2000 announcement that scientists had completed a draft of the Homo sapiens genome sequence. The truth is, however, that figuring out the order of the letters in our genetic alphabet was the easy part. Now comes the hard part: deciphering the meaning of the genetic instruction article. The next stage goes by a deceptively prosaic name: annotation. Strictly speaking, “annotation” comprises everything that can b...
6. The original plan was to repeat the sequencing more times
Correct errors and proofread. The original plan was to repeat the sequencing up to 12 times to prune away the mistakes that inevitably accompany a project involving 3.1 billion pieces of datum. In the rush to make the joint announcement, the privately funded Celera Genomics and the publicly funded international consortium Human Genome Project settled temporarily for le...
7. If the biotechnology company called Myriad Genetics
If the biotechnology company called Myriad Genetics has its way, thousands of healthy women in the U.S. will hear doubly bad news. First, a close relativeperhaps a sisterwill announce that she has breast cancer. Second, the patient’s physician thinks this particular cancer has probably been caused by a mutation that the healthy relative has an even chance of also carrying. This patient has been advised to suggest to all her female relatives that they be tested for t...
8. Burgeoning genetic revolution is already causing seismic reverberations
In spite of these problems, the burgeoning genetic revolution is already causing seismic reverberations in the business world. Pharmaceutical companies have staked hundreds of millions of dollars on efforts to discover genes connected to disease, because they could show the way to molecules that might then be good targets for drugs or diagnostic reagents. The prospect of commercial exploitation of the genome is motivating protests in some quarters. Most of the political flack is being taken by an initiative known as the Human...