Several companies have sprouted up to provide bioinformatics tools

an article added by: Donis F. at 11272007


In: Categories » Health » DNA » Several companies have sprouted up to provide bioinformatics tools

Unprecedented fanfare greeted the June 26, 2000 announcement that scientists had completed a draft of the Homo sapiens genome sequence. The truth is, however, that figuring out the order of the letters in our genetic alphabet was the easy part. Now comes the hard part: deciphering the meaning of the genetic instruction article.

The next stage goes by a deceptively prosaic name: annotation. Strictly speaking, “annotation” comprises everything that can be known about a gene: where it works, what it does and how it interacts with fellow genes. Right now, scientists often use the term simply to signify the first step: gene finding. That means discovering which parts of a stretch of DNA belong to a gene and distinguishing them from the other 96 percent or so that have no known function, often called junk DNA.

Several companies have sprouted up to provide bioinformatics tools, software and services. Their success, though, may hinge on a peaceful spot south of England’s University of Cambridge. It is home to the Sanger Center, the U.K. partner in the publicly funded Human Genome Project (HGP) consortium, and the European Bioinformatics Institute (EBI), Europe’s equivalent of the National Center for Biotechnology Information (NCBI) at the National Institutes of Health. Sanger and EBI are collaborating on the Ensembl project, which consists of computer programs for genome analysis and the public database of Homo sapiens DNA sequences. New DNA sequences arrive in bits and pieces; automated routines scan the sequences, looking for patterns typically found in genes. “One of the important things about Ensembl is that we’re completely open, so you can see all our data, absolutely everything,” says EBI’s Ewan Birney.

No matter how talented their algorithms, however, computers can’t get all the genes, and they can’t get them all right. Many additions and corrections, plus the all-important information about how genes are regulated and what they do, are tasks for Homo sapiens curators. That problem may be solved for Ensembl by a distributed computing system under development by Lincoln Stein of the Cold Spring Harbor Laboratory on Long Island, N.Y. The plan is to provide Homo sapiens annotationcorrections and suggestions and research findings from scientists around the worldlayered on top of Ensembl’s automatic annotation. Stein’s Distributed Sequence Annotation System, DAS for short, borrows an approach from Napster, the controversial software that allows people to swap music files over the Internet.

The plan is that different labs will publish their own annotations (on dedicated servers) according to specifications of some commonly accepted map of the genomelike Ensembl’s. “Then the browser application would be able to go out onto the Web, find out what’s there and bring it all into an integrated view so that you could see in a graphical way what different people had to say about a region of the genome,” Stein explains. In this way DAS may solve a huge problem that plagues biology databases: the lack of a standard format for archiving and presenting data, which, among other disadvantages, makes it impossible to search across them and compare contents.

The DAS model is not universally beloved. NCBI director David Lipman is concerned that the Homo sapiens annotations may be full of rubbish because they will not be peer-reviewed. Stein acknowledges the possibility but hopes that good annotation will drive out bad. He is more concerned about whether the spirit of volunteerism will flag when faced with personnel changes and the vagaries of funding. Keeping a lab’s Web server running and up-to-date is a long-term commitment.

As opposed to the well-publicized rivalry between the HGP and the privately owned Celera Genomics in sequencing the genome, many bioinformatics firms don’t regard Ensembl as an organization to beat. In fact, several commercial players endorse collaboration; financial opportunity will come from using the data in a unique way. James I. Garrels, president of Proteome in Beverly, Mass., expects to partner with and provide help to public-domain efforts to amass a basic description of each gene, its protein and a few of the protein’s key properties. But Proteome also believes that nothing beats the vast and versatile Homo sapiens brain for making sense of the vast and versatile Homo sapiens genome. The company’s researchers scour the literature, concentrating on proteinsthe product most genes makeand since 1995 have built protein databases on three model organisms: the roundworm Caenorhabditis elegans and two species of yeast. Now they are adding data on the Homo sapiens, mouse and rat genomes. The company’s niche will be integrating all that information. “That’s not the type of effort contemplated in the public domain,” Garrels points out.

Proteome’s strength is likely to lie in its customers’ ability to compare sequences across species. Because evolution has conserved a great many genes and used them over and over, such comparisons are a rich source of hints: a Homo sapiens gene whose job is currently a mystery will often be nearly identical to one present in other species.

Randy Scott, president of Incyte Genomics in Palo Alto, Calif., is another fan of sharing the load. Besides, “there’s plenty of ways to make money,” Scott declares. “We assume there are going to be broadly annotated databases available in the public domain, and the sooner we can get there, the faster Incyte can focus on downstream, on how we take that information to create new levels of information.” For instance, the company has picked a group of genes it believes will be important for diagnostics and other applications and is concentrating its annotation efforts on them. It also has databases that permit some cross-species comparisons.

Given Ensembl’s open-source code, distributed annotation and determination to stay free, comparisons to the free Linux computer operating systemwhich may someday challenge Microsoft Windows’s supremacyare natural. But the parallel doesn’t go very far. Thinking of public and commercial annotation products as rivals misses the point, observers say. In the words of Sean Eddy of Washington University, who is working on DAS: “The Homo sapiens genome is too big for anybody to look at alone. We’re going to have to figure out ways for the public and private sectors to work collaboratively rather than competitively.”

legal notice

Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.

Useful tools and features

Link to this article from your page    Send this article to you or to a friend
If you like this article (tutorial), please link to it from your web page using the information above.

related articles

1. Deciphering the Code of Life
When historians look back at this turning of the millennium, they will note that the major scientific breakthrough of the era was the characterization in ultimate detail of the genetic instructions that shape a Homo sapiens being. The Human Genome Projectwhich aims to map every gene and spell out letter by letter the literal thread of life, DNAwill affect just about every branch of biology. The complete DNA sequencing of more and more organisms, including Homo sapienss, will answer many important questions, such as how organisms evolved,...

2. Discovering Genes for New Medicines
Most readers are probably familiar with the idea of a gene as something that transmits inherited traits from one generation to the next. Less well appreciated is that malfunctioning genes are deeply involved in most diseases, not only inherited ones. Cancer, atherosclerosis, osteoporosis, arthritis and Alzheimer’s disease, for example, are all characterized by specific changes in the activities of genes. Even infectious disease usually provokes the activation of identifiable genes in a patient’s immune system. Moreover, ac...

3. How to Make and Separate cDNA Molecules
Cells use messenger RNA to make protein. We discover genes by making complementary DNA (cDNA) copies of messenger RNA. First we have to clone and produce large numbers of copies of each cDNA, so there will be enough to determine its constituent bases. Molecular biologists have developed ways to insert cDNA into specialized DNA loops, called vectors, that can reproduce inside bacterial cells. A mixture of cDNAs from a given tissue is called a library. Researchers at HGS have now prepared Homo sapiens cDNA libraries from almost all n...

4. How to Find a Partial cDNA Sequence
Researchers find partial cDNA sequences by chemically breaking down copies of a cDNA molecule to create an array of fragments that differ in length by one base. In this process, the base at one end of each fragment is attached to one of four fluorescent dyes, the color of the dye depending on the identity of the base in that position. Machines then sort the labeled fragments according to size. Finally, a laser excites the dye labels one by one. The result is a sequence of colors that can be read electronically and that corresponds ...

5. Origin of Species by Means of Natural Selection
The questions we do not yet have the wit to ask will be a growing preoccupation of science in the next 50 years. That is what the record shows. Consider the state of science more than a century ago, in 1899. Then, as now, people were reflecting on the achievements of the previous 100 years. One solid success was the proof by John Dalton in 1808 that matter consists of atoms. Another was the demonstration (by James Prescott Joule in 1851) that energy is indeed conserved and the earlier surmise (by French physicist Sadi Carnot) that the...

6. The original plan was to repeat the sequencing more times
Correct errors and proofread. The original plan was to repeat the sequencing up to 12 times to prune away the mistakes that inevitably accompany a project involving 3.1 billion pieces of datum. In the rush to make the joint announcement, the privately funded Celera Genomics and the publicly funded international consortium Human Genome Project settled temporarily for le...

7. If the biotechnology company called Myriad Genetics
If the biotechnology company called Myriad Genetics has its way, thousands of healthy women in the U.S. will hear doubly bad news. First, a close relativeperhaps a sisterwill announce that she has breast cancer. Second, the patient’s physician thinks this particular cancer has probably been caused by a mutation that the healthy relative has an even chance of also carrying. This patient has been advised to suggest to all her female relatives that they be tested for t...

8. Burgeoning genetic revolution is already causing seismic reverberations
In spite of these problems, the burgeoning genetic revolution is already causing seismic reverberations in the business world. Pharmaceutical companies have staked hundreds of millions of dollars on efforts to discover genes connected to disease, because they could show the way to molecules that might then be good targets for drugs or diagnostic reagents. The prospect of commercial exploitation of the genome is motivating protests in some quarters. Most of the political flack is being taken by an initiative known as the Human...

9. Genetics Discrimination
In April 1999 Terri Seargent went to her doctor with slight breathing difficulties. A simple genetic test confirmed her worst nightmare: she had alpha-1 deficiency, meaning that she might one day succumb to the same respiratory disease that killed her brother. The test probably saved Seargent’s lifethe condition is treatable if detected earlybut when her employer learned of her costly condition, she was fired and lost her health insurance. Seargent’s case could have been a shining success story for genetic scienc...