MotifNetwork Offers New Tool for Research

Beckman Institute researcher Eric Jakobsson unveiled an innovative new computational tool called MotifNetwork at an important bioinformatics conference last fall.

The announcement of a new, sustainable, terascale bioinformatics computing environment by Beckman Institute faculty member Eric Jakobsson and collaborators Gloria Rendon of NCSA, Jeff Tilson of the Renaissance Computing Institute, and Beckman Institute graduate student Mao-Feng Ger at the IEEE 7th International Conference on Bioinformatics and Bioengineering back in October obviously made an impact.

The Boston conference was the occasion for the first public announcement of MotifNetwork, a computational tool for creating functional domain architectures for protein sequences. The announcement came in the form of three talks and a paper by Jakobsson and his collaborators. Their presentation was so impressive that it earned the Best Applications Paper award out of more than 600 papers at the meeting.

Jakobsson, who is Director of the National Center for Design of Biomimetic Nanoconductors at Beckman, said that MotifNetwork has already demonstrated its value as "a high performance computing environment for translating genomic information into a form that dramatically enhances the ability to infer the functional significance of sequence information."

Jakobsson said the present and future development of MotifNetwork for research and medical purposes is similar in some ways to the creation of the Web browser.

"What the Web browser did is it transformed information and it transformed the ease of access to that information in a way that just made it much, much easier and more intuitive for many people to use," Jakobsson said. "So this is a similar kind of transforming operation on sequence data that makes it in many ways more useful."

Using information from data storehouses such as Interpro and the National Center for Biotechnology Information, MotifNetwork offers a comprehensive approach to classifying proteins at the single domain level rather than by the whole protein.

"That's why this is so powerful," Jakobsson said. "You get more information by looking at the protein as a collection of functions as opposed to trying to describe it globally as a single unit, and you get more information yet by cataloging the domain composition of all proteins, as opposed to just some proteins. This is because domain composition leads to insights about protein-protein interactions, since proteins interact with each other at the single domain level. These are really domain-domain interactions. This restructuring of the data is of tremendous power."

Jakobsson said the most important immediate impact of MotifNetwork is what the project adds in the area of the annotation of genes and proteins.

"Just running existing sequences, even if they've already been annotated, through MotifNetwork gives you additional information about the functions," he said. "Sometimes it provides annotations for proteins that were previously not annotated, where the previous annotation programs just didn't pick up anything. Beyond that it lays the foundation for considering protein-protein interactions with much greater facility."

Jakobsson said MotifNetwork has already been used to sequence several complete proteomes.

"The proteomes we've done so far are Human, Mouse, Honey Bee, Yeast, Cow, Mosquito, and 10 strains of E. coli. We have also processed all the proteins in the Protein Data Bank," he said. "What that means is that we have put all of the protein sequences in each organism, or in the case of the Protein Data Bank, all the proteins of known structure, through the workflow and produced at the other end the functional domain architecture of all of those proteins."

Jakobsson said the computational tool is proving useful at the nanomedicine center he directs. They are using the MotifNetwork tools and the databases that it produced to understand the control of transport through epithelial tissue in the airway, an important topic for those studying the disease of cystic fibrosis. In addition, a cancer research team headed by William Kaufman at the University of North Carolina is using MotifNetwork to study the protein and gene networks that govern DNA repair, with the idea of intervening in those networks in tumors toward inhibiting or even reversing tumor growth.

"Some of the MotifNetwork products are uniquely equipped to help them understand what would be the most useful targets in those networks for any drug therapies," Jakobsson said.

Future plans are for MotifNetwork to become a comprehensive, petascale bioinformatics computing environment. Jakobsson expects that it will contribute new information about the function of protein domains and extend the concept of the functional module from functional domains and proteins to groupings of domains and proteins that interact with each other. He said that will help researchers to understand functional modules "at the next level of organization that make up networks and pathways that give rise to phenotypes, give rise to the overall behavior of cells, and ultimately of organisms."

One MotifNetwork project involves single nucleotide polymorphisms that have been identified in humans and many of which have been identified with disease.

"We are annotating each of the functional domains as to whether or not they contain single nucleotide polymorphisms and what they are," Jakobsson said. "That really is the foundation for the development of personalized medicine."

For more on MotifNetwork, click here.