Undergraduate Research | George R. Brown School of Engineering | Rice University

George R. Brown School of Engineering at Rice University Science and Engineering Website Rice University About | Academics | Alumni | Departments | News | Calendar | Contacts | SE Web | Home

Undergraduate Research in Engineering at Rice

Paul Murphy, Senior Computer Science student
Computational Biology Problems with Probabilistic Roadmap Techniques: Energetic Considerations
with Dr. Lydia Kavraki

Probabilistic roadmap techniques from the field of algorithmic robotics promise to become important in the field of computational biology. Problems such as protein folding and drug-docking can be formulated in such ways that these techniques can be applied to them. One fundamental question involved in this formulation, however, is what collision function to use. Most often, the answer has been a collision function related to the energy of the system. Thus, computations of energy and searches biased towards low-energy regions of configuration space are important. The energy of a system is a fundamental value in biological computations. Two important problems in computational biology show that this is the case. The drug-docking problem is generally constructed as follows. Given a receptor and a conformation for it, design a ligand that interacts strongly with the receptor in that conformation. This problem is always reduced to scoring how well a set of candidate ligands interact with the active site of the receptor. The protein-folding problem is generally constructed in this way. Given a sequence or several sequences of amino acids that code for a protein, predict the configurations that the protein will adopt after it is translated. Each problem requires that some configuration space be scored in some way. The energy of the system is reasonable to use, because it affects how long a system will stay in a particular configuration.

The actual energy of a system is determined by Schrodingers equation. However, because of the complexity of solving that equation, imperfect parameterizations are generally used. These parameterizations involve terms that represent: bond length, bond angles, torsional angles, van der Waals-London interactions, electrostatic interactions, hydrogen bonding and other effects. However, acceleration of these computations can be achieved using data structures from computational geometry such orthogonal range structures, or Delaunay triangulations. Much of my summer was spent creating reusable classes involved in the computation of a Delaunay triangulation of a set of points.

To make conformational searches more efficient, it is worthwhile to bias the random sampling involved to low energy regions of space. Examining the local minima of the terms involved in energy calculations is one way of doing so. Assuming that bond lengths and angles do not change is one such simplification that we make when sampling that stems from this idea. Similarly, when generating random conformations, we would like to restrict the range of torsional angles sampled, and to bias the placement of atoms towards low-energy positions as determined by the van der Waals and hydrogen bonding effects. We are still in the process of implementing these biased searches

In my work, I investigated data structures and algorithms that allowed for efficient representation of macromolecules and for efficient calculations of pertinent functions of those molecules, such as their energy. The representation of articulated arms from robotics was a fitting structure. I built upon the code developed by another member of the Kavraki group, Ming Zhang, to add such capabilities as reading data from Tripos' Mol2 file format, and incorporating the definitive bond information available in those files. Having improved this data structure, I created functions that computed the van der Waals-London energy of the structure, and sought ways to improve the speed of their evaluation. At this point I investigated ways of using the Delaunay triangulation to accelerate these calculations. Preliminary tests that I ran showed the possibility that that structure would offer increased speed. Unfortunately, my initial attempts to build a fully-dynamic triangulation were too ambitious, and I was unable to complete that code. At that point, I scaled back my efforts towards the fully-dynamic structure and looked at other triangulation software. I also, at that point, began work on generating random configurations for use in random path planning algorithms. However, I found that purely random configurations were most often very unfavorable energetically. Thus, I began looking into ways of biasing the search towards more favorable regions of configuration space, but, as school was resuming by this point, was not able to implement any of the heuristics I had thought of. I am now seeking to unify several development branches of the code that I had worked on into one package that will be usable by others or me in the future.

Department of Computer Science
Back to Undergraduate Research page

SE Web About the School | Academics & Research | Departments | What's New
Alumni | Calendar | Contacts | SE Web | Duncan Hall Conference Room Reservations

George R. Brown School of Engineering, Rice University   grbsoe@rice.edu