Jana Sperschneider
PhD Project
Supervisors: Prof Amitava Datta and Prof Michael Wise
RNA pseudoknots: a challenging twist
There are two types of nucleic acids in the living cell: deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). For nearly a century, RNA was solely seen as the passive carrier of genetic information from DNA to proteins. In the 1980's, the research field experienced major turbulences with the discovery that RNA has the ability to act as a catalyst. Numerous non-coding RNAs have since been discovered and the prevailing view among scientists is that RNA is an extremely versatile molecule which easily keeps up with the countless functions and structures proteins exhibit.
Biologists demand computational RNA structure prediction methods as laboratory techniques are intricate. Several robust and efficient algorithms for RNA secondary structure prediction are available which neglect crossing structure elements, so-called pseudoknots, for ease of computation. However, pseudoknots have turned out to be of great biological relevance over the last decade. RNA pseudoknots perform essential functions as part of cellular transcription machinery, regulatory processes and viral replication. Consequently, prediction of these functional units in RNA sequences has important implications for antiviral drug design.
From a theoretical point of view, general pseudoknot prediction is not an easy task and has been shown to constitute an NP-complete problem. In general, most practical methods as reported in the literature suffer from high running times and low accuracy for longer sequences.
Figure 1: Three different views of a simple hairpin type RNA pseudoknot.
Proposed Study
Due to their biological relevance and abundance, pseudoknots should no longer be neglected in RNA structure prediction for ease of computation. There is very high demand for structure prediction including pseudoknots, however most existing methods are inefficient and unreliable for longer sequences. In this study, two main points are proposed.
First, the majority of algorithms incorporate pseudoknots into an algorithmic framework designed for secondary structure prediction. Generally, this leads to high runtime and usage of a simplified energy model. In contrast, this study proposes a different and much more efficient heuristic route. It is a promising approach to identify pseudoknot candidates as a first step and subsequently verify them in regards to a sophisticated energy model. After pseudoknot detection, the remaining sequence can be folded using state-of-the-art secondary structure prediction methods.
Second, additive energy models are successfully used in RNA secondary structure prediction. However, for pseudoknots this principle fails. A sophisticated pseudoknot energy model has to take into account stem-loop correlations. This is crucial for successful prediction, yet there is no method in the literature which efficiently employs such a model. It is the main goal of this study to develop and incorporate an advanced pseudoknot energy model, resulting in a rapid and reliable pseudoknot prediction method.
From a practical point of view, goal of this project is to look for pseudoknots in a wide range of viral genomes and, where found, note their locations. As structure is strongly related to function, computational prediction can deliver the basis for laboratory experiments on detected pseudoknots. Most methods in the literature are tested only on short sequence stretches containing known pseudoknots. In contrast, the benefit of this approach will be demonstrated by applying it to long sequences and ultimately viral genomes.
Selected References
Brierley, I., Pennell, S., and Gilbert, R.J. (2007). Viral RNA pseudoknots: versatile motifs in gene expression and replication. Nat Rev Microbiol, 5 (8): 598-610. Abstract
Staple, D.W. & Butcher, S.E. (2005). Pseudoknots: RNA structures with diverse functions. PLoS Biol, 3 (6): e213. Abstract
Sperschneider, J. and Datta, A. (2008). KnotSeeker: Heuristic pseudoknot detection in long RNA sequences. RNA, 14(4): 630-640. Abstract / Web Server
Huang, C.H., Lu, C.L., and Chiu, H.T. (2005). A heuristic approach for detecting RNA H-type pseudoknots. Bioinformatics, 21 (17): 3501-3508. Abstract
Chen, S.J. (2008). RNA folding: conformational statistics, folding kinetics, and ion electrostatics. Annu Rev Biophys, 37 : 197-214. Abstract
Cao, S. and Chen, S.J. (2006). Predicting RNA pseudoknot folding thermodynamics. Nucleic Acids Res, 34 (9): 2634-2652. Abstract
Gultyaev, A.P., van Batenburg, F.H., and Pleij, C.W. (1999). An approximation of loop free energy values of RNA H-pseudoknots. RNA, 5 (5): 609-617. Abstract

