Main Research Areas and Current Projects

RNA pseudoknots: a challenging twist

RNA is a nucleic acid which has recently attracted worldwide attention from many research groups due to newly discovered functions and roles. RNA is no longer seen solely as a carrier of genetic information from DNA to proteins. RNA easily keeps up with the countless functions and structures proteins exhibit, adopts diverse three-dimensional folds, and can act like a catalyst. It is an extremely versatile molecule and facilitates various functions, including translational regulation, intron splicing, gene expression and cell regulation. Novel noncoding RNAs are discovered continuously and the exciting RNA world is far from being fully explored.

The foundation of RNA structure formation is continuous pairing of complementary bases through stable hydrogen bonds, resulting in helical stem regions. Stems deliver the basis for various structure elements, e.g. single-stranded regions, hairpin loops, interior loops, bulge loops and multiloops. The set of structure elements for an RNA sequence is referred to as its secondary structure. It is important to note that secondary structure elements are defined to appear in a non-crossing or nested fashion.

Broadly speaking, the model behind computational RNA secondary structure prediction is free energy minimization which is commonly solved by dynamic programming. There are well-established algorithms which take O(n3) time and O(n2) space (see mfold or Vienna RNAfold).

Figure 1: RNA structure with non-crossing secondary structure elements, i.e. hairpin loops (H), interior loops (I), bulge loops (B) and multiloops (M). Stem regions are indicated by grey areas.

Things become intricate when crossing or non-nested structure elements come into play, so-called pseudoknots. Pseudoknots commonly occur in RNA and perform essential functions as part of cellular transcription machinery and regulatory processes. Pseudoknots are found in most viruses and therefore prediction of these structures in RNA molecules has important implications in antiviral drug design.

From a computer scientists point of view, including a tertiary structure element like the pseudoknot dampens the optimism of solving the RNA structure prediction problem. It has been shown that general prediction of pseudoknots is an NP-complete problem. Practical dynamic programming algorithms for a restricted class of pseudoknots are computationally very expensive. Depending on the generality of pseudoknots they can predict, they require O(n6), O(n5), or O(n4) time, which is impractical for long sequences.

Our research is focused on the development of novel and practical computational tools for detecting and predicting of pseudoknot structures (PhD project).

Figure 2: A simple RNA pseudoknot structure with its crossing two stems.