ExpaRNA-P: Simultaneous exact pattern matching and folding of RNAs

Christina Otto, Mathias Möhl, Steffen Heyne, Mika Amit, Gad M. Landau, Rolf Backofen, Sebastian Will

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Identifying sequence-structure motifs common to two RNAs can speed up the comparison of structural RNAs substantially. The core algorithm of the existent approach solves this problem for input structures. However, such structures are rarely known; moreover, predicting them computationally is no rescue, since single sequence structure prediction is highly unreliable. Results: The novel algorithm computes exactly matching sequence-structure motifs in entire Boltzmann-distributed structure ensembles of two RNAs; thereby we match and fold RNAs simultaneously, analogous to the well-known "simultaneous alignment and folding" of RNAs. While this implies much higher flexibility compared to, has the same very low complexity (quadratic in time and space), which is enabled by its novel structure ensemble-based sparsification. Furthermore, we devise a generalized chaining algorithm to compute compatible subsets of 's sequence-structure motifs. Resulting in the very fast RNA alignment approach, we utilize the best chain as anchor constraints for the sequence-structure alignment tool. is benchmarked in several variants and versus state-of-the-art approaches. In particular, we formally introduce and evaluate strict and relaxed variants of the problem; the latter makes the approach sensitive to compensatory mutations. Across a benchmark set of typical non-coding RNAs, has similar accuracy to but is four times faster (in both variants), while it achieves a speed-up over 30-fold for the longest benchmark sequences (≈400nt). Finally, different variants enable tailoring of the method to specific application scenarios. and are distributed as part of the package. The source code is freely available at. Conclusions: 's novel ensemble-based sparsification reduces its complexity to quadratic time and space. Thereby, significantly speeds up sequence-structure alignment while maintaining the alignment quality. Different variants support a wide range of applications.

Original languageEnglish
Article number404
JournalBMC Bioinformatics
Volume15
Issue number1
DOIs
StatePublished - 31 Dec 2014

Bibliographical note

Publisher Copyright:
© Otto et al.; licensee BioMed Central.

Keywords

  • RNA bioinformatics
  • Sparsification
  • Structure-based comparison of RNA

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'ExpaRNA-P: Simultaneous exact pattern matching and folding of RNAs'. Together they form a unique fingerprint.

Cite this