Background: Identifying sequence-structure motifs common to two RNAs can speed up the comparison of structural RNAs substantially. The core algorithm of the existent approach solves this problem for input structures. However, such structures are rarely known; moreover, predicting them computationally is no rescue, since single sequence structure prediction is highly unreliable. Results: The novel algorithm computes exactly matching sequence-structure motifs in entire Boltzmann-distributed structure ensembles of two RNAs; thereby we match and fold RNAs simultaneously, analogous to the well-known "simultaneous alignment and folding" of RNAs. While this implies much higher flexibility compared to, has the same very low complexity (quadratic in time and space), which is enabled by its novel structure ensemble-based sparsification. Furthermore, we devise a generalized chaining algorithm to compute compatible subsets of 's sequence-structure motifs. Resulting in the very fast RNA alignment approach, we utilize the best chain as anchor constraints for the sequence-structure alignment tool. is benchmarked in several variants and versus state-of-the-art approaches. In particular, we formally introduce and evaluate strict and relaxed variants of the problem; the latter makes the approach sensitive to compensatory mutations. Across a benchmark set of typical non-coding RNAs, has similar accuracy to but is four times faster (in both variants), while it achieves a speed-up over 30-fold for the longest benchmark sequences (≈400nt). Finally, different variants enable tailoring of the method to specific application scenarios. and are distributed as part of the package. The source code is freely available at. Conclusions: 's novel ensemble-based sparsification reduces its complexity to quadratic time and space. Thereby, significantly speeds up sequence-structure alignment while maintaining the alignment quality. Different variants support a wide range of applications.
|State||Published - 31 Dec 2014|
Bibliographical noteFunding Information:
This work was partially supported by the German Research Foundation (BA 2168/3-3 and MO 2402/1-1), the German Federal Ministry of Education and Research (BMBF, grant 0316165A e:Bio RNAsys to RB), the National Science Foundation (Award 0904246 to GML), the Israel Science Foundation (grant 347/09 to GML), and the United States-Israel Binational Science Foundation (BSF) and DFG (grant 2008217 to GML). We thank the anonymous reviewers for their help to improve the paper. Finally, we acknowledge support from the German Research Foundation (DFG) and Leipzig University within the program of Open Access Publishing.
© Otto et al.; licensee BioMed Central.
- RNA bioinformatics
- Structure-based comparison of RNA
ASJC Scopus subject areas
- Structural Biology
- Molecular Biology
- Computer Science Applications
- Applied Mathematics