Abstract
Background: Identifying sequence-structure motifs common to two RNAs can speed up the comparison of structural RNAs substantially. The core algorithm of the existent approach solves this problem for input structures. However, such structures are rarely known; moreover, predicting them computationally is no rescue, since single sequence structure prediction is highly unreliable. Results: The novel algorithm computes exactly matching sequence-structure motifs in entire Boltzmann-distributed structure ensembles of two RNAs; thereby we match and fold RNAs simultaneously, analogous to the well-known "simultaneous alignment and folding" of RNAs. While this implies much higher flexibility compared to, has the same very low complexity (quadratic in time and space), which is enabled by its novel structure ensemble-based sparsification. Furthermore, we devise a generalized chaining algorithm to compute compatible subsets of 's sequence-structure motifs. Resulting in the very fast RNA alignment approach, we utilize the best chain as anchor constraints for the sequence-structure alignment tool. is benchmarked in several variants and versus state-of-the-art approaches. In particular, we formally introduce and evaluate strict and relaxed variants of the problem; the latter makes the approach sensitive to compensatory mutations. Across a benchmark set of typical non-coding RNAs, has similar accuracy to but is four times faster (in both variants), while it achieves a speed-up over 30-fold for the longest benchmark sequences (≈400nt). Finally, different variants enable tailoring of the method to specific application scenarios. and are distributed as part of the package. The source code is freely available at. Conclusions: 's novel ensemble-based sparsification reduces its complexity to quadratic time and space. Thereby, significantly speeds up sequence-structure alignment while maintaining the alignment quality. Different variants support a wide range of applications.
Original language | English |
---|---|
Article number | 404 |
Journal | BMC Bioinformatics |
Volume | 15 |
Issue number | 1 |
DOIs | |
State | Published - 31 Dec 2014 |
Bibliographical note
Publisher Copyright:© Otto et al.; licensee BioMed Central.
Keywords
- RNA bioinformatics
- Sparsification
- Structure-based comparison of RNA
ASJC Scopus subject areas
- Structural Biology
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Applied Mathematics