Abstract
Motivation: A nucleosome DNA positioning pattern is known to be one of the weakest (highly degenerated) patterns. The alignment procedure that has been developed recently for the extraction of such a pattern is based on a statistical matching of the sequences, and its success depends on the pattern/background ratio in the individual sequences and in the generated pattern. The heuristic nature of the method and distinctive properties of the pattern bring up the question of efficiency and sensitivity in the procedure. This paper presents a method of verification for this multiple sequence alignment algorithm. Results: To verify the applicability of the multiple alignment approach, we constructed a set of sequences carrying the hidden pattern. The pattern was presented by weak (‘signal’) oscillations of occurrences of AA and TT dinucleotides along otherwise random sequences. Only a few dinucleotides of any given 145 base long sequence would correspond to the signal, appearing in about the same phase within the simulated periodic pattern. The novelty of our simulation approach is that we simulated a database as a whole, as opposed to simulating each sequence separately. The correlation between the hidden pattern and a sequence from the database is negligible on average, but our statistical multicycle alignment procedure produced the pattern with attributes very close to the simulated ones. The accuracy of the procedure was tested and calibrated. The presence in a typical sequence of as little as three dinucleotides corresponding to the signal is sufficient to generate (detect) the pattern hidden in a collection of 204 sequences. Availability: The programs of the multiple sequence alignment algorithm and database simulation are available from the authors free of charge. Requests should be accompanied by a 3.5″ diskette. Contact: E-mail: [email protected].
Original language | English |
---|---|
Pages (from-to) | 383-389 |
Number of pages | 7 |
Journal | Bioinformatics |
Volume | 12 |
Issue number | 5 |
DOIs | |
State | Published - 1996 |
Externally published | Yes |
Bibliographical note
Funding Information:The authors are thankful to E.Kolker for kindly providing the original program of spectral analysis. A.B. is supported by the 'B. de Rothschild Fund for the Advancement of Science in Israel' and the National Laboratory for Bioinformatics and DNA Sequencing of the Israel Council for Higher Education. I.I. is supported by an L.Bein WIS scholarship.
ASJC Scopus subject areas
- Statistics and Probability
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics