Intuitively, the complexity of a given DNA sequence is related to the number of various superimposed biological messages it contains. Here we assess the expectation that in nucleosome DNA sequences of lower linguistic complexity, the nucleosome DNA positioning pattern would be more pronounced than in those of higher linguistic complexity. The nucleosome DNA positioning pattern is one of the weakest (highly degenerate) sequence patterns. It has been extracted recently by specially designed multiple alignment procedures. We applied the most sensitive of these procedures to nearly equal subsets of a nucleosome database separated according to linguistic complexity. The pattern extracted from the subset of the simpler nucleosome sequences not only possesses all major attributes of the known nucleosomal pattern, but is substantially stronger with respect to amplitude in comparison with the total database. This result constitutes the first demonstration that a weak pattern can be significantly enhanced by selective treatment of a lower complexity subset of the sequence ensemble under consideration.
Bibliographical noteFunding Information:
The authors are thankful to E.Kolker, who kindly provided an original program for spectral analysis, and to Dr E.Shpigelman for his version of a program for the linguistic complexity calculations. The authors would also like to express their gratitude to Drs S.Brunak and H.Herzel for their invaluable suggestions during the preparation of the manuscript. A.B. was supported by the Bat Sheva de Rothschild Fund for the Advancement of Science in Israel and the National Laboratory for Bioinformatics and DNA Sequencing of the Israel Council for Higher Education and is supported by the Danish National Research Foundation. I.I. was supported by a L.Bein WIS scholarship. K.S. received the Clarice D.Kaufmann Scholarship to the 28th Dr Bessie F.Lawrence International Summer Science Institute.
ASJC Scopus subject areas