Sequence complexity and DNA curvature

Andrei Gabrielian, Alexander Bolshoy

Research output: Contribution to journalArticlepeer-review


A linguistic complexity measure was applied to the complete genomes of HIV-1, Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Mycoplasma genitalium, and to long human and yeast genomic fragments. Complexity values averaged over entire genomic sequences were compared, as were predicted average values of intrinsic DNA curvature. We found that both the most curved and the least complex fragments are located preferentially in non-coding parts of the genome. Analysis of location of the most curved and the simplest regions in bacteria showed that the low-complexity segments are preferentially located in close proximity to the highly curved sequences, which are, in turn, placed from 100 to 200 bases upstream to the start of the nearest coding sequence. We conclude that the parallel analysis of sequence complexity and DNA curvature might provide important information about sequence-structure-function relationship in genomes.

Original languageEnglish
Pages (from-to)263-274
Number of pages12
JournalComputers and Chemistry
Issue number3-4
StatePublished - 15 Jun 1999

Bibliographical note

Funding Information:
The authors would like to express their gratitude to Drs. A. Konopka, E.N. Trifonov, and D. Landsman for their invaluable suggestions during the preparation of the manuscript. Dr. Birgit An der Lan provided excellent editorial assistance. A.B. was supported by the Danish National Research Foundation and NCBI Scientific Visitors Program at the National Library of Medicine, NIH.


  • Complexity
  • DNA curvature
  • Linguistic analysis
  • Nucleotide composition

ASJC Scopus subject areas

  • Applied Microbiology and Biotechnology
  • General Chemical Engineering
  • Biotechnology


Dive into the research topics of 'Sequence complexity and DNA curvature'. Together they form a unique fingerprint.

Cite this