Biochemical and Biophysical Research Communications, Vol.344, No.2, 612-616, 2006
An investigation of genomic base distribution
While veritable oceans of ink have been spilled over the base distributions within genes, the literature is virtually silent on large scale intra, genomic base distribution. To address this issue, we have examined similar to 3400 chromosomal sequences from similar to 2000 entire genomes-including DNA and RNA, single- and double-stranded, coding and non-coding genomes. For each sequence the mean, variance, skew-ness, and kurtosis for each base were computed along with the genome base composition. The main findings are: (1) there is no simple relationship between these statistics and the base composition of the genome, (2) in non-viral genomes, base distribution is non-uniform, (3) base distribution in non-eukaryotic genomes obeys a number of simple rules, (4) these rules are not dependent on the presence of coding sequences, (5) bacterial genomes in particular are unusually compliant with these rules, and (6) eukaryotes have a unique pattern of base distribution. (c) 2006 Elsevier Inc. All rights reserved.