Arabidopsis thaliana has a relatively small genome of approximately 130 Mb containing about 10% repetitive DNA. Genome sequencing studies reveal a gene-rich genome, predicted to contain approximately 25 000 genes spaced on average every 4.5 kb. Between 10 to 20% of the predicted genes occur as clusters of related genes, indicating that local sequence duplication and subsequent divergence generates a significant proportion of gene families. In addition to gene families, repetitive sequences comprise individual and small clusters of two to three retroelements and other classes of smaller repeats. The clustering of highly repetitive elements is a striking feature of the A. thaliana genome emerging from sequence and other analyses.