In this paper we give a short review of the problems of homoplasy and collision in AFLP, and describe a software tool that we developed to illustrate these problems. AFLP is a DNA fingerprinting technique, producing profiles of bands, the result of the separation of DNA fragments by length on a gel or microcapillary system. The profiles are usually interpreted as binary band absence/presence patterns. We focus on two major problems: (1) Within a profile two or more fragments of the same length but of different genomic origin may have been selected, colliding into a single band. This collision problem, akin to the birthday problem, may be surprisingly large. (2) In a pair of profiles two equally long fragments of different genomic origin may have been selected, appearing as identical bands in the two profiles. This is called homoplasy. Both problems are quantified by modeling AFLP as a random sampling technique of fragment lengths. AFLP may be used in phylogenetic studies to estimate the pairwise genetic similarity of individuals. Similarity coefficients like Dice and Jaccard coefficients overestimate the true genetic similarity because of homoplasy, with increasing bias for higher numbers of bands per profile. Corrected estimators are described, which do not suffer from bias. The ideas are illustrated using a new software tool. Data from studies on Arabidopsis and tomato serve as examples. Finally, we make some recommendations with respect to the use of AFLP.
- fragment length distributions