genetic marker
genetic marker, any alteration in a sequence of nucleic acids or other genetic trait that can be readily detected and used to identify individuals, populations, or species or to identify genes involved in inherited disease. Genetic markers consist primarily of polymorphisms, which are discontinuous genetic variations that divide individuals of a population into distinct forms (e.g., AB versus ABO blood type or blond hair versus red hair). Genetic markers play a key role in genetic mapping, specifically in identifying the positions of different alleles that are located close to one another on the same chromosome and tend to be inherited together. Such linkage groups can be used to identify unknown genes that influence disease risk. Technological advances, especially in DNA sequencing, have greatly increased the catalogue of variable sites in the human genome.
Multiple types of polymorphisms serve as genetic markers, including single nucleotide polymorphisms (SNPs), simple sequence length polymorphisms (SSLPs), and restriction fragment length polymorphisms (RFLPs). SSLPs include repeat sequences, variations known as minisatellites (variable number of tandem repeats, or VNTRs) and microsatellites (simple tandem repeats, STRs). Insertions/deletions (indels) are another example of a genetic marker.
In the human genome, the most common types of markers are SNPs, STRs, and indels. SNPs affect only one of the basic building blocks—adenine (A), guanine (G), thymine (T), or cytosine (C)—in a DNA segment. For example, at a genomic location with the sequence ACCTGA in most individuals, some persons may contain ACGTGA instead. The third position in this example would be considered an SNP, since there is a possibility of either a C or a G allele occurring in the variable position. Because every individual inherits one copy of DNA from each parent, every person has two complementary copies of DNA. As a result, in the above example, three genotypes are possible: homozygous CC (two copies of the C allele at the variable position), heterozygous CT (one C and one T allele), and homozygous TT (two T alleles). The three genotype groups can be used as “exposure” categories to assess associations with an outcome of interest in a genetic epidemiology setting. Should such an association be identified, researchers may investigate the marked genomic region further to identify the particular DNA sequence in that region that has a direct biological effect on the outcome of interest.
STRs are markers in which a piece of sequence is repeated several times in a row, and the number of repeats (considered an allele) is variable within and across individuals. For example, a CCT pattern may be repeated up to 10 times, such that individuals in the population may have genotypes at that locus (chromosomal location) representing any combination of two repeat alleles of sizes 1 to 10 repeats (e.g., 10(10 + 1)/2 = 55 different possible genotypes). Indels are polymorphisms in which a piece of DNA sequence exists in some versions (insertion allele) and is deleted in others (deletion allele) in the population.