Wednesday, July 31, 2019

Interpreting GEDmatch AYPR results

If a child inherits the same allele from both parents at the exact same location, on both the paternal chromosome and the maternal chromosome, they are said to be homozygous at that location. When parents are biologically related they can share strings of alleles inherited from common ancestors, and they can both pass on the shared string or part thereof to a child. These homozygous segments are known as runs of homozygosity (ROH) and GEDmatch has a handy tool to calculate the extent of such segments in our DNA. This can be helpful when you don’t know who one or both of your biological parents are, as a) having related parents makes sorting DNA matches into genetic networks and predicting relationships to DNA matches harder so it is best to be forewarned, and b) it gives you useful information if you already know who one parent is.

The meaning of total shared segments reported by the GEDmatch Are Your Parents Related? (AYPR) tool is not obvious. Some genetic genealogists have pointed out that multiplying the total by a factor of four approximates the amount of DNA shared by the child’s parents, from which we can infer some possible relationships between unknown parents (see Homozygosity on the ISOGG Wiki for blog links). I wanted to try explain why, to myself and others.

The first column of figures in the table below will be familiar to genetic genealogists used to working with autosomal DNA statistics associated with the theoretical coefficient of relationship for different relationships (A). The last column of figures containing theoretical total homozygous autosomal segments of the child (H) is based on the theoretical coefficient of inbreeding for different relationships (half the relationship coefficient). I have tried to keep this explanation as simple as possible, assuming an otherwise outbred ancestry. I have listed my scientific sources at the end for those who want to know more about the underlying theory based on Mendelian genetics.

In the table, L is shorthand for the total length of the 22 autosomes in centiMorgans (cM). Because we have two of each chromosome, shared DNA percentages apply to double that length i.e. 2L. The theoretical proportion of a child’s autosomal genome length (L) that will be homozygous is the child’s inbreeding coefficient. We can convert these percentages to cM using the appropriate factor, 2L or L. The GEDmatch AYPR result corresponds to the figures in the last column (H).


Degree of parental relationship with examples
Theoretical total shared autosomal DNA of parents
(A)
Theoretical total homozygous autosomal segments of child
(H)
First (parent/child, full siblings)
50% x 2L
100% x L
25% x L
Second (half-siblings, uncle/niece, grandparent/grandchild, double first cousins)
25% x 2L
50% x L
12.5% x L
Third (first cousins)
12.5% x 2L
25% x L
6.25% x L
Fourth (first cousins once removed)
6.25% x 2L
12.5% x L
3.125% x L
Fifth (second cousins)
3.125% x 2L
6.25% x L
1.5625% x L


Consider the example of half-siblings M and F below, where their shared parent P has a section of DNA at a specific chromosomal location that I will call pq, the p half inherited from one of their parents and the q half from the other. Two out of four (50%) possible ways M and F can inherit DNA from P at this location result in them being identical by descent (IBD) at that location. I will call the IBD half i and the half each child inherits from their other parent j and k. One out of four (25%) possible ways a child C of M and F can inherit DNA from both parents at this location results in them being homozygous at that location.

Parent P (pq)
IBD probability
Half-sibling parents
M (ij) and F (ik)
ROH probability
Child M
Child F

Child C

p
p
25%
i
i
25%
q
q
25%
i
k
p
q
j
i
q
p
j
k


Total 50%


Total 25%

Half-siblings M and F share 50% x L (25% x 2L)

ROH segments of their child = 50% x 25% x L = 12.5% x L

In the parent/child or full siblings scenarios, the parents share 100% x L (50% x 2L)

ROH segments of their child = 100% x 25% x L = 25% x L

The autosomal DNA shared by the child’s parents can be approximated as 4 x the child’s total homozygous segments reported by GEDmatch.

Note that these figures are theoretical averages for random processes, not the ranges we observe in practice. Also note that a child's parents may still be related within a genealogical time frame if the GEDmatch AYPR result is zero, as they may not both pass on the same segments above the threshold used by GEDmatch to eliminate false positives.

If you need support dealing with an unexpected positive AYPR result please read the High ROH Infosheet prepared by genetic counselor Brianne Kirkpatrick.

Sources:

Franklin, I.R. “The distribution of the proportion of the genome which is homozygous by descent in inbred individuals.” Theoretical Population Biology 11, no. 1 (1977): 60–80.

Sund, K.L. and C.W. Rehder. “Detection and reporting of homozygosity associated with consanguinity in the clinical laboratory.” Human Heredity 77, no. 1–4 (2014): 217–24.

Sund, K.L et al. “Regions of homozygosity identified by SNP microarray analysis aid in the diagnosis of autosomal recessive disease and incidentally detect parental blood relationships.” Genetics in Medicine 15, no. 1 (2013): 70–78.

Thompson, E.A. "Descent-Based Gene Mapping in Pedigrees and Populations." In Handbook of Statistical Genomics, 4th edition. David J. Balding, Ida Moltke and John Marioni, editors. Hoboken, New Jersey: John Wiley & Sons, 2019. Chapter 20.

Weir, B.S. “Inbreeding.” In Encyclopedia of Biostatistics, 2nd edition. Peter Armitage and Theodore Colton, editors. Hoboken, New Jersey: John Wiley & Sons, 2005.

Wright, Sewall. "Coefficients of Inbreeding and Relationship." The American Naturalist 56, no. 645 (1922): 330–38.