West Asian sources of the Eurasian component in Ethiopians: a reassessment
Ludovica Molinaro1∗
, Francesco Montinaro1
, Toomas Kivisild1,2
, Luca Pagani1,3∗
1Estonian Biocentre, Institute of Genomics, University of Tartu, Estonia
2Department of Human Genetics, KU Leuven, Leuven, Belgium
3Department of Biology, University of Padova, Italy
∗To whom correspondence may be addressed: lu.molinaro8@gmail.com;
lp.lucapagani@gmail.com
Summary
1 Previous genome-scale studies of populations living today in Ethiopia have found evidence of
recent gene flow from an Eurasian source, dating to the last 3,000 years1,2,3,4. Haplotype1
2
and genotype data based analyses of modern2,4 and ancient data (aDNA)3,5
3 have considered
Sardinia-like proxy2
, broadly Levantine1,4 or Neolithic Levantine3
4 populations as a range of
5 possible sources for this gene flow. Given the ancient nature of this gene flow and the extent
6 of population movements and replacements that affected West Asia in the last 3000 years,
7 aDNA evidence would seem as the best proxy for determining the putative population source.
8 We demonstrate, however, that the deeply divergent, autochthonous African component which
9 accounts for ∼50% of most contemporary Ethiopian genomes, affects the overall allele frequency
10 spectrum to an extent that makes it hard to control for it and, at once, to discern between
11 subtly different, yet important, Eurasian sources (such as Anatolian or Levant Neolithic ones).
12 Here we re-assess pattern of allele sharing between the Eurasian component of Ethiopians (here
13 called “NAF” for Non African) and ancient and modern proxies area after having extracted NAF
14 from Ethiopians through ancestry deconvolution, and unveil a genomic signature compatible
15 with population movements that affected the Mediterranean area and the Levant after the fall
16 of the Minoan civilization.
17 Results and Discussion
18 To determine the most likely source of the Eurasian gene flow into the ancestral gene pool of
19 present-day Ethiopians we have used a combination of ancestry deconvolution (AD) and allele
sharing methods6
20 . AD refers to analyses that determine the likeliest ancestry composition of
21 genomes of individuals with mixed ancestry at fine haplotype resolution. These methods have
1
All rights reserved. No reuse allowed without permission.
(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
bioRxiv preprint first posted online Jul. 8, 2019; doi:
dx.doi.org/10.1101/694299. The copyright holder for this preprint
22 allowed us to i) exploit high quality modern data and ii) harness the power of allele sharing
23 tools on genetic fractions with no or reduced African contributions. Such a strategy, while
24 potentially beneficial, introduce a novel source of bias which we aimed to explore here. Particularly, after AD of 120 Ethiopian genomes7
25 , we assigned each genomic SNP into one of the
26 following four categories based on the method likelihoods (see Methods for further details): 1)
27 confidently non African (NAF); 2) low confidence non African (X); 3) low confidence African
28 (Y) and 4) confidently African (AF, consistently filtered out from our analyses). While basing
29 our inference on the NAF component alone, we here demonstrate that the component X does
30 account for a minority of the genome and, when analysed together with NAF does not quali31 tatively change the results. Furthermore, when joining together the NAF and AF confidently
32 assigned components (to create “Joint” components) we recapitulate the signals of the global
33 population (prior to ancestry deconvolution), showing that the X and Y components are not
34 holding a considerable or peculiar genetic signature and hence ruling out, in this study, the role
35 of ancestry deconvolution as a potential source of artifacts. For the sake of clarity, out of the
36 four admixed Ethiopian populations available from Pagani et al. 2015 (Amhara, Oromo, So37 mali, Wolayta), we report results only on the NAF component of Amhara. Comparable results
38 for the other three populations, which we chose not to lump into a heterogeneous Ethiopian
39 super-population to emphasize potential population-specific peculiarities, are provided in Sup40 plementary Information.