FULL TEXT:
www.realfuture.org/GIST/Readings/Templeton%281998%29.pdf-----------------------------------------------------------------
EXCERPT FROM TEMPLETON
ALAN R. TEMPLETON
Human Races: A Genetic and Evolutionary Perspective
Race is generally used as a synonym for subspecies, which traditionally is a geographically circumscribed, genetically
differentiated population. Sometimes traits show independent patterns of geographical variation such that some combination
will distinguish most populations from all others. To avoid making "race" the equivalent of a local population,
minimal thresholds of differentiation are imposed. Human "races" are below the thresholds used in other species, so valid
traditional subspecies do not exist in humans. A "subspecies" can also be defined as a distinct evolutionary lineage within
a species. Genetic surveys and the analyses of DN A haplotype trees show that human "races" are not distinct lineages, and
that this is not due to recent admixture; human "races" are not and never were "pure." Instead, human evolution has been
and is characterized by many locally differentiated populations coexisting at any given time, but with sufficient genetic
contact to make all of humanity a single lineage sharing a common evolutionary fate, [race, subspecies, lineage, haplotype
tree, genetic differentiation}
The word race is rarely used in the modem, nonhuman
evolutionary literature because its meaning is
so ambiguous. When it is used, it is generally used
as a synonym for subspecies (Futuyma 1986:107-109),
but this concept also has no precise definition. The traditional
meaning of a subspecies is that of a geographically
circumscribed, genetically differentiated population (Smith
et al. 1997). The problem with this definition from an evolutionary
genetic perspective is that many traits and their
underlying polymorphic genes show independent patterns
of geographical variation (Futuyma 1986:108-109).
As a result, some combination of characters will distinguish
virtually every population from all others. There is
no clear limit to the number of races that can be recognized
under this concept, and indeed this notion of subspecies
quickly becomes indistinguishable from that of a local
population. One way around this difficulty is to place
minimal quantitative thresholds on the amount of genetic
differentiation that is required to recognize subspecies
(Smith et al. 1997). A second solution is to allow races or
subspecies to be defined only by the geographical patterns
found for particular "racial" traits or characters. A similar
problem is faced in defining species. For example, the biological
species concept focuses attention on characters related
to reproductive incompatibility as those important in
defining a species. These reproductive traits have priority
in defining a species when in conflict with other traits,
such as morphology (Mayr 1970). Unfortunately, there is
no such guidance at the subspecies level, although in practice
easily observed morphological traits (the very ones
deemed not important under the biological species concept)
are used. There is no evolutionary justification for
this dominance of easily observed morphological traits;
indeed, it merely arises from the sensory constraints of our
own species. Therefore, most evolutionary biologists reject
the notion that there are special "racial" traits.
Because of these difficulties, the modern evolutionary
perspective of a "subspecies" is that of a distinct evolutionary
lineage within a species (Shaffer and McKnight
1996) (although one should note that many current evolutionary
biologists completely deny the existence of any
meaningful definition of subspecies, as argued originally
by Wilson and Brown [1953]—see discussions in Futuyma
[1986:108-109] and Smith et al. [1997:13]. The
Endangered Species Act requires preservation of vertebrate
subspecies (Pennock and Dimmick 1997), and the
distinct evolutionary lineage definition has become the de
facto definition of a subspecies in much of conservation
biology (Amato and Gatesy 1994; Brownlow 1996; Legge
et al. 1996; Miththapala et al. 1996; Pennock and Dimmick
1997; Vogler 1994). This definition requires that a
subspecies be genetically differentiated due to barriers to
genetic exchange that have persisted for long periods of
time; that is, the subspecies must have historical continuity
in addition to current genetic differentiation. It cannot
be emphasized enough that genetic differentiation alone is
insufficient to define a subspecies. The additional requirement
of historical continuity is particularly important
because many traits should reflect the common evolutionary
history of the subspecies, and therefore in theory there
American Anthropologist 100(3):632-650. Copyright © 1999, American Anthropological Association
TEMPLETON / EVOLUTIONARY GENETICS OF RACE 633
is no need to prioritize the informative traits in defining
subspecies. Indeed, the best traits for identifying subspecies
are now simply those with the bestphylogenetic resolution.
In this regard, advances in molecular genetics have
greatly augmented our ability to resolve genetic variation
and provide the best current resolution of recent evolutionary
histories (Avise 1994), thereby allowing the identification
of evolutionary lineages in an objective, explicit
fashion (Templeton 1994b, 1998a, 1998b; Templeton et
al. 1995).
The purpose of this paper is to examine the existence of
races in humans using an evolutionary genetic perspective.
The fundamental question is: Are human populations
genetically differentiated from one another in such a fashion
as to constitute either sharply genetically differentiated
populations or distinct evolutionary sublineages of
humanity? These questions will be answered with molecular
genetic data and through the application of the
same, explicit criteria used for the analyses of nonhuman
organisms. This last point is critical if die use of the word
race in humanity is to have any general biological validity.
This paper will not address the cultural, social, political,
and economic aspects of human "races."
Are Human "Races" Geographically
Circumscribed, Sharply Differentiated
Populations?
The validity of the traditional subspecies definition of
human races can be addressed by examining the patterns
and amount of genetic diversity found within and among
human populations. One common method of quantifying
the amount of within to among genetic diversity is through
the Fa statistic of Wright (1969) and some of its more
modern variants that have been designed specifically for
molecular data such as Ks, (Hudson et al. 1992) or Na
(Lynch and Crease 1990). Fa and related statistics range
from 0 (all the genetic diversity within a species is shared
equally by all populations with no genetic differences
among populations) to 1 (all the genetic diversity within a
species is found as fixed differences among populations
with no genetic diversity within populations). The Fa
value of humans (based on 16 populations from Africa,
Europe, Asia, the Americas, and the Australo-Pacific region)
is 0.156 (Barbujani et al. 1997), thereby indicating
that most human genetic diversity exists as differences
among individuals within populations, and only 15.6%
can be used to genetically differentiate the major human
"races." To put the human F,, value into perspective, humans
need to be compared to otheT species. F5,'s for many
plants, invertebrates, and small-bodied vertebrates are
typically far larger than the human value, but most of these
organisms have poor dispersal abilities, so this is to be expected.
A more valid comparison would be the Fa values
of other large-bodied mammals with excellent dispersal
abilities. Figure 1 shows the values of /Vs and related statistics
for several large-bodied mammals. As can be seen,
the human Fa value is one of die lowest, even though the
human geographical distribution is die greatest. A standard
criterion for a subspecies or race in the nonhuman literature
under the traditional definition of a subspecies as a
geographically circumscribed, sharply differentiated
population is to have F* values of at least 0.25 to 0.30
(Smith et al. 1997). Hence, as judged by the criterion in die
nonhuman literature, the human Fn value is too small to
have taxonomic significance under the traditional subspecies
definition.
This does not mean that the low human Fa value is without
any evolutionary significance. Suppose for the moment
that the Fn values in humans truly reflect a balance
between gene flow versus local drift/selection and are not
due to isolated human lineages. One convenient method
for quantifying this balance is Nm, the product of local effective
population size (N) with m, the migration rate between
demes. Under die idealized population structure
known as the island model, the relationship between Fs,
and Nm is (Wright 1969):
F,, (1)
4Nm
Most real populations do not fit an island model (which
assumes that gene flow is independent of geographical
distance). Nm is therefore not die actual number of individuals
exchanged per generation, but rather is an effective
number of migrating individuals per generation
relative to this simple, idealized model of population
structure. This allows comparisons across different species
in effective amounts of gene flow with respect to a
common standard. Forthe human/Rvalue of0.156,Mn =
1.35. This result is consistent with the work of Santos et al.
(1997) who examined several human data sets with a variety
of statistical procedures and always obtained Nm > 1.
Widi Nm on the order of 1, massive movements of large
numbers of individuals are not needed to explain the level
of genetic differentiation observed in humans. Moreover,
Nm = 1.35 does not mean that precisely 1.35 effective individuals
migrate among the "races" every generation;
rather, diis is the long-term average. Assuming a generation
time of 20 years, the levels of racial differentiation in
humanity could be explained by interchanging 1.35 effective
individuals every 20 years, or 13.5 every 200years, or
135 every 2,000 years. Since humans often move as populations,
gene flow could be very sporadic on a time scale
measured in thousands to tens of thousands of years and
still yield an effective number of migrants of 1.35.
An Nm value of 1.35 would insure that the population
evolves as a single evolutionary lineage over long periods
of time (Crow and Kimura 1970). Nevertheless, population
genetic theory also indicates that fluctuations around
634 AMERICAN ANTHROPOLOGIST • VOL. 100, No. 3 • SEPTEMBER 1998
<<SNIP>>
Figure 1. Fsl (or /fs, or Nsl) values for various species of large-bodied mammals with excellent dispersal abilities. The figure shows Fs, (or its
multiallelic analogue, Gst) values for African buffalo (Templeton and Georgiadis 1996), humans (Barbujani et al. 1997), bighorn sheep (Boyce
et al. 1997), elephants (Georgiadis et al. 1994), and white-tailed deer (Ellsworth et al. 1994); K5t values for waterbuck, impalas, wildebeest, and
Grant's gazelle (Arctander et al. 1996); and Njr values for coyotes (Lehman and Wayne 1991) and wolves (Wayne et al. 1992). The geographical
scale of the study is indicated by the species name. Values are given in order of size, with the human value indicated in black and nonhuman
values in gray.
an average Nm of order one is conducive both to the rapid
spread of selectively favored genes throughout the species
and to local population differentiation and adaptation
(Barton and Rouhani 1993). If anatomically modern traits
did indeed first evolve in Africa, the human Nm value implies
that such traits could rapidly spread throughout all of
humanity through gene flow if selectively favored even
though local populations could still display genetic differentiation
for other loci. Studies on nonhuman organisms
indicate that Nm values can be larger than those in humans
and yet the species can still display much local differentiation
and adaptation, as predicted by this theory. For example,
populations of Drosophila mercatorum on the slopes
versus the saddle of the Kohala mountains on the island of
Hawaii (a distance of 3 km) have an estimated Nm of between
4 and 8 (DeSalle et al. 1987). Nevertheless, these
populations show extreme differentiation and local adaptation
for the abnormal abdomen syndrome, a complex
polygenic suite of phenotypes that affects morphology,
developmental time, female fecundity, male sexual maturation,
and longevity in adaptively significant ways (Hollocher
and Templeton 1994; Hollocher et al. 1992; Templeton
et al. 1993; Templeton et al. 1989). Similarly,
garter snake populations in Lake Erie have an Nm value
between 2.7 and 37.6 among sites with populations that
differ greatly in the amount of melanism (King and
Lawson 1995, 1997; Lawson and King 1996). These examples
(and many more could have been given) clearly
show that Nm values higher than the estimated Nm value
for humans are still compatible with much local differentiation
across space even though the gene flow is sufficiently
high to ensure that the species as a whole evolves
as a single lineage over time.
The above discussion was predicated upon the assumption
that the human Fsl value arose from the balance of
gene flow versus local drift and selection. Unfortunately,
TEMPLETON / EVOLUTIONARY GENETICS OF RACE 635
Current
Populations Asians Africans Europeans
Asians Africans Europeans
t t
100,000
Years Ago
Dispersal of
Homo erectus
out of Africa
A.
Ancient Origin Candelabra B. Recent Origin Candelabra
With Replacement
Figure 2. Candelabra models of recent human evolution. Part A illustrates the ancient origin version of the candelabra model. Under this
hypothesis, the major human "races" split from one another at the time of dispersal of Homo erectus out of Africa. After that initial split, the
various "races" behaved as separate evolutionary lineages and independently evolved into their modem forms. Part B illustrates the recent origin
version of the candelabra model with replacement. Under this hypothesis, an initial candelabra existed as illustrated in part A. However,
anatomically modem humans then arose in Africa and dispersed out of Africa around 100,000 years ago. This second dispersal event was marked
by the complete genetic extinction of the earlier Homo erectus populations (indicated by the broken lineages in B) and by a split of these
anatomically modem humans into separate evolutionary lineages that then independently acquired their modem "racial"' variation.
the Fa statistic per se cannot discriminate among potential
causes of genetic differentiation (Templeton 1998a). Although
human "races" do not satisfy the standard quantitative
criterion for being traditional subspecies (Smith et
al. 1997), this does not necessarily mean that races do not
exist in the evolutionary lineage sense. Under the lineage
concept of subspecies, all that is needed is sufficient genetic
differentiation to define the separate lineages. If the
lineages split only recently, the overall level of divergence
could be quite small. Therefore, the quantitative levels of
genetic diversity among human populations do not rule
out the possibility that human "races" are valid under the
evolutionary lineage definition of subspecies. The re
mainder of this paper will focus upon this more modern
definition of subspecies.
Are Human "Races" Distinct Evolutionary
Lineages?
Models of Human Evolution and Human Races
When a biological race is defined as a distinct evolutionary
lineage within a species, the question of race can
only be answered in the context of the recent evolutionary
history of the species. The two dominant models of recent
human evolution during the last half of this century are the
636 AMERICAN ANTHROPOLOGIST • VOL. 100. No. 3 •
Cuirent
Africans Europeans
Populations
Dispersal of
Homo erectus •
out of Africa
Figure 3. The trellis model of recent human evolution. Under this
hypothesis, Homo erectus dispersed out of Africa and established
populations in Africa and southern Eurasia, as indicated by the large
dots. These populations were interconnected by gene flow so that
there were no evolutionary sublineages of humanity or independent
evolution of the various "races." Arrows with heads on both ends
indicate gene flow among contemporaneous populations, and arrows
with single heads indicate lines of genetic descent.
candelabra (Figure 2) and trellis (Figure 3) models. Both
models accept the evolutionary origin of the genus Homo
in Africa and the spread of Homo erectus out of Africa a
million years ago or more. Candelabra models posit that
the major Old World geographical groups (Europeans,
sub-Saharan Africans, and Asians) split from one another
and since have had nearly independent evolutionary histories
(but perhaps with some subsequent admixture).
Therefore, the evolutionary relationships among Africans,
Europeans, and Asians can be portrayed as an evolutionary
tree—in this case with the topology of a candelabra
(Figure 2). The major human geographical populations
are portrayed as the branches on this candelabra and are
therefore valid "races" under the evolutionary lineage
definition. The ancient origin candelabra model regarded
the split between the major "races" as occurring with the
spread of Homo erectus (Figure 2A) followed by independent
evolution of each "race" into its modern form.
This version has been thoroughly discredited and has no
serious advocates today. However, a recent origin candelabra
model known as the out-of-Africa replacement hypothesis
(Figure 2B) has become widely accepted. Under
this model, anatomically modern humans evolved first in
Africa. Next, a small group of these anatomically modern
humans split off from the African population and colo
SEPTEMBER 1998
nized Eurasia about 100,000 years ago, driving the Homo
erectus populations to complete genetic extinction everywhere
(the "replacement" part of the hypothesis). The ancient
(Figure 2 A) and recent (Figure 2B) candelabra models
differ only in their temporal placement of the ancestral
node but share the same tree topology that portrays Africans,
Europeans, and Asians as distinct branches on an
evolutionary tree. It is this branching topology that defines
"races" under the evolutionary lineage definition, and not
the time since the common ancestral population. Hence,
human "races" are valid evolutionary lineages under
either candelabra model.
The trellis model (Figure 3) posits that Homo erectus
populations not only had the ability to move out of Africa
but also back in, resulting in recurrent genetic interchange
among Old World human populations (Lasker and Crews
1996; Wolpoff and Caspari 1997). It is also important to
note that, under the trellis model, the taxonomic designations
of Homo erectus and H. sapiens only have morphological
significance and do not imply reproductive isolation
as under the biological-species concept (Mayr 1970).
Therefore, anatomically modern traits could evolve anywhere
in the range of Homo erectus (which includes Africa)
and subsequently spread throughout all of humanity
by selection and gene flow. Hence, an African origin for
anatomically modern humans is compatible with both the
trellis and candelabra models. The two models do differ in
their interpretation of interpopulational genetic differences.
Populational genetic differences reflect the time of
divergence from a common ancestral population under the
candelabra models. With the trellis model, the genetic distances
reflect the amount of genetic interchange and not
time of divergence from an ancestor. However, the most
important distinction between the candelabra and trellis
models for the discussion at hand is that under the trellis
model there was no separation of humanity into evolutionary
lineages, and hence human "races" are not valid subspecies.
In summary, human "races" as evolutionary lineages
do exist under the candelabra models but do not exist
under the trellis model.
Although these two models are frequently presented as
mutually exclusive alternatives (Wolpoff and Caspari
1997), there is no biological reason why some human
populations may be genetically differentiated because
they are historical lineages, whereas other populations are
differentiated because of recurrent but restricted gene
flow. Moreover, the genetic differences between any two
human populations may represent a mixture of both gene
flow and historical events. Much genetic evidence is
equally compatible with both models and hence is noninformative.
The emphasis in this paper will therefore be
upon data sets that discriminate between gene flow and
historical splits as non-mutually exclusive causes of differentiation
among human populations.
TEMPLETON EVOLUTIONARY GENETICS OF RACE
Genetic Diversity Levels within and among
Human Populations
Do the levels of genetic diversity found within and
among human "races" discriminate between evolutionary
lineage and genetic interchange models of recent
human evolution? As pointed out earlier, F statistics
and related measures of within to among diversity levels
do not discriminate per se. However, one conclusion
reached in that section has great relevance to the debate
over the validity of human races as evolutionary lineages;
namely, that the estimated gene flow levels in humans
are compatible with local differentiation across
geographical space even though the species as a whole
could evolve as a single lineage over time. Much skepticism
about the trellis model stems from the belief that a
delicate balance is required between gene flow (to insure
all humans are a common evolutionary lineage
over time) and local genetic drift/selection (to maintain
humans as a polytypic species at any given moment in
time) (Aiello 1997; Nei and Takezaki 1996). Indeed,
even proponents of the trellis model have argued that
only rarely can a species be polytypic under a trellis
model. For example, Wolpoff and Caspari (1997:282)
state that "the human pattern .. . of a widespread poly-
typic species with many different ecological niches .. . is a
very rare one." However, polytypic species are not rare
(Futuyma 1986; Mayr 1970). Moreover, as illustrated
by the examples given earlier, polytypic species occur
over a broad range of values for Nm and are a robust evolutionary
outcome. There is no difficulty either in population
genetic theory or observation for the conclusion
that humans can be both a polytypic species and a single
evolutionary lineage.
Although F statistics are compatible with either model
of human evolution, the claim is made in much of the recent
literature that within "race" diversity levels support
the recent candelabra model. Africans have higher
amounts of genetic diversity than non-Africans for many
nuclear loci (Armour et al. 1996; Jorde et al. 1997; Perez-
Lezaun et al. 1997). mitochondrial DNA (mtDNA) (Comas
et al. 1997; Francalacci et al. 1996), and some regions
of Y-DN A (Hammer et al. 1997). These results are often
interpreted as supporting the recent candelabra model by
assuming that only a small number of individuals left Africa
to colonize Eurasia with little or no subsequent gene
flow. As a result, a bottleneck effect reduced the levels of
genetic variation in non-Africans. This interpretation of
genetic diversity also implies that at least Africans and
non-Africans are distinct evolutionary lineages and hence
are valid races. However, alternative explanations of diversity
levels exist. Under the neutral theory, the expected
heterozygosity for a DNA region (a standard measure of
genetic diversity) is given by:
1
Heterozygosity = (2)
where Nt is an effective size of the population and p. is the
mutation rate of the DNA region of interest. Equation (2)
reveals that differences in effective size can explain differences
in the level of genetic diversity. Africans are expected
to have higher genetic diversity simply because
their population sizes were larger during much of the last
million years (Harpending et al. 1996; Relethford and
Harpending 1994, 1995). Indeed, the patterns of genetic
diversity found in humans are more consistent with differences
in population sizes and growth rates than with differences
in population ages from presumed bottlenecks
(Harding et al. 1997; Perez-Lezaun et al. 1997). The danger
of using diversity levels as an indicator of population
age from a bottleneck is illustrated by the observation that
mitochondrial DNA diversity within Africa is higher in
food-producing populations than in hunter-gatherers
(Watson et al. 1996). By equating diversity to age, this
result would imply that agricultural peoples in Africa represent
the ancestral populations, whereas the hunter-gatherers
are the recent descendant populations. Such a
conclusion is not credible, and the diversity levels within
Africa are interpreted as reflecting effective size differences
(Watson etal. 1996).
Note that equation (2) has no time component. The reason
is that equation (2) describes the diversity levels at
equilibrium. When the equilibrium is disturbed by bottlenecks
or rapid population growth, time enters as a factor
(Templeton 1997a). Fortunately, different causes of departure
from equilibrium can be discriminated. For example,
a bottleneck and split should affect all genetic systems.
However, nuclear DNA and mitochondrial DNA
show discordant patterns in humans, a result inconsistent
with the presumed population bottleneck and the sharing
by all genetic systems of a common demographic history
(Hey 1997; Jorde et al. 1995). One can also discriminate
by the patterns of diversity across genetic systems that differ
in mutation rate (Templeton 1997a). A bottleneck reduces
genetic variation, and temporal dependence enters
because mutation takes time to restore genetic diversity.
Hence, the longest lasting discrepancies in relative genetic
diversity levels are for low mutation rate systems.
Therefore, under the bottleneck hypothesis, Africans
should show the greatest excess in relative genetic diversity
for low mutation rate systems. However, the excess
genetic diversity in Africans is found with the high mutation
rate systems (Armour et al. 1996; Comas et al. 1997;
Francalacci etal. 1996; Jorde etal. 1997; Perez-Lezaun et
al. 1997), whereas the classic, low mutation rate systems
show comparable levels of genetic diversity (Bowcock et
al. 1994; Jorde et al. 1995), and a low polymorphic section
of Y-DN A shows greater levels of diversity in Europeans
than in Africans (Mitchell 1996). An alternative non-equilibrium
pattern can be generated by rapid population
638 AMERICAN ANTHROPOLOGIST • VOL. 100, No. 3 •
growth which causes an increase—not a decrease—in
levels of genetic diversity. The high mutation rate systems
show the earliest and strongest response to increased
population size, which is consistent with the observed pattern.
Hence, the observed diversity patterns reflect human
population growth rather than population bottlenecks.
The within "race" genetic diversity levels do not support
the idea that Eurasians split off from Africans via a
small founder population, but they do not necessarily falsify
the notion that a Eurasian/African split occurred without
a bottleneck. Therefore, the within population genetic
diversity data are inconclusive on the status of Eurasians
and Africans as separate evolutionary lineages and
thereby valid races.
Genetic Distances and Evolutionary "Trees"
An alternative method to Fa of measuring the extent of
genetic differentiation among populations is to convert
the genetic differences into a genetic distance. There are
several genetic distance measures available, and sometimes
the biological conclusions are strongly dependent
upon the precise measure chosen (Perez-Lezaun et al.
1997). However, this problem will be ignored in this paper
because the relative distances among the major human
"races" appear robust to differing genetic distance measures
(Cavalli-Sforza 1997). Genetic distances in turn can
be converted into an evolutionary tree of populations by
various computer algorithms. Figure 4A shows such a
population tree (Cavalli-Sforza et al. 1996). This and most
other human genetic distance trees have the deepest divergence
between Africans and non-Africans, and this split is
commonly estimated to have occurred around 100,000
years ago (Cavalli-Sforza et al. 1996; Cavalli-Sforza
1997; Nei and Takezaki 1996). All this seems consistent
with the recent candelabra model, but non-zero genetic
distances can also arise and persist between interbreeding
populations with recurrent gene flow (Wright 1931,1943,
1969). As shown by Slatkin (1991), recurrent gene flow
results in an average divergence time of gene lineages between
populations even when no population-level split
occurred and the divergence levels are at equilibrium and
thereby time invariant. Therefore, an apparent genetic
time of divergence does not necessarily imply a time of
population splitting—or any population split at all. Under
a trellis model, genetic distances reflect the patterns and
amounts of gene flow and not the age since some "separation"
or "split."
Fortunately, these two interpretations of genetic distance
can be distinguished. If human populations can truly
be represented as branches on an evolutionary tree, then
the resulting genetic distances should satisfy several constraints.
For example, under the candelabra model, all
non-African human populations "split" from the Africans
at the same time, and therefore all genetic distances be
SEPTEMBER 1998
<<SNIP>>
C. A. R. Zaire Europeans Chinese Melanesians
Pygmies Pygmies
Europeans Chinese
B.
Melanesians
Figure 4. Genetic distances and recent human evolution. Part A
shows an evolutionary tree of human populations as estimated from
the genetic distance data given in Bowcock et al. (1991). Human
population evolution is depicted as a series of splits, and the numbers
on the left indicate the estimated times of divergence in thousands of
years. This figure is redrawn from Figure 2.4.4, Cavalli-Sforza et al.
(1996:91). Part B shows the same genetic distance data drawn with
the neighbor-joining method but without all the constraints of a tree.
This figure is redrawn from Figure 2.4.5, Cavalli-Sforza et al.
(1996:91).
tween African and non-African populations have the same
expected value (Figure 4A). When genetic distances in
stead reflect the amount of gene flow, "treeness" con
straints are no longer applicable. Because gene flow is
commonly restricted by geographical distance (Wright
1943), gene flow models are expected to yield a strong
positive relationship between geographical distance and
genetic distance. Figure 4B places the populations on a
two-dimensional plot in a manner that attempts to reflect
their genetic distances from one another, particularly
nearest-neighbor distances, while otherwise attempting to
minimize the total sum of branch lengths (formally, a
neighbor-joining dendrogram). Figure 4B uses the same
genetic distance data used to generate the tree in Figure
4A, but without imposing all the constraints of treeness
(Cavalli-Sforza et al. 1996). Note that Europeans fall be
tween Africans and Asians as predicted by their geo
graphical location—in contrast to the candelabra model
prediction of equal genetic distances of Europeans and
Asians to Africans. The computer programs used to gener
ate "trees" from genetic distance data will do so regardless
TEMPLETON / EVOLUTIONARY GENETICS OF RACE 639
of what evolutionary factors generated the distances. It is
therefore the obligation of the users of such programs to
ensure that the genetic distance data have the properties of
treeness before representing their data as a tree. To present
trees that do not have the properties of treeness is analytically
indefensible, and worse, it is biologically misleading.
The failure of human genetic distances to fit treeness is
ubiquitous whenever tested (Bowcock et al. 1991;
Cavalli-Sforza et al. 1996; Nei and Roychoudhury 1974,
1982). Nevertheless, these same authors persist in presenting
the relationships of the major human "races" as an
evolutionary tree. Worse, many recent papers do not even
test for treeness. For example, Nei and Takezaki (1996)
give several trees for both old and new genetic data sets,
but not a single test of treeness is given or even mentioned.
However, the older data sets given in Nei and Takezaki
(1996) have long been known not to fit treeness (Nei and
Roychoudhury 1974, 1982). The newer data sets in Nei
and Takezaki (1996) were not tested for treeness in the
original papers, so a test will be given here using a standard
measure of treeness—the cophenetic correlation
(Rohlf 1993). Because the trees are themselves estimated
from the genetic distance data, a large, positive cophenetic
correlation is always expected and any correlation less
than 0.8 is regarded as a "poor" fit (Rohlf 1993). The cophenetic
correlations for the new data sets given in Nei and
Takezaki (1996) are 0.75 for the microsatellite data of
Bowcock et al. (1994), 0.69 for the microsatellite data of
Deka et al. (1995), 0.79 for the restriction fragment length
data of Mountain and Cavalli-Sforza (1994), and 0.45 for
the Alu insertion polymorphism data of Batzer et al.
(1994). Not one of the data sets fits treeness.
In marked contrast, the genetic distance data fit well to a
restricted gene flow model. In their analyses of the older
data sets, Nei and Roychoudhury (1974,1982) not only rejected
treeness. but showed that the deviations were those
expected from genetic interchange among the "races."
Similarly, Bowcock et al. (1991) not only rejected
treeness for their data, but also showed that their data fit
well to a model of "continuous admixture, in time, in
space, or in both: a chain of populations somewhat similar
to a stepping-stone model in which the ancestors of Europeans
are geographically intermediate between the two
extremes, Africans and Asians" (p. 841). The phrase "continuous
admixture" is an oxymoron, as will be evident
later, but in this case it is used as a synonym for recurrent
gene flow (Cavalli-Sforza, personal communication).
The "stepping-stone model" is a classic isolation by distance
model, so Bowcock et al. (1991) show an excellent
fit of their data to the recurrent gene flow model of isolation
by distance. Santos et al. (1997) analyzed several human
data sets with a variety of statistical procedures and
found that the pattern is one of isolation by distance with
high gene flow between geographically close populations.
Finally, Cavalli-Sforza et al. assembled a compre
0.031
0.00
1000 2000 3000 4000 5000
Geographic Distance in Miles
Figure 5. Genetic distances and isolation by geographical distance.
The global human genetic distances (the ordinate) are plotted against
geographical distance in miles (the abscissa). The circles indicate the
observed values, and the curved line is the theoretical expectation
under an isolation-by-distance model. This figure is redrawn from
Figure 2.9.2, Cavalli-Sforza et al. (1996:123).
hensive human data set and concluded that "the isolationby-
distance models hold for long distances as well as for
short distances, and for large regions as well as for small
and relatively isolated populations" (1996:124). Figure 5
is a redrawing of one of the figures from Cavalli-Sforza et
al. (1996) that illustrates how well an isolation by distance
model fits the human data.
Given that there is no tested human genetic distance
data set consistent with treeness and that isolation by distance
fits the human data well, proponents of the recent
candelabra model have attempted to salvage the candelabra
model by postulating a complex set of "admixtures
between branches that had separated a long time before"
(Cavalli-Sforza et al. 1996:19). The key phrase in this proposal
is between branches that had separated a long time
before (Terrell and Stewart 1996). Admixture occurs
when genetic interchange is reestablished between populations
that had separated in the past and undergone genetic
divergence (i.e., the gene flow patterns have been
discontinuous). Proponents of the recent candelabra
model then attempt to reconcile the genetic distance data
with an admixture model that mimics some of the effects
(and the good fit) of recurrent gene flow. By invoking admixture
events as needed, human "races" can still be
treated as separate evolutionary lineages, but now with the
qualification that the "races" were purer in the past—the
paradigm of the "primitive isolate" (Terrell and Stewart
1996). However, even advocates of the recent candelabra
model acknowledge that these postulated admixture
events are "extremely specific" and "unrealistic" (Bowcock
etal. 1991:841).
640 AMERICAN ANTHROPOLOGIST • VOL. 100. No. 3 • SEPTEMBER 1998
For example, consider Melanesians and Africans. As
shown in Figure 4B, these two human populations have
nearly maximal genetic divergence within humanity as a
whole with respect to molecular markers. Moreover, note
that Europeans are closer to both Africans and to Melanesians
than are Africans to Melanesians (Figure 4B). However,
Melanesians and Africans share dark skin, hair texture,
and cranial-facial morphology (Cavalli-Sforza et al.
1996; Nei and Roychoudhury 1993)—the traits typically
used to classify people into races. One obvious conclusion
from this gross disparity between racially defining traits
and the molecular genetic data is that classifications based
on these "racial" traits have no evolutionary validity.
However, in order to salvage the racial types emerging
from the candelabra model, Nei and Roychoudhury
(1993) propose two dispersal events out of Africa. The
first group of people moved through the Middle East to
Northeast Asia and then moved southward to occupy
Southeast Asia. Later, a second group of humans migrated
out of Africa to the Indian subcontinent and then to Southeast
Asia, where admixture occurred with the earlier
Asian group. Nei and Roychoudhury then propose that the
resultant admixed population in Southeast Asia absorbed
most of its gene pool from the older Asian group, but "retained
the genes for dark skin, frizzled hair, etc. from Africans,
because of natural selection in tropical conditions"
(1993:937). This admixed group then moved out to the islands
of the Pacific and Australia. The part of this admixed
population that remained in Southeast Asia and India then
experienced additional admixture events involving the
older Asians and Europeans. This second round of admixture
wiped out most of the "African traits" in India and
Southeast Asia except for a few isolated subpopulations
(Nei and Roychoudhury 1993).
Nei and Roychoudhury argue that this complicated, ad
hoc scheme is more plausible than the hypothesis of
"independent evolution of African traits in this area"
(1993: 938). However, no mention is even given to the
trellis model interpretation in which these traits are not
"African" traits at all, but rather tropical adaptive traits
that are favored in human populations living in the appropriate
environment—populations that are not evolutionarily
independent because they were and are in genetic
contact. Moreover, even this complicated scheme of multiple
admixture and massive population movements still
does not explain the genetic distance data. Admixed populations
are expected to be intermediate in genetic distance
between the original parental populations, but Melanesians
are not intermediate between mainland Asian populations
and Africans (Figure 4B). This example shows that
although complex, multiple ad hoc admixture events are
invoked to reconcile the recent candelabra model with the
genetic distance data, they still fail to do so. In contrast,
isolation by distance fits the human data well and all that it
requires is that humans tend to mate primarily with others
born nearby but often outside one's own natal group
(Lasker and Crews 1996; Santos etal. 1997).
The hypothesis of admixture can be tested directly.
When admixture occurs between branches that have differentiated
under past isolation, genetic clines are set up
simultaneously for all differentiated loci. This results in a
strong geographical concordance in the clines for all genetic
systems, both neutral and selected. In contrast, isolation
by distance may result in geographical concordance
for systems under similar selective regimes (Endler 1977;
King and Lawson 1997), but otherwise no concordance is
expected. Hence, the lack of concordance of "African
traits" with molecular genetic distances is not surprising
under an isolation by distance model. The lack of concordance
in the geographical distribution of different elements
has been thoroughly and extensively documented
by others and has been one of the primary traditional arguments
against the biological validity of human races
(Cavalli-Sforza et al. 1996; Futuyma 1986). This lack of
concordance across genetic systems falsifies the hypothesis
of admixture of previously isolated branches and the
idea that "races" were "pure" in the past.
The genetic distance data are therefore informative
about the status of human "races" as evolutionary lineages.
Genetic distance analyses strongly and uniformly
indicate that human "races" cannot be represented as
branches on an evolutionary tree as under the candelabra
models, even by invoking ad hoc admixture events. Genetic
distances, when properly analyzed, undermine the
biological validity of human races as evolutionary lineages.
Haplotype Trees
The final type of genetic evidence to be considered is
that arising from phylogenetic reconstructions of the genetic
variation found in homologous regions of DNA that
show little or no recombination. All the homologous copies
of DNA in such a DNA region that are identical at
every nucleotide (or in practice, identical at all scored nucleotide
sites) constitute a single haplotype class. A mutation
at any site in this DNA region will usually create a
new haplotype that differs initially from its ancestral
haplotype by that single mutational change. As time proceeds,
some haplotypes can acquire multiple mutational
changes from their ancestral type. All the different copies
of a haplotype for each of the haplotypes in a species are
subject to mutation, resulting in a diversity of haplotypes
in the gene pool that vary in their mutational closeness to
one another. If there is little or no recombination in the
DNA region (as is the case for human mitochondrial DNA
or for small segments of nuclear DNA), the divergence of
haplotypes from one another reflects the order in which
mutations occurred in evolutionary history. When mutational
accumulation reflects evolutionary history, it is
possible to estimate a network that shows how mutational
changes transform one haplotype into another or from
some common ancestral haplotype. Such a network represents
an unrooted evolutionary tree of the haplotype variation
in that DNA region and is called a haplotype tree, hi
some circumstances, the ancestral haplotype is known or
can be inferred, thereby providing a rooted haplotype tree.
In practice, haplotype trees are sometimes difficult to infer
from the mutational differences among a set of observed
haplotypes because the same mutation may have occurred
more than once, thereby destroying the relationship between
mutational state and evolutionary history, and/or
recombination may have scrambled up the DNA region so
thoroughly that accumulated mutational differences reflect
both evolutionary history and recombination in a
confounded fashion. When they can be estimated, haplotype
trees directly reflect only the evolutionary history of
the genetic diversity being monitored in the DNA region
under study. Haplotype trees are not necessarily evolutionary
trees of species nor of subpopulations within species.
For example, suppose a species is and always has
been completely randomly mating as a single population
and therefore has no subpopulation evolutionary history at
all; yet that same randomly mating species will have
haplotype trees for all homologous DNA regions that
show little or no recombination.
The publication of mitochondrial haplotype trees
(Cannetal. 1987; Vigilant etal. 1991) motivated much of
the current debate over recent human evolution. These and
subsequent papers (Stoneking 1997) on mitochondrial
haplotype trees make a threefold argument in favor of the
recent candelabra model: (1) all mtDNA types in current
human populations can be traced back to a single common
ancestor (mitochondrial "Eve"), (2) the root of the mitochondrial
tree is in Africa, and (3) the tree coalesces to its
common ancestral type about 200,000 years ago. Although
the original haplotype trees were estimated incorrectly
because of an improper use of a computer program
(Maddison 1991; Templeton 1992), this error is trivial in
light of the fact that these three arguments are noninformative
about the status of human populations as evolutionary
lineages and therefore do not discriminate between the
candelabra and trellis models (Templeton 1994a). Point
(1) is a universal for all models of human (and indeed, nonhuman)
evolution because all homologous segments of
DNA are expected to coalesce to a common ancestral
molecule under any model of evolution in a finite population
(Tavare et al. 1997). Indeed, haplotype trees would
not exist at all if this were not true. With respect to point
(2), the trellis model is compatible with any root location
occupied by humans at the time of coalescence, which includes
Africa. Because the bulk of humanity lived in Africa
hundreds of diousands of years ago (as previously
noted), an African root is the most likely result under the
trellis model. Argument (3) is based on the premise that
TEMPLETON / EVOLUTIONARY GENETICS OF RACE 641
mitochondrial DNA can spread only when populations
expand geographically, so mitochondrial DNA either
spread with Homo erectus (a million years ago or more) or
with die presumed spread of anatomically modem humans
about 100,000 years ago (Cann et al. 1987; Stoneking
1997; Vigilant et al. 1991)- This premise equates the
mitochondrial haplotype tree to a population tree. Haplotype
trees may or may not reflect population history (indeed,
as pointed out above, there may be no population
history at all), and this proposition needs to be tested rather
than assumed. In particular, when dealing widi populations
that are exchanging genes (the premise of the trellis
model), a haplotype can spread geographically at any time
via geneflow. Hence, a coalescence time of 200,000 years
ago is compatible with either model of human evolution
(Templeton 1994a).
A fourth argument, not present in the original "Eve" papers
but related to mtDNA coalescence time, is that the human
population size at the time of coalescence was too
small to be compatible with the trellis model (Rogers
1997). Under neutrality, the expected coalescence time of
mtDNA is 2Ne generations, where Nf is the inbreeding effective
size of females. Assuming a coalescence time of
200,000 years ago and a generation length of 20 years
yields Ne = 5,000. More complicated coalescent models
yield different estimates, but all are on the order of diousands
for JV, (Rogers 1997). Nt is not the census size of females.
In general, effective sizes are much smaller than
census size. For example, in conservation biology it is
standard to assume that the effective size is only one-fifth
the census size for large-bodied mammals. This fivefold
correction factor from conservation biology assumes a
stable or declining census size, but when population sizes
are increasing, as seems to be the case for humans over die
past hundred thousand years or so, inbreeding effective
size can be orders of magnitude smaller than census size or
other effective sizes, such as the variance effective size
(Templeton 1980). Hence, a fivefold correction for inbreeding
effective size to census size is undoubtedly conservative
for recent human evolution. Moreover, die census
size should be doubled to include males. Thus, the
estimate ofNf = 5,000 implies a census size of 50,000 humans
or more. Also, coalescence time is not known to be
exactly 200,000 years but rather has a broad confidence
interval due to a lack of precise knowledge about the neutral
mutation rate and evolutionary stochasticity (Tavare
et al. 1997; Templeton 1993). Using the full range of ambiguity
given in Templeton (1993), population sizes up to
200,000 cannot be excluded. Moreover, since 1993. the
ambiguity on the mtDNA mutation rate has actually increased
(Arnason et al. 1996; Howell et al. 1996; Parsons
etal. 1997), taking the upper limits of the confidence range
close to a population size of 500,000. All of these calculations
depend upon the assumption of neutrality. Deleterious
mutations will cause this procedure to underestimate
642 AMERICAN ANTHROPOLOGIST • VOL. 100, No. 3 • SEPTEMBER 1998
effective size, and such mutations are known to exist (Hey
1997; Nachman et al. 1996; Templeton 1996). Therefore,
all of the above calculations are lower bounds given this
demonstrated violation of assumptions. More impor
tantly, even a single advantageous mutation occurring
anywhere within the mtDNA genome at any time during
the past few hundreds of thousands of years of human evo
lution will make the effective size estimator quantitatively
meaningless (Rogers 1997). Given the broad confidence
ranges associated with this estimation procedure and its
extraordinary sensitivity to deviations from neutrality, it
is patent that the population size argument does not dis
criminate among the alternatives.
Fortunately, there is much information in haplotype
trees that can be used to test the hypothesis that human
"races" are evolutionary sublineages whose past purity
has been somewhat diminished by admixture. For example,
in order to reconcile the candelabra model with the genetic
distance data, it is necessary to regard Europeans as a
heavily admixed population (Bowcock et al. 1991;
Cavalli-Sforza et al. 1996). When admixture occurs,
haplotypes that differ by multiple mutational events with
no existing intermediate haplotypes should coexist in the
admixed population's gene pool (Manderscheid and Rogers
1996; Templeton et al. 1995). The detection of such
highly divergent haplotypes requires large sample sizes of
the presumed admixed population in order to have statistical
power. When large sample surveys have been performed
upon the presumed admixed European populations,
no highly divergent haplotypes or evidence for
admixture are observed for either mtDNA (Manderscheid
and Rogers 1996) or Y-DNA (Cooper et al. 1996). In contrast,
isolation by distance (the trellis model) produces
gene pools without strongly divergent haplotypes (i.e.,
most haplotypes differ by one or at most just a few mutational
steps from some other haplotype found in the same
population), as is observed.
The candelabra and trellis hypotheses are models of
how genes spread across geographical space and through
time, and hence a geographical analysis of haplotype trees
provides a direct test of these two models. Statistical techniques
exist that separate the influences of historical
events (such as population range expansions) from recurrent
events (such as gene flow with isolation by distance)
when there is adequate sampling both in terms of numbers
of individuals and of numbers and distribution of sampling
sites (Templeton et al. 1995). This statistical approach
first converts the haplotype tree into a nested statistical
design. The lowest level of analysis is the haplotypes
themselves, and the first level of nesting is created by
starting at the tips of the haplotype network and moving
one mutational step in, forming a union of any haplotypes
that are reached by such a single mutational step or that
converge upon a common node. This first set of "1-step
clades" (Templeton et al. 1987) on the tips of the haplo
type network is then pruned off and the process repeated
until all haplotypes are included in 1 -step clades. Now one
has a tree of 1 -step clades, and this tree can be nested into
"2-step clades" using exactly the same nesting rules, but
using 1 -step clades instead of haplotypes as the base unit.
These nesting rules are used at successively higher levels
until the next level of nesting would place the entire original
haplotype tree into a single clade (for more details, see
Templeton and Sing 1993).
The age of a higher order nesting clade has to be as old
or older than the clades nested within it. Thus, even in the
absence of a root for the haplotype tree, the nested design
provides relative age information. By studying how a series
of nested clades is distributed in space, it is therefore
possible to make inferences about how haplotype lineages
spread geographically through time. Moreover, the geographical
range of a clade relative to that of the other
clades it is nested with at the next higher level indicates
how far spatially a haplotype lineage can spread during the
time it takes to accumulate a single mutation. Hence, the
nested design based on the haplotype tree automatically
adds a temporal dimension to the spatial data gathered
with the sample of current haplotypes. It is therefore possible
to reconstruct the historical dynamics of the geographical
spread of haplotype lineages, with the dynamical
resolution being limited by the average amount of time
it takes a lineage to accumulate a single mutation. Moreover,
by making the analysis nested, no assumption of homogeneity
is being made about how lineages spread geographically
over time. That is, at one time or place,
haplotype lineages may have spread through gene flow restricted
by geographical distance; atanothertime orplace,
there may have been a rapid range expansion; and at yet
another time or place, all genetic interchange between two
geographical regions may have been severed. The nested
analysis does not exclude any of these possibilities a priori,
but ratherregards all of them (or any mixture) as legitimate
factors influencing the movement of haplotype lineages
through time and space (Templeton et al. 1995). This
statistical approach therefore treats historical and recurrent
events as joint possibilities rather than as mutually exclusive
alternatives.
These different factors, however, leave different signatures
in the nested analyses. If gene flow restricted by isolation
by distance dominated during the place and time
when a certain subset of mutations occurred, then the older
clades defined by these mutations should be more widespread
and the younger but evolutionarily close clades
should be in the same general area as the older clades. This
expectation follows from the simple fact that under isolation
by distance, genes spread only a little every generation,
and the longer a gene lineage exists, the more generations
it has to spread geographically and to accumulate
additional mutations. If two geographical regions split
from one another (i.e., severed genetic interchange), then
TEMPLETON ' EVOLUTIONARY GENETICS OF RACE 643
the clades that mark those geographical regions and that
time of isolation would accumulate many mutational differences
but without movement into each other's space.
Finally, if a subset of the original population (containing
only a subset of the haplotype variation that existed at that
time) suddenly expanded into and colonized a new geographical
region, then the subset of haplotypes they carried
and the lineages derived from them would have widespread
geographical distributions for their frequency
relative to the population as a whole. Thus, gene flow and
different historical events leave distinct genetic-spatial
signatures in a nested analysis and are thereby distinguishable.
Moreover, the areas affected by these forces and
events can be inferred, as can their time relative to the
nested design of the haplotype tree.
The ability to discriminate the genetic signatures of
range expansions from recurrent but restricted gene flow
is critical to discriminating the candelabra from the trellis
models and thereby inferring the evolutionary validity of
race. The criteria used to identify range expansions in this
nested approach have been empirically validated by analyzing
data sets with strong prior evidence of range expansion
and were found to be accurate and not prone to false
positives (Templeton 1998a). Application of this statistical
approach to human mtDNA haplotype trees yields the
significant results summarized in Figure 6 (Templeton
1993,1997b, 1998a).
As shown in Figure 6, human mtDNA yields a pattern
of isolation by distance between Africans and Eurasians
throughout the entire time period marked by mtDNA coalescence
(Templeton 1993,1997b), thereby significantly
rejecting both the candelabra hypothesis of no gene flow
between Africans and non-Africans and the admixture
models used to reconcile the candelabra models with the
genetic distance data. Recurrent gene flow in this analysis
is relative to the time scale defined by the coalescence and
mutation rates of mtDNA, so gene flow among Old World
human populations could have been sporadic on a time
scale of several tens of thousands of years.
Figure 6 also reveals that range expansions played a significant
role in recent human evolution. Among the statistically
significant range expansions is a relatively recent
range expansion across Europe (Templeton 1993,1997b),
an inference supported by other mtDNA data sets (Calafell
et al. 1996; Comas et al. 1997; Francalacci et al.
1996). A recent study on mtDNA isolated from a Neandertal
(Krings et al. 1997) is suggestive (but not conclusive
as the sample size is one) that Neandertals were replaced
in Europe. This inference is compatible with the
statistically significant European expansion shown in Figure
6, but further data are obviously needed to determine if
Restricted Gene Flow After
Range Expansion Mostly Through
Isolation by Distance But With
Some Recurrent Long
Distance Interchange
Figure 6. Statistically significant inferences from geographical analyses of human mtDNA haplotype trees. As far back as is observable with
mtDNA, there was gene flow restricted by isolation by distance in human populations living in Africa and southern Eurasia. More recent
statistically significant range expansion events are indicated by wide arrows. There were expansions into Europe, northern Asia, the Pacific, and
the Americas. Two arrows are indicated going into North America because this expansion either involved a colonization event with a large
number of people, an extended colonization, or at least two separate colonization events. The lines drawn through these arrows indicate that after
the colonization there was a significant reduction, perhaps cessation, of gene flow between Asia and North America. After the colonization of
North America, there were further significant expansions into the remainder of the Americas. After these expansion events, there is statistically
significant gene flow once again. Most of this postexpansion gene flow fits the expectations of isolation by distance, but some postexpansion gene
flow occurred through long-distance interchanges.
644 AMERICAN ANTHROPOLOGIST • VOL. 100, No. 3 • SEPTEMBER 1998
this recent European expansion event was also a replacement
event. The other recent expansions (into northern
Asia, the Pacific, and the Americas) appear to be range expansions
into previously unoccupied areas.
Genetic interchange between Africans and Eurasians
over long periods of human evolutionary history is also
strongly suggested by a hemoglobin beta locus tree (Harding
et al. 1997). The coalescence of an autosomal gene is
expected to be about four times as old as that of mtDNA or
Y-DNA, and this seems to be the case for the beta locus
(Harding et al. 1997). Consequently, the patterns of widespread
geneflow across Africa and Asia observed with the
hemoglobin locus predate the hypothesized "replacement"
event of the recent candelabra model (Harding et al.
1997). Obviously, if such a replacement had occurred,
these earlier genetic signatures of gene flow should have
been obliterated.
To reinforce these conclusions, the hemoglobin beta locus
data of Harding et al. (1997) were subjected to a nested
clade analysis of geographical associations (Templeton et
al. 1995). First the estimated haplotype tree is converted
into a series of nested branches (clades) (Templeton et al.
1987; Templeton and Sing 1993). Figure 7 shows the hemoglobin
haplotype network of Harding et al. (1997),
along with the nested statistical design. Once the haplotype
tree has been converted into a nested statistical de
sign, the geographical data are quantified in two main
fashions (Templeton et al. 1995): the clade distance, Dc,
which measures the geographical range of a particular
clade; and the nested clade distance, £>„, which measures
how a particular clade is geographically distributed relative
to its closest evolutionary sister clades (i.e., clades in
the same next higher-level nesting category). Contrasts in
these distance measures between older and younger
clades are important in discriminating the potential causes
of geographical structuring of the genetic variation (Templetonetal.
1995), as discussed above. In this case, temporal
polarity is determined by an outgroup analysis that indicates
that haplotype B3 in Figure 7 is the root (Harding
et al. 1997) in addition to the polarity inherent in the nested
design itself. The statistical significance of the different
distance measures and the old-young contrasts are determined
by random permutation testing that simulates the
null hypothesis of a random geographical distribution for
all clades within a nesting category given the marginal
clade frequencies and sample sizes per locality. Figure 8
presents the results of this nested clade analysis of geographical
distributions.
The statistically significant patterns shown in Figure 8
need to be interpreted biologically. In order to make inference
explicit and consistent, a detailed inference key is provided
as an appendix to Templeton et al. (1995) (hereafter
Figure 7. The hemoglobin haplotype network of Harding et al. (1997), alone with the nested statistical design. Haplotype designations are those
given in Harding et al. (1997). Nested groupings above the haplotype level are designated by "C-N," where C is the nesting level of the clade and
N is the number of a particular clade at a given nesting level. Boxes with thin lines nest together haplotypes into 1-step clades, and boxes with
thick lines nest together 1-step clades with 2-step clades.
TEMPLETON / EVOLUTIONARY GENETICS OF RACE 645
<<SNIP>>
Figure 8. Results of the nested geographic analysis of the human beta
chain hemoglobin haplotypes. The nested design is given in Figure 7,
as are the haplotype and clade designations. Following the name or
number of any given clade are the clade and nested clade distances.
The oldest clade within a nested group is indicated by shading. The
average difference between the oldest and younger clades within a
nesting category (as determined by B3 being the root) for both
distance measures is given in the row below a dashed line labeled
"O-Y." A superscript S means that the distance measure is significantly
small at the 5% level, and SS, at the 1% level. Similarly, a
superscript L means that the distance measure is significantly large at
the 5% level, and LL, at the 1% level. At the bottom of the boxes that
indicate a nested set of clades in which one or more of the distance
measures is significantly large or small is a line indicating the biological
inference. The numbers refer to the sequence of questions in
the TRP key that the pattern generated, followed by the answer to the
final question in the TRP key. Following this answer is the biological
inference generated by use of the TRP key, where RE is range
expansion and IBD is recurrent gene flow restricted by isolation by
distance.
referred to as the TRP key). Templeton (1998a) gives an
empirical validation of this key. This key provides for the
objective and systematic identification of the distinct signatures
associated with isolation by distance, fragmentation,
and range expansion that were described qualitatively
above. Moreover, the key also identifies the
artifacts that can emerge from inadequate geographical
sampling. Consequently, not all rejections of the null hypotheses
can be interpreted biologically. Figure 8 shows
the resulting inferences.
In comparing the mtDNA (Figure 6) and hemoglobin
(Figure 8) inferences, it is important to keep two factors in
mind. First, these two haplotype trees are detecting events
on different time scales. In particular, the time depth of the
hemoglobin network has a 95% confidence interval of
400,000 to 1,300,000 years ago (Harding et al. 1997).
Once ultimate coalescence has occurred in a haplotype
tree, there is no information about previous events or evolutionary
forces. Therefore, the older events and forces
detected in the hemoglobin analysis would be completely
invisible to the mtDNA analysis. The oldest event detected
in the hemoglobin analysis is an out-of-Africa
range expansion found among 2-step clades as nested
within the entire haplotype tree, and which therefore must
have occurred close to the time depth of the entire tree.
This out-of-Africa expansion event is obviously too old to
be the one postulated by the recent candelabra model. Because
it spans the entire time depth of the hemoglobin
haplotype tree, there is no information at all about the pre-
expansion population. Hence, this old out-of-Africa expansion
could have been a colonization event of empty areas^
replacement event, or a hybridization event in which
new migrants interbred with previous Eurasian inhabitants.
There is simply no way of knowing. After this expansion,
gene flow clearly occurred among Africans and
Eurasians as constrained by isolation by distance as
shown by the 1-step clades nested within both 2-step
clades (Figure 8). The mutations defining these mid-level
clades are expected to be > 200,000 years old (Harding et
al. 1997). Given that the mtDNA shows recurrent gene
flow with isolation by distance certainly for times
< 200,000 years ago (Figure 6), the two data sets jointly
imply a long time span of recurrent genetic contact among
the major Old World human populations.
An out-of-Asia expansion event is detected within
clade 1 -5 in the hemoglobin analysis (Figure 8). One of the
critical mutations defining this expansion event (the mutation
on the branch between C3 and C2) has an estimated
age of 137,000 ± 81,500 years and a second critical mutation
(the one defining C7) of 69,000 ± 48,000 years
(Harding et al. 1997). If this out-of-Asia expansion is
older than 100,000, then it would be impossible for a complete
genetic replacement of the ancestral Asian population
to have occurred by Africans 100,000 years ago. If
this out-of-Asia expansion is younger than 100,000, then
there was genetic interchange between Asians and Africans,
and therefore no "split" between Africans and Eurasians
100,000 years ago. Another range expansion is
found within clade 1-2 (Figure 8), and the geographical
distribution of the haplotypes implies that this is an out-of-
Africa expansion. Because clade 1 -2 includes the very
oldest haplotypes, this may simply be a reflection of the
old out-of-Africa range expansion detected among the 2step
clades. However, the significant effect of the old
haplotype B3 may in this case be due in part to the nonsignificant
but widespread distribution of haplotype B1. The
mutation defining Bl has an estimated age of about
646 AMERICAN ANTHROPOLOGIST • VOL. 100, No. 3 • SEPTEMBER 1998
152,000 years, but its confidence interval spans virtually
the entire past 300,000 years (Harding et al. 1997). Hence,
the clade 1 -2 inference may represent a more recent outof-
Africa expansion occurring sometime in the last
300,000 years. Even if clade 1-2 represents a recent outof-
Africa expansion event, it certainly is not a replacement
event. A true replacement event at about 100,000
years ago would have obliterated all evidence for older
gene flow; yet the clade 1 -2 out-of-Africa event is nested-
within a pattern of significant gene flow with isolation by
distance (clade 2-1). There is no obvious way to reconcile
the hemoglobin data with a recent out-of-Africa replacement
event.
The second factor to keep in mind when comparing the
mtDNA with the hemoglobin analysis is the level of dynamic
resolution. The nested clade analysis can only detect
population events and recurrent forces that are
marked by mutational changes (Templeton 1998a).
MtDNA is evolving much more rapidly than the hemoglobin
locus, and the attendant haplotype trees are far more
resolved for mtDNA than for hemoglobin. Consequently,
the hemoglobin analysis is on both an older and a coarser
time scale than the mtDNA. Therefore, the most recent
events and forces detected in the mtDNA analysis would
be invisible to the hemoglobin analysis. This explains why
the hemoglobin analysis does not detect the more recent
range expansions revealed by the mtDNA (Figure 6):
there are simply no or too few mutations in the hemoglobin
data to mark these recent expansion events. Hence, the
hemoglobin and mtDNA analyses are complementary,
not contradictory.
Finally, genetic interchange between Africans and Eurasians
is additionally suggested by a nested clade analysis
of a Y-DNA haplotype tree (Hammer et al. 1998). Interestingly,
a range expansion out of Africa and into
Eurasia is detected in this nested analysis. However, in
light of the mtDNA and hemoglobin results, this expansion
was not a replacement event, at least for the maternal
demographic component. Following this out of Africa expansion,
the nested analysis reveals a pattern of significant
recurrent gene flow restricted by isolation by distance, including
interchange between African and Eurasian populations.
Moreover, there was a subsequent range expansion
out of Asia and into Africa, as was also detected in the
hemoglobin analysis. The Y-chromosome therefore
shows more evidence of long-range population movements
than the mtDNA. One possible explanation for this
pattern is that males dispersed more than females during
long-range population movements. However, both
mtDNA and Y-DNA show recurrent gene flow with isolation
by distance interconnecting African and Eurasian
populations, indicating that both males and females have
dispersed during short-range migrations. Regardless,
there is clearly genetic interchange between Africans and
Eurasians due to a mixture of gene flow mediated by isola
tion by distance and population movements. No genetic
split between Africans and Eurasians is found in the YDNA,
as was also true for the mtDNA and hemoglobin
beta region.
Combined, the mtDNA, Y-DNA, and hemoglobin data
sets reveal that human evolution from about a million
years ago to the last tens of thousands of years has been
dominated by two evolutionary forces: (1) population
movements and associated range expansions (perhaps
with some local replacements, but definitely with no
global replacement within the last 100,000 years), and (2)
gene flow restricted by isolation by distance. The only evidence
for any split or fragmentation event in human evolutionary
history within this time frame is the one detected
with mtDNA (Figure 6) involving the colonization of the
Americas (Templeton 1998a). However, this colonization
was due to either multiple colonization events or involved
movements by large numbers of peoples (Templeton
1998a), resulting in extensive sharing of genetic polymorphisms
of New World with Old World human populations.
Moreover, the genetic isolation between the Old and
New Worlds was brief and no longer exists. Other than
this temporary fragmentation event, the major human
populations have been interconnected by gene flow (recurrent
at least on a time scale of the order of tens of thousands
of years) during the last one to two hundred thousand
years. Gene flow may have been more sporadic
earlier, but multiple genetic interchanges certainly occurred
among Old World populations > 200,000 years
ago. Hence, the haplotype analyses of geographical associations
strongly reject the existence of evolutionary sub-
lineages of humans, reject the separation of Eurasians
from Africans 100,000 years ago, and reject the idea of
"pure races" in the past. Thus, human "races" have no biological
validity under the evolutionary lineage definition
of subspecies.
Conclusions
The genetic data are consistently and strongly informative
about human races. Humans show only modest levels
of differentiation among populations when compared to
other large-bodied mammals, and this level of differentiation
is well below the usual threshold used to identify subspecies
(races) in nonhuman species. Hence, human races
do not exist under the traditional concept of a subspecies
as being a geographically circumscribed population
showing sharp genetic differentiation. A more modem
definition of race is that of a distinct evolutionary lineage
within a species. The genetic evidence strongly rejects the
existence of distinct evolutionary lineages within humans.
The widespread representation of human "races" as
branches on an intraspecific population tree is genetically
indefensible and biologically misleading, even when the
ancestral node is presented as being at 100,000 years ago.
<<SNIP>>