Linguistic aspects of the Aryan non-invasion theory
Dr. Koenraad ELST
_____________________
PART I
Summary
It is widely assumed that linguistics
has provided the clinching evidence for the Aryan invasion theory (AIT)
and for a non-Indian homeland of the Indo-European (IE) language family.
Defenders of an "Out of India" theory (OIT) of IE expansion unwittingly
confirm this impression by rejecting linguistics itself or its basic
paradigms, such as the concept of IE language family. However, old
linguistic props of the AIT, such as linguistic paleontology or
glottochronology, have lost their credibility. On closer inspection,
currently dominant theories turn out to be compatible with an out-of-India
scenario for IE expansion. In particular, substratum data are not in
conflict with an IE homeland in Haryana-Panjab. It would however be rash
to claim positive linguistic proof for the OIT. As a fairly soft type of
evidence, linguistic data are presently compatible with a variety of
scenarios.
1. Preliminary remarks
1.1. Invasion vs. immigration
The theory of which we are about to
discuss the linguistic evidence, is widely known as the "Aryan invasion
theory" (AIT). I will retain this term even though some scholars object to
it, preferring the term "immigration" to "invasion". They argue that the
latter term represents a long-abandoned theory of Aryan warrior bands
attacking and subjugating the peaceful Indus civilization. This dramatic
scenario, popularized by Sir Mortimer Wheeler, had white marauders from
the northwest enslave the black aboriginals, so that "Indra stands
accused" of destroying the Harappan civilization. Only the extremist
fringe of the Indian Dalit (ex-Untouchable) movement and its Afrocentric
allies in the US now insist on this black-and-white narrative (vide
Rajshekar 1987, Biswas 1995).
But for this once, I believe the
extremists have a point. North India's linguistic landscape leaves open
only two possible explanations: either Indo-Aryan was native, or it was
imported in an invasion. In fact, scratch any of these emphatic
"immigration" theorists and you'll find an old-school invasionist, for
they never fail to connect Aryan immigration with horses and spoked-wheel
chariots, i.e. factors of military superiority.
Immigration means a movement from one
country to another, without the connotation of conquest; invasion, by
contrast, implies conquest or at least the intention of conquest. To be
sure, invasion is not synonymous with military conquest; it may be that,
but it may also be demographic Unterwanderung. What makes an
immigration into an invasion is not the means used but the end achieved:
after an invasion, the former outsiders are not merely in, as in an
immigration, but they are also in charge. If the newcomers end up
imposing their (cultural, religious, linguistic) identity rather than
adopting the native identity, the result is the same as it would have been
in the case of a military conquest, viz. that outsiders have made the
country their own, and that natives who remain true to their identity
(such as Native Americans in the US) become strangers or second-class
citizens in their own country.
In the case of the hypothetical Aryan
invasion, the end result clearly is that North India got aryanized. The
language of the Aryans marginalized or replaced all others. In a popular
variant of the theory, they even reduced the natives to permanent
subjugation through the caste system. So, whether or not there was a
destructive Aryan conquest, the result was at any rate the humiliation of
native culture and the elimination of the native language in the larger
part of India. It is entirely reasonable to call this development an
"invasion" and to speak of the prevalent paradigm as the "Aryan invasion
theory".
As far as I can see, the supposedly
invading Aryans could only initiate a process of language replacement by a
scenario of elite dominance (that much is accepted by most
invasionists), which means that they first had to become the ruling class.
Could they have peacefully immigrated and then worked their way up in
society, somewhat like the Jews in pre-War Vienna or in New York? The
example given illustrates a necessary ingredient of peaceful immigration,
viz. linguistic adaptation: in spite of earning many positions of honour
and influence in society, the Jews never imposed their language like the
Aryans supposedly did, but became proficient in the native languages
instead. So how could these Aryan immigrants first peacefully integrate
into Harappan or post-Harappan society yet preserve their language and
later even impose it on their host society? Neither their numbers,
relative to the very numerous natives, nor their cultural level, as
illiterate cowherds relative to a literate civilization, gave them much of
an edge over the natives.
Therefore, the only plausible way for
them to wrest power from the natives must have been by their military
superiority, tried and tested in the process of an actual conquest.
Possibly there were some twists to the conquest scenario, making it more
complicated than a simple attack, e.g. some Harappan faction in a civil
war may have invited an Aryan mercenary army which, after doing its job,
overstayed its welcome and dethroned its employers. But at least some
kind of military showdown should necessarily have taken place. As things
now stand, the Aryan "immigration" theory necessarily implies the
hypothesis of military conquest.
1.2. The archaeological argument from
silence
In this paper, I will give a
sympathizing account of the prima facie arguments in favour of the
"Out of India" theory (OIT) of IE expansion. I am not sure that this
theory is correct, indeed I will argue that the linguistic body of
evidence is inconclusive, but I do believe that the theory deserves a
proper hearing. In the past, it didn't get one because the academic
establishment simply hadn't taken serious notice. Now that this has
changed for the better, it becomes clear that the all-important linguistic
aspect of the question has never been properly articulated by "Out of
India" theorists. The OIT invokes archaeological and textual evidence, but
doesn't speak the language of the IE linguists who thought up the AIT in
the first place. So now, I take it upon myself to show that the OIT need
not be linguistic nonsense.
But first, a glimpse of the
archaeological debate. In a recent paper, two prominent archaeologists,
Jim Shaffer and Diane Lichtenstein (1999), argue that there is absolutely
no archeaological indication of an Aryan immigration into northwestern
India during or after the decline of the Harappan city culture. It is odd
that the other participants in this debate pay so little attention to this
categorical finding, so at odds with the expectations of the AIT
orthodoxy, but so in line with majority opinion among Indian
archaeologists (e.g. Rao 1992, Lal 1998).
The absence of archaeological evidence
for the AIT is also admitted, with erudite reference to numerous recent
excavations and handy explanations of the types of evidence recognized in
archaeology, by outspoken invasionist Shereen Ratnagar (1999). It then
becomes her job to explain why the absence of material testimony of such a
momentous invasion need not rule out the possibility that the invasion
took place nonetheless. Thus, she mentions parallel cases of known yet
archaeologically unidentifiable invasions, e.g. the Goths in
late-imperial Rome or the Akkadians in southern Mesopotamia (Ratnagar
1999:222-223). So, in archaeology even more than elsewhere, we should not
make too much of an argumentum e silentio. To quote her own
conclusion: "We have found that the nature of material residues and the
units of analysis in archaeology do not match or fit the phenomenon we
wish to investigate, viz. Aryan migrations. The problem is exacerbated by
the strong possibility that simultaneous with migrations out of Eurasia
there were expansions out of established centres by
metallurgists/prospectors. Last, when we investigate pastoral land use in
the Eurasian steppe, we can make informed inferences about the nature of
Aryan emigration thence, which is a kind of movement very unlikely to have
had artefactual correlates." (1999:234)
It's against the stereotype of
overbearing macho invaders, but the Aryans secretively stole their way
into India, careful not to leave any traces.
1.3. Paradigmatic expectation as a
distortive factor
If the Aryan invasion does not stand
disproven by the absence of definite archaeological pointers, then neither
does an Aryan emigration from India. However, there is one difference.
Because several generations of archaeologists have been taught the AIT,
they have in their evaluation of new evidence tried to match it with the
AIT; in this, they have failed so far. However, it is unlikely that they
have explored the possibility of matching the new findings with the
reverse migration scenario. Psychologically, they must have been much
less predisposed to noticing possible connections between the data and an
out-of-India migration than the reverse.
This predisposition is also in evidence
in the debates over other types of evidence. Thus, in a recent internet
discussion about the genetic data, someone claimed that one study (unlike
many others) indicated an immigration of Caucasians into India for the 2nd
millennium BC. To be sure, archaeo-genetics is not sufficiently fine-tuned
yet to make that kind of chronological assertion, but even if we accept
this claim, it would only prove the AIT in the eyes of those who are
already conditioned by the AIT perspective. After all, a northwestern
influx into India in the 2nd millennium, while not in conflict with the
AIT, is not in conflict with the OIT either: the latter posits a
northwestern emigration in perhaps the 5th millennium BC, and has no
problem with occasional northwestern invasions in later centuries, such
as those of the Shakas, Hunas and Turks in the historic period.
Likewise, linguistic evidence cited in
favour of the AIT often turns out to be quite compatible with the OIT
scenario as well (as we shall see), but is never studied in that light
because so few people in the 20th century even thought of that
possibility. And today, even those who are aware of the OIT haven't
thought it through sufficiently to notice how known data may verify it.
1.4. The horse, argument from silence
In a recent paper, Hans Hock gives the
two arguments which have, all through the 1990s, kept myself from giving
my unqualified support to the OIT. These are the dialectal distribution of
the branches of the IE language family, to be discussed below, and the
sparse presence of horses in Harappan culture. About the horse, he
summarizes the problem very well: "no archaeological evidence from
Harappan India has been presented that would indicate anything comparable
to the cultural and religious significance of the horse (...) which can be
observed in the traditions of the early IE peoples, including the Vedic
Aryas. On balance, then, the 'equine' evidence at this point is more
compatible with migration into India than with outward migration."
(1999:13)
B.B. Lal (1998:111) mentions finds of
true horse in Surkotada, Rupnagar, Kalibangan, Lothal, Mohenjo-Daro, and
terracotta images of the horse from Mohenjo-Daro and Nausharo. Many bones
of the related onager or half-ass have also been found, and one should not
discount the possibility that in some contexts, the term ashva
could refer to either species. Nevertheless, all this is still a bit
meagre to fulfil the expectation of a prominent place for the horse in an
"Aryan" culture. I agree with the OIT school that such paucity of horse
testimony may be explainable (cfr. the absence of camel and cow
depictions, animals well-known to the Harappans, in contrast with the
popularity of the bull motif, though cows must abound when bulls are
around), but their case would be better served by more positive evidence.
On the other hand, the evidence is not
absolutely damaging to an Aryan Harappa hypothesis. Both outcomes remain
possible because other, reputedly Aryan sites are likewise poor in horses.
This is the case with the Bactria-Margiana Archaeological Complex,
surprisingly for those who interpret the BMAC as the culture of the
Indo-Aryans poised to invade India (Sergent 1997:161 ff.). It is also the
case for Hastinapura, a city dated by archaeologists at ca. 8th century
BC, when that part of India was very definitely Aryan (Thapar 1996:21).
So, the argument from near-silence regarding horse bones need not prove
absence of Aryans nor be fatal to the OIT, though it remains a weak point
in the OIT argumentation.
1.5. Evidence sweeping all before it
When evidence from archaeology and
Sanskrit text studies seems to contradict the AIT, we are usually
reassured that "there is of course the linguistic evidence" for this
invasion, or at least for the non-Indian origin of the IE family. Thus,
F.E. Pargiter (1962:302) had shown how the Puranas locate Aryan origins in
the Ganga basin and found "the earliest connexion of the Vedas to be with
the eastern region and not with the Panjab", but then he allowed the
unnamed linguistic evidence to overrule his own findings (1962:1): "We
know from the evidence of language that the Aryans entered India very
early". His solution is to relocate the point of entry of the Aryans from
the western Khyber pass to the eastern Himalaya: Kathmandu or thereabouts.
A common reaction among Indians against
this state of affairs is to dismiss linguistics altogether, calling it a
"pseudo‑science". Thus, N.S. Rajaram describes 19th-century comparative
and historical linguistics, which generated the AIT, as "a scholarly
discipline that had none of the checks and balances of a real science"
(1995:144), in which "a conjecture is turned into a hypothesis to be
later treated as a fact in support of a new theory" (1995:217).
Along the same lines, N.R. Waradpande
(1989:19-21) questions the very existence of an Indo-European language
family and rejects the genetic kinship model, arguing very briefly that
similarities between Greek and Sanskrit must be due to very early
borrowing. He argues that "the linguists have not been able to establish
that the similarities in the Aryan or Indo-European languages are genetic,
i.e. due to their having a common ancestry". Conversely, he also
(1993:14-15) rejects the separation of Indo-Aryan and Dravidian into
distinct language families, and alleges that "the view that the
South-Indian languages have an origin different from that of the
North-Indian languages is based on irresponsible, ignorant and motivated
utterances of a missionary" (meaning the 19th-century prioneer of
Dravidology, Bishop Robert Caldwell).
This rejection of linguistics by critics
of the AIT creates the impression that their own pet theory is not
resistent to the test of linguistics. Indeed, nothing has damaged their
credibility as much as this sweeping dismissal of a science praised in the
following terms by archaeologist David W. Anthony (1991:201-202): "It is
true that we can only work with relatively late IE daughter languages,
that we cannot hope to capture the full variability of PIE, and that
reconstructed semantic fields are more reliable than single terms. It is
also true that both the reconstructed terms and their meanings are
theories derived from systematic correspondences observed among the
daughter IE languages; no PIE term is known with absolute certainty.
Nevertheless, the rules that guide phonetic (and to a lesser extent,
semantic) reconstruction are more rigorous, have been more intensely
tested, and rest upon a more secure theoretical foundation than most of
the rules that guide interpretation in my own field of prehistoric
archaeology. Well-documented linguistic reconstructions of PIE are in many
cases more reliable than well-documented archaeological interpretations of
Copper Age material remains."
However, the fact that people fail to
address the linguistic evidence, preferring simply to excommunicate it
from the debate, does not by itself validate the prevalent interpretation
of this body of evidence. Rajaram's remark that scholars often treat mere
hypotheses (esp. those proposed by famous colleagues) as facts, as solid
data capable of overruling other hypotheses and even inconvenient new
data, is definitely valid for much of the humanities.
But then, while some linguists have
sometimes fallen short of the scientific standard by thus relying on
authority, it doesn't follow that linguistics is a pseudo-science. Nobody
can observe the Proto-Indo-Europeans live to verify hypotheses, yet
comparative IE linguistics does sometimes satisfy the requirement of
having predictions implicit in the theory verified by empirical
discoveries. Thus, some word forms reconstructed as the etyma of terms in
the Romance languages failed to show up in the classical Latin
vocabulary, but were finally discovered in the vulgar-Latin graffiti of
Pompeii. The most impressive example of this kind is probably the
identification of laryngeals, whose existence had been predicted in
abstracto decades earlier by Ferdinand de Saussure, in
newly-discovered texts in the Hittite language. We will get to see an
important sequel to the laryngeal verification below.
At the same time, some linguists are
aware that the AIT is just a successful theory, not a proven fact. One of
them told me that he had never bothered about a linguistic justification
for the AIT framework, because there was, after all, "the well‑known
archaeological evidence"! But for the rest, "the linguistic evidence" is
still the magic mantra to silence all doubts about the AIT. It is time we
take a look at it for ourselves.
2. The Indo-European landscape
2.1. Intuitive deductions from geography
There is, pace Misra 1992,
absolutely no reason to doubt the established refutation of the Indian
(and turn-of-the-19th-century European) belief that Sanskrit is the mother
of all IE languages, though Sanskrit remains in many respects closest to
PIE, as a standard textbook of IE testifies: "The distribution [of the two
stems as/s for "to be"] in Sanskrit is the oldest one" (Beekes
1990:37); "PIE had 8 cases, which Sanskrit still has" (Beekes 1990:122);
"PIE had no definite article. That is also true for Sanskrit and Latin,
and still for Russian. Other languages developed one" (Beekes 1990:125);
"[For the declensions] we ought to reconstruct the Proto-Indo-Iranian
first,... But we will do with the Sanskrit because we know that it has
preserved the essential information of the Proto-Indo-Iranian" (Beekes
1990:148); "While the accentuation systems of the other languages indicate
a total rupture, Sanskrit, and to a lesser extent Greek, seem to continue
the original IE situation" (Beekes 1990:187); "The root aorist... is still
frequent in Indo-Iranian, appears sporadically in Greek and Armenian, and
has disappeared elsewhere" (Beekes 1990:279).
All the same, Sanskrit has moved away
from PIE and the path can be mapped. Thus, you can explain Skt. jagâma
from PIE *gegoma as a palatalization of the initial velar (before
e/i) followed by the conflation of a/e/o to a, but
the reverse is not indicated and is close to impossible: palatalization is
a one-way process, attested in numerous languages on all continents
(including English, e.g. wicca > witch), while the opposite
shift is practically unknown. The kentum forms and the forms with
differentiated vowels as attested in Greek represent the original
situation, while the Sanskrit forms represent an innovation. This means
that Sanskrit is not PIE, that it has considerably evolved after
separating from the ancestor-languages of the other branches of IE.
However, accepting the conventional
genealogical tree of the IE languages does not imply acceptance of their
conventional geography. When Sanskrit was dethroned in the 19th century
and the putative linguistic distance between PIE and Sanskrit
progressively increased, there was a parallel movement of the PIE homeland
away from India. Apart from linguistic considerations (chiefly linguistic
paleontology) and the political background (increased Eurocentrism at the
height of the colonial period), this was certainly also due to a more or
less conscious tendency to equate linguistic distance from PIE with
geographical distance from the Urheimat. That tendency has
persisted here and there all through the 20th century, e.g. Witold Manczak
(1992) deduces that the Urheimat must be in or near Poland from his
estimate that lexically, Polish is closest to PIE in that it is the IE
language with the fewest substratal borrowings.
Obviously, that type of reasoning must
be abandoned. It is perfectly possible for the most conservative language
to be spoken by a group of emigrants rather than by those who stayed
behind in the homeland. Indeed, according to the so-called Lateral Theory,
it is precisely in outlying settlement areas that the most conservative
forms will be found, while in the metropolis the language evolves faster.
That exactly is what the OIT posits regarding palatalization.
2.2. Kentum/satem
The first innovation acknowledged as
creating a distance between PIE and Sanskrit was the kentum > satem
shift. It was assumed, in my view correctly (pace Misra 1992), that
palatalization is a one‑way process transforming velars (k,g) into
palatals (c,j) but never the reverse; so that the velar or "kentum" forms
had to be the original and the palatal or "satem" forms the evolved
variants.
However, it would be erroneous to infer
from this that the homeland was in the kentum area. On the contrary, it is
altogether more likely that it was in what became satem territory, e.g. as
follows: India originally had the kentum form, the dialects which
emigrated first retained the kentum form and took it to the geographical
borderlands of the IE expanse (Europe, Anatolia, China), while the
last‑emigrated dialects (Armenian, Iranian) plus the staybehind
Indo‑Aryan languages had meanwhile adopted the satem form.
Moreover, the discovery of a small and
extinct kentum language inside India (Proto‑Bangani, with koto as
its word for "hundred"), surviving as a sizable substratum in the
Himalayan language Bangani, tends to support the hypothesis that the older
kentum form was originally present in India as well. This discovery was
made by the German linguist Claus Peter Zoller (1987, 1988, 1989). The
attempt by George van Driem and Suhnu R. Sharma (1996) to discredit Zoller
has been overruled by the findings made on the spot by Anvita Abbi (1998)
and her students. She has almost entirely confirmed Zoller's list of
kentum substratum words in Bangani. But as the trite phrase goes: this
calls for more research.
Zoller does not explain the presence of
a kentum language in India through an Indian Homeland Theory but as a
left-over of a pre-Vedic Indo-European immigration into India. He claims
that the local people have a tradition of their immigration from
Afghanistan. If they really lived in Afghanistan originally, their case
(and their nuisance value for the AIT) isn't too different from that of
the Tocharians, another kentum people showing up in unexpected quarters.
But if even the Vedic poets could not recall the invasion of their
grandfathers into India (Vedic literature doesn't mention it anywhere,
vide Elst 1999:164-171), what value should we attach to a tradition of
this mountain tribe about its own immigration many centuries ago? Could it
not rather be that they have interiorized what the school-going ones among
them picked up in standard textbooks of history, viz. the AIT model? Their
presence in Afghanistan or in Garhwal itself is at any rate highly
compatible with the OIT.
2.3. Indo-Hittite
Another element which increased the
distance between reconstructed PIE and Sanskrit dramatically was the
discovery of Hittite. Though Hittite displayed a very large intake of
lexical and other elements from non‑lE languages, some of its features
were deemed to be older than their Sanskrit counterparts, e.g. the Hittite
genus commune as opposed to Sanskrit's contrast between masculine
and feminine genders, or the much‑discussed laryngeal consonants. Outside
Hittite, some phonetic side-effects are the only trace of these supposed
laryngeals, e.g. Greek odont-, "tooth", shows trace of an initial
H-, which Latin lost to yield dent-. Greek anęr, "man",
would come from *Hnr, whereas Sanskrit has nr/nara, only
preserving the laryngeal in the form of vowel-lengthening in a prefix, as
in sű-nara from su + *Hnara. In metre, we find traces of an
original laryngeal consonant marking a second syllable which was later
contracted with the preceding syllable: "In Indo-Iranian such forms are
often still disyllabic in the oldest poetry: bhâs, 'light', = /bhaas/
< /bheH-os/." (Beekes 1990:180)
This fact has gone unnoticed in all pro-OIT
writing so far. The laryngeal came in three varieties, which later
yielded the three vowels a/e/o, whose representatives in the Greek
alphabet happen to be derived from the three more or less laryngeal
consonants in Northwest-Semitic: aleph, he and ayn.
The laryngeal theory has been attacked
by both OIT and mainstream circles. Misra (1992:21) claims to have
"refuted" it, Décsy (1991:17) calls it "the infamous laryngeal theory".
When scholars claim proof of the laryngeals in Caucasian loan-words from
IE, Décsy (1992:14, w.ref. to Wagner 1984) counters that it is the other
way around: "Hittite lost its Indo-European character and acquired a large
number of Caucasian areal features in Anatolia. These Caucasian-type
features can not be regarded as ancient characteristics of the entire
PIE". Likewise Jonsson (1978:86), though accepting that the laryngeals may
offer a "more elegant explanation of certain cases of hiatus in Vedic, of
certain suffixal î's, ű's", presents as "an acceptable
alternative" the scenario that the laryngeal in IE-inherited Anatolian
words "comes from the unknown non-IE language or languages that are
responsible for the major part of the [Anatolian] vocabulary".
But we need no dissident hypotheses
here: even in the dominant theory, there is no reason why the Urheimat
should be in the historical location of Hittite or at least outside India.
As the first emigrant dialect, Hittite could have taken from India some
linguistic features (genus commune, laryngeals) which were about to
disappear in the dialects emigrating only later or staying behind.
As for the shift from genus commune
to a differentiation of the "animate" category in masculine and feminine,
this has been used to illustrate a theory of fast-increasing complexity of
post-PIE grammar, which Zimmer (1990/2) interprets as a typical phenomenon
of Creole languages. He sees early IE as the language of a colluvies
gentium, a synthetic tribe of people from divergent ethnic
backgrounds, which developed its makeshift link language into a complex
language, with Hittite splitting off in an early stage of this evolution.
This is an interesting hypothesis, but so far the evidence for it is
lacking. Thus, there is no proof that the simpler verbal tense system of
Germanic and Hittite came first while the more elaborate tense system of
Aryan or Greek was a later evolution; more likely, the aorist which exists
in the latter two but not in the former two is a PIE tense which some
retained and some lost. The theories that PIE grammar was Hittite-like
simple and that PIE was a Creole developed by a colluvies gentium
are mutually supportive, but there is no outside proof for either. And if
there were, it would still not preclude northwestern India as the habitat
of this colluvies gentium.
2.4. Dialect distribution
One consideration which has always kept
me from simply declaring the AIT wrong concerns the geographical
distribution of the branches of the IE family. This argument has been
developed in some detail by Hans Hock, who explains (1999:13) that "the
early Indo-European languages exhibit linguistic alignments which cannot
be captured by a tree diagram, but which require a dialectological
approach that maps out a set of intersecting 'isoglosses' which define
areas with shared features (...) While there may be disagreements on some
of the details, Indo-Europeanists agree that these relationships reflect a
stage at which the different Indo-European languages were still just
dialects of the ancestral language and as such interacted with each other
in the same way as the dialects of modern languages."
Isoglosses, linguistic changes which are
common to several languages, indicate either that the change was imparted
by one language to its sisters, or that the languages have jointly
inherited or adopted it from a common source. Within the IE family, we
find isoglosses in languages which take or took geographically
neighbouring positions, e.g. in a straight Greece-to-India belt, the
Greek, Armenian, Iranian and some Dardic and western Indo-Aryan languages,
we see the shift s > h, e.g. Latin septem corresponding to
Greek hepta, Iranian hafta. In the same group, plus the
remaining Indo-Aryan languages, we see the "preterital augment": Greek
e-phere, Sanskrit a-bharat, "he/she/it carried". Does this mean
that the said languages formed a single branch for some time after the
disintegration of PIE unity, before fragmenting into the presently
distinct languages?
Not necessarily, for this group is
itself divided by separate developments which the member languages have in
common with non-member languages. Best known is the kentum/satem
divide: Greek belongs to the kentum group, while Armenian and Indo-Iranian
share with Baltic and Slavic the satem isogloss (as well as the related "ruki
rule", changing s to sh after r, u, k, i). So, like
between the dialects of any modern language, the IE languages share one
isogloss with this neighbour, another isogloss with another neighbour,
who in turn shares isoglosses with yet other neighbours.
The key factor in Hock's argument seems
to be neighbour: the remarkable phenomenon which should ultimately
support the AIT is that isoglosses are shared by neighbouring branches
of IE. Thus, the kentum languages form a continuous belt from Anatolia
through southern to western and northern Europe (with serious exceptions,
viz. Tocharian and proto-Bangani), and the satem isogloss likewise covers
a continuous territory from central Europe to India, only later
fragmented by the intrusion of Turkic. Hock provides (1999:15) a map
showing ten isoglosses in their distribution over the geographically
placed IE language groups, and we do note the geographical contiguity of
languages sharing an isogloss. Why is this important? "What is
interesting, and significant for present purposes, is the close
correspondence between the dialectological arrangement in Figure 2 (based
on the evidence of shared innovations) and the actual geographical
arrangement of the Indo-European languages in their earliest attested
stages. (...) the relative positions of the dialects can be mapped
straightforwardly into the actual geographical arrangement if (...) the
relative positions were generally maintained as the languages fanned out
over larger territory." (Hock 1999:16) In other words: the geographical
distribution of IE languages which actually exists happens to be the one
which would, at the stage when the proto-languages were dialects of PIE,
be best able to produce the actual distribution of isoglosses over the
languages.
So, the relative location of the
ancestor-languages in the PIE homeland was about the same as their
location at the dawn of history. This, Hock proposes, is best compatible
with a non-Indian homeland. And indeed, if the Homeland was in the Pontic
region, the dialect communities could spread out radially, with the
northwestern proto-Germanic tribe moving further northwest through what is
now Poland, the western proto-Celtic tribe moving further west, the
southwestern proto-Greek and proto-Albanian tribes moving further
southwest through the Balkans, the southeastern proto-Indo-Iranians
moving southeast, etc. (One reason given by the early Indo-Europeanists
for assuming such radial expansion is that they found little
inter-borrowing between IE language groups, indicating little mutual
contact, this in spite of plenty of Iranian loans found in Slavic, some
Celtic loans in Germanic, etc.) This way, while the distances grew bigger,
the relative location of the daughters of PIE vis-ŕ-vis one another
remained the same.
If this is a bit too neat to match the
usual twists and turns of history, it is at least more likely than an
Indocentric variant of Hock's scenario would be: "To be able to account
for these dialectological relationships, the 'Out-of-India' approach
would have to assume, first, that these relationships reflect a stage of
dialectal diversity in a Proto-Indo-European ancestor language located
within India. While this assumption is not in itself improbable, it
has consequences which, to put it mildly, border on the improbable and
certainly would violate basic principles of simplicity. What would have to
be assumed is that the various Indo-European languages moved out of India
in such a manner that they maintained their relative position to each
other during and after the migration. However, given the bottle-neck
nature of the route(s) out of India, it would be immensely difficult to do
so." (Hock 1999:16-17, emphasis Hock's)
I believe there is a plausible and
entirely logical alternative. The geographical distribution of PIE
dialects in the PIE homeland is unrelated to the location of their
daughter languages; the isoglosses are the result of a twofold scenario,
part areal effect and part genealogical tree, as follows. In part, they
reflect successive migrations from the heartland where new linguistic
trends developed and affected only the dialects staying behind.
Gamkrelidze and Ivanov (1995:348-350) have built an impressive
reconstruction of such successive migrations on an impressive survey of
the linguistic material. To summarize:
1) Initially, there was a single PIE
language.
2) The first division of PIE yielded two
dialect groups, which will be called A and B. Originally they co-existed
in the same area, and influenced each other, but geographical separation
put an end to this interaction.
3) In zone A, one dialect split off,
probably by geographical separation (whether it was its own speakers or
those of the other dialects who emigrated from the Urheimat, is not yet at
issue), and went on to develop separately and become Anatolian.
4) The remainder of the A group acquired
the distinctive characteristics of the Tocharo-Italo-Celtic subgroup.
5) While the A remainder differentiated
into Italo-Celtic and Tokharic, the B group differentiated into a
"northern" or Balto-Slavic-Germanic and a "southern" or
Greek-Armenian-Aryan group; note that the kentum/satem divide only affects
the B group, and does not come in the way of other and more important
isoglosses distinguishing the northern group (with kentum Germanic and
predominantly satem Baltic and Slavic) from the southern group (with
kentum Greek and satem Armenian and Aryan).
The second part is that the isoglosses
not explainable by the former scenario are post-PIE areal effects, which
is why they affect historically neighbouring languages, regardless of
whether these had been neighbours when they were dialects of PIE.
Archaeologists (mostly assuming a North-Caspian homeland) have said that
the North-Central-European Corded Ware culture of ca. 3000 BC was a kind
of secondary homeland from which the Western branches of PIE spread, again
more or less radially, to their respective historical locations; the OIT
would allot that role of secondary western-IE homeland to the Kurgan
culture. In such a secondary homeland, IE-speaking communities would,
before their further dispersal, be close enough to allow for the
transmission of lexical innovations or common substratal borrowings
(e.g. beech, cfr. Latin fagus; or fish, cfr. Latin
piscis, unattested in eastern IE languages). Communities in truly
close interaction, at whichever stage of the development of IE, would also
develop grammatical isoglosses.
Hock (1999:14) himself unwittingly gives
at least one example which doesn't easily admit of a different
explanation: "The same group of dialects [Germanic, Baltic, Slavic] also
has merged the genitive and ablative cases into a single 'genitive' case.
But within the group, Germanic and Old Prussian agree on generalizing the
old genitive form (...) while Lithu-Latvian and Slavic favor the old
ablative".
But clearly, Old Prussian and Lithu-Latvian
lived in close proximity and separate from Germanic and Slavic for
centuries, as dialects of proto-Baltic, else they wouldn't have jointly
developed into the Baltic group, distinct in many lexical and grammatical
features from its neighbours. So, if the Baltic language bordering on the
Germanic territory happens to share the Germanic form, while the languages
bordering on Slavic happen to share the Slavic form, we are clearly faced
with a recent areal effect and not a heirloom from PIE days. The
conflation of cases has continued to take place in many IE languages in
the historical period, so the example under consideration may well date
to long after the fragmentation of PIE.
A second example mentioned by Hock may
be the split within the Anatolian group, with Luwian retaining a
distinction between velar and palatal but Hittite merging the two, just
like its Greek neighbour. Positing an areal influence at the stage of PIE
dialectal differentiation on top of an obviously existing areal influence
in the post-PIE period seems, in this context, like a "multiplication of
entities beyond necessity": neighbouring languages need not also have been
neighbours at the dialectal PIE stage in order to transmit innovations,
because their present or recent neighbourliness already allows for such
transmissions.
As far as I can see from Hock's
presentation, the twofold scenario outlined above is compatible with all
the linguistic developments mentioned by him. For now, I must confess that
after reading Hock's presentation, the linguistic problem which I have
always considered the most damaging to an Indocentric hypothesis, doesn't
look all that threatening anymore. The isoglosses discussed by him do not
necessitate the near-identity of the directional distribution pattern of
the PIE dialects with that of their present-day daughter languages, which
would indeed be hard to reconcile with an out-of-India hypothesis. But I
cannot as yet exclude that Hock's line of argument could be sharpened,
viz. by proving that certain isoglosses must date back to PIE
times, making it tougher to reconcile the distribution of isoglosses with
an Indian homeland hypothesis.
2.5. Distribution of large and small
territories
Another aspect of geographical
distribution is the allocation of larger and smaller stretches of
territory to the different branches of the IE family. We find the Iranian
(covering the whole of Central Asia before 1000 AD) and Indo‑Aryan
branches each covering a territory as large as all the European branches
(at least in the pre‑colonial era) combined. We also find the Indo‑Aryan
branch by itself having, from antiquity till today, more speakers on the
Eurasian continent (now nearing 900 million) than all other branches
combined. This state of affairs could help us to see the Indo-Aryan branch
as the centre and the other branches as wayward satellites; but so far,
philologists have made exactly the opposite inference.
It is said that this is the typical
contrast between a homeland and its colony: a fragmented homeland where
languages have small territories, and a large but linguistically more
homogeneous colony (cfr. English, which shares its little home island with
some Celtic languages, but has much larger stretches of land in North
America and Australia all to itself, and with less dialect variation than
in Britain). By that criterion, it may be remarked at once, the Pontic
region too would soon be dismissed as an IE homeland candidate, for it has
been homogeneously Slavic for centuries, though it was more diverse in the
Greco-Roman period.
It is also argued that Indo‑Aryan must
be a late‑comer to India, for otherwise it would have been divided by now
in several subfamilies as distinct from each other as, say, Celtic from
Slavic.
To this last point, we must remark first
of all that the linguistic unity of Indo‑Aryan should not be exaggerated.
The difference between Bengali and Sindhi may well be bigger than that
between, say, any two of the Romance languages, especially if you consider
their colloquial rather than their high-brow (sanskritized) register.
Further, to the extent that Indo‑Aryan has preserved its unity, this may
be attributed to the following factors, which have played to a larger
extent and for longer periods in India than in Europe: a geographical
unity from Sindh to Bengal (a continuous riverine plain) facilitating
interaction between the regions, unlike the much more fragmented
geography of Europe; long‑time inclusion in common political units (e.g.
Maurya, Gupta and Moghul empires); and continuous inclusion in a common
cultural space with the common stabilizing influence of Sanskrit.
As for the high fragmentation of IE in
Europe when compared to its relative homogeneity in North India: from the
viewpoint of an Indian homeland hypothesis, the most important factor
explaining it is the way in which an emigration from India to Europe must
have taken place. Tribes left India and mixed with the non‑lE‑speaking
tribes of their respective corners of Central Asia and Europe. This
happens to be the fastest way of making two dialects of a single language
grow apart and develop distinctive new characteristics: make them mingle
with different foreign languages.
Thus, in the Romance family, we find
little difference between Catalan, Occitan and Italian, three languages
which have organically grown without much outside influence except for a
short period of Germanic influence which was common to them; by contrast,
Spanish and Rumanian have grown far apart (lexically, phonetically and
grammatically), and this is largely due to the fact that the former has
been influenced by Germanic and Arabic, while the latter was influenced by
Greek and Slavic. Similarly, under the impact of languages they
encountered (now mostly extinct and beyond the reach of our searchlight),
and whose speakers they took over, the dialects of the IE emigrants from
India differentiated much faster from each other than the dialects of
Indo‑Aryan.
To be sure, expanding Indo-Aryan
communities have likewise merged with communities speaking now-extinct
non-IE languages, but they remained continually in touch with neighbouring
speakers of "pure" Indo-Aryan, so that they maintained the original
standards of their language better. It is widely assumed that the Bhil
tribals of Gujarat and Madhya Pradesh originally spoke a non-IE language,
probably Nahali, yet: "No group of Bhils speak any but an Aryan tongue.
(...) it is unlikely that traces of a common non-Aryan substratum will
ever be uncovered in present-day Bhili dialects." (von Fürer-Haimendorf
1956:x, quoted in Kuiper 1962:50).
One can still witness this process
today: when tribals in Eastern and Central India switch over to Hindi,
they retain at most only a handful of words from their Austro-Asiatic or
Dravidian mother-tongues, because the influence of standard Hindi is
continually impressed upon them by the numerous native Hindi speakers
surrounding them (not to mention the media).
By contrast, upon arrival on the
North-European coasts, the speakers of proto-Germanic merged completely
with the at least equally numerous natives. Having covered greater
distances and in smaller numbers than the gradually expanding Indo-Aryan
agriculturalists in India, they lost touch with the language standards of
their fathers because they were not surrounded by a compact and
numerically overwhelming environment of fellow IE-speakers. This allowed a
far deeper impact of the native language upon their own, differentiating
it decisively from IE languages not influenced by the same substratum.
2.6. Go West
A seemingly common-sense objection to an
Indian homeland is that it implies an IE expansion almost entirely in one
direction: east to west, with the homeland lying in the far corner of the
ultimate IE settlement area rather than in the centre. Isn't this odd?
Well, no: it is the rule rather than the
exception. Chinese spread from the Yellow River basin southward, first
assimilating Central and then South China. Arabic spread from Arabia a
little northward and mostly westward. The circumstances in north and
south, or in east and west, are usually very different, making the
prospects of expansion very attractive on one side but quite uninteresting
on the other. Spanish and English could expand westward, in the Americas,
because of their steep technological-military edge over the natives; this
did not apply in the equation of forces to their east, in Europe.
Assuming the OIT with Panjab-Haryana as
the centre, we can safely surmise that a similar number of migrants went
southeast c.q. northwest, yet their destinies were quite different. The
first didn't have far to go: they colonized the rain forests of India's
interior, where soil and climate allowed for the settlement of large
populations on a relatively small surface. It was always easier to chop
down another stretch of forest and expand locally than to leave the
material security of interior India for a dangerous and probably pointless
mountain trek into China or a sea voyage to Indonesia. By contrast, the
second group going to Central Asia found itself challenged by more
uncomfortable conditions: a variable climate, large stretches of
relatively useless land, a crossroads location with hostile nomads or
migrating populations passing through. They had to cross far larger
distances in order to settle comfortably, mixing with many more people
along the way, thus losing their physical Indianness and linguistically
growing away from PIE fast and in different directions.
As an economic and demographic outpost
of India, Bactria was, along with Sogdia, a launching-pad for the most
ambitious migration in premodern history; the first Amerindians and
Austronesians covered even larger distances but settled empty lands, while
the Indo-Europeans assimiliated large populations in a whole continent.
This followed (or rather, set) a pattern: recall how the Mongols conquered
this region, thence to conquer the Western half of Asia and Eastern
Europe; in the preceding centuries, the Turks; before that, the Iranians
or (pars pro toto) Scythians; and first of all, the Indo-Europeans.
Nichols 1997 (cfr. below) adds Kartvelian to this list, as one case of a
language spread westward through the Central-Asian "spread zone" but
entirely losing its foothold there, only to survive in a South-Causasian
backwater; and points to the parallel westward movement of the
Finno-Ugrians from Siberia to Northeastern Europe. Until the eastward
expansion of Russia, Central Asia was subject to an over-arching dynamic
of east-to-west migration. This may have started as early as the end of
the Ice Age, when a depopulated Europe became hospitable again, and lasted
until the reversal of the demographic equation, when European population
pressures forced an eastward expansion.
3. Loans and substratum features
3.1. How to decide on the foreign origin
of a word?
One widely accepted criterion for
deciding whether a word attested in ancient Sanskrit is IE or not, is the
presence of sound combinations which do not follow the standard pattern.
It is argued that a word in a given language cannot take just any shape,
e.g. a true English word cannot start with shl-, shm-, sht-.
Consequently, when a word does contain such irregular sounds, it must be
of foreign origin, i.c. German or Yiddish loans like schnitzel,
schmuck, schlemiel. Likewise, a Sanskrit word cannot contain certain
sound combinations, which would mark it as a foreign loan.
However, there are problems with this
rule. Firstly, and invasionists should welcome this one, if a sound is too
strange, chances are that people will "domesticate" it into something more
manageable. This will result in a loan which differs in pronunciation
from its original form, but which is no longer recognizable as a loan by
the present criterion. Thus, in Sino-English, a boss or upper-class person
is called a taiban, Chinese for "big boss"; there is nothing
decisively un-English about this string of consonants and vowels. The one
feature of this Chinese word which could have marked it as un-English, is
its tones (tai fourth tone, ban third tone),-- but precisely
that typically foreign feature has been eliminated from the English usage
of the word. The same is true in Japanese, which has adopted hundreds of
Chinese words after stripping them of tones and other distinctively
Chinese phonetic characteristics. Likewise, Arabic has a number of sounds
and phonemic distinctions unknown in European languages, which are
systematically eliminated in the Arabic loans in these languages, e.g.
tariff from ta'rîfa with laryngeal 'ayn, or cheque
from sakk with emphatic saad.
Another point is: how do you decide what
the standard shape of a word in a given language should be? Witzel
(1999/1:364) calls bekanâTa "certainly a non-IA name" citing as
reason the retroflex T and the initial b-. It may be
conceded that the suffix -Ta is common in seemingly non-IA
ethnonyms (kîkaTa etc.), but the phonetic exceptionalism, by
contrast, cannot be accepted as a valid ground for excluding an IA
etymology. The dental/retroflex distinction must initially have been
merely allophonic, representing a single but phonetically unstable
phoneme; and at any rate, numerous purely IE words have acquired the
retroflex pronunciation, e.g. SaD, "six", or aSTa, "eight".
While b- may be rare in Old IA, there is no good reason to exclude
it altogether from the acceptable native sounds of the language. It is
also attested in bala, "strength", related to Greek bel-tiôn,
"better", and Latin de-bil-is, "off-strength", "weak", a connection
which Kuiper (1990:90) admits to be "attractive" though he would prefer to
"accept the absence of /b/ in the PIE consonant system", it being
otherwise only attested in the Celtic-Germanic-Slavic (hence probably
Euro-substratal) root *kob, "to fall".
What threatens to happen here, is that
the minority gets elbowed out by the majority, that the majoritarian forms
are imposed as the normative and only permissible forms. Compare with the
argument by Alexander Lehrman (1997:151) about accepting or excluding the
rare sequence "e + consonant" as a possibly legitimate root in Hittite:
"There is absolutely no reason why a lexical root of Proto-Indo-European
(or Proto-Indo-Hittite) cannot have the shape *eC-, except the wilful
imposition by the researching scholar of the inferred structure of a
majority of lexical roots on a minority of them." (emphasis mine) The
same openness to exceptions to the statistical rule is verifiable in other
languages, e.g. Chinese family names are, as a rule, monosyllabic (the
Mao in Mao Zedong), yet two-syllable names have also existed, though
now fallen in disuse (the Sima in Sima Qian). As a rule, Semitic
verbal roots have a "skeleton" of three consonants, yet a few with two or
four consonants also exist. Admittedly, both examples also illustrate a
tendency of the exception to disappear in favour of (or to conform itself
to) the majoritarian form; but their very existence still provides an
analogy for the existence of atypical minoritarian forms in IE, such as
the b- phoneme.
Another point is that there may be a
covert petitio principii at work here. Many assertions on what can
or cannot be done in Indo-Aryan are based on the assumption that Vedic
Sanskrit is more or less the mother of the whole IA group, it being the
language of the entry point whence the Aryan tribes populated a large part
of India. In an OIT scenario (e.g. Talageri 1993:145) of ancient Indian
history, Sanskrit need not be the mother of IA at all, there being IA
dialects developing alongside Vedic Sanskrit. Just as Vedic religion was
but one among several Indo-Aryan religious traditions, the traces of
which are found in the Puranas and Tantras, Vedic Sanskrit is but one
among a number of OIA dialects. The eastward expansion of Vedic culture
attested in the Atharva Veda, Shatapatha Brâhmana etc. may have vedicized
regions which were already IA-speaking though religiously and
linguistically non-Vedic.
Thus, the sh/S > s shift in
eastern Hindi and Bengali, e.g. subhâSa > subhâs, ghoSa > ghos, may
be due to substratum influence (cfr. the case of Kosala in the next
section), but then again, what is more ordinary than this inter-sibilant
shift in dialectal variation? Remember Semitic salâm/shalom, or
the Biblical test of pronouncing sibboleth/shibboleth. This could
be a substratum influence, but it could also simply be a spontaneous
variation in a non-Vedic dialect of IA. More generally, one should not
jump to conclusions of foreign origins without a positive indication.
Mere oddities may come into being without adstratal or substratal
influence (cfr. French phonetic oddities like nasalization or uvular
[r]); they are not proof enough that IA was an intruding language
replacing a native one.
3.2. River names in Panjab
If a word looks Sanskritic, it may still
be of foreign origin, but thoroughly assimilated. With historical
languages, the assimilation into Sanskrit sound patterns is
well-attested, e.g. Greek dekanos becoming drekkâNa, Altaic
turuk becoming turuSka, Arabic sultan becoming
suratrâNa, etc. Sometimes this phonetic adaptation gives rise to
folk-etymological reinterpretation, often with hypercorrect
modification of the word, e.g. the râNa, "king", in suratrâNa.
Such adaptation can also take place even without etymological
interpretation, just for reasons of "sounding right". Thus, it is often
said (e.g. Witzel 1999/1:358) that Yavana, vaguely "West-Asian", is
a hypersanskritic back-formation on Yona, Ionia, i.e. the name of
the Asian part of Greece. This principle underlies the Sanskrit looks of
many foreign loans in Sanskrit.
Witzel uses this phenomenon to explain
the Sanskrit looks of no less than 35 North-Indian river names: "Even a
brief look at this list indicates that in northern India, by and large,
only Sanskritic river names seem to survive". (1999/1:370) He quotes
Pinnow 1953 as observing that over 90% don't just look IA but "are
etymologically clear and generally have a meaning" in IA. He attributes
this unexpectedly large etymological transparency to "the ever-increasing
process of changing older names by popular etymology". This hypothesis of
a very thorough assimilation of foreign names with pseudo-etymology is a
possibility but quite unsubstantiated, a complicated explanation
satisfying AIT presumptions but not Occam's razor. It has no counterpart
in any other region of IE settlement, e.g. in Belgium most river names are
Celtic or pre-Celtic and make no sense at all in Dutch or French; yet in
their present forms no attempt is in evidence of semantically romanizing
or germanicizing them. In the US, there are plainly native river names
like Potomac, and plainly European ones like Hudson, but no
anglicized native names. So, most likely, the Sanskrit-looking river names
are simply Sanskrit.
This may be contrasted with the
situation farther east in the Ganga plain, where we do find many
Sanskrit-sounding names of rivers and regions which however do not have a
transparent etymology, e.g. kaushikî or koshala, apparently
linked to Tibeto-Birmese kosi, "water", and the name of the river
separating Koshala from Videha. In that case, we also see the ongoing
sanskritization: kaushikî evolved from kosikî (attested in
Pali), and koshala from kosala, which Witzel (1999/1:382)
considers as necessarily foreign loans because the sequence -os-
is "not allowed in Sanskrit". But while the phonetic assimilation can be
caught in the act, we can see no semantic domestication through folk
etymology at work. The name koshala doesn't mean anything in
Sanskrit, and that is a decisive difference with the Western hydronyms
gomatî, "the cow-rich one", or asiknî, "the dark one". While
the occurrence of some folk-etymological adaptation among the
Panjabi river names can in principle be conceded, it is highly unlikely to
be the explanation of all 35 names. Until proof of the contrary, the
evidence of the Northwest-Indian hydronyms goes in favour of the absence
of a non-IE substratum, hence of the OIT.
3.3. Exit Dravidian Harappa
The European branches of IE are all full
of substratum elements, mostly from extinct Old European languages. For
Germanic, this includes some 30% of the acknowledged "Germanic"
vocabulary, including such core lexical items as sheep and drink;
for Greek, it amounts to some 40% of the vocabulary. In both cases,
extinct branches of the IE family may have played a role along with non‑lE
languages (vide Jones-Bley and Huld 1996:109-180 for the Germanic case).
The branch least affected by foreign elements is Slavic, but this need not
be taken as proof of a South‑Russian homeland: in an Indian Urheimat
scenario, the way for Slavic would have been cleared by other IE
forerunners, and though these languages would absorb many Old‑European
elements as substratum features, they also eliminated the Old‑European
languages as such and prevented them from further influencing Slavic.
Even if we accept as non‑lE all the
elements in Sanskrit described as such by various scholars, the non‑lE
contribution is still smaller than in some of the European branches of IE,
which bear the undeniable marks of "Aryan" invasions followed by
linguistic assimilation of large native populations. Among the highest
estimates is the 5% to 9% of loans in Vedic Sanskrit proposed by Kuiper
1991:90-93, in his list of 383 "foreign words in the Rigvedic language". A
number of these words are certainly misplaced: some have no counterpart in
Dravidian or Munda, or when they do, there is often no reason to assume
that the direction of borrowing was into rather than out of Indo-Aryan.
To take up one example, the name of the
seer Agastya is a normal Sanskritic derivation of the tree name
agasti, "Agasti grandiflora" (Kuiper 1991:7 sees the derivation as a
case of totemism). This word is proposed to be a loanword, related to
Tamil akatti, acci, as if the invaders borrowed the name from
Dravidian natives. That non-Indian branches of IE do not have this word,
says nothing about its possible IE origins: they didn't need a word for a
tree that only exists in India, so they may have lost it after emigrating.
It is perfectly possible that the Tamil word was derived from Sanskrit
agasti, and by looking harder we just might discern an IE etymon for
it, e.g. Pirart (1998:542) links Agastya with Iranian gasta,
"foul-smelling, sin".
But let us accept that some 300 words in
Kuiper's list are indeed of non-IE origin. Even then, the old tendency to
impute Dravidian origins to IA words of unclear etymology must be
abandoned because the underlying assumption of a Dravidian-speaking
Harappan civilization has failed to get substantiated. Likewise, the
relative convergence of Indo-Aryan and Dravidian (as well as Munda and to
an extent Burushaski) in phonetic, lexical and grammatical features,
forming a pan-Indian linguistic zone (vide e.g. Abbi 1994), is no longer
explained as the substratal effect of an India-dominating Dravidian
culture.
That the Dravidians are not native to
their present habitat, had already been accepted: "Arguments in favour of
the South Indian peninsula being the original home of the Dravidian
language family, very popular with Tamil scholars at one time, cannot
resist the weight of the evidence, both archaeological and linguistic."
(Basham 1979:2)
Now, even Harappa is being lifted out of
their claimed heritage. Bernard Sergent (1997:129) and Michael Witzel
(1999/1:385) are among the latest experts to bid goodbye to the popular
assumption that Harappa was Dravidian-speaking. Indeed, the most important
shift in scholarly opinion in recent years is the realization that, when
all is said and done, there is really not a shred of evidence for the
identification of the Harappans as Dravidian, even though several
elaborate attempts at decipherment of the Indus script (Fairservis 1992,
Parpola 1994) have been based on it.
Some of the arguments classically used
against Vedic Harappa equally stand in the way of Dravidian Harappa, e.g.
like Vedic culture, the oldest attested Dravidian culture was not urban:
according to McAlpin (1979:181-182), the Dravidians "were almost certainly
transhumants practising both herding and agriculture, with herding the
more unbroken tradition".
Of course, in both cases, a
chronological shift placing them in the pre-urban pre-Harappan period
could solve this problem. More importantly, the Dravidian contribution
to the Indo-Aryan languages is not such as one would expect if Indo-Aryan
newcomers had incorporated a prestigious Dravidian-speaking city culture.
Even linguists eager to discover Dravidian words in IA are surprised to
find how small their harvest is: "Dravidian influence is less than has
been expected by specialists." (Wojtilla 1986:34)
Judging from the substratum of
place-names, Dravidians once were located along the northwestern coast (Sindh,
Gujarat, Maharashtra) in the southern reaches of the Harappan
civilization. Parpola (1994:170) points out the presence of a Dravidian
substratum, starting with the place-names: "palli, 'village'
(whence valli and modern -oli, -ol in Gujarat),
corresponding to South-Dravidian paLLi; and pâTa(ka) or
pâTi (whence vâTa, vâTi, etc., modern -vâDâ, vâD
etc. in Gujarat) as well as paTTana (Gujarati paTTan), all
originally 'pastoral village' from the Dravidian root paTu, 'to lie
down to sleep'. In addition to place-names, other linguistic evidence
suggests that Dravidian was formerly spoken in Maharashtra, Gujarat and,
less evidently, Sindh, all of which belonged to the Harappan realm. It
includes Dravidian structural features in the local Indo-Aryan languages
Marathi, Gujarati and Sindhi, such as the distinction between two forms of
the personal pronoun of the first person plural, indicating whether the
speaker includes the addressee(s) in the concept 'we' or not. Dravidian
loanwords are conspicuously numerous in the lower-class dialects of
Marathi." Add to this the cultural influence, e.g. the Dravidian system of
kinship (Witzel 1999/1:385).
So, that is how a Dravidian past
perpetuates itself along the presently IA-speaking coastline, but it is
conspicuous by its absence in the language and culture of Panjab and the
Hindi belt. The latter has much fewer Dravidian elements than the link
language Sanskrit, e.g. the Dravidian loan mîna, "fish", caught on
in Sanskrit but never in Hindi. There is no reason to assume a Dravidian
presence in North India at any time. The main part of the Harappan
civilization was definitely not Dravidian if we may judge by the
substratum evidence there, e.g. the lack of Dravidian hydronyms. There are
also no indications that South-Indian Dravidian culture is a continuation
of Harappan culture.
The Dravidians may have entered Sindh
through the Bolan Pass from Afghanistan (Samuel 1990:45), possibly as
late as the 3rd millennium BC (McAlpin 1979), though I am not aware of any
firm proof against their indigenous origins. Vedic culture was established
in the Panjab for quite some time before encountering Dravidian,
considering that the oldest layers of Vedic literature do not contain
loans from Dravidian: according to Witzel (1999/2:$1.1), "RV level 1 has
no Dravidian loans at all". Dravidian loans appear only gradually in the
next stages (i.e. when Indo-Aryan culture penetrates Dravidian territory)
and are typically terms used in commercial exchanges, indicating adstratum
rather than substratum influence. With that, Dravidian seems now to have
been eliminated from the shortlist of pretenders to the status of Harappan
high language.
3.4. Pre-IE substratum in Indo-Aryan:
para-Munda
Unlike Dravidian, other languages seem
to have exerted an influence on Sanskrit since the earliest Vedic times:
chiefly a language exhibiting Austro-Asiatic features, hence provisionally
called para-Munda, not the mother but at least an aunt of the Munda
languages still spoken in Chhotanagpur. Where IA-Dravidian likenesses in
words without apparent IE etymology were hitherto often explained as
Dravidian substratum in IA, the favourite explanation now is that
Dravidian borrowed from IA what IA itself had borrowed from para-Munda,
e.g. mayűra, "peacock" was derived from Munda *mara and in
its turn yielded Tamil mayil. A second influence is attributed to
an unknown language, nonetheless discernible through consistent features,
and provisionally called Language X.
Indian non-invasionists strongly
dislike the alleged fondness of Western linguists for "ghost languages",
e.g. Talageri (1993:160) dismisses "purely hypothetical extinct languages"
thus: "We cannot proceed with these scholars into the twilight zone of
non-existent languages." But the simple fact remains that numerous
languages have died out, and that the ghost of some of them can be
seen at work in anomalous elements in existing languages. Thus, the first
Sumerologists noticed an un-Sumerian presence of remnants of an older
language typified by reduplicated final syllables, hence baptized "banana
language". Today, much more is known about a pre-Sumerian Ubaidic culture,
which has become considerably less ghostly.
In the para-Munda thesis, the
hypothetical para-Munda language seems to be the main influence, reaching
far northwest to and even beyond the entry point of the Vedic Aryans in
India, and definitely predominant in the whole Ganga basin. The word
gaGgâ itself has long been given an Austro-Asiatic etymology, esp.
linking it with southern Chinese kang/kiang/jiang, supposedly also
an Austro-Asiatic loan. The latter etymology has recently been abandoned,
with the pertinent proto-Austro-Asiatic root being reconstructed as *krang
and the Chinese word having a separate Sino-Tibetan origin (Zhang 1998).
Witzel (1999/1:388) now proposes to explain Ganga as "a folk
etymology for Munda *gand", meaning "river", a general meaning it
still has in some IA languages. The folk etymology would be a
reduplication of the root *gam/ga, "moving-moving", "swiftly
flowing", which only applies meaningfully to the river's upper course,
nearest to the Harappan population centres. But there is no decisive
reason why the folk etymology could not be the real one, nor why some
other IE etymology could not apply. (Experimentally: what about a
phonetically impeccable kinship with Middle Dutch konk-elen,
"twist and turn", related to English kink, "torsion"?)
In some cases, a Munda etymology is
supported by archaeological evidence. Rice cultivation was developed in
Southeast Asia (including South China), land of origin of the
Austro-Asiatic people, who brought it to the Indus region by the late-Harappan
age at the latest. Therefore, it is not far-fetched to derive Sanskrit
vrihi from Austro-Asiatic *vari, which exists in practically
the same form in Austronesian languages like Malagasy and Dayak, and
reappears even in Japanese (uru-chi), again pointing to
Southeast-Asia as the origin and propagator in all directions of both the
cultivation of rice and its name *vari.
All this goes to confirm that at least
linguistically, the Munda tribals are not "aboriginals" (with a
pseudo-native modern term, âdivâsîs), but carriers and importers of
Southeast-Asian culture. Witzel himself acknowledges that "Munda speakers
immigrated", as this should explain why in Colin Masica's list of
agricultural loans in Hindi (1979), which in conformity with the
invasionist paradigm is very generous in allotting non-IE origins to
Indo-Aryan words, Austro-Asiatic etymologies account for only 5.7%. In
borrowing so few Munda words, the Vedic Aryans clearly did not behave like
immigrants into Munda-speaking territory.
This paucity of Munda influence in the
agricultural vocabulary, soil-related par excellence, should also
caution us against reading Munda etymologies into the equally soil-bound
hydronyms, which are overwhelmingly Indo-Aryan from the kubhâ to
the yamunâ. Witzel (1999/1:374) diagnoses the usual Sanskritic
interpretations as artificial "popular etymology", but in most cases does
not produce convincing Munda alternatives. The one plausible Munda
etymology is for shutudrî (prefix plus *tu-, "to drift",
plus *da, "water", Witzel 1999/2:$1.4), if only because the Vedic
Aryans themselves showed their unfamiliarity with it by devising folk
etymologies like shata-drukâ, "hundred streams"; even there, the
step from -da to -drî, though possible, does not impress
itself as compelling.
Numerous words have wrongly or at least
prematurely been classified as foreign loans. Talageri (1993:169-170)
gives the examples of animal-names like khaDgin ("breaker",
rhinoceros), mâtaMga ("roaming at will", elephant), gaja
("trumpeter", elephant), which Suniti Kumar Chatterji had cited as loans
from Dravidian or Munda but which easily admit of an IE etymology.
Likewise, there may well be an IA explanation for terms commonly given
non-IE etyma, e.g. exotic-sounding ulűkhala, "mortar (for soma)",
may well be analysed, following Paul Thieme, into IA uru, "broad",
plus khala, "threshing-floor", or even khara, "rectangular
piece of earth for sacrifices" (with Greek cognate, eschara),
albeit with vulgar -l- pronunciation. The word mayűra,
"peacock", is often given a Dravidian or (by Witzel 1999/1:350) Munda
etymon, but Monier Monier-Williams (1899:789) already derived it from an
onomatopoeic IA root *mâ, "bleat", and the related words in non-IA
languages may very well be derived from IA forms (but in this case, the
suffix -űr-, unknown in Indo-Aryan, pleads in favour of a foreign
origin).
As a rule, one should not allot
Dravidian or Munda origins to an IA word unless the etymon can actually be
pointed out (at least indirectly) in the purported source language. It is
therefore with great reservation that we should consider the list of
para-Munda words "in the RV, even if we cannot yet find etymologies". (Witzel
1999/2:$1:2) However, many hypothetical etyma which do not exist in Munda
in full, and which should at first sight be rejected, may be analysed as
composites with components which do exist in Munda.
The main pointer to a Munda connection
seems to be a list of prefixes, now no longer productive in the Munda
languages, and not recognized or used as prefixes by Vedic Sanskrit
speakers. Thus, the initial syllable of the ethnonym kî-kaTa seems
to be one in a series of non-IA and probably para-Munda prefixes ka/ke/ki
etc. (Witzel 1999/1:365), some of which look like the declension forms of
the definite article in Khasi, an Austro-Asiatic language in the
Northeast. On this basis, very common words become suspected loans from "para-Munda",
e.g. ku-mâra, "young man", a term not explainable in IE, but
plausibly related to a Munda word mar, "man" (Witzel 1991/2:$1.2).
Between Sanskrit karpâsa,
"cotton", and Munda ka-pas (cfr. Sumerian kapazum), it may
now be decided that the latter was first while the former, with its
typical cluster -rp-, is but a hypersanskritized loan. This also
fits in with the archaeological indications of textile-manufacturing
processes pioneered by the Southeast-Asians, and with an
already-established Austro-Asiatic etymon *pas (without the prefix)
for Chinese bu, "cotton cloth". Incidentally, this does not affect
the argument by Sethna that the appearance of this word in late-Vedic,
regardless of its provenance, should be synchronous with the appearance of
actual cotton cloth in the Panjab region, viz. in the mature Harappan
phase (implying that early Vedic predated the mature Harappan phase);
indeed, Sethna (1982:5) himself accepts the Austro-Asiatic etymology.
An interesting little idea suggested by
Witzel concerns an alleged alternation k/zero, e.g. in the Greek
rendering of the place-name and ethnonym Kamboja (eastern
Afghanistan) as Ambautai, apparently based on a native
pronunciation without k-. Citing Kuiper and others, Witzel
(1999/1:362) asserts that "an interchange k : zero 'points in the
direction of Munda'" though this "would be rather surprising at this
extreme western location". Indeed, it would mean that not just Indo-Aryan
but also other branches of Indo-Iranian have been influenced by Munda, for
Kam-boja seems to be an Iranian word, the latter part being the
de-aspirated Iranian equivalent of Skt. bhoja, "king" (Pirart
1998:542). At any rate, if the Mundas could penetrate India as far as the
Indus, they could reach Kamboja too.
But the interesting point here is that
the "interchange k : zero" is attested in IE vocabulary far to the west of
India and Afghanistan, e.g. English ape corresponding to Greek
kepos, Sanskrit kapi, "monkey", or Latin aper, "boar",
corresponding to Greek kapros. Gamkrelidze and Ivanov (1995:113,
435) have tried to explain this through a Semitic connection, with the
phonetic and physiological closeness, somewhere in the throat, of qof
and 'ayn. But if the origin of this alternation must be sought in
an Afghano-Munda connection, what does that say about the geographical
origin of English, Latin and Greek?
Given the location of the different
language groups in India, it is entirely reasonable that Munda influence
should appear in the easternmost branch of IE, viz. Indo-Aryan. If both IE
and Munda were native to India, we might expect Munda influence in the
whole IE family (though India is a big place with room for non-neighbouring
languages), but since Munda is an immigrant language, we should not be
surprised to find it influencing only the stay-behind IA branch of IE.
This merely indicates a relative chronology: first Indo-Aryan separated
from the other branches of IE when these left India, and then it came in
contact with para-Munda. So, if we accept the presence of para-Munda loans
in Vedic Sanskrit, we still need not accept that this is a native
substratum influence in a superimposed foreign language.
3.5. Pre-IE substratum in Indo-Aryan:
language X
The mysterious language X has possibly
not left this earth without a trace, for it is tentatively claimed to be
connected with the nearly-vanished but known Kusunda language of Nepal (Witzel
1999/1:346). Masica (1979) had found no known etymologies for 31% of
agricultural and flora terms in Hindi, and Witzel credits these to
language X (1999/1:339). I would caution, with Talageri (1993:165 ff.),
against prematurely deciding on the non-IE origin of a word not having
parallels in other IE languages, especially in the case of terms for
indigenous flora and fauna. Though Sanskrit kukkura or Hindi
kuttâ, both "dog", have no IE cognates outside India, we cannot expect
the Aryans to have been ignorant of this animal and to have learned about
it from the Indian natives upon invading. Onomatopoeic or otherwise slang
formations just come into being and sometimes replace the original
standard terms, without implying foreign origin or a substratum effect.
The OIT has no objection to the
impression that Vedic Sanskrit has absorbed some foreign words, e.g. from
immigrants into their metropolis, just like the Romance languages borrowed
many Germanic words from the Gothic invaders. All that the OIT requires is
merely that this absorption should have taken place after the emigration
of the other branches of IE from India. Also, it is accepted that
substratal effects may have taken place during the Aryan "colonization"
of the non-Aryan lower Ganga plain, in which the western IE languages took
no part.
One discernible trait of this ghost
language X is claimed to be the "typical gemination of certain consonants"
(Witzel 1999/2:$1.1), e.g. in the name of the malla tribe/caste.
Often these geminates are visible upon first borrowing but are later
masked by hypersanskritic dissimilation, e.g. pippala becoming
pishpala, or guggulu becoming gulgulu (Witzel
1999/2:$2.4). However, the geminated -kk- in kukkura or the
-tt- in kuttâ, though atypical of the IE word pattern, can
perfectly come into being as onomatopoeic formations within a purely IE
milieu: in imitating the sound of a dog, even IE-speakers need not have
assumed that barking sounds follow the IE pattern.
The assumption of a language X in North
India will be welcomed by many as the solution to the vexing question of
the origin of retroflexion in the Indian languages. Weak in Burushaski and
Munda, strong yet defective (never in initial position) in Dravidian,
strong in Indo-Aryan but unattested among its non-Indian sister-languages,
retroflexion in its origins is a puzzling phenomenon. So, language X as
the putative language of the influential Harappan metropolis, or as the
native substratum of the later metropolitan region, viz. Eastern Uttar
Pradesh and Bihar, might neatly fit an invasionist scenario for the
genesis of retroflexion in Indo-Aryan as well as its spread to all
corners of India.
Still, there is no positive
reason yet for locating the origin of retroflexion in this elusive
language X. An entirely internal origination of retroflexion within early
Indo-Aryan, which then imparted it to its neighbours, has always had its
defenders even among linguists working within the invasionist paradigm
(e.g. Hamp 1996). And consider the following possibility.
The Vedic hymns may well be somewhat
older than the language in which they have come down to us. We need not
exclude a phonetical evolution between the time of composition and the
time when the Veda was given its definitive shape, traditionally by
vyâsa, "compiler". Strictly speaking, it is not even impossible that a
hymn composed in a language phonetically close to PIE,
pre-proto-Indo-Iranian, subsequently underwent the kentum/katem shift and
the vowel shift from IE /a/e/o/ to Sanskrit /a/, somewhat like the
continuity of living Latin across centuries of phonetic change: Caesar
evolving from [kaisar] to [cezar] or [sezar], agnus (lamb) from [agnus]
to [anyus], cyclus from [küklus] to [ciklus] or [siklus],
descendere from [deskendere] satemized to [deshendere], the vowels
ae/oe/e coinciding as [e], etc. In the Middle Ages, Virgil's verses
were still recited, but with a different pronunciation, just as in China,
children memorized the Confucian Classics in the pronunciation of their
own day, without knowing what the ancient masters' own pronunciation must
have sounded like. Similarly, the Vedic hymns may well be older than the
language form in which they have been preserved till today.
A very modest application of this line
of thought is the hypothesis that the differentiation between dental and
retroflex or cerebral consonants was not yet present in the original
Vedic, and only developed by the time Sanskrit reached its classical form.
Deshpande (1979) argues that the cerebral sounds crept in when the centre
of Brahminical learning had shifted from Sapta-Sindhu to the Ganga basin,
where the Indo-Aryan dialects had developed the dental-cerebral
distinction. In that case, the Veda recension which we have today (the
mâNDűkeya and shâkalya recensions, which Deshpande dates to 700
BC), was established in Videha-Magadha (Bihar), where native speakers
imposed their pronunciation on the Veda.
Deshpande also mentions a Magadhan king
Shishunaga (5th century BC?) who prohibited the use of the retroflex
sounds T/Th/D/Dh/S/kS in his harem. But this seems to indicate that
retroflexion was an intrusive new trend in Magadha, not at all a native
tendency which was so strong and ingrained that it could impose itself on
the liturgical language. Something may be said for Kuiper's (1991:11-14)
rebuttal to Deshpande's thesis, viz. that mâNDűkeya's insistence
on retroflex pronunciation was a case of upholding ancient standards
against a new and degenerative trend, implying that retroflexion was
well-established by the time the Vedas were composed, and was being
neglected in the new, eastern metropolis. That puts us back at base one:
Munda (probably the main influence in Bihar) is clearly not the source of
retroflexion, and that elusive language X didn't have much lexical impact
on Vedic yet, making phonological influence even less likely. So if
retroflexion was already present in Vedic, and otherwise too, the search
for its origin continues.
3.6. The peculiar case of "Sindhu"
Among IA-looking river names, a case can
be made for surprising IE etymologies of names usually explained as loans.
In particular, sindhu might be an "Indo-Iranian coinage with the
meaning 'border river, ocean' and fits Paul Thieme's etymology from the IE
root *sidh, 'to divide'". (1999/1:387) Now, if the Vedic Aryans
only entered India in the 2nd millennium BC, the name Sindhu cannot be
older than that.
According to Oleg Trubachov (1999),
elaborating on a thesis by Kretschmer (1944), Indo-Aryan was spoken in
Ukraine as late as the Hellenistic period, by two tribes knows as the
Maiotes and the Sindoi, the latter also known by its
Scythian/Iranian-derived name Indoi and explicitly described by
Hesychius as "an Indian people". They seem to have used a word sinu,
from sindhu, for "river", a general meaning which it also has in
some Vedic verses. Trubachov lists a number of personal and place names
recorded by Greek authors (e.g. Kouphes for the Kuban river,
apparently a re-use of kubhâ, the Kabul river, Greek
Kophes), and concludes that the Maiotes and Sindoi spoke an Indo-Aryan
dialect, though often with -l- instead of -r-, as in king
Saulios, cfr. sűrya (just the opposite from Mitannic, where
palita, "grey", and pingala, "reddish", appear as parita
and pinkara) and with -pt- simplified to -tt- (so
that, just like in Mitannic, sapta appears as satta, a
feature described by Misra 1992 as "Middle IA").
Working within the AIT framework,
Kretschmer saw these Sindoi as a left-over of the Indo-Aryans in their
original homeland, and even as a splendid proof of the Pontic homeland
theory (Trubachov is less committed to any particular homeland
hypothesis). In that case, again, the name sindhu (and likewise
kubhâ) would be an Indo-Aryan word brought into India by the
Vedic-Aryan invaders.
However, Witzel himself (1999/2:$1.9)
notes that the Sumerians (who recorded a handful of words from "Meluhha"/Sindh,
which incidentally seem neither IA nor Dravidian) in the 3rd millennium
already knew the name sindhu as referring to the lower basin of the
Indus river, then the most accessible part of the Harappan civilization,
whence they imported "sinda" wood. If this is not a coincidental
look-alike, then either sindhu is a word of non-IE origin already
used by the non-IE Harappans, in which case the Pontic Sindoi were
migrants from India (demonstrating how earlier the Kurganites might have
migrated from India?); or sindhu was an IE word, and proves that
the Harappan civilization down to its coastline was already IA-speaking.
Part II