Did the Proto-Indo-Europeans borrow agricultural and cultural terms from a population that spoke something close to Proto-Semitic? Rasmus Bjørn has just published a new paper (paywalled) discussing 21 (Proto-)Indo-European words that have been suggested to be borrowed from Semitic or Afroasiatic more generally and argues that yes: there are enough terms in Proto-Indo-European and its daughters to posit the existence of a Semitoid “Old Balkanic” language bordering the PIE steppe homeland to the west.

A very exciting possibility! Unfortunately, there are some issues with the words that Bjørn compares. Let’s dive right in. The main question we’ll try to answer: do these Indo-European words really have close parallels in Semitic, and if so, is there convincing evidence that Semitic was the source and not the recipient language? (I’ve modified some of the transcriptions of reconstructed words to match conventions I’m more used to. (P)IE means that a reconstruction is reflected in several branches of Indo-European but is probably not Proto-Indo-European proper.)

The comparanda

  • PIE *h₂ster– ‘star’, PS *ʕaθtar– ‘deified morning star’ (Ishtar, Astarte, etc.). Aren Wilson-Wright wrote a 2016 book about the Semitic deity and has suggested before (probably also in the book) that this is a loanword from Indo-European. I’m inclined to agree that ‘star’ > ‘deified Venus’ is a more likely development than vice versa. With four more-or-less matching consonants and very similar meanings, I think a coincidence is unlikely in this case.
  • PIE *h₃or-(n-) ‘eagle’, PS *ġVrVn– ‘eagle’. I can’t find the alleged Arabic reflex ġaran- in Lane, which leaves just Akkadian urinn- (possibly a Sumerian loanword). If these words are related, the fact that *-n- is only present in a few of the Indo-European reflexes suggests that it was borrowed from Indo-European (or a third language family) into Semitic, not vice versa.
  • PIE *ḱer-(n-)(h/u-) ‘horn’, PS *ḳarn– ‘horn’. Bjørn cites the PS form as *ḳar-n-, but the *n is part of the root in Semitic. I don’t know what’s going on with “Tigre ḳär(n)“, but if it lacks the –n sometimes, I’m highly skeptical that this says anything about Proto-Semitic; all of Tigre’s closest relatives do have the n. The tentative derivation from Proto-Afroasiatic *ḳar– relies on “Omotic [ḳ]ar” and “Egyptian ḳr.ty (dual) ‘horns of the crown (of one of the manifestations of Amun)’”. Omotic isn’t a language; it’s a language family, and we need attested forms to judge the possible relationship. Moreover, Omotic has not been demonstrated to be Afroasiatic. As Marwan Kilani’s personal communication in a footnote points out, the Egyptian attestation is highly specific; if it’s related to the Indo-European word, it could perhaps be a borrowing from something like Greek (I have no idea when or where the word is attested, so this may be difficult). Without any indication that the Semitic –n is a suffix, it is again hard to see the PIE word which sometimes lacks it as a borrowing from Semitic.
  • PIE *guōu– ‘cow’ (I’ve also seen this as *gueh₃(-)u-). “[T]his is an item that is not attested in PS proper while being shared with the wider Northern Afro-Asiatic speech community”, i.e. Egyptian gw (referring to a certain kind of bull). The similar words in Northwest Caucasian, Northeast Caucasian, and Sumerian (and elsewhere, like Proto-Bantu gòmbè ‘cattle’) suggest a much wider cultural diffusion and/or onomatopoeia.
  • PIE *septm ‘seven’, PS *tsabʕ– ‘seven’. I greatly appreciate the informed PS reconstruction based on some Twitter discussions we had in the past. Bjørn cites the masculine stem, *tsabʕat-; to really make the comparison to PIE work we should probably add the absolute state ending and make it *tsabʕat-Vm. Is there some known PIE process that would get rid of the laryngeal in a form like *seph2tm? If so, the fact that we can understand the *t and *m as Semitic morphemes does make PS > PIE a good possibility, if this isn’t a coincidence.
  • PIE *(s)ueḱs ‘six’, PS *sidθ– (not “*sidt”) ‘six’. “On the surface not very compelling as a contact phenomenon directly between PIE and PS, but the sequential nature and the similarities that permeate the same group of languages as for the number seven nonetheless make the comparison worth entertaining.” The similarities for ‘seven’ mainly consisted of many languages having a sibilant at the beginning. Either way, the argument for both ‘six’ and ‘seven’ being borrowed from Semitic would be much stronger if PIE ‘six’ also ended in *-tm.
  • PIE *(H)oḱtoH ‘eight’, Proto-Berber *okkuz ‘four’ (sic; this should probably be *ăkkuẓ, Maarten Kossman p.c.), (Proto?-)Kartvelian *otxo ‘four’. In the background here is the idea that the PIE numeral is a dual, either ending in the PIE dual suffix *-h1 (Bjørn thinks this unlikely) or something related to the PS dual suffix *na, making it ‘two fours’. The argument is that what looks like a coincidence for ‘eight’ individually may be significant given the pattern that ‘seven’ and ‘six’ also have relatives. We just heard the same argument for ‘six’, so where this isn’t circular, it all relies on ‘seven’. Note that ‘eight’ is not ‘two fours’ anywhere in Afroasiatic.
  • PIE *medhu- ‘sweet, mead’, PS *mtḳ ‘to be sweet’. “Likely comparanda in both NE Caucasian and Uralic point to a wanderwort, possibly of Afro-Asiatic provenance.” Bjørn cites these comparanda, neither of which has anything corresponding to the PS *. PIE *dh : PS *t also isn’t very convincing. Also, the word does not mean ‘sweet’ in PIE (that would be *sueh2d-), just ‘mead’ and/or ‘honey’—at least, that’s my understanding of it, but Bjørn has written more about this.
  • PIE *dh2p- ‘sacrifice, feast’, PS *ðabḥ- ‘sacrifice, slaughter’. The metathesis increases the chance of a coincidental match, but otherwise this one is nice. It would be annoying to bring up Zulu hlaba ‘to stab, slaughter, sacrifice’.
  • PIE *dhoHn- ‘grain’, PS *duḫn– ‘millet’. This one looks great! No notes. If related, the direction of borrowing is ambiguous.
  • PIE *gwrH-n- ‘quern, millstone’, PS *gurn- ‘threshing floor’. The PIE *-n- is normally taken to be a nominal suffix so the word can be related to *gwrh2-u- ‘heavy’, but Bjørn suggests folk etymology in PIE. That would also explain why the PKIE laryngeal finds no counterpart in PS. Still, “the comparison between PIE and PS suffers from discontinuous semantics” (in other words: a quern is not a threshing floor).
  • PIE *kleh2-u- ‘lock, key, bolt’, PS *klʔ ‘to retain, detain’. As Bjørn writes, “[t]he semantic match is not immaculate”. PIE *h2 : PS *ʔ is not so intuitive either.
  • PIE *(s)teuros, *tauros (with *a!) ‘bull’, PS *θawr- ‘bull, ox’. “The European reflexes of *tauros are uniform to a degree that suggests a late (dialectal) distribution”. The originality of the Semitic form is based on Militarev & Kogan identifying Afroasiatic cognates, which are not presented.
  • (P)IE *ghaid- ‘goat kid’, PS *gady-. Pretty nice. As with ‘bull’, the form (*a!) and distribution suggest a late loanword. Bjørn also brings in Proto-Berber *a-ɣăyd, which matches the Indo-European forms even better (note that PB *ɣ probably corresponds to PS *, not *g).
  • (P)IE *lāp- ‘calf, cow’, PS *ʔalp– ‘bovine’. This one is piggybacking on the credentials of the previous two *a-nimals, which have similar distributions.
  • (P)IE *bhar-(s-) ‘grain, barley’, PS *bVrr- ‘grain, wheat’. Pretty good: *barr- with an *a is reflected in Hebrew, and the simplification of the *rr to *r is expected in Indo-European.
  • PIE *h2eǵro-s ‘field’. The Semitic is a bit of a mess here: a PS reconstruction *ḫagar- is based on a Ge’ez form that can’t descend from it (hagar with h) and an Aramaic form that doesn’t exist (haǧar with h and a ǧ that doesn’t exist in premodern Aramaic; haḡar doesn’t exist either). This last one appears to be based on a misinterpretation of Leslau’s note “Ar[abic] ([of] Dat[ina]) haǧar village in ruins”. As Ge’ez hagar means ‘city’ etc., not ‘field’ either, I don’t understand where this *ḫagar ‘arable field’ is coming from.
  • PIE *h2endh– ‘flower’, PS *ḥinṭ– ‘wheat’. The Semitic etymon is well attested, but the Indo-European one seems spurious (‘marshgrass’, ‘flower’, ‘arable field’, ‘soma plant’… are all of these related?). The formal correspondence is pretty nice, apart from PIE *dh : PS *.
  • PIE *ǵlh3(o)u- ‘sister-in-law’, PS *kall-at- ‘bride, daughter-in-law, sister-in-law’ (Arabic kannat- has that last meaning; thanks, Marijn!). Citing earlier publications of his, he states that “the term should … be considered a Wanderwort tied to marriage and alliance strategies defying linguistic and cultural barriers”. This sounds exciting but I find the forms pretty different.
  • (P)IE *h1is(h2)-u- ‘arrow’, PS *ḥVθ̣θ̣ (not “*ḥiθ̣w-“) ‘arrow’. The *w in Bjørn’s PS reconstruction must be based on Classical Arabic ḥað̣w-at- ‘small (headless) arrow used for practice’, ‘twig’. Without it, there’s hardly any resemblance between the IE and PS words.
  • (P)IE *peleḱu– ‘axe’, PS *plḳ ‘to split apart’. The semantics are nice but the *-e-e- vocalism would look as strange in PS as it does in Indo-European.
  • Evaluation

    So what have we got?

    • ‘seven’ has the same meaning in both families, is formally similar, and has linguistic arguments supporting a borrowing from Semitoid to PIE.
    • ‘grain’/’millet’ is semantically and formally very close, with no reason to see either family as the source.
    • ‘star’/’Venus’ is formally very close, with the semantics making IE more likely as the source than Semitoid.
    • ‘eagle’ and ‘horn’ have formal reasons to see IE as the source, not the recipient (if the Semitic words are even related).
    • ‘six’, ‘mead’/’sweet’, ‘quern’/’threshing floor’, ‘bolt’/’to detain’, ‘flower’/’wheat’, ‘sister-in-law’, and ‘axe’/’to split’ all have formal and/or semantic mismatches or problems increasing the chance that they just look similar by accident.
    • ‘cow’, ‘eight’, ‘field’, and ‘arrow’ lack a convincing Semitic counterpart. Bringing in other branches of Afroasiatic (which have massively different lexicons!) greatly increases the chance of a coincidental match, especially when we allow for diagonal comparisons like ‘eight’ : ‘four’ and ‘cow’ : ‘class of bull’.

    Most interestingly:

    • ‘bull’ and ‘grain, barley’/’wheat’ both show a very close formal resemblance; allowing for metathesis, so do ‘calf’/’bovine’ and ‘goat kid’, and maybe ‘sacrifice’. Most of these cannot go back to Proto-Indo-European due to the presence of an *a (rare or non-existent in PIE). Whether ‘sacrifice’ is PIE depends on the identification of possible reflexes in Hittite and Tocharian. Notably, the forms with *a are all limited to European languages, and these words all belong to the same, agricultural semantic field.

    Two strong examples and three weak ones isn’t a lot to base a whole account of European prehistory on, but I think this last category could point to post-PIE borrowings from Semitic or something close to it, which is a cool finding! For the rest, with just one word that is more likely to have been borrowed from Semitic into PIE than vice versa and one that could go either way, I don’t think there’s sufficient evidence to say that there are Semitoid loans in Proto-Indo-European proper. The two possible examples should be attributed to chance resemblance.

    Coincidence, really?

    I want to finish with a note on this last point, chance resemblance. Can it really be a coincidence that ‘seven’ is *septm in PIE and *tsabʕ-at-Vm in PS; that ‘grain’ is *dhoHn- in PIE and ‘millet’ is *duḫn– in PS; and so forth, if you want to include more examples? Well… yes. Depending on how many of the comparanda you find close enough to consider them being related, we could just be dealing with the couple of words that end up looking similar and having similar meanings in any two languages you compare. In the case at hand, this risk of coincidence is increased because Bjørn isn’t very strict when identifying formal matches. For example, PIE had (at least) three laryngeals: guttural sounds of unknown realization, labeled *h1, *h2, and *h3. *H means “one of these three but we can’t tell which one”. PS, on the other hand, had six guttural sounds: uvular * and *ġ, pharyngeal * and *ʕ, and glottal *h and *ʔ. Bjørn is OK with any of these matching each other:

    *h1*h2*h3*H**h2eǵro-s/*ḫagar-?*dhoHn-/*duḫn–*ġ*h₃or-(n-)/*ġVrVn–**h1is(h2)-u/*ḥiθ̣w-*dh2p-/*ðabḥ-; *h2endh–/*ḥinṭ–*ʕ*h₂ster-/*ʕaθtar–*h*h2eǵro-s/*hagar-?*ʔ*kleh2-u-/*klʔ

    It’s also fine for a laryngeal or guttural to be present in either language with nothing matching it in the other, as with *septm/*tsabʕ-, *(H)oḱtoH (is this a suffix?)/*okkuz, *lāp-/*ʔalp-, and *ǵlh3(o)u-/*kall-at-. That means that we can increase our forms that would count as a match: PIE *dhoHn– would match all of the following:

    • *duḫn
    • *duġn
    • *dun
    • *duʕn
    • *duhn
    • *duʔn
    • *dunn

    Moreover, PIE has three series of stops: voiceless, voiced, and voiced aspirated. PS has similar triads of voiceless, voiced, and ejective stops, affricates, and fricatives. These, too, can mix and match:

    *T*D*Dh*T*h₂ster-/*ʕaθtar-, *kleh2-u-/*klʔ, *(s)teuros~*tauros/*θawr-, *lāp-/*ʔalp–*ǵlh3(o)u-/*kall-at-*medhu-/*mtḳ*D*septm/*tsabʕ-, *(s)ueḱs/*sidθ-, *dh2p/*ðabḥ-*dh2p/*ðabḥ-, *ghaid/*gady-, *h2eǵro-s/*ḫagar-*dhoHn-/*duḫn-, *gwrH-n-/*gurn-, *ghaid-/*gady-, *bhar-(s-)/*bVrr-**ḱer-(n-)(h/u-)/*ḳarn-, *peleḱu-/*plḳ*h2endh–/*ḥinṭ–The one correspondence Bjørn does not find is PIE voiced/PS ejective, which would have worked so well for the Glottalic Theory.

    So we can expand our list of acceptable PS matches for PIE *dhoHn-; this now includes:

    • *tuḫn
    • *tuġn
    • *tun
    • *tuʕn
    • *tuhn
    • *tuʔn
    • *tunn
    • *duḫn
    • *duġn
    • *dun
    • *duʕn
    • *duhn
    • *duʔn
    • *dunn
    • *uḫn
    • *uġn
    • *un– (this root means ‘to grind’, as in tahini! Semantically close enough to match ‘grain’, right?)
    • *uʕn
    • *uhn
    • *uʔn
    • *unn

    We’ve increased the odds of getting a match by coincidence by 21 times, and have indeed found another match in the root *ṭḥn ‘to grind’.1 So if we really want to consider how likely it is that these similarities between PIE and PS are coincidental, we should ask ourselves how likely it is for one match as nice as *dhoHn-/*duḫn– to occur by chance, and then multiply that chance by 21. Would we really expect this to happen through sheer chance? In my view: yes, we totally should.

  • This is only made worse by allowing for metathesis of the second and third consonant: now we have 40 options. Allowing for an additional final consonant corresponding to nothing, as in *medhu-/*mt, multiplies the chance by a factor of 27 or so, taking some root co-occurrence restrictions into account. That would give us 1080 potential matches, although these wouldn’t all look as nice as *dhoHn-/*duḫn-. ↩︎
  • https://bnuyaminim.wordpress.com/2023/11/13/bjorn-old-european-afro-asiatic/

    #Afroasiatic #Akkadian #Arabic #Aramaic #Berber #Egyptian #GeEz #Hebrew #IndoEuropean #linguistics #NECaucasian #news #Omotic #ProtoSemitic #Sumerian