Category Archives: Indo-European

Phantom linguistics publications

It is frustrating when one is alerted by catalogues to books on language that were never actually published.

Routledge’s Language Family Surveys series now covers most of the major language groups of the world. However, the announced volume on the Manchu-Tungusic languages, said to be edited by Alexander Vovin, never appeared even though it worked its way into the Helsinki University Library catalogue (on order) and Amazon. I hear that Vovin is still working on this, but it will appear from a different publisher.

Another phantom publication is Teach Yourself Yiddish, a book that was meant to appear in 2009 and compete with the new edition of rival Routledge’s Colloquial Yiddish, a book which very much exists. Supposedly authored by Chaim Nelsen and Barry Davis, Teach Yourself Yiddish never did appear, in spite of also being announced at Amazon complete with ISBN.

Various Turkic–Mongolic etymological observations

Preparing to study Mongolian from Krueger’s An Introduction to Classical (Literary) Mongolian (Wiesbaden: Harrassowitz, 3rd edition 1993), I’ve been re-reading the Routledge Language Family Surveys volume The Mongolic Languages ed. Juha Janhunen. Below are some musings on and follow-ups to trivia within.

Examples of some crucial [Khalka] consonant contrasts: ad [at] ‘demon’ vs. at [aʰt] ‘castrated camel’; dal [taɮ] ‘seventy’ vs. tal [tʰaɮ] ‘steppe’.

So modern Mongolian is one of those languages that, instead of a voiced–unvoiced distinction in dentals that I could actually pronounce, has an aspirated–unaspirated distinction that I’ll never get down. That’s a damn shame.

[Turkic borrowings in Mongolic] often show a specialized meaning, whereas the native [Mongolic] words have a more general semantic profile, cf. e.g. Mongolic *xüsün ‘hair’ vs. * ‘hair of a horse’ ← Bulgharic kïlka = Common Turkic *kïl (qïl) ‘hair’.

The ordinary Chuvash word for ‘hair’ today is ҫӳҫ. However, for Russian конский волос ‘horsehair’, the Skvortsovs’ dictionary gives лаша хӗлӗхӗ. For Cv. хӗлӗх, Fedotov’s Этимологический словарь чувашского языка gives a wide array of Turkic cognates, but they are all glossed as ‘horsehair’, so it’s unclear to me on what grounds Claus Schönig in the passage I’ve quoted believes it ever meant ‘hair’ in general.

In the Common Turkic branch, rhotacism, lambdacism is generally absent, but it is occasionally observed in preconsonantal position, which makes the dating of certain loanwords problematic, cf. e.g. Mongolic *buxas ‘pregnant’ (from Common Turkic *bugaz id.) vs.‑ ‘to cut the throat’ (from either Bulgharic or Common Turkic, cf. Common Turkic *bogaz ‘throat’).

That Bulgar Turkic had a cognate word for ‘throat’ showing rhotacism is attested by Chuvash пыр id.

Mongolic ulus ← Common Turkic uluš (later replaced in most Turkic languages by a reborrowing from Mongolic).

There is an informative entry on Common Turkic *uluš/ulus on page 152 of Clauson’s A Dictionary of Pre-Thirteenth Century Turkish, which notes that the original Turkic form uluš seems to survive only in Karaim.

Mongolic *kerbish ‘brick’ ← Common Turkic *kärpič

The Common Turkic is the source of Russian кирпич. It must say something of the material poverty and fondess for wooden buildings of the Russians of old, that they had to take the word for ‘brick’ from a population generally associated with yurts.

The early Kipchak source Codex Cumanicus exhibits [Mongolic] borrowings like abaɣa ‘uncle’, čïray ‘face’, ebäk ~ elpäk ‘very much’, yada‑ ‘to get tired’, qurulta ‘assembly, council’, manglay ‘forehead’, nögär ‘follower’, and qaburqa ‘rib’.

For what it’s worth, several of these are commonplace in Tatar as well, namely абый, чырай, бик, маңгай and кабырга.

Mongolic *köper > *köxer ‘proud’ > ‘happy’ vs. Turkic *küpez (> *kübez) ‘proud’, Mongolic *köperge > *köxerge ‘bridge’ vs. Turkic *köprüg (*köbrüg).

Of the first set of words here, I’m tempted to claim some connection to Tatar чибәр ‘beautiful’, with cognates in languages of the Volga region meaning ‘happy’. Could the k‑ of the Mongolic or Bulgar word cited above have shifted to an affricate before a front vowel in some other language that was the source of the Tatar? However, I don’t seem to own any etymological reference that describes this possibility. Äxmat’janov’s Татар теленең кыскача тарихи-этимологик сүзлеге suggests only that the Tatar is borrowed from a Mongolic cegeber ‘white, clean’.

For the second set of words, I’ve long suspected a connection to Greek γέφῡρα, but the entry in Clauson on page 690 mentions no connection between the Turkic and other language families (except the loan in Mongolic), mentioning only morphologically Dev. N. fr. köpür‑ [‘to froth, to foam’] but with no obvious semantic connection. On Greek γέφῡρα, Beekes on page 269 of his Etymological Dictionary of Greek suggests the Greek is borrowed from Hattic hammuruwa ‘beam’, with all instances of the words in Homeric Greek representing ‘beam’ and the meaning ‘bridge’ is attested only later. However, if a meaning ‘bridge’ is attested for this word by the mid 1st millennium BC, would that not give plenty of time for it to be borrowed into an unknown Iranian language of Central Asia and then picked up by Turkic?

Perso-Arabic vocabulary in Tatar

The great thing about learning Tatar vocabulary is that, with a little effort at finding out the different spellings, you often get Farsi and Tajik vocabulary (and Arabic, Turkish, a lot of Caucasian languages…) for free. Here’s a list of just a few recent things I’ve acquired:

Tatar Farsi Tajik
игътибар ‘attention’ اعتبار
хөрмәт ‘respect’ حرمت хурмат
һөнәр ‘specialization, focus’ هنر ҳунар
дәрәҗә ‘rank, authority’ درجه дараҷа
табигать ‘nature’ طبيعت табиат
дәвам ‘duration’ دوام ‘durability, endurance’ давом ‘duration’
шигар ‘slogan’ شعار

There may well be Tajik cognates for the two missing items, but unfortunately I never managed to buy a Tajik-Russian dictionary, and I can’t figure these out with my Russian-Tajik dictionary.

New edition of Routledge’s Colloquial Albanian

The cover of Routledge’s Colloquial AlbanianOne of the things that always made Albanian seem so mysterious to me in the 1990s and early millennium was the dearth of quality learning materials, a strange state of affairs considering that Albanian is the official language of a decent-sized European country. For a long time, the only introduction easy to purchase was Isa Zymberi’s entry in Routledge’s Colloquial series. However, its presentation of this rather daunting language was opaque, and it was based entirely on the dialect of Kosovo (presumably because it was the only place learners of Albanian could freely travel during the Communist era).

Happily, Routledge remedied this last year by publishing a new version of Colloquial Albanian by Linda Mëniku and Héctor Campos. This is based on the standard language established in Albania proper after the war, treating the Gheg and Tosk dialects only in the last chapter. From my initial impressions after buying a copy in a Helsinki bookshop and flipping through it, this new version lays out more clearly the complex (often irregular) morphology of Albanian. There is no English-Albanian glossary and the amount of vocabulary presented is fairly small, but it seems a fine start and I look forward to working through it before a trip to the Western Balkans this summer.

Inscriptions in Nepal

On my recent trip to Nepal I came across two inscriptions of linguistic interest.

The first is an unusual inscription in Kathmandu’s Durbar Square. This was placed here by King Pratap Malla in the 17th century. The king was a linguaphile and this poem to the goddess Kali includes words from 15 scripts and languages. According to an article in the Nepali newspaper República these are Persian, Arabic, Maithili, Kiranti, Newari, Kayathinagar (the script then used in western Nepal), Devanagri, Gaudiya, Kashmiri, Sanskrit, two different Tibetan scripts, English and French.

You can clearly make out French l’hiver ‘winter’ and automne ‘autumn’ as well as English winter.

Sadly, a significant part of this inscription has already been effaced. Indeed, the same is happening to most of the inscriptions in Durbar Square, and in spite of its UNESCO World Heritage Site status nothing is being done to protect them.

The second interesting inscription is on the pillar that the Emperor Ashoka set up in the 3rd century BC in Lumbini, the birthplace of the Buddha. This Prakrit-language proclamation releasing Lumbini from tax obligations is written in the Brahmi script. The plaque standing in front of the pillar has a Latin transliteration and translations into English and Nepali.

Iranian from quincunx and back again

When I first became acquainted with Persian some years ago, two grammatical features seemed unusual to me from an Indo-European perspective. One was the ezafe construction, which I eventually learned was the product of contact with Caucasian languages. But the other was the formation of the present tense with a prefix me‑ (indicative) or be‑ (subjunctive) followed by the verb stem and personal endings. In his chapter ‘Dialectology and Topics’ in Routledge’s The Iranian Languages pp. 24–25, Gernot Windfuhr offers a fine summary of the changes that produced the modern Persian system of tenses, which not only clarifies the origin of me‑ and be‑, but shows that Persian has returned to the same five-member tense/aspect system that Iranian (like Greek) started off with.

The history of the parameters and axes of the verb systems from Old Iranian to Modern Iranian shows a cycle from a five-member quincunx to varying Middle Iranian systems back to a quincunx. The development is shown here with the example of Persian.

The inherited fundamental and primary verbal parameter of the Early Old Iranian system is triple aspect which intersects with the binary tense parameter of present and past (marked by the augment a‑). It is centered on the perfective aorist:

Early Old Iranian
Present Past
Imperfective PR a-PR “Present system”
Perfective AOR “Aorist system”
Resultive-stative PF (a-PF) “Perfect system”

In time, this triple aspect system was reduced to forms of the “present” system, i.e. imperfect present and imperfective past, leaving only a few forms of the aorist and the perfect. With their loss, the highly complex inherited system was reduced to a single imperfective stem, distinguishing present vs. augmented imperfect: PR vs. a-PR.

Concomitantly, however, the vacated aorist and perfect ranges of the system were partially filled by the innovation of a new perfective system based on the adjectival completive participle in -tá plus the present and past copula, with both intransitive and transitive verbs.

In Middle Persian, the resulting four-member system of two imperfective and two perfective forms was extended by replacing the copula with the stative verb ēst‑ ‘to stand’. The outcome was a six-member system with a triple aspect axis and a binary tense axis:

Middle Persian
Present Past
Imperfective raw‑ (a-raw‑) present imperfect (later lost)
Perfective raft COP raft būd COP preterit past preterit
Resultive-stative raft ēst‑ raft ēstād COP perfect pluperfect

In addition, the adverb hamē lit. ‘forever’ expressed ongoing and progressive action as well as continuing state, while its pendant (homophonous with the adverb ‘out, away’) expressed the singularity of an event in present and past and assumed inchoative or future connotation with the present stem.

In Early New Persian, (ha)mē‑ and bē‑ were continued, but the periphrastic resultative ēst‑ forms were replaced by extended forms based on the verbal adjective in -tag (< *-taka). bi and could still occur with these verb forms, and neither was obligatory. The core system in terms of frequency was the following:

Early New Persian
Present Past
Imperfective mē-raw‑ mē-raft‑
Perfective bi-raw‑ bi-raft‑ inchoat.-fut. singularity
Unmarked raw‑ raft‑ gen. present gen. past
Resultive-stative raft-a COP raft-a bud‑

Subsequently the system was restructured by the coalescence of the unmarked forms with the perfective forms by the fifteenth century.

  1. In the present, the perfective bi-form assumed distinct subjunctive function, alternating with the unmarked general present form, now opposed to the indicative present-future -form.
  2. In the past, the general unmarked form subsumed the function of the bi-form to express both general and perfective events, now opposed to the imperfective -past form. It thereby assumed the central role of an aorist in the resulting five-member system.

The core of the system became thus as follows, and has not changed since:

Pre-Modern, Indicative
Present Past
Imperfective mē-rav‑ mē-raft‑
Perfective raft‑
Resultive-stative raft-a COP raft-a bud‑

The non-indicative sub-system developed in parallel to the indicative core, using the imperfect and past-perfect forms for irreal function, and using the present subjunctive of ‘to be’ for the perfect subjective:

Pre-Modern, Non-Indicative
Present Past
Imperfective bi-rav‑ mē-raft‑
Perfective raft‑
Resultive-stative raft-a bāš raft-a bud‑

The lesser-known W. Sidney Allen

Any student of classical languages with a linguistics bent will delight at discovering W. Sidney Allen’s books Vox Latina and Vox Graeca that reconstruct the pronunciation of Classical Latin and Greek, respectively. Cambridge University Press has published them in relatively cheap paperbacks. However, there are two more works by this scholar that that don’t get anywhere near the attention they deserve, even though they are logical next steps.

The first is Accent and Rhythm: Prosodic Features of Latin and Greek (Cambridge University Press, 1973). Here W. Sidney Allen takes the linguistic reconstruction of Greek and Latin one step further from Vox Latina and Vox Graeca to encompass suprasegmental aspects of these languages. This book does demand a greater understanding of theory (whereas the earlier books expected little more than some knowledge of IPA), and it takes some work to apply Allen’s insights to one’s own enunciation.

The second book treats what is historicaly the third important classical language for Indo-European studies, Sanskrit. Allen’s Phonetics in Ancient India (Oxford University Press, 1953) was published years before Vox Latina and Vox Graeca, and is organized somewhat differently in that it is mainly a retelling of the already very detailed ancient Indian sources for Sanskrit pronunciation. However, Allen does engage in some detective work to clarify matters obscure in the ancient grammarians, such as the pronunciation of the visarga.

Xenophon comix

In one of the odder installments in a university press series, volume 16 of Odense University Classical Studies is a graphic novel adaption of Book I of Xenophon’s Anabasis, where the original Greek text is paired with illustrations by Minna Winsløw. Were this somewhat larger (it is only 25 pages long, heavily abridging the text) and if the Greek were written with better calligraphy, I could see this motivating at least some students out there.

The first page from the Odense University Press graphic novel of Xenophon's Anabasis"

You can find this in a university library near you – or probably not – under the title ΑΝΑΒΑΣΙΣ (Odense Universitetsforlag, 1991) ISBN 8774928007.

A Uralic loanword in late Proto-Indo-European?

I may have come across such etymologies before, but as far as I remember, this is the first proposal I’ve seen of a Uralic loanword in Proto-Indo-European. In Ananta Śāstram: Indological and Linguistic Studies in Honour of Bertil Tikkanen ed. Klaus Karttunen (Helsinki: Finnish Oriental Society, 2010), Asko Parpola has this to say on the etymology of Finnish kaivaa ‘dig’:

The Finnish words kaiva-a ‘to dig’ and kaivo ‘digging, well, pit’ have cognates in Finnic languages, in Saami and the Volgaic and Permic languages. Ante Aikio has shown that Proto-Finno-Ugric *kajwa- can be regularly connected with Proto-Samoyedic käjwa ‘spade’, as the change *a > took place in Samoyedic before a tautosyllabic palatal consonant, thereby settling an old problem, the history and material of which is fully discussed by Aikio. Hence the etymon is an archaic Uralic nomen verbum.

What I offer here is not a new etymology, but simply a reference to an old etymology proposed as early as 1920 that was not included in the indexes of etymologically treated Finnish words by Donner and Erämetsä, and so has escaped notice in SKES and SSA. K. F. Johansson had reconstructed an archaic Proto-Indo-European heteroclitic noun *kaiw-r̥-t (nom.) ~ *kaiwn̥n-eś (gen.) on the basis of Greek and Old Indo-Aryan. Hesychius records καίατα in the sense of ‘pits, excavations, trenches, ditches’ (ὀρύγματα) or ‘landslide chasms caused by earthquake’ (ἢ τὰ ὑπὸ σεισμῶν καταρραγέντα χωρία) The plural καίατα is supposed to stand for καίϝατα, from the singular καίϝαρ. Old Indo-Aryan kevaṭa- ‘pit’ is attested in a single occurence in the oldest text, Rigveda, 6,45,7; Old Indo-Aryan e goes back to Proto-Aryan *ai and *rt has often become retroflex *ṭ. Pokorny accepts the comparison and reconstructs for Proto-Indo-European *kaiwr̥t *kaiwn̥-t. Thomas Burrow and Manfred Mayrhofer have considered the scanty evidence in both Old Indo-Aryan and Greek as too uncertain for the assumption of a PIE hetercliton. Still, Mayrhofer thinks it is possible that the words are related. Herbert Petersson also emphasizes that no trace of this etymon is found in other Indo-European languages — and Frisk points out that no corresponding PIE verbal root can be traced — while the root structure too, with a diphthong following by -w-, also looks peculiar for PIE. Petersson therefore takes this to be one of the rare cases where Proto-Indo-European is likely to have borrowed from Proto-Finno-Ugric. Mayrhofer refers to Petersson’s suggestion as noteworthy but unconfirmed. However, the confirmed Uralic origin of kajwa- and the archaic appearance of the word on both sides gives new significane to Petersson’s hypothesis.

(The title of Parpola’s contribution to this volume is ‘New Etymologies for Some Finnish Words’, pp. 305–318. In quoting it here, I have slightly abridged the text and left out the parenthetical citations for the sake of readabiity.)