Category Archives: Indo-Iranian

Perso-Arabic vocabulary in Tatar

The great thing about learning Tatar vocabulary is that, with a little effort at finding out the different spellings, you often get Farsi and Tajik vocabulary (and Arabic, Turkish, a lot of Caucasian languages…) for free. Here’s a list of just a few recent things I’ve acquired:

Tatar Farsi Tajik
игътибар ‘attention’ اعتبار
хөрмәт ‘respect’ حرمت хурмат
һөнәр ‘specialization, focus’ هنر ҳунар
дәрәҗә ‘rank, authority’ درجه дараҷа
табигать ‘nature’ طبيعت табиат
дәвам ‘duration’ دوام ‘durability, endurance’ давом ‘duration’
шигар ‘slogan’ شعار

There may well be Tajik cognates for the two missing items, but unfortunately I never managed to buy a Tajik-Russian dictionary, and I can’t figure these out with my Russian-Tajik dictionary.

Inscriptions in Nepal

On my recent trip to Nepal I came across two inscriptions of linguistic interest.

The first is an unusual inscription in Kathmandu’s Durbar Square. This was placed here by King Pratap Malla in the 17th century. The king was a linguaphile and this poem to the goddess Kali includes words from 15 scripts and languages. According to an article in the Nepali newspaper República these are Persian, Arabic, Maithili, Kiranti, Newari, Kayathinagar (the script then used in western Nepal), Devanagri, Gaudiya, Kashmiri, Sanskrit, two different Tibetan scripts, English and French.

You can clearly make out French l’hiver ‘winter’ and automne ‘autumn’ as well as English winter.

Sadly, a significant part of this inscription has already been effaced. Indeed, the same is happening to most of the inscriptions in Durbar Square, and in spite of its UNESCO World Heritage Site status nothing is being done to protect them.

The second interesting inscription is on the pillar that the Emperor Ashoka set up in the 3rd century BC in Lumbini, the birthplace of the Buddha. This Prakrit-language proclamation releasing Lumbini from tax obligations is written in the Brahmi script. The plaque standing in front of the pillar has a Latin transliteration and translations into English and Nepali.

Iranian from quincunx and back again

When I first became acquainted with Persian some years ago, two grammatical features seemed unusual to me from an Indo-European perspective. One was the ezafe construction, which I eventually learned was the product of contact with Caucasian languages. But the other was the formation of the present tense with a prefix me‑ (indicative) or be‑ (subjunctive) followed by the verb stem and personal endings. In his chapter ‘Dialectology and Topics’ in Routledge’s The Iranian Languages pp. 24–25, Gernot Windfuhr offers a fine summary of the changes that produced the modern Persian system of tenses, which not only clarifies the origin of me‑ and be‑, but shows that Persian has returned to the same five-member tense/aspect system that Iranian (like Greek) started off with.

The history of the parameters and axes of the verb systems from Old Iranian to Modern Iranian shows a cycle from a five-member quincunx to varying Middle Iranian systems back to a quincunx. The development is shown here with the example of Persian.

The inherited fundamental and primary verbal parameter of the Early Old Iranian system is triple aspect which intersects with the binary tense parameter of present and past (marked by the augment a‑). It is centered on the perfective aorist:

Early Old Iranian
Present Past
Imperfective PR a-PR “Present system”
Perfective AOR “Aorist system”
Resultive-stative PF (a-PF) “Perfect system”

In time, this triple aspect system was reduced to forms of the “present” system, i.e. imperfect present and imperfective past, leaving only a few forms of the aorist and the perfect. With their loss, the highly complex inherited system was reduced to a single imperfective stem, distinguishing present vs. augmented imperfect: PR vs. a-PR.

Concomitantly, however, the vacated aorist and perfect ranges of the system were partially filled by the innovation of a new perfective system based on the adjectival completive participle in -tá plus the present and past copula, with both intransitive and transitive verbs.

In Middle Persian, the resulting four-member system of two imperfective and two perfective forms was extended by replacing the copula with the stative verb ēst‑ ‘to stand’. The outcome was a six-member system with a triple aspect axis and a binary tense axis:

Middle Persian
Present Past
Imperfective raw‑ (a-raw‑) present imperfect (later lost)
Perfective raft COP raft būd COP preterit past preterit
Resultive-stative raft ēst‑ raft ēstād COP perfect pluperfect

In addition, the adverb hamē lit. ‘forever’ expressed ongoing and progressive action as well as continuing state, while its pendant (homophonous with the adverb ‘out, away’) expressed the singularity of an event in present and past and assumed inchoative or future connotation with the present stem.

In Early New Persian, (ha)mē‑ and bē‑ were continued, but the periphrastic resultative ēst‑ forms were replaced by extended forms based on the verbal adjective in -tag (< *-taka). bi and could still occur with these verb forms, and neither was obligatory. The core system in terms of frequency was the following:

Early New Persian
Present Past
Imperfective mē-raw‑ mē-raft‑
Perfective bi-raw‑ bi-raft‑ inchoat.-fut. singularity
Unmarked raw‑ raft‑ gen. present gen. past
Resultive-stative raft-a COP raft-a bud‑

Subsequently the system was restructured by the coalescence of the unmarked forms with the perfective forms by the fifteenth century.

  1. In the present, the perfective bi-form assumed distinct subjunctive function, alternating with the unmarked general present form, now opposed to the indicative present-future -form.
  2. In the past, the general unmarked form subsumed the function of the bi-form to express both general and perfective events, now opposed to the imperfective -past form. It thereby assumed the central role of an aorist in the resulting five-member system.

The core of the system became thus as follows, and has not changed since:

Pre-Modern, Indicative
Present Past
Imperfective mē-rav‑ mē-raft‑
Perfective raft‑
Resultive-stative raft-a COP raft-a bud‑

The non-indicative sub-system developed in parallel to the indicative core, using the imperfect and past-perfect forms for irreal function, and using the present subjunctive of ‘to be’ for the perfect subjective:

Pre-Modern, Non-Indicative
Present Past
Imperfective bi-rav‑ mē-raft‑
Perfective raft‑
Resultive-stative raft-a bāš raft-a bud‑

The lesser-known W. Sidney Allen

Any student of classical languages with a linguistics bent will delight at discovering W. Sidney Allen’s books Vox Latina and Vox Graeca that reconstruct the pronunciation of Classical Latin and Greek, respectively. Cambridge University Press has published them in relatively cheap paperbacks. However, there are two more works by this scholar that that don’t get anywhere near the attention they deserve, even though they are logical next steps.

The first is Accent and Rhythm: Prosodic Features of Latin and Greek (Cambridge University Press, 1973). Here W. Sidney Allen takes the linguistic reconstruction of Greek and Latin one step further from Vox Latina and Vox Graeca to encompass suprasegmental aspects of these languages. This book does demand a greater understanding of theory (whereas the earlier books expected little more than some knowledge of IPA), and it takes some work to apply Allen’s insights to one’s own enunciation.

The second book treats what is historicaly the third important classical language for Indo-European studies, Sanskrit. Allen’s Phonetics in Ancient India (Oxford University Press, 1953) was published years before Vox Latina and Vox Graeca, and is organized somewhat differently in that it is mainly a retelling of the already very detailed ancient Indian sources for Sanskrit pronunciation. However, Allen does engage in some detective work to clarify matters obscure in the ancient grammarians, such as the pronunciation of the visarga.

A Uralic loanword in late Proto-Indo-European?

I may have come across such etymologies before, but as far as I remember, this is the first proposal I’ve seen of a Uralic loanword in Proto-Indo-European. In Ananta Śāstram: Indological and Linguistic Studies in Honour of Bertil Tikkanen ed. Klaus Karttunen (Helsinki: Finnish Oriental Society, 2010), Asko Parpola has this to say on the etymology of Finnish kaivaa ‘dig’:

The Finnish words kaiva-a ‘to dig’ and kaivo ‘digging, well, pit’ have cognates in Finnic languages, in Saami and the Volgaic and Permic languages. Ante Aikio has shown that Proto-Finno-Ugric *kajwa- can be regularly connected with Proto-Samoyedic käjwa ‘spade’, as the change *a > took place in Samoyedic before a tautosyllabic palatal consonant, thereby settling an old problem, the history and material of which is fully discussed by Aikio. Hence the etymon is an archaic Uralic nomen verbum.

What I offer here is not a new etymology, but simply a reference to an old etymology proposed as early as 1920 that was not included in the indexes of etymologically treated Finnish words by Donner and Erämetsä, and so has escaped notice in SKES and SSA. K. F. Johansson had reconstructed an archaic Proto-Indo-European heteroclitic noun *kaiw-r̥-t (nom.) ~ *kaiwn̥n-eś (gen.) on the basis of Greek and Old Indo-Aryan. Hesychius records καίατα in the sense of ‘pits, excavations, trenches, ditches’ (ὀρύγματα) or ‘landslide chasms caused by earthquake’ (ἢ τὰ ὑπὸ σεισμῶν καταρραγέντα χωρία) The plural καίατα is supposed to stand for καίϝατα, from the singular καίϝαρ. Old Indo-Aryan kevaṭa- ‘pit’ is attested in a single occurence in the oldest text, Rigveda, 6,45,7; Old Indo-Aryan e goes back to Proto-Aryan *ai and *rt has often become retroflex *ṭ. Pokorny accepts the comparison and reconstructs for Proto-Indo-European *kaiwr̥t *kaiwn̥-t. Thomas Burrow and Manfred Mayrhofer have considered the scanty evidence in both Old Indo-Aryan and Greek as too uncertain for the assumption of a PIE hetercliton. Still, Mayrhofer thinks it is possible that the words are related. Herbert Petersson also emphasizes that no trace of this etymon is found in other Indo-European languages — and Frisk points out that no corresponding PIE verbal root can be traced — while the root structure too, with a diphthong following by -w-, also looks peculiar for PIE. Petersson therefore takes this to be one of the rare cases where Proto-Indo-European is likely to have borrowed from Proto-Finno-Ugric. Mayrhofer refers to Petersson’s suggestion as noteworthy but unconfirmed. However, the confirmed Uralic origin of kajwa- and the archaic appearance of the word on both sides gives new significane to Petersson’s hypothesis.

(The title of Parpola’s contribution to this volume is ‘New Etymologies for Some Finnish Words’, pp. 305–318. In quoting it here, I have slightly abridged the text and left out the parenthetical citations for the sake of readabiity.)

Substrate speculations

Just two briefly mention two substrate hypotheses which I’ve come across in the last 24 hours:

  1. Theo Vennemann posits a Semitic substrate for Proto-Germanic, an encounter made possible by Phoenician colonization of the North Sea area. Among the supposed loanwords are the names of the Germanic gods Pol and Baldur, none other than the Semitic god Baal. Vennemann’s vast work on Semitic and Basque substrates in Europe seems to be politely tolerated but generally ignored by IEists, and I heard of this hypothesis from the popular press: John McWhorter’s Our Magnificient Bastard Tongue: The Untold Story of English. McWorther does mention that there are serious objections to this theory, but in my opinion, even bringing it up at all risks leading impressionable laymen astray.
  2. Alexander Lubotsky’s article ‘The Indo-Iranian Substratum’ in Early Contacts between Uralic and Indo-European: Linguistic and Archaeological Considerations ed. Carpelan et al. (Helsinki: Finno-Ugrian Society, 2001) notes that the phonological peculiarities of non-Indo-European words in Indo-Iranian are the same for loanwords in Indo-Aryan specifically. The author writes, In order to account for this fact, we are bound to assume that the language of the original population of the towns of Central Asia, where the Indo-Iranians must have arrived in the second millennium BCE, on the one hand, and the language spoken in Punjab, the homeland of the Indo-Aryans, on the other, were intimately related.

Romani exonyms

In Romani: a linguistic introduction (Cambridge University Press, 2005), Yaron Matras gives several examples of how the Roma people have been very inventive with names for the countries and people encountered on their westward migration (pp. 26–27):

Characteristic of Romani is – alongside replications of nations’ self-ascription (e.g. sasitko ‘German’, njamco ‘German’, valšo ‘French’) – the widespread use of inherited or internal names for nations. Thus we find das ‘Slavs’ (cf. OIA dāsa- ‘slave’), a word play based on Greek sklavos; xoraxaj/koraxaj of unclear etymology, in the Balkans generally ‘Muslim, Turk’ and elsewhere ‘foreigner’ or ‘non-Rom’; gadžo ‘non-Rom’. Other inherited words for non-Rom include xalo (‘meagre, shabby’), also in the diminutive xaloro ‘Jew’, balamo and goro ‘Greek, non-Gypsy’; biboldo ‘Jew’ (‘unbaptised’), chindo ‘Jew’ (‘cut’ = ‘circumcised’), trušulo ‘Christian’ (cf. trušul ‘cross’), džut ‘Jew’ (possibly Iranian). Names attached to foreign countries by individual Romani groups often refer to incomprehensible speech, based on either lal- ‘dumb’ or čhib ‘tongue’: lallaro-temmen ‘Finland’ and lalero them ‘Bohemia’ (= ‘dumb land’), lalero ‘Lithuanian’, čibalo/čivalo meaning ‘Albania’ among Balkan Rom, ‘Bavaria’ among German Rom, and ‘Germany’ among Yugoslav Rom. More recently, barvale thema (lit. ‘rich countries’) has emerged as a designation for ‘western Europe’, lole thema (lit. ‘red countries’) for ‘eastern, communist Europe’.

Internal creations of place names are common mostly among the northwestern dialects of Romani. They are frequently either translations, or semantic or sound associations based on the original place names: nevo foro lit. ‘new town’ for ‘Neustadt’, xačerdino them lit. ‘burned country’ for ‘Brandenburg’, čovaxanjakro them lit. ‘witches’ country’ for ‘Hessen’ (German Hexen ‘witches’), kiralengro them lit. ‘cheese country’ for ‘Switzerland’, u baro rašaj lit. ‘the big priest’ for ‘Rome’, lulo piro lit. ‘red foot’ for ‘Redford’, baro foro lit. ‘big town’ for capital cities of various countries (Helsinki, Stockholm, Belgrade).

I remember thinking how it cool it was that the Chuvash coined the name чул хула ‘stone city’ for Nižnyj Novgorod, once the closest large Russian settlement, and how disappointed I was to hear that it was no longer in use. I wonder how many of these Romani examples are still current.

Classical philology is dead in India

Sheldon Pollock, one of the most prominent scholars of Sanskrit literature today, has contributed a jeremiad entitled ‘Crisis in the Classics’ to the journal Social Research Vol. 78 No. 1 (Spring 2011) on the decline of classical philology in India. The article is available as a PDF and its 28 pages have so much good material that I can hardly decide what to quote here, but here’s the heart of Pollock’s observations:

Indeed, there have been no successors to any of the pre-independence generation of Sanskrit scholars, the sort who mastered their discipline and thought conceptually about it and wrote for an international audience: S. N. Dasgupta, S. K. De, Mysore Hiriyanna, P. V. Kane, S. Radhakrishnan, Venkata Raghavan, C. Kunhan Raja, V. S. Sukthankar, are the first in a long and distinguished list from across India (I leave aside the loss of the great tradition of pandit learning, which is now virtually extinct). There have been no major Sanskrit projects in India since the completion of the critical edition of the Ramayana at Baroda more than 30 years ago. All the great classical series (such as Anandasrama, Trivandrum, Gaekwad, Madras) have been more or less discontinued, and as a result the manuscripts in those collections are no longer being published. Indeed, there have been few new Indian editions of complex Sanskrit texts at all from among the scores of important manuscripts that lie unpublished in archives. In the area of hermeneutics (Mimamsa), for example, I know of no one in India today capable of editing works like those edited just a generation ago by P. N. Pattabhirama Sastry or S. Subrahmanya Sastry. (The same holds for many other areas of classical studies; with the death of A. N. Upadhye in 1975 and H. C. Bhayani in 2000, the editing of Prakrit and Apabhramsha works seems to have died too.) I have not encountered a single PhD dissertation on Sanskrit in India—and I have seen many—worthy of publication by a Western university press.

The situation is no different in the other classical languages, as I learned in the late 1990s when I organized a project on the histories of South Asian literary cultures (Pollock 2003). Our core group of colleagues was looking for others to join us who possessed a deep historical understanding of a regional language, conceptual skills, and the capacity to communicate their knowledge effectively. We were able to locate only four qualified scholars in India, and identified no one for a host of languages, including Assamese, Marathi, Newari, Oriya, and Panjabi.

I suspected as much when I visited university bookshops in India: almost no publications from the last 30 years, and heaps of decaying old editions that evidently no one wanted to buy. Online language discussions are so often overrun by Hindu fundamentalist claims that Sanskrit is a divine language and India’s literature the oldest and wisest in the world, but for all the prominence of such views on the internet, this heritage is neglected in India.

The dull conversations of Tajikistan

I’ve been travelling in Tajikistan for a few days now and I’m not liking it much as a linguist. It’s not because the people aren’t friendly; for not a single night have I lacked invitations for a place to dine and sleep comfortably. But conversations here tend to all be the same. The first repetitive response happens all over the former Soviet Union: Your name is Christopher, eh? Like Christopher Columbus/Christopher Lambert! I guess I’m used to that one, and I just laugh and pretend I haven’t heard it myriad times before.

But essentially all conversations devolve into this very quickly:

Tajik: Are you married?

Me: No, I am not married yet.

Tajik: How old are you?

Me: I am 29 years old.

Tajik: You need to get married! [The more good-humoured locals will at this point indicate the closest unmarried woman and propose I marry her]

Me: I don’t wish to get married yet.

Tajik: Why?

Me: Because I wish to travel and study and remain a free man.

After this they tend to grumble a fair bit — it does seem that some are appalled by what I said — and the conversation returns to marriage constantly. It would be nice to talk about something else and to perceive some element of culture. What happened to even fairly poor, rural locals knowing something about shashmaqâm or Persian classical poetry as travelers in Transoxania reported less than 20 years ago? It’s especially frustrating since I came hear to learn Tajik, but there’s not enough of a variety of conversational topics to really expand my vocabulary.

Memsahib Hindi

In his textbook Teach Yourself Beginner’s Hindi Script Rupert Snell offers the following charming anecdote:

Legend has it that in the days of the Raj the British memsahibs, indifferent to real Hindi, would learn simple Hindi commands by assimilating them to English phrases: ‘There was a banker’ was to be interpreted by servants as representing दरवाज़ा बंद कर darvāzā band kar ‘Close the door’, and ‘There was a cold day’ meant दरवाज़ा खोल दे darvāzā khol de, ‘Open the door’. Thankfully, those days are long gone

While a Google search on the first phrase gives no results, a search on the second comes up with numerous references to this phenomenon, but only as an urban myth. If this were really a commonplace among Brits in India, one would expect it to appear in some contemporary written form, perhaps something penned as advice to recent arrivals.