Category Archives: Languages

The Volga Bulgarian inscription of Tatarskoe Shapkino

The palatalization of Proto-Turkic /č/ to /č́/ and then the weakening of the affricate’s initial stop to give /š́/ or /š/, is a notable areal feature extending from the Volga–Kama region into Kazakhstan. In the second volume of Róna-Tas and Berta’s Western Old Turkic (Harrassowitz, 2011), which reconstructs the ancestor of Volga Bulgarian and Chuvash on the basis of loanwords into Hungarian, the authors mention how the Tatars, whose own language would soon undergo the same evolution, were confronted by this change already almost complete in Volga Bulgarian:

Important is the bilingual inscription of Tatar Šapkino. In the Arabic inscription containing Volga Bulgarian words, the name of the deceased lady is written as J̌eker, and should be read as /č́eker/, while on the other side of the same stone, the same name is written as Šeker. What was perceived as /č/ by the Volga Bulgars was heard by the Kipchak Tatars as /š/.

Tatarskoe Shapkino is a village in south-central Tatarstan. A description of the Arabic portion of this inscription can be found in Khakimzjanov’s Язык эпитафий волжских булгар (Moscow: Nauka, 1978) on pages 158–159:

هو الحى الذى لا يموت
هذه روضة مستورة
المطهرة الصَّالحة الصائـنة الطيفة
شكر الجى بنت عثمان البلفارؾ
الهم ارحمها رحمة واسعة توفيت
الى رحمة الله تعالى فى اليوم الرابع و العشريں

Huwa-l-xäjji-l-läzi lä jämutu wä küllü häjjin säjämutu. Haẕihi rawḍatu-l-mästüräti-l-muṭahhiräti-ṣ-ṣalixäti-ṣ-ṣa’inäti-ṭ-tajfäti Šäkär-älči bint Gos̱man äl-Bolɣari. Äl-lähummä ärxämha räxmätän wäsigätän. Tuwufijjat ilä-r-räxmäti-l-lahi tägali fi-l-jawmi-r-rabigi wä-l-gišrinä

He lives who does not die, but every living thing dies. This is the plot of the chaste, devout, pious, caring, compassionate Šeker-elči, daughter of Osman the Bulgarian. God, have mercy on her with your great mercy. She was entrusted to the mercy of God the Most-High on the twenty-fourth day.

A photgraph of the Volga Bulgarian inscription of Tatarskoe Šapkino

The monument lies in the village cemetery and has dimensions of 160×60×23 cm. It has been inscribed in two languages: on the obverse there is an Arabic-language inscription written in relief in the Thuluth style of calligraphy, while on the reverse a Turkic text has been inscribed in the Bulgarian variant of the Kufic style. There is also relief writing on the sides of the monument.

A piece of pottery is lying nearby with writing on both sides (but it has not been successfully deciphered). This may give the date of the inscription in question.

Bartens’s history of Permian vowels

In my study of Udmurt and Komi, I have produced an English translation of the chapter on Permian vowels from Raija Bartens’s Permiläisten kielten rakenne ja kehitys (The Structure and Development of the Permian Languages, Helsinki: Finno-Ugrian Society, 2001). While Bartens’s book no longer represents the state of the art in Uralic linguistics, and in the years since Sándor Csúcs has shaken the field up with such publications as Die Rekonstruktion der permischen Grundsprache (Budapest: Akadémiai Kiadó, 2005), Permiläisten kielten rakenne ja kehitys does provide a helpful introduction to 20th-century work on Permian vocalism. Continue reading

Names for ‘ladybug’ in Udmurt

The Диалектологический атлас удмуртского языка edited by R. I. Nasibullin et al. (Iževsk: R&C Dynamics, 2009) has a series of maps showing the distributions of the Udmurt names for various things across the area where the language is spoken. For the most items, there are only a few variants, and in the case of borrowing, Russian loans are prevalent in the north of the Udmurt Republic while Tatar loans are prevalent in the south.

The word for ‘ladybug’ (Russian божья коровка) is a different story. The atlas lists 124 variants.

Some of these are very colourful: ӵужанай ‘maternal grandmother’, вӧйын нянь сиись ‘bread-and-butter eater’. A large number are formed with зор ‘rain’ (< Volga Bulgarian, cf. Chuvash ҫур ‘snow’). Nasibullin examines these names more closely in his article“Божья коровка” в удмуртских говорах’ in the journal Иднакар (issue 2007-2).

Amusingly, after the myriad names for ‘ladybug’, the atlas documents only one name (with varying vocalism) for that most common pest on Earth, the cockroach: торокан/ таракан/тӓрӓкӓн (cf. Russian таракан).

(If this kind of variation fascinates you, in North America, the various names for the family Armadillidiidae, which I grew up calling a roly poly, have also been mapped.)

New edition of Routledge’s Colloquial Albanian

The cover of Routledge’s Colloquial AlbanianOne of the things that always made Albanian seem so mysterious to me in the 1990s and early millennium was the dearth of quality learning materials, a strange state of affairs considering that Albanian is the official language of a decent-sized European country. For a long time, the only introduction easy to purchase was Isa Zymberi’s entry in Routledge’s Colloquial series. However, its presentation of this rather daunting language was opaque, and it was based entirely on the dialect of Kosovo (presumably because it was the only place learners of Albanian could freely travel during the Communist era).

Happily, Routledge remedied this last year by publishing a new version of Colloquial Albanian by Linda Mëniku and Héctor Campos. This is based on the standard language established in Albania proper after the war, treating the Gheg and Tosk dialects only in the last chapter. From my initial impressions after buying a copy in a Helsinki bookshop and flipping through it, this new version lays out more clearly the complex (often irregular) morphology of Albanian. There is no English-Albanian glossary and the amount of vocabulary presented is fairly small, but it seems a fine start and I look forward to working through it before a trip to the Western Balkans this summer.

Austronesian is no longer alone?

Proposals of macrofamilies are interesting, especially when based on data only recently elicited from hitherto-unstudied languages. I’ve come across a paper by Juliette Blevins titled “A Long Lost Sister of Proto-Austronesian?” published in Oceanic Linguistics vol. 46 no. 1 (June 2007) that links two Andaman languages to Austronesian. Continue reading

Mari and Udmurt children’s poetry

Textbooks of Finno-Ugrian languages written for foreign learners really like to give children’s poetry as translation exercises. Thus Марийский язык для всех presents the following from one Pet Pershut:

Kутко сӱан

Тыгыде кутко —
Шем кутко,
Йошкар кутко —
Ер кутко,
Сар кутко —
Сад кутко
Кеҥеж кечын сад мучко
Каеныт корно мучко,
Пурак веле тӱргалтын,
Изи йыҥгыр мӱгыралтын.
Орава да тарантас ден,
Шым гитар ден,
Шым шӱвыр ден,
Вич тӱмыр ден,
Волен, кӱзен,
Шудым пӱген,
Каеныт сӱаныш,
Рӱж миеныт йыраҥыш.

The ant wedding

Small ants,
black ants,
red ants,
lakeshore ant,
grey ants
garden ants
They made their way
though the garden on a summer day,
carrying only crumbs,
singing a little song.
With carts and wagons,
with seven guitars,
with seven bagpipes,
with five drums,
they sang and danced,
and made merry,
They went on, they went up,
They bent down grain stalks,
They went to the wedding,
with a buzz they headed into the flower-bed.

The third chapter of the Udmurt textbook Марым, леся… gives a series of several poems by Alla Kuznetsova exemplifying the numerals just introduced. Here’s the one for ‘7’:

Сизьым туж тодмо мыным,
Сизьым нунал арняын:
Вордӥськон бере пуксён,
Вирнунал, покчиарня,
Крезьгуро удмуртарня,
Кӧснунал, арнянунал.

Seven things are very familiar to me,
The seven days of the week:
Monday then Tuesday,
Wednesday, Thursday,
Melodious Friday
Saturday, Sunday.

I don’t much care for this. Adult learners should not be treated like children. Sure, it may be a few chapters before a student is ready for it, but it would be more dignified to bring in selections from folk songs or simple selections from novels.

Inscriptions in Nepal

On my recent trip to Nepal I came across two inscriptions of linguistic interest.

The first is an unusual inscription in Kathmandu’s Durbar Square. This was placed here by King Pratap Malla in the 17th century. The king was a linguaphile and this poem to the goddess Kali includes words from 15 scripts and languages. According to an article in the Nepali newspaper República these are Persian, Arabic, Maithili, Kiranti, Newari, Kayathinagar (the script then used in western Nepal), Devanagri, Gaudiya, Kashmiri, Sanskrit, two different Tibetan scripts, English and French.

You can clearly make out French l’hiver ‘winter’ and automne ‘autumn’ as well as English winter.

Sadly, a significant part of this inscription has already been effaced. Indeed, the same is happening to most of the inscriptions in Durbar Square, and in spite of its UNESCO World Heritage Site status nothing is being done to protect them.

The second interesting inscription is on the pillar that the Emperor Ashoka set up in the 3rd century BC in Lumbini, the birthplace of the Buddha. This Prakrit-language proclamation releasing Lumbini from tax obligations is written in the Brahmi script. The plaque standing in front of the pillar has a Latin transliteration and translations into English and Nepali.

Close in the Common Turkic vowel system

Here I present a translation of pp. 18–23 of the Фонетика volume of the Сравительно-историческая грамматика тюркских языков ed. E. R. Tenišev (Moscow: Nauka, 1984). This is a work which I appreciate more and more as time goes by, and I hope to bring further portions of it into English in future.

Close in the Common Turkic vowel system

This problem arises in the reconstruction of Proto-Turkic vocalism, and its solution depends on solving the question of how many cardinal vowels there were in Proto-Turkic: 8 or 9. Theoretically the following hypotheses are possible:

  1. the Proto-Turkic system had ä (wide) and (narrow);

  2. there was a qualitative and quantitative opposition of these vowels, i.e. ä versus ẹː (wide short versus narrow long);

  3. there was ä (short) and äː (long): this variation of the reconstruction is actually very similar to the second if one takes into account that phonetically a long vowel is usually more close than the corresponding short one;

  4. there were äː and (wide long and narrow short);

  5. there were ä, äː, and ẹː;

  6. at an early stage Proto-Turkic had ä̂, äː, ä and ệ, ẹː, , but in Common Turkic (with the exception of Chuvash) ä̂ and ệ fell together into ä̂, while ä and fell together into ä (cf. variant 5).

All of these variants have been discussed in specialist literature.[1]

First of all, one must observe that in the modern Turkic languages there are not two (open and close) but several phonetic variants of phonemes which can be presented in transcription as æ, ä, ɛ, e. Furthermore, in each specific system or type of system they have their own particular origin and status.

Thus in languages of the Kipchak type (Kazakh, Karakalpak and Nogay), where a comparatively regular raising of mid vowels occurred, the variant e was established, which could have originated in ɛ or . In Tatar and Bashkir, this e (< ä, ) shifted to i, but in the affixal subsystem it is represented by ä (a front variant of a).[2] Tatar and Bashkir also developed a secondary ä from a in the environment of dorsal j, z, , ž, š, ǯ, č, ž and both ä fell together:

ä/ä in suffixes } ä
ä (vowel harmony variant of a)

In Turkish (taking its dialects into account) there resulted ɛ and e and even ä, e, , though in the literary language ɛ and e did not develop into independent phonemes.

In Turkmen a new opposition between ɛ and äː arose, whereas earlier , apparently through a stage ei, gave . The stage e was preserved in the Khorezm dialects of Uzbek: eːr ‘early’ (~ Turkmen iːr), and in Turkmen dialects (eːr ‘early’, eːl ‘country’, beːl ‘small of the back’, geːč ‘late’).

The opposition of long and short e (ɛ vs. ) can be found in Azeri, but now it is not quantitative (ä vs. ) but qualitative (äl ‘hand’ vs. el ‘country’; Azeri er ‘early’, bel ‘small of the back’, geǯ ‘late’ ~ Turkmen iːr ‘country’, biːl ‘small of the back’, giːč ‘late’).

Apart from this, in Azeri (and Turkmen) a shift ä > e took place in the environment of j, attested already in ancient languages, and also in rare instances of assimilation before a following i (ä > e): Azeri jet‑ ‘arrive, reach’, jer ‘earth’, cf. Old Turkic jetirü ‘until’ and jer ‘earth’ in the Brahmi texts, Azeri ešik ‘door’, Turkmen iːšik ‘door’[3], but Azeri dämir ‘iron’ (Turkish demir), gämi ‘boat’, where there is no influence from i.

In Yakut the quantitative and qualitative opposition between ɛ and changed into an opposition between ɛ and i͜e[4], i.e. between a relatively short vowel and a diphthong: än ‘you’ versus i͜en ‘width’. Furthermore, there is also a dialectal variation i ~ e (is‑ ~ es‑ ‘wade’, ilt‑ ~ elt‑ ‘lead’, iliː ~ eliː ‘hand’) and ä, e > i under the influence of j (> ǯ > č > s): sir‑ ‘reject’, sit‑ ‘reach, attain’, and also i of a following syllable: tirit‑ ‘sweat’ (< tär ‘sweat’), tiriː ‘leather, hide, skin’ (< *täriɣ), diriŋ ‘deep’ (< *däriŋ), timir ‘iron’ (< *tämir).

In Chuvash ɛ is of recent origin. It is a substitution for Tatar ä in loanwords and the front variant of the wide vowel in suffixes.

In Chuvash a and i correspond to the Common Turkic phoneme e (ɛ and ). Thus since a can be found instead of the mid variant of the vowel, i.e. ɛ > ä > a[5], and i is usually found instead of high (ẹː) or a diphthong, one could imagine that Chuvash reflects more accurately the ancient qualitative opposition between ä (äː) and e (ëː ?). For every case of a in Chuvash, at an earlier stage of Common Turkic there must have been ä (or äː), and wherever Chuvash has i earlier Common Turkic had e ().

In the remaining Turkic languages one must consider the qualitative opposition between ä and e to be lost and explain the various reflexes of these vowels as traces of a quantitative opposition, i.e.:

Azeri ä < ä, e; e in many cases < äː,
Turkmen e < ä, e; < äː,
Yakut ä < ä, e; i͜e < äː, , etc.

Analysing Chuvash examples, we find that Chuvash a reflects ä from Common Turkic and ä, and also from ä in some loanwords.

  1. Turkmen giːč ‘late’ (gẹːč), Turkish geč, Azeri g’eǯ, Yakut ki͜ehä ~ Chuvash kas’; Turkmen ber‑ ‘give’, Turkish dial. beːr‑, Yakut bi͜er‑ ~ Chuvash par‑; Turkmen ‘width’, Yakut i͜en, Azeri en ~ Chuvash an; Turkmen iːn‑ ‘go down’, Azeri en‑ ~ Chuvash an‑[6];

  2. Turkmen ek‑ ‘sow’, Azeri äk‑ ~ Chuvash ak‑ (cf. Hungarian eke ‘plow’ < Bulgarian); Turkmen θeθ ‘voice’, Azeri säs ~ Chuvash sas̬ə;

  3. Turkish eš‑ ‘trot’ ~ Chuvash aš‑ (i.e. the shift e‑ > ä‑ > a‑ took place even in relatively late loans).

The reflex i in single-syllable roots in Chuvash is found instead of Common Turkic e, but also e < ä, including early and late loans:

  1. Turkmen eẟ‑, Azeri äz‑, Tatar iz‑ ~ Chuvash ir‑ ‘crush’; Azeri jet‑ ‘arrive’, Turkmen jet‑, Tuvan čeʰt‑, Tatar ǯ́it‑, Bashkir jët‑ ~ Chuvash s’it‑; Azeri g’äl‑ ‘come’, Turkmen gel‑, Tuvan, Yakut kel‑, Tatar kil‑ ~ Chuvash kil‑;

  2. Azeri sez‑ ‘feel’, Tatar siz‑, Bashkir hiẟ‑ ~ sis (< Tat.); Turkmen em ‘medicine’, Tat., Bashkir, Khakas im ~ Chuvash im (< Tat.); Kyrgyz, Altay, Tuvan er ‘use’, Bashkir ir (Tat. irlə̈) ~ Chuvash ir ‘use, gain’ (< Tat.).

One finds instances where Common Turkic e < ä (next to j) gives ə̈ in Chuvash (as in Bashkir): Turkmen, Turkish, Azeri jer ‘earth’, Kyrgyz ǯer, Tuvan čer, Tat. ǯ́ir, Yakut sir, Bashkir jə̈r ~ s’ə̈r ‘earth’; Turkish jen‑, Turkmen jeŋ‑, Kyrgyz ǯeŋ‑, Tat. ǯ́iŋ‑, Khakas čiŋ‑, Bashkir jə̈ŋ‑ ‘defeat’ ~ s’ə̈n‑.

In a number of Chuvash words the vowel i corresponds to Common Turkic (~ Turkmen , Yakut i͜e): Turkmen bäːš ‘5’, Yakut bi͜es, Turkish, Azeri beš, Tat., Bashkir biš ~ Chuvash pilə̈k; Turkmen, Yakut biːl ‘small of the back’, Azeri, Kyrgyz bel, Tat. bil, Khakas pil ~ Chuvash pilə̈k; Turkmen ir ‘early’, Azeri er ~ Chuvash ir.

According to Doerfer, the last example illustrates the assumption that Chuvash i goes back to Common Turkic e, as in Mari we find the word er ‘morning, early’ which was borrowed from Chuvash. Mari e represents an earlier stage of development (in Hill Mari e > i).

Turkic borrowings in Mari like el ‘country’ (~ ? Chuvash jal), en ‘most’, ertäš ‘go past’, pelčän ‘sow thistle (genus Sonchus)’, teŋə̈z ‘sea’, terə̈s ‘manure, fertilizer’, terke ‘plate’, keremet ‘evil spirit’, seŋäš ‘defeat’, s’erə̈p ‘heavy’ show that the raising of e > i involved words not from Ancient Chuvash but rather representing a general Turkic stock in the Middle Volga that goes back to a single source.

In iranianized Uzbek dialects we find e (narrow) and æ (a very wide variant of the vowel e).

In Uyghur, which has the so-called i-umlaut, we find wide ä and e (e and ë) secondary in origin, originating from ä and under the influence of a following i.

Close and open variants of e ( and ɛ) are apparently found in the language of the Yenesei runic inscriptions, as e is depicted by a special grapheme. A distinction was made between these two variants also in the texts in the Brahmi script. Worth noting are the Brahmi-Azeri parallels ket‑ ‘leave, go away’ ~ g’et‑ (Turkmen gider ‘he goes out’), keŋ ‘wide’ ~ g’en (Turkmen giːŋ), ber‑ ‘give’ ~ ver‑ (Yakut bi͜er‑), beš ‘5’ ~ beš (Yakut bi͜es), el ‘tribe’ ~ el (Turkmen iːl), which confirm that in a portion of words Azeri e reflects the quantity of the Proto-Turkic vowel.

E (as a variant of ä) before and after j is found in Turkish dialects, Azeri, the Brahmi texts, Yakut, etc. Cf. e.g. Brahmi jel ‘wind’ ~ Azeri jel; Brahmi jer ‘earth’ ~ Azeri jer ~ Yakut sir; Azeri jet‑ ‘arrive’ ~ Yakut sit‑; Azeri jerik ‘cravings of a pregnant woman’ ~ Yakut sir‑ ‘reject’, etc.

In Yakut this (short) narrowed to i (sir, sit‑).

Because combinatory and positional variation of the type ä ~ ɛ ~ e and e ~ i, and thus ä ~ ɛ ~ ~ i is typical of many modern-day Turkic languages and dialects, one can assume it also for earlier stages of their development. Nonetheless one cannot neglect the rich attestations of dialect mixing, reflected in many (if not all) Turkic vowel systems, cf. e.g. the systems of Chuvash, Khakas and West Siberian Tatar dialects. Both of these factors have led (including in the literary standards) to irregular correspondences: Turkmen lit. bäːš ‘5’, dial. beš, Azeri beš (< beːš) ~ Chuvash pilə̈k; Turkmen äːr ‘man’, Azeri är, Tat., Bashkir, Khakas ir ~ Chuvash ar (on the basis of the Chuvash and Azeri forms one can reconstruct *är); Turkmen mäːẟ ‘gland’, Turkish, Kyrgyz, Kumyk bez ~ Chuvash par (Azeri väz) (on the basis of the Chuvash form one can reconstruct *bär); Turkmen gäːt‑ ‘break off, away’, Turkish get‑ (gedik), Kyrgyz ket‑, Tat. kit‑ ~ Chuvash kat‑ (on the basis of the Turkish and Chuvash forms one can reconstruct *kaːt‑). Thus Turkmen äː corresponds to Common Turkic ä and ẹ̈ː.

Also noteworthy are correspondences between Chuvash and Common Turkic: Chuvash alək ‘gate, door’ (< *äːlik), cf. Turkish, Azeri ešik (where ä > e under the influence of a following i?), Tat., Bashkir išə̈k, Khakas dial. izə̈k, Khakas ə̈zə̈k; but Turkmen iːšik (< *eːšik); Chuvash at‑ ‘do’ (< *ät‑) (cf. Turkmen eder ‘he does’), where d < t after an initial long vowel), but Azeri et‑ points to a protoform *eːt‑; Chuvash ilt‑ ‘hear’ (< *elit‑), Turkmen, Azeri ešit‑, Turkish išit‑ (e > i under the influence of i), Tat., Bashkir išə̈t‑, Yakut ihit‑; Chuvash i, Azeri e point to Common Turkic *e.

Thus an ancient qualitative opposition of ä and e is reflected in the Chuvash system, where we have a < ä and i < e. Only traces remain of a quantitative opposition ẹː > i, cf. pilə̈k ‘5’. Significantly more frequently long and short ä are reflected as a: kas’ ‘evening’ (< *käːč < *kẹːč); ak‑ ‘sow’ (< äk‑).

The whole Common Turkic map is tainted with subsequent dialect mixing and positional-combinatorial variation of the vowels ä, e, i.

For showing the ancient quantitative opposition of e sounds, the Turkmen and Yakut data are the most reliable: in Turkmen < ẹː, as a rule, corresponds to the Yakut diphthong i͜e, for example: Turkmen giːč ‘late’ ~ Yakut ki͜ehä ‘evening’, giːŋ ‘wide’ ~ ki͜eŋ (but käŋä‑ ‘widen’), iːn ‘width’ ~ i͜en (but ? äŋäj‑ ‘spread out’), iːt ‘lead’ ~ si͜et‑ ‘take by the hand, by a leash or rope’ (but sätiː ~ si͜etiː ‘leading a blind person’).

The regular nature of these correspondences is undermined by such examples as Turkmen biːl ‘small of the back’ ~ Yakut biːl (and not i͜e), bäːš (and not biːš) ‘5’ ~ bi͜es (but bähis ‘fifth’).

The alternation of i͜e ~ possibly arose within Yakut, cf. also Yakut iːt ‘load a rifle’ ~ ? Turkmen et‑ ‘do’ (< *eːt‑, indicated also by the d in eder ‘he does’); Yakut tiːl ‘calf or colt nursing from an unrelated female’ ~ Kyrgyz tel.

As far as the correspondence bäːš ~ Yakut bi͜es is concerned, it is well known that numerals are often characterized by phonetic peculiarities due to their function in speech, such as emphatic gemination of consonants. It is also well known that in Turkmen dialects one also encounters the phonetic variants beːš, beš. Chuvash pilə̈k and Volga Bulgarian *bielim may also attest to the length and close character of in beːl ~ beš.

Thus the materials we have examined allow us to speak with a high level of probability of the existence in Proto-Turkic of short ä and long ẹː and of the combinatorial variation of ä (ɛ, e) in different phonetic environments at a late stage of the protolanguage and at various points in the history of the modern Turkic languages all the way to the present day.

[1] See e.g. Scherbak 1970, 28–33, which contains a detailed analysis of almost every proposed hypothesis, and Doerfer 1971, 240–247.

[2] If the shift ɛ > e > i had occurred in suffixes, then the variations of some affixes, e.g. ‑di and ‑də̈, could merge, as i in open final syllables tends to be lowered. Indeed, in the Kasimov dialect ä > i even in affixes: bir8in ‘he gave’, bə̈zdi ‘on us’.

[3] Azeri ešik possibly goes back to *eːšik, and not *äšik, though the latter reconstruction is suggested by Chuvash alək ‘door’.

[4] As pointed out by D’jakovskij (1971, 98–99), the second part of the i͜e diphthong is more close than short ɛ.

[5] The shift ä > a occurred after the shift of Common Turkic a > ao > o. Note that in the period of Permian-Bulgarian contacts (8th–9th centuries) Chuvash still retained ä in opposition with e: ban, bam ‘cheek’, but s’i̮l ‘storm’.

[6] Cf. Turkmen äːr ‘man’, Azeri är ~ Chuvash ar, where the long vowel in the Turkmen word points to äː in the protolanguage, see e.g. the reconstructions of Poppe (eːr) and Doerfer (ä̂r or är).

Iranian from quincunx and back again

When I first became acquainted with Persian some years ago, two grammatical features seemed unusual to me from an Indo-European perspective. One was the ezafe construction, which I eventually learned was the product of contact with Caucasian languages. But the other was the formation of the present tense with a prefix me‑ (indicative) or be‑ (subjunctive) followed by the verb stem and personal endings. In his chapter ‘Dialectology and Topics’ in Routledge’s The Iranian Languages pp. 24–25, Gernot Windfuhr offers a fine summary of the changes that produced the modern Persian system of tenses, which not only clarifies the origin of me‑ and be‑, but shows that Persian has returned to the same five-member tense/aspect system that Iranian (like Greek) started off with.

The history of the parameters and axes of the verb systems from Old Iranian to Modern Iranian shows a cycle from a five-member quincunx to varying Middle Iranian systems back to a quincunx. The development is shown here with the example of Persian.

The inherited fundamental and primary verbal parameter of the Early Old Iranian system is triple aspect which intersects with the binary tense parameter of present and past (marked by the augment a‑). It is centered on the perfective aorist:

Early Old Iranian
Present Past
Imperfective PR a-PR “Present system”
Perfective AOR “Aorist system”
Resultive-stative PF (a-PF) “Perfect system”

In time, this triple aspect system was reduced to forms of the “present” system, i.e. imperfect present and imperfective past, leaving only a few forms of the aorist and the perfect. With their loss, the highly complex inherited system was reduced to a single imperfective stem, distinguishing present vs. augmented imperfect: PR vs. a-PR.

Concomitantly, however, the vacated aorist and perfect ranges of the system were partially filled by the innovation of a new perfective system based on the adjectival completive participle in -tá plus the present and past copula, with both intransitive and transitive verbs.

In Middle Persian, the resulting four-member system of two imperfective and two perfective forms was extended by replacing the copula with the stative verb ēst‑ ‘to stand’. The outcome was a six-member system with a triple aspect axis and a binary tense axis:

Middle Persian
Present Past
Imperfective raw‑ (a-raw‑) present imperfect (later lost)
Perfective raft COP raft būd COP preterit past preterit
Resultive-stative raft ēst‑ raft ēstād COP perfect pluperfect

In addition, the adverb hamē lit. ‘forever’ expressed ongoing and progressive action as well as continuing state, while its pendant (homophonous with the adverb ‘out, away’) expressed the singularity of an event in present and past and assumed inchoative or future connotation with the present stem.

In Early New Persian, (ha)mē‑ and bē‑ were continued, but the periphrastic resultative ēst‑ forms were replaced by extended forms based on the verbal adjective in -tag (< *-taka). bi and could still occur with these verb forms, and neither was obligatory. The core system in terms of frequency was the following:

Early New Persian
Present Past
Imperfective mē-raw‑ mē-raft‑
Perfective bi-raw‑ bi-raft‑ inchoat.-fut. singularity
Unmarked raw‑ raft‑ gen. present gen. past
Resultive-stative raft-a COP raft-a bud‑

Subsequently the system was restructured by the coalescence of the unmarked forms with the perfective forms by the fifteenth century.

  1. In the present, the perfective bi-form assumed distinct subjunctive function, alternating with the unmarked general present form, now opposed to the indicative present-future -form.
  2. In the past, the general unmarked form subsumed the function of the bi-form to express both general and perfective events, now opposed to the imperfective -past form. It thereby assumed the central role of an aorist in the resulting five-member system.

The core of the system became thus as follows, and has not changed since:

Pre-Modern, Indicative
Present Past
Imperfective mē-rav‑ mē-raft‑
Perfective raft‑
Resultive-stative raft-a COP raft-a bud‑

The non-indicative sub-system developed in parallel to the indicative core, using the imperfect and past-perfect forms for irreal function, and using the present subjunctive of ‘to be’ for the perfect subjective:

Pre-Modern, Non-Indicative
Present Past
Imperfective bi-rav‑ mē-raft‑
Perfective raft‑
Resultive-stative raft-a bāš raft-a bud‑