Tatar in Arabic script

Though I’ve often heard that there is a rich pre-1917 literature in Tatar that is no longer widely accessible because of the change of script, I probably wouldn’t have learned how to read Tatar in Arabic script had I not come across a couple of very useful guides. One, the more serious, is The front cover of the book Гарәп язуы нигезендә татарча әлифба by Dž. G. ZäjnullinГарәп язуы нигезендә татарча әлифба by Dž. G Zainullin (Татарстан китап нәшрияты, 1989).

The other, a colourful children’s reader entitled الفبا (Alifba), was published by the Tatar diaspora in Berlin in 1918. I’ve scanned this and uploaded it as a PDF (18MB).A page from a Tatar reader with a text in Arabic script and a drawing of a dog and two cats

Any adaptation of the Arabic script to a Turkic language would have to indicate the frontness of the vowels in a word, but one solution for this that I wasn’t expecting is the use of certain Arabic emphatics to specify back vowel words. However, this doesn’t hold for all cases – as one works through these books, exceptions pile on exceptions. All in all, this system is so bloody complicated that it’s no surprise that Tatar activists pine instead for the Latin script of the 1930s. Still, I am hoping that a knowledge of this script will let me discover some unjustly forgotten literature over the centuries before the October Revolution.

Mari uštə̑š ‘verst’ as a calque on Kipchak

All of the Kipchak languages except Karaim referred to the Russian verst as čaqïrïm, a derivation of the verb čaqïr- ‘to shout’, that is, a verst was seen as the distance a shout would carry.

In Mari, a word for verst is MariE W uštə̑š, for which Tscheremissisches Wörterbuch gives no etymology. One’s eye is then drawn to a verb on the same page, uštal kolten ‘I shout’, which would support deriving the Mari term in the same way as the Kipchak.

The odd thing is that this verb is attested with the meaning ‘shout’ in only one dialect in Tscheremissisches Wörterbuch, that of Krasnoufimsk in the Eastern Mari diaspora. Everywhere else, ueštaš (with an e that reduces and drops out dialectally) is met only in the meaning ‘to yawn’. In Etymologisches Wörterbuch des Tscheremissischen (Mari), Bereczki et al. reject the longstanding Uralic etymology for this word (some Ob-Ugric verbs for ‘yawn’) and instead propose a simple etymology from onamatopoeia: u, representing the sound one makes when shouting or yawning, followed by the denominal verb-forming suffix -Všt-.

The irregular correspondences in uštə̑š between the Mari dialects along with the fact that not all dialects have the verb from which it is derived, underscore how this noun must have been calqued by a dialect in relatively close contact with Tatar, and then mediated to the other dialects.

Mari šaške ‘mink’ borrowed even into South Kipchak

MariE šaške, MariW šäškə ‘mink’ and Finnish dial. häähkä ibid. have some kind of old relationship with Lithuanian šẽškas ‘polecat’. Whether it’s a Baltic > Uralic loan or vice versa doesn’t matter, the match is very old, and therefore we must assume that Chuvash šaškă ‘mink’ is a loan from Mari.

Moving on to the Volga Kipchak languages, we find an irregular initial correspondence in Tatar čäške ‘mink’, but one could suppose that we are dealing with the same word. Äxmatjanov’s Tatar etymological dictionary, at any rate, accepts a Mari etymology. And then the word is also found in Bashkir, as šäške.

Now, the most interesting aspect of all this, is that the word is found in Kazakh. Though standard Kazakh has suw küzeni for ‘mink’, Radloff recorded a form čäške ‘some kind of aquatic animal’, and this must have been borrowed from Bashkir. When I first began studying the Volga-Kama region, I would compare features found there to Kazakh, and if they were present in the latter, assume that they were either from Proto-Kipchak or at least from outside the Volga–Kama area. However, at least some words have been borrowed from North Kipchak to South Kipchak, and ‘mink’ is another one.

With so much language learning, how does one ever publish anything?

A couple of years ago I quoted a statement from an introductory Altaic studies textbook that the continual language learning in this field means a lifelong commitment. It’s one thing to continually learn languages over one’s scholarly career to broaden one’s horizons, but lately it seems that so much language learning is imposed that I cannot ever actually finish a journal submission.

This is how things have gone so far:

  1. When I began my studies of Finno-Ugrian linguistics, my initial concern was just Mari, which struck me as the Uralic language with the most readily assimilable grammar, and Russian so that I could use the only decent textbook of Mari available at the time. (Of course I was learning Finnish too as a foreigner in Helsinki, and Saami, Erzya and Nenets as other coursework.)
  2. After a few months it became clear that one can hardly do anything with Mari without having real proficiency in Chuvash and Tatar.
  3. A few months after that, I saw that understanding the Turkic languages of the Volga–Kama area requires some knowledge of what they were like before they arrived in that part of the world. So, numerous references on the Turkic family in general were added to my reading list, and I had to learn a couple of other Turkic languages (I chose Turkish and Kazakh) to act as a sort of control group for Volga Kipchak.
  4. As the years went by, it became clear that I had considered enough the relationship of the Permian languages with Mari, so courses of Udmurt and Komi became obligatory before I could even dare to comment on the prehistory of Mari. The Ob-Ugrian languages are another area I should strengthen.

At the moment I’ve got a Mari-related research project that I would very much like to bring to publication, but I have the feeling that I will not have done my scholarly due diligence unless I get two more languages under my belt, namely Moksha Mordvin (Erzya Mordvin is not enough) and Ossetian. I’m very worried that the latter is going to lead to even more things to follow up on in Iranian. This could bog me down for years.

The low-hanging fruit in Uralic studies has long been taken. I think it virtually impossible now to publish a paper on Mari considering only that language and no others around it. To someone today, it seems incredible that in 1950 Thomas Sebeok was able to score another entry on his list of publications simply with a two-page article on how Mari family names or patronymics typically precede a person’s own name.

Do scholars who frequently publish simply say at some point OK, I’ve got enough data now and I am collecting no more? Are they not scared that during the peer review process some possibly more knowledgeable scholar is going to condemn them for overlooking data from another language spoken far away but nonetheless essential to the subject?

Mari koma ‘otter, beaver’ and its Chuvash and Tatar analogues

The first prayer in Paasonen’s collection of Eastern Mari texts has several paragraphs of supplications for a fruitful hunt that really tests one’s familiarity with Mari animal names: swans, martens, lynxes, bears, elk, etc. etc. One animal hitherto unfamiliar to me is koma, which Paasonen’s dictionary glosses as what should be two separate animals:

выдра / Otter; aus seinem Leder verfertigten die Tscheremissien vormals Mützen; боберъ J 88, выдра Tr. koma-jol Fuss(fell) des Otters (1038). koma-upš Otterfellmütze (1283. [Tschuw. Zol. xoma выдра, tat. kama.]

Tscheremissisches Wörterbuch also says [< Tschuw. / Tat.], though the presence of the initial velar should establish that koma was borrowed from Tatar and not Chuvash; Hill Mari ama ‘beaver’ is the Chuvash loan.

In Chuvash, Ashmarin lists only xoma, so the word is limited to the Viryal zone and it is probably a borrowing from Tatar. Fedotov’s etymological dictionary also lists a form xăma, but it is not clear where he got that from, because he cites only Ashmarin.

In Tatar, kama is part of the literary language and my Tatar-Russian dictionary defines it as выдра. Äkhmat’janov’s etymological dictionary draws a not very convincing comparison with Old Turkic kam ‘shaman’, but kama with the same meaning as the Tatar is found in Siberian Tatar, and Shor has kamnaɣïs.

So, the word appears to have been brought to the Volga–Kama area from elsewhere. However, that strange distribution within Turkic makes one want to look at other Siberian language families (though a cursory glance at a Ket dictionary shows nothing similar-looking under выдра and бобр).

As the definitions of this term in the various languages are very much bound up with the notion ‘fur-bearing animal’, one might link Ashmarin’s xumă ‘sable’ (from an earlier *kam-ïK?) to the Tatar word.

Tatarisms in Paasonen’s Eastern Mari texts

The Eastern Mari dialect represented in the texts gathered by Heikki Paasonen in April–July 1900 is, for the most part, not especially different from the Mari literary language. However, there are some interesting signs of contact with Tatar. Thus one naturally finds some loanwords like tarlau ‘burnt clearing’ and okaš ‘to read’.

In some places we find lack of accusative marking on the object when it is indefinite, e.g. in kajenə̑t urem dene šap ojlen ojlen koktə̑nat siɣarə̑m tul pə̑zə̑kten purlə̑nə̑t ‘they went along the street speaking loudly and with lit cigarettes in their mouths’. As tul ‘fire’ is the object here of pə̑zə̑ktaš ‘to set’, one would rather expect the accusative form tulə̑m. In fact, in Paasonen’s Eastern Mari dictionary, he cites from somewhere else the phrase pueš tulə̑m pə̑zə̑ktaš ‘to start a wood fire’ where the expected accusative appears. Similarly in pojan kupesβlak par kičken, trojka kičken koštə̑t ‘rich merchants ride with two horses hitched, with three horses hitched’ the two objects lack an accusative.

What I suppose is another copy of a Tatar model is the phrase šukume šagalme lij möŋgö ‘some time later [lit. a lot or a little time passing]’. The Mari question particle mo is a Tatar borrowing, but in this case it is used in a sense (‘or’) not often encountered in Mari non-interrogative sentences, and it doesn’t display labial harmony.

Chuvash reeds and stems

Eastern Mari has the word omə̑ž ‘reed’, which Paasonen’s dictionary notes is a borrowing from Chuvash. In the Skvortsovs’ Chuvash-Russian dictionary I found the source: Cv. xămăš ‘bulrush’. Fedotov’s etymological dictionary compares this to a wide variety of Turkic cognates such as Turkish kamış and Yakut xomus.

But a few lines above it, one finds an entry for a remarkably similar word: xămăl ‘stubble (of cereals)’. Fedotov compares this to Tatar and Bashkir qamïl ‘bulrush’.

These must be the same words, both going back to Proto-Turkic *kamïš ‘grass stalk (or the like)’ and showing the ‑š ~ ‑l distinction that divides the family in two. Outside of Chuvash, the ‑l variant has no cognates outside of Volga Kipchak, and thus can be regarded as a Volga Bulgarian loan into Tatar and Bashkir. The ‑š variant, on the other hand, must be a loan from Volga Kipchak into Chuvash.

An amusing bit of trivia, two distantly related languages trading cognates with different meanings.

On Chuvash śüś ‘hair’

In their 1983 paper on early Bulgarian loanwords in the Permian languages, Rédei & Róna-Tas derive Chuvash śüś ‘hair’ from Proto-Turkic by proposing an intermediate form that is not actually attested anywhere: PT yulči (Cf. Kashgari yulïč ‘goat’s hair’) > Chuv. *śevśi > śüś (p. 77).

Why should this not be considered a simple Tatar loan? A Proto-Turkic word for ‘hair’ as inherited by the Kipchak languages was *sač. Tatar now has čäč after the initial consonant was assimilated to the following č and then, because Tatar č is articulated with great palatalization, the vowel was fronted.

Early Tatar loans in Chuvash show Cv. o/u for Tatar a, and Cv. ś for Tatar č. However, Chuvash ś is just as palatalized as Tatar č (indeed, it can be argued that phonetically they are the same sound). Thus, Chuvash could borrowed the word from Tatar in an intermediate form *čač, raised the vowel, and then fronted it on its own or within an areal context.

Deceptive Volga-Kama parallels

The Perso-Arabic word خبر khabar ‘tidings, news’ was firmly entrenched centuries ago in the Volga-Kama region, borrowed by Volga Bulgarian and reflected in Chuvash as xıpar, and borrowed into Tatar as xäbär. Mari got it from Chuvash as uβer, where the loss of initial x‑ is a regular phenomenon. When I began studying Udmurt, I learned the word ivor ‘news’ and, like many people probably, I simply assumed that it was part of this family of words. In fact, according to several publications by T. E. Uotila, the Udmurt word goes back to Proto-Permian (cf. Komi divor) and has no relationship to the Perso-Arabic loan. Such a loss of the initial velar would be unusual in Udmurt, after all.

Mari tukə̑m ‘family tree, stock, descent’ is another case where it’s important to always check the etymology instead of assuming it’s something one already knows. I had always thought this was from the Turkic root *doɣ- ‘to be born’, which is present in e.g. Tatar tuɣan ‘native’. Again, the phonetic mismatch should have clued me in: Turkic ‑ɣ‑ would not be reflected in Mari as ‑k‑ (and besides, Tatar had lost the voiced velar fricative from that root; the ‑ɣ in Tat. tuɣan belongs to the ‑GAn suffix). The Mari word actually comes from Tatar tokïm < Persian تخم tokhm ‘seed, egg, origin’, which bears only a coincidential resemblance to the Turkic root.

Shiro Hattori: the Japanese linguist with a Mishar wife

In spite of mainly working in other branches of historical linguistics like his native Japanese, Shiro Hattori (1908–1995) published several papers on Tatar, especially its vowel harmony. His articles “Phonological Interpretation of Tatar High Vowels” (published in the Ural-Altaischer Jahrbücher in 1975) and “Vowel Harmony or Consonant Harmony?” (in Documenta Barbarorum: Festschrift für Walther Heissig) allude to how he came to be interested in the language.

In the mid 1930s, Hattori stayed for two years in Manchuria. This was the time of the Japanese puppet state of Manchukuo, but in the two cited articles Hattori doesn’t explain what exactly he was doing there. Here he first learned Kazan Tatar from Hossein Gäbdush, a poet and singer who had descended into utter poverty in Harbin. His second teacher was Hatib Halidi, head teacher of the Tatar school in Harbin – clearly the city was home to a sizable group of Tatars.

This stay not only introduced him to Tatar speakers in general but to his future wife. Magira Mikhammedshakhovna Ageeva was a Mishar Tatar born in 1912 in the Krasnoslobodsky district of what is now Mordovia. In 1916 her family moved to Hailar in northern Manchuria. Thanks to living with her, Hattori was able to compare in unusually great detail the phonology of Kazan Tatar and Mishar Tatar.

Hattori was a fairly public figure in postwar Japan and readers with Japanese proficiency may be able to search for more information on this couple. I’d certainly like to read more about what must have been quite an experience, a Mishar Tatar living in Japan through World War II and an unusual mixed marriage. What language did they use at home? What did Japanese society think of her?