Andreev’s Chuvash textbook and what’s wrong with it

I wrote this review of I. A. Andreev’s Чувашский язык. Практический курс 3rd ed. (Cheboksary: Чувашское книжное издательство, 2011) ISBN 9785767018130 for a book-rating website, but I thought I should also post it here where it is probably more likely to be read. The cover of Andreev’s textbook (3rd ed. 2011) While I do love to just rant about this and other poor learning resources, I think it would be helpful if this book’s flaws were known, as one can avoid being too greatly disappointed. I remember how thrilled I was to discover the book nearly a decade ago, and how quickly my bubble was burst.

An anachronistic paired word in Chuvash

I have written here before about the use of paired words in the Volga–Kama languages to denote an entire class of things, e.g. Chuvash yïvăś-kurăk ‘vegetation’ < yïvăś ‘tree’ + kurăk ‘grass’.

An amusing consequence of this is a jarring anachronism if one of the items in the paired word construction was discovered or invented after the event being described. Consider the following from a Chuvash children’s text on the history of the Olympic games: Хӗҫ-пӑшаллӑ ҫынна Олимпие кӗме юраман ‘[In Ancient Greece] people bearing arms were not allowed into Olympia.’

The paired word here is xĕś-păşal ‘arms, weapons’, made up of xĕś ‘sword’ and păşal ‘rifle’. Obviously there were no rifles in Ancient Greece, but apparently the paired word has become so lexicalized that an author can legitimately use it in any historical context.

Tatar in Arabic script

Though I’ve often heard that there is a rich pre-1917 literature in Tatar that is no longer widely accessible because of the change of script, I probably wouldn’t have learned how to read Tatar in Arabic script had I not come across a couple of very useful guides. One, the more serious, is The front cover of the book Гарәп язуы нигезендә татарча әлифба by Dž. G. ZäjnullinГарәп язуы нигезендә татарча әлифба by Dž. G Zainullin (Татарстан китап нәшрияты, 1989).

The other, a colourful children’s reader entitled الفبا (Alifba), was published by the Tatar diaspora in Berlin in 1918. I’ve scanned this and uploaded it as a PDF (18MB).A page from a Tatar reader with a text in Arabic script and a drawing of a dog and two cats

Any adaptation of the Arabic script to a Turkic language would have to indicate the frontness of the vowels in a word, but one solution for this that I wasn’t expecting is the use of certain Arabic emphatics to specify back vowel words. However, this doesn’t hold for all cases – as one works through these books, exceptions pile on exceptions. All in all, this system is so bloody complicated that it’s no surprise that Tatar activists pine instead for the Latin script of the 1930s. Still, I am hoping that a knowledge of this script will let me discover some unjustly forgotten literature over the centuries before the October Revolution.

Battle of the etymologists

The verb MariE püč́kampəčkäm ‘cut off’ is funny. In the Uralisches etymologisches Wörterbuch (367) the word is derived from a supposed Proto-Uralic *pečkä‑ (päčkä‑) ‘to cut’ on the basis of North Saami bæsˈkedi‑ ‘cut hair or wool off’ and Mordvin E M pečke ‘cut off, chop off’. Bereczki upholds this etymology in his Etymologisches Wörterbuch des Tscheremissischen (Mari) without mentioning any alternatives.

On the other hand, Fedotov in his Этимологический словарь чувашского языка (I 409) etymologizes Chuvash păčkă ‘saw’ on the basis of Turkic – namely the widespread *pïčak/bičäk ‘knife’ – and claims (again without mentioning any alternative) that MariW pəčkäm is a borrowing from Chuvash. Who is right here?

There is only one Uralic etymology in Bereczki where *pe‑ gives MariE pü‑, namely püńč́ö ‘pine’ < *penčä (UEW 727). Otherwise pü- in Mari is normally from *pä‑, e.g. pükš ‘hazelnut’ < *päškз (UEW 726–7). However, if we assume that the Proto-Uralic form was *päčkä, that would conflict with the Mordvinic forms, as Moksha Mordvin usually preserves PU *ä and does not raise it to e. I suppose that is why the UEW placed a question mark before the Mordvinic forms.

Can derivational morphology settle the question? The frequentative of this verb is püč́keẟem, and a quick search of the Mari–English Dictionary shows that ‑eẟem is overwhelmingly found in inherited Uralic vocabulary (or at least pre-Chuvash borrowings), not Turkic loanwords. It is not exclusively so – note joɣeẟem ‘flow’ < Chuvash and tojeẟem ‘hide’ < Tatar – but I would think it probable that MariE püč́kem is inherited.

Ultimately, however, with the resemblance between the Proto-Turkic and Proto-Uralic forms, we might have to take the dreaded notion of “sound symbolism” into account here, something which usually makes me want to drop the question entirely, leaving it for someone else with a greater gift for linguistics.

The birth of Russian/Central Asian studies at Indiana University

A few months ago I read David C. Engerman’s Know Your Enemy: The Rise and Fall of America’s Soviet Experts (Oxford University Press, 2009), hoping it might have some details about the rise of Uralic and Altaic studies at Indiana University in Bloomington. As I wrote here Engerman’s book was something of a disappointment, but a scholar at IU has drawn my attention to a recent paper by Blake Puckett, “Central Eurasian Studies at IU (the pre-Department Years)”. Here’s the abstract:

The Department of Central Eurasian Studies at Indiana University dates its origins to the Army Specialized Training Program conducted at IU starting in 1943. But the history of the Department from that beginning to its official emergence as a Department in 1966 is less well known. This paper follows the development of Central Eurasian Studies during this first twenty year period, tracing its interactions with both internal and external events. Relations between departments, the influence of individual personalities, governmental funding and world events all factor into the rise of a unique department at Indiana University one that traces its roots primarily neither to a geographic region nor to an academic discipline, but largely to an [imagined] family of languages. Particularly interesting are the connections between Linguistics as a field of study and broader efforts to promote language training and the understanding of various cultures and regions. The history also provides grounds to reflect on current concerns over the influence of DoD funding in the academy and the recurrent tensions within academia between the (practical) preparation of professionals and the advancement of (theoretical) knowledge.

There are many interesting details here of the sources of funding for these studies, how European-born linguists like Thomas Sebeok, Alo Raun and Felix Oinas ended up in the United States, and just a touch of academic scandal and intrigue.

More Chuvash and Mari at OpenStreetMap

I am drawing up a table of placename abbreviations from Ashmarin’s Chuvash dictionary along with their geographical coordinates, e.g. Урас-к. = д. Ураз-касы, Янтиковского района ЧАССР = 55.571, 47.7352. This will allow me to more easily map the distribution of some isoglosses that have interested me. For the most part, it has been very easy to link Ashmarin’s villages with contemporary ones, though there are a small number of villages which either no longer exist, or which were drastically renamed after the October Revolution.

In the course of doing this research, I’ve added the Chuvash names for several hundred villages in Chuvashia and in the Chuvash diaspora to OpenStreetMap (a project I am passionate about, as I described here). One of the strange things I’ve discovered is that Tatars and Bashkirs are more likely to recognize Chuvash than editors from Chuvashia. Very, very few villages in Chuvashia were marked with a Chuvash name on OSM when I began this project, but villages in Tatarstan and Bashkiria that historically had a Chuvash population were often marked with the Chuvash name alongside the Russian, Tatar or Bashkir name.

In two instances for Chuvash villages within Chuvashia, someone had specified the Chuvash name not with the name:cv tag but with the old_name tag, which just breaks my heart.

Many of the Chuvash placenames floating around the internet were drawn from the Chuvash Encyclopedia, an authoritative reference source. However, the Chuvash Encyclopedia was digitized at some early time when Chuvash fonts weren’t thought widely available. Thus, for the Chuvash letters ҫ,ӗ,ӑ,ӳ, the Chuvash Encyclopedia actually uses the similar-looking codepoints from the Latin-1 block of Unicode, not the Cyrillic block. Because the names were copied and pasted elsewhere, this error persists in the Tatar-language Wikipedia and some OpenStreetMap points. I suppose I’ve have to write a script to automate correcting these on OSM.

For the moment I am not so enthusiastic about adding Mari placenames, because existing Meadow Mari/Eastern Mari placenames are marked up variously with name:chm and name:mhr. I’ve never thought about the existence of three ISO 639-3 codes for Mari (Mari in general and Meadow Mari/Eastern Mari respectively, plus mrj for Hill Mari) as a problem before, but because OSM generates map tiles based on one and only one ISO 639 code, some Mari-language names will not be visible whichever code one chooses. I suppose this too will have to be automated with a script, however redundant it might seem to add both name:chm and name:mhr to every single point.

Mari uštə̑š ‘verst’ as a calque on Kipchak

All of the Kipchak languages except Karaim referred to the Russian verst as čaqïrïm, a derivation of the verb čaqïr- ‘to shout’, that is, a verst was seen as the distance a shout would carry.

In Mari, a word for verst is MariE W uštə̑š, for which Tscheremissisches Wörterbuch gives no etymology. One’s eye is then drawn to a verb on the same page, uštal kolten ‘I shout’, which would support deriving the Mari term in the same way as the Kipchak.

The odd thing is that this verb is attested with the meaning ‘shout’ in only one dialect in Tscheremissisches Wörterbuch, that of Krasnoufimsk in the Eastern Mari diaspora. Everywhere else, ueštaš (with an e that reduces and drops out dialectally) is met only in the meaning ‘to yawn’. In Etymologisches Wörterbuch des Tscheremissischen (Mari), Bereczki et al. reject the longstanding Uralic etymology for this word (some Ob-Ugric verbs for ‘yawn’) and instead propose a simple etymology from onamatopoeia: u, representing the sound one makes when shouting or yawning, followed by the denominal verb-forming suffix -Všt-.

The irregular correspondences in uštə̑š between the Mari dialects along with the fact that not all dialects have the verb from which it is derived, underscore how this noun must have been calqued by a dialect in relatively close contact with Tatar, and then mediated to the other dialects.

Uralic linguistics data on OpenStreetMap

I have used a great deal while travelling, and being a GPS anorak, I’ve added a great deal of previously unrecorded streets, shops and other points of interest. It has become my usual map reference, superior to Google Maps in its libre nature and its surprisingly richer coverage of certain areas.

While reading Paasonen’s Tscheremissischen Texte collected among the Mari of Bashkiria, I was curious where exactly these villages were. Paasonen describes them as centered around the small town of Čurajevo, 25 versts north of Birsk. That is roughly this map view.

Zoom in, and one will find that almost all of the villages that Paasonen lists still exist, at least nominally. There is also a village there named Oktyabr which, with some research, could probably be identified with one of Paasonen’s pre-revolutionary village names. The Mari names for these villages and one of the local rivers were missing, so I added them (OpenStreetMap allows one to specify different-language names for points by appending to the XML tag a colon followed by the ISO-639 code, so name:chm for Mari).

I’d like to see the Uralic/Altaic/etc. linguistics community add more of these details, not so much as a source of reliable toponymic data for scholarship – one still needs to mine archives – but at least to make it convenient for linguists to pull up the placenames they encounter in the old text collections and dictionaries. Just being able to see these Mari villages on the map makes the texts more enjoyable, and it elucidates some of Paasonen’s comments on inter-village communication.

The tangled etymology of tvorog

Chuvash turăx ‘sour milk’, according to Fedotov’s etymological dictionary, has a secure Turkic etymology, cf. Old Turkic tar ‘buttermilk’. This word in Chuvash or some other early Turkic language is probably the source of Common Slavonic *tvarogŭ (which was even further borrowed into German as Quark) as well as Hungarian túró. It is also claimed by Fedotov to be the origin of MariE torə̑k, MariW tarə̑k.

There is, however, another very similar Chuvash word both in meaning and in phonology that Fedotov does not connect with turăx, namely tăvara (Viryal tora) ‘small cheese’. (This is the source of MariE tuara, MariW tara ‘curd cake, cheese pastry, curd pancake’, Tatar tura ‘homemade cheese’.)

Looking at these two words together, I wondered if Cv. tăvara is the inherited Turkic word, while turăx is a reborrowing from Russian. Loss of a final voiced velar is a normal occurrence in Chuvash, cf. ura ‘leg’ < *aẟaɣ < *aẟak, so tăvara is to be expected.

While any connection goes unmentioned by Fedotov, both Chuvash words are treated together in Róna-Tas & Berta’s West Old Turkic under Hu. túró. In the course of their discussion, they write:

The WOT form toraɣ may be a denominal derivation from tor(ï) with the denominal suffix +rAk, which originally served as an intensifier and later a marker of the comparative degree. The voicing of the final -k is problematic here, because the suffix exists in Chuvash and is +rAx. It is, however, possible that the -x is a second intensifier in Chuvash, +rAk > +rAg > +rA > +rA+x.

Assuming reborrowing from Russian, could the unvoiced final velar in Chuvash turăx reflect the devoicing of final consonants in Russian after the loss of the yers? How old is that feature anyway?

The vocalism of the Turkic roots is problematic. Cv. tăvara assumes original first-syllable *-o/-u, but the word has been compared to Old Turkic tar with its original *a.

The ancient Indo-European comparanda τυρός ‘cheese’ and Avestan tūiri ‘cheeselike milk, whey’ (see Beekes Etymological Dictionary of Greek) make me wonder if this is a steppe Wanderwort.

The surprising origin of Kyrgyzstan’s Altyn Arashan

Back in 2008, while traveling in the Karakol region of Kyrgyzstan, I visited the Altyn Arashan hot springs (as described here), but I didn’t think anything about the place-name other than that it was a golden (altïn) something. Years later, while reading Juha Janhunen’s recently-published presentation Mongolian, I was surprised to find the meaning of the word, and it took a long path to Kyrgyzstan.

The cover of Juha Janhunen’s book Mongolian (John Benjamins, 2012)While speaking of Mongolian’s historical tendency to avoid r- at the beginning of a word by prepending an a-, Janhunen mentions Khalkha Mongolian arshaan ‘hot spring’, where this process has taken place. The Mongolian word in fact originated in Sanskrit raṣāyana, a term of Indian traditional medicine.

Kyrgyzstan has historically had some Mongolian-speaking population, especially in this particular area. The Mongols also brought this word for ‘hot spring’ to Buryatia and Tuva. Some Indian loanwords in Mongolian came through Central Asian Iranian and Uyghur mediation, while others came through Tibetan mediation, though unfortunately I don’t have the references at home to determine by which route arshaan came.