MariE tolašemtalašem ‘try hard, strive’ < Tatar talaš

One of the frustrations of working with Tscheremissiches Wörterbuch is that some Mari items are labeled Tschuw. or Tat., but the exact source is not specified and sometimes one has to dig a little to determine the original Chuvash or Tatar word.

A case in point is MariE tolašemtalašemsich bestreben, eilen, irgwendwie zu tun versuchen’. This is marked as a Tatar loanword in TschWb, and the word is clearly of Turkic origin since it has a causitive derivational form MariE tolaštaremtalaštarem. I turned to my dictionary of literary Kazan Tatar, the Татарско-русский словарь (Казань: Мәгариф, 2007), and found a phonetic match: талашу. However, the meanings ‘сспориться, скандалить, переругиваться’ of this verb and its derivational forms were not close enough to the Mari verb to satisfy.

If my Tatar dictionary doesn’t help for a Turkic loanword in Mari, the next stop is a Chuvash one. Ashmarin’s Thesaurus Linguae Tschuvaschorum contains a verb corresponding to the Tatar one and almost certainly a borrowing of it, namely tulaş, and the first meanings mentioned are the same as for the Tatar: ‘беситься, злиться, грызться’. However, buried deeper down in the entry is the meaning we’re looking for: возиться, стараться. This is an understandable extension of the Turkic root tal-, the basic meaning of which is ‘to force; to take by force’.

Thus Mari and Chuvash preserve a meaning of the Tatar word that seems to have died out among Kazan Tatars. Interestingly, Russian too borrowed this Tatar word dialectally and uses it in a similar sense, or at least it did in the 19th century: a verb талашитьсясуетиться, толочься, метаться’ is attested from the Tambov region in the Толковый словарь Даля, compiled by Vladimir Ivanovich Dal’ and published in 1863–1866.

Incidentally, had I carefully examined the Mari–English Dictionary instead of basing myself solely on Tscheremissiches Wörterbuch, then I could have figured out this etymology more quickly, because one of the meanings of MariE lit. толашаш is ‘to quarrel, to squabble, to bicker’, and that meaning is not found in TschWb. However, the Mari–English Dictionary, being a general literary-language reference and not a dialect dictionary, does not list the origin of the item, and I wonder if the word in that meaning was found only in Eastern Mari communities under heavy Tatar influence before the rise of the literary language, and only the meaning ‘try hard, strive’ is pan-Mari.

Andreev’s Chuvash textbook and what’s wrong with it

I wrote this review of I. A. Andreev’s Чувашский язык. Практический курс 3rd ed. (Cheboksary: Чувашское книжное издательство, 2011) ISBN 9785767018130 for a book-rating website, but I thought I should also post it here where it is probably more likely to be read. The cover of Andreev’s textbook (3rd ed. 2011) While I do love to just rant about this and other poor learning resources, I think it would be helpful if this book’s flaws were known, as one can avoid being too greatly disappointed. I remember how thrilled I was to discover the book nearly a decade ago, and how quickly my bubble was burst.

An anachronistic paired word in Chuvash

I have written here before about the use of paired words in the Volga–Kama languages to denote an entire class of things, e.g. Chuvash yïvăś-kurăk ‘vegetation’ < yïvăś ‘tree’ + kurăk ‘grass’.

An amusing consequence of this is a jarring anachronism if one of the items in the paired word construction was discovered or invented after the event being described. Consider the following from a Chuvash children’s text on the history of the Olympic games: Хӗҫ-пӑшаллӑ ҫынна Олимпие кӗме юраман ‘[In Ancient Greece] people bearing arms were not allowed into Olympia.’

The paired word here is xĕś-păşal ‘arms, weapons’, made up of xĕś ‘sword’ and păşal ‘rifle’. Obviously there were no rifles in Ancient Greece, but apparently the paired word has become so lexicalized that an author can legitimately use it in any historical context.

Battle of the etymologists

The verb MariE püč́kampəčkäm ‘cut off’ is funny. In the Uralisches etymologisches Wörterbuch (367) the word is derived from a supposed Proto-Uralic *pečkä‑ (päčkä‑) ‘to cut’ on the basis of North Saami bæsˈkedi‑ ‘cut hair or wool off’ and Mordvin E M pečke ‘cut off, chop off’. Bereczki upholds this etymology in his Etymologisches Wörterbuch des Tscheremissischen (Mari) without mentioning any alternatives.

On the other hand, Fedotov in his Этимологический словарь чувашского языка (I 409) etymologizes Chuvash păčkă ‘saw’ on the basis of Turkic – namely the widespread *pïčak/bičäk ‘knife’ – and claims (again without mentioning any alternative) that MariW pəčkäm is a borrowing from Chuvash. Who is right here?

There is only one Uralic etymology in Bereczki where *pe‑ gives MariE pü‑, namely püńč́ö ‘pine’ < *penčä (UEW 727). Otherwise pü- in Mari is normally from *pä‑, e.g. pükš ‘hazelnut’ < *päškз (UEW 726–7). However, if we assume that the Proto-Uralic form was *päčkä, that would conflict with the Mordvinic forms, as Moksha Mordvin usually preserves PU *ä and does not raise it to e. I suppose that is why the UEW placed a question mark before the Mordvinic forms.

Can derivational morphology settle the question? The frequentative of this verb is püč́keẟem, and a quick search of the Mari–English Dictionary shows that ‑eẟem is overwhelmingly found in inherited Uralic vocabulary (or at least pre-Chuvash borrowings), not Turkic loanwords. It is not exclusively so – note joɣeẟem ‘flow’ < Chuvash and tojeẟem ‘hide’ < Tatar – but I would think it probable that MariE püč́kem is inherited.

Ultimately, however, with the resemblance between the Proto-Turkic and Proto-Uralic forms, we might have to take the dreaded notion of “sound symbolism” into account here, something which usually makes me want to drop the question entirely, leaving it for someone else with a greater gift for linguistics.

More Chuvash and Mari at OpenStreetMap

I am drawing up a table of placename abbreviations from Ashmarin’s Chuvash dictionary along with their geographical coordinates, e.g. Урас-к. = д. Ураз-касы, Янтиковского района ЧАССР = 55.571, 47.7352. This will allow me to more easily map the distribution of some isoglosses that have interested me. For the most part, it has been very easy to link Ashmarin’s villages with contemporary ones, though there are a small number of villages which either no longer exist, or which were drastically renamed after the October Revolution.

In the course of doing this research, I’ve added the Chuvash names for several hundred villages in Chuvashia and in the Chuvash diaspora to OpenStreetMap (a project I am passionate about, as I described here). One of the strange things I’ve discovered is that Tatars and Bashkirs are more likely to recognize Chuvash than editors from Chuvashia. Very, very few villages in Chuvashia were marked with a Chuvash name on OSM when I began this project, but villages in Tatarstan and Bashkiria that historically had a Chuvash population were often marked with the Chuvash name alongside the Russian, Tatar or Bashkir name.

In two instances for Chuvash villages within Chuvashia, someone had specified the Chuvash name not with the name:cv tag but with the old_name tag, which just breaks my heart.

Many of the Chuvash placenames floating around the internet were drawn from the Chuvash Encyclopedia, an authoritative reference source. However, the Chuvash Encyclopedia was digitized at some early time when Chuvash fonts weren’t thought widely available. Thus, for the Chuvash letters ҫ,ӗ,ӑ,ӳ, the Chuvash Encyclopedia actually uses the similar-looking codepoints from the Latin-1 block of Unicode, not the Cyrillic block. Because the names were copied and pasted elsewhere, this error persists in the Tatar-language Wikipedia and some OpenStreetMap points. I suppose I’ve have to write a script to automate correcting these on OSM.

For the moment I am not so enthusiastic about adding Mari placenames, because existing Meadow Mari/Eastern Mari placenames are marked up variously with name:chm and name:mhr. I’ve never thought about the existence of three ISO 639-3 codes for Mari (Mari in general and Meadow Mari/Eastern Mari respectively, plus mrj for Hill Mari) as a problem before, but because OSM generates map tiles based on one and only one ISO 639 code, some Mari-language names will not be visible whichever code one chooses. I suppose this too will have to be automated with a script, however redundant it might seem to add both name:chm and name:mhr to every single point.

The tangled etymology of tvorog

Chuvash turăx ‘sour milk’, according to Fedotov’s etymological dictionary, has a secure Turkic etymology, cf. Old Turkic tar ‘buttermilk’. This word in Chuvash or some other early Turkic language is probably the source of Common Slavonic *tvarogŭ (which was even further borrowed into German as Quark) as well as Hungarian túró. It is also claimed by Fedotov to be the origin of MariE torə̑k, MariW tarə̑k.

There is, however, another very similar Chuvash word both in meaning and in phonology that Fedotov does not connect with turăx, namely tăvara (Viryal tora) ‘small cheese’. (This is the source of MariE tuara, MariW tara ‘curd cake, cheese pastry, curd pancake’, Tatar tura ‘homemade cheese’.)

Looking at these two words together, I wondered if Cv. tăvara is the inherited Turkic word, while turăx is a reborrowing from Russian. Loss of a final voiced velar is a normal occurrence in Chuvash, cf. ura ‘leg’ < *aẟaɣ < *aẟak, so tăvara is to be expected.

While any connection goes unmentioned by Fedotov, both Chuvash words are treated together in Róna-Tas & Berta’s West Old Turkic under Hu. túró. In the course of their discussion, they write:

The WOT form toraɣ may be a denominal derivation from tor(ï) with the denominal suffix +rAk, which originally served as an intensifier and later a marker of the comparative degree. The voicing of the final -k is problematic here, because the suffix exists in Chuvash and is +rAx. It is, however, possible that the -x is a second intensifier in Chuvash, +rAk > +rAg > +rA > +rA+x.

Assuming reborrowing from Russian, could the unvoiced final velar in Chuvash turăx reflect the devoicing of final consonants in Russian after the loss of the yers? How old is that feature anyway?

The vocalism of the Turkic roots is problematic. Cv. tăvara assumes original first-syllable *-o/-u, but the word has been compared to Old Turkic tar with its original *a.

The ancient Indo-European comparanda τυρός ‘cheese’ and Avestan tūiri ‘cheeselike milk, whey’ (see Beekes Etymological Dictionary of Greek) make me wonder if this is a steppe Wanderwort.

Mari šaške ‘mink’ borrowed even into South Kipchak

MariE šaške, MariW šäškə ‘mink’ and Finnish dial. häähkä ibid. have some kind of old relationship with Lithuanian šẽškas ‘polecat’. Whether it’s a Baltic > Uralic loan or vice versa doesn’t matter, the match is very old, and therefore we must assume that Chuvash šaškă ‘mink’ is a loan from Mari.

Moving on to the Volga Kipchak languages, we find an irregular initial correspondence in Tatar čäške ‘mink’, but one could suppose that we are dealing with the same word. Äxmatjanov’s Tatar etymological dictionary, at any rate, accepts a Mari etymology. And then the word is also found in Bashkir, as šäške.

Now, the most interesting aspect of all this, is that the word is found in Kazakh. Though standard Kazakh has suw küzeni for ‘mink’, Radloff recorded a form čäške ‘some kind of aquatic animal’, and this must have been borrowed from Bashkir. When I first began studying the Volga-Kama region, I would compare features found there to Kazakh, and if they were present in the latter, assume that they were either from Proto-Kipchak or at least from outside the Volga–Kama area. However, at least some words have been borrowed from North Kipchak to South Kipchak, and ‘mink’ is another one.

With so much language learning, how does one ever publish anything?

A couple of years ago I quoted a statement from an introductory Altaic studies textbook that the continual language learning in this field means a lifelong commitment. It’s one thing to continually learn languages over one’s scholarly career to broaden one’s horizons, but lately it seems that so much language learning is imposed that I cannot ever actually finish a journal submission.

This is how things have gone so far:

  1. When I began my studies of Finno-Ugrian linguistics, my initial concern was just Mari, which struck me as the Uralic language with the most readily assimilable grammar, and Russian so that I could use the only decent textbook of Mari available at the time. (Of course I was learning Finnish too as a foreigner in Helsinki, and Saami, Erzya and Nenets as other coursework.)
  2. After a few months it became clear that one can hardly do anything with Mari without having real proficiency in Chuvash and Tatar.
  3. A few months after that, I saw that understanding the Turkic languages of the Volga–Kama area requires some knowledge of what they were like before they arrived in that part of the world. So, numerous references on the Turkic family in general were added to my reading list, and I had to learn a couple of other Turkic languages (I chose Turkish and Kazakh) to act as a sort of control group for Volga Kipchak.
  4. As the years went by, it became clear that I had considered enough the relationship of the Permian languages with Mari, so courses of Udmurt and Komi became obligatory before I could even dare to comment on the prehistory of Mari. The Ob-Ugrian languages are another area I should strengthen.

At the moment I’ve got a Mari-related research project that I would very much like to bring to publication, but I have the feeling that I will not have done my scholarly due diligence unless I get two more languages under my belt, namely Moksha Mordvin (Erzya Mordvin is not enough) and Ossetian. I’m very worried that the latter is going to lead to even more things to follow up on in Iranian. This could bog me down for years.

The low-hanging fruit in Uralic studies has long been taken. I think it virtually impossible now to publish a paper on Mari considering only that language and no others around it. To someone today, it seems incredible that in 1950 Thomas Sebeok was able to score another entry on his list of publications simply with a two-page article on how Mari family names or patronymics typically precede a person’s own name.

Do scholars who frequently publish simply say at some point OK, I’ve got enough data now and I am collecting no more? Are they not scared that during the peer review process some possibly more knowledgeable scholar is going to condemn them for overlooking data from another language spoken far away but nonetheless essential to the subject?

Gemination in Viryal Chuvash as evidence of a substrate

The conference collection Volgan alueen kielikontaktit (Turku, 2002) has a paper by A. V. Emel’janova titled “Геминация — черта субстратной лексики языка-предшественника” that I’ve read a number of times over the years and have struggled with.

Emel’janova’s thesis is essentially that occurences of gemination in the Malokaračin variety of the Viryal dialect provide evidence that the Chuvash there absorbed a Mari population, e.g. ačča ‘child’ (literary language ača), vălčča ‘roe’ (lit. vălča), śamkka ‘forehead’ (lit. śamka), etc. That is, under the influence of the presence of unvoiced medial consonants in the local Mari dialect, Chuvash medials became unvoiced, and Chuvash has a rule that unvoiced stops be pronounced as geminates.

What is especially interesting about Emel’janova’s data is several Chuvash words which don’t show gemination: kalča ‘shoots, young growth’, molča ‘bath’, xănča ‘when’, on čońa ‘at that time’, kĕntele ‘bundle of fabric prepared for spinning’, śınsančan ‘from people’, măntra ‘ball’.

Of these words without gemination, molča < баня and kĕnčele < кудель are early loanwords from Russian, and we know that there was a voiced d in the latter. Agyagási has identified măntra as a borrowing from the Late Gorodets population, so this may help to establish the presence of voiced stops in that language. Conversely, the word kukkăl’ ‘pie’ (lit. kukăl’, also in Agyagási’s Late Gorodets wordlist, shows gemination, so can we reconstruct a system of both voiced and unvoiced consontants for the Late Gorodets language?

But what confuses me about all this is the chronology. Bereczki’s view (Gründzuge der tscheremissischen Sprachgeschichte) was that the Volga Bulgars did not settle in the north of Chuvashia and absorb its Mari population until just before the Mongol invasion in the 13th century. Emel’janov on the other hand claims that the Bulgars took the north of Chuvashia and absorbed some of its Mari population already in the 9th and 10th centuries.

Furthermore, the word куделе was borrowed into Volga Bulgarian early, before the loss of nasal vowels in Old Russian (which probably occurred by the 11th century on the basis of the Ostromir Gospels). If the substrate population for this Chuvash dialect pronounced medial -č- as unvoiced in the Chuvash vocabulary they took on, and thus caused gemination, they should have done the same with kĕnčele, that is, unless the word was borrowed after Chuvash and the substrate came into contact. Therefore, if the substrate were Mari, then intensive Chuvash and Mari contact would have to be traced back to at least two centuries earlier. Do we want to do that?

Mari koma ‘otter, beaver’ and its Chuvash and Tatar analogues

The first prayer in Paasonen’s collection of Eastern Mari texts has several paragraphs of supplications for a fruitful hunt that really tests one’s familiarity with Mari animal names: swans, martens, lynxes, bears, elk, etc. etc. One animal hitherto unfamiliar to me is koma, which Paasonen’s dictionary glosses as what should be two separate animals:

выдра / Otter; aus seinem Leder verfertigten die Tscheremissien vormals Mützen; боберъ J 88, выдра Tr. koma-jol Fuss(fell) des Otters (1038). koma-upš Otterfellmütze (1283. [Tschuw. Zol. xoma выдра, tat. kama.]

Tscheremissisches Wörterbuch also says [< Tschuw. / Tat.], though the presence of the initial velar should establish that koma was borrowed from Tatar and not Chuvash; Hill Mari ama ‘beaver’ is the Chuvash loan.

In Chuvash, Ashmarin lists only xoma, so the word is limited to the Viryal zone and it is probably a borrowing from Tatar. Fedotov’s etymological dictionary also lists a form xăma, but it is not clear where he got that from, because he cites only Ashmarin.

In Tatar, kama is part of the literary language and my Tatar-Russian dictionary defines it as выдра. Äkhmat’janov’s etymological dictionary draws a not very convincing comparison with Old Turkic kam ‘shaman’, but kama with the same meaning as the Tatar is found in Siberian Tatar, and Shor has kamnaɣïs.

So, the word appears to have been brought to the Volga–Kama area from elsewhere. However, that strange distribution within Turkic makes one want to look at other Siberian language families (though a cursory glance at a Ket dictionary shows nothing similar-looking under выдра and бобр).

As the definitions of this term in the various languages are very much bound up with the notion ‘fur-bearing animal’, one might link Ashmarin’s xumă ‘sable’ (from an earlier *kam-ïK?) to the Tatar word.