An overlooked Russian loan etymology in Chuvash and Mari

I was surprised to find Mari pajə̑rka, pə̑jə̑rka and Chuvash payarka ‘small amount’ in Agyagási’s set of shared Mari–Chuvash lexical material of unknown etymology (“Der sprachliche Nachlaß der Spät-Gorodec Bevölkerung in den tschuwaschischen und mariischen Mundarten”, 2000). The words seemed to me like straightforward borrowings of Russian поярок known from Dal’ and glossed ‘шерсть с ярки, первой стрижки, с овцы по первой осени’. When lambs are shorn for the first time, they produce a quite small amount of wool, and the example sentences that Beke’s dictionary gives for Mari pajə̑rka suggest the word was mainly applied to small amounts of material (wool/straw/bast), and so one could readily propose a semantic development ‘small bundle of wool’ → ‘small bundle of any material’ → ‘small thing’.

Agyagási’s ascription of the word to an unknown Middle Volga substrate had the consequence that, first of all, the word was overlooked in her later work Ранние русские заимствовния тюркских языков Волго-Камского ареала, and secondly, the editors of Tscheremissisches Wörterbuch only marked these Mari items “[~ Tschuw.]” I began writing up this Russian loan etymology with publication in mind, so I was rather disappointed to find the etymology had already been presented as an aside in Rédei & Róna-Tas’s 1983 paper “Early Bulgarian loanwords in the Permian languages”, though at the same time it is nice to see my hunch confirmed.

On p. 37 of the article, the authors discuss the mistaken comparison of Komi pargain der Flachshechel zurückgebliebener flockenförmiger, reiner Abfall vom gehechelten Flachs’ with Chuvash pargaBüschel’. They write:

The Chuvash parga (Zolotnickij, Čuv.-russk sl.; Paasonen, Csuv. szój) ‘heap, bundle’ is a dialectal form: more exactly, the word is paŕga (Ašmarin IX, p. 117) and is the equivalent of the payărka of the literary language. This word exists in Cheremiss (pajə̑rka, pə̑jə̑rka, Räsänen, Tat. Lehnw., p. 88 Cher. ← Chuv., Etym. Wb., p. 378 Cher. → Chuv.), and also in Tatar (dial. payarka). These words are adoptions of the R poyarokšerst’ jagnjat (pervoj strižki)’ (Vasmer III, p. 351) and the semantic development is ‘small heap of wool’ → ‘small heap, bundle’ (Cf. Cher. miž-pajə̑rkaein wenig Wolle’).

Mari constellations

I have occasionally wondered if ancient Mari names for the constellations have been preserved – after all, so much folkloric terminology has already been lost over the 20th century – but I never looked into the matter. In his 2005 publication on G. F. Müller’s Mari wordlist, Oleg Sergeev points out some terms that had survived from at least the early 18th century until the compilation of the Словарь марийского языка in the 1980s and 1990s:

Müller Meadow Mari English
шорду шудеръ ‘звезда лось, которую астрономи называют’ Шордышӱдыр šorẟə̑šüẟə̑r North Star
машка шудеръ ’медведь’ Маскашӱдыр maskašüẟə̑r Great Bear, Ursa Major
витвара шудеръ ‘кичиги’ Вӱдварашӱдыр βüẟβarašuẟə̑r Orion
сокта шудеръ ‘утяча гнездо’ Шоктешӱдыр šoktešüẟə̑r Pleiades

The word кичиги as a Russian name for the constellation Orion is known from V. I. Dal’s dialectal dictionary: Кичига, кичиги мн. сев. три-царя, сохатый сиб. кигачи южн. созвездие Орион, или пятизвездие, образующее пояс и меч его.

More unmarked loanwords in Tscheremissisches Wörterbuch

As I mentioned in an earlier post, while Tscheremissisches Wörterbuch generally notes if a Mari word has a Uralic etymology or is a loanword, and thus is a useful way to determine which Mari items are so far without an etymology, there are a few loanwords that were erroneously left unmarked. Here are some more:

  • MariE č́oman ‘large bast basket’ was linked by Wichmann (1903) to Cv. çuman id., although the presence of the Mari word only in Morko and Birsk in TschWb makes Tatar čuman or even Ru. dial. чуман a possibility as well.
  • MariE mŭl’o, mə̑l’e NW W mol’ə̑ ‘kind of fish’ is from Ru. dial. мольсамая мелкая рыбка, недавно выведшаяся’ and is listed in Savatkova (1969: 104).
  • MariE NW süsanem ‘tremble (e.g. from cold, fear)’ is from Cv. śüśen- and is listed in Fedotov (1990).
  • MariE šolap, šolop ‘roof gutter’ is from Ru. жёлоб id. I’m am not aware of this etymology being expressly mentioned anywhere before Veršinin’s dictionary of Mari dialects of Udmurtia and Tatarstan, but I would think this a pretty transparent borrowing.
  • MariE NW teŋgə̑ltöngəl ‘bench, chair’ is from either Tat. dial., Bashkir täŋkälтабуретк’ or Cv. tenkelскамейка, стул’.
  • MariE treńč́a NW trendzä W tranza ‘roof shingle’ is a borrowing of Ru. драница and listed in Savatkova (1969:95).
  • MariW əŋgəžä ‘shoulder’ is a borrowing of Cv. ĕnse (< *ĕŋsä) ‘nape’, as recognized already by Räsänen (1920: 28). The word is not, however, listed in Fedotov’s Чувашкомарийские языковие взаимосвязи, and that may be why it was overlooked by the TschWb editors.
  • MariNW W tepenä ‘hole in oven or drying kiln’ was derived from Ru. теплина by Räsänen (1930). This etymology was overlooked by Savatkova (1969), however.
  • Also in Räsänen (1930) we find MariE sə̑lma NW W tsə̑lma ‘homemade trousers’ compared to Cv. śïrt­ma, śït­ma, śïta ‘children’s cloth nappies; long underwear’, which merited marking the entry in TschWb with “[~ Tschuw.]”. This word would be more than a straightforward loan between Mari and Chuvash, as it surely must be related to Moksha Mordvin seŕ­mag, ser­mjaga ‘coarse undyed fabric’ and is probably of Iranian origin.
  • MariE toβro ‘of course, certainly’ is a clear borrowing of Ru. добро (before contact with Russian, ‑βr‑ was not a permitted cluster in Mari), and other instances of this Russian borrowing across the Mari dialects are listed in Savatkova (1969: 95).


Fedotov = Федотов, М. Р. 1990: Чувашско-марийские языковые взаимосвязи. Саранск: Издательство саратовского универстета, саранский филиал.

Räsänen, Martti 1930: “Wortgeschichtliches zu den sprachen der Wolga-völker” – Finnisch-Ugrische Forschungen 36:1.

Räsänen, Martti 1920: Die tschuwassischen Lehnwörter im Tscheremissischen. Mémoires de la Socíete Finno-ougrienne 48. Helsinki: Société Finno-ougrienne.

Savatkova = Savatkova, A. A. 1969: Русские заимствования в марийском языке. Йошкар-Ола: Марийское книжное издательство.

Veršinin = Вершинин, В. И. 2011: Словарь марийских говоров Татарстана и Удмуртии. Йошкар-Ола

Wichmann, Yrjö 1903: Die tschuwassischen Lehnwörter in den permischen Sprachen. Mémoires de la Socíete Finno-ougrienne 21. Helsinki: Société Finno-ougrienne.

A fradulent polyglot

Lately there has been a lot of European media coverage of eager polyglots (sometimes hyped as “hyperpolyglots”), but these newspaper or magazine articles are often centered around meetings or internet communities where people claiming some command of a foreign language are seeking others with whom they can converse, and so it’s obvious what skills they have (or don’t). A blog post by Russian travel guru Anton Krotov introduced me to Villi Melnikov, a now-deceased Russian who several years ago convinced journalists that he knew up to a hundred languages, and as they had no way to verify that, they simply accepted his claims uncritically.

Melnikov has his own article at the Russian Wikipedia, a section of which notes that some of the languages he claimed to speak don’t even exist, e.g. “Sumero-Akkadian” or a mysterious “Dkhurr-Wuemmt” (a Google search for the latter in the original Cyrillic spelling дхурр-вуэммт returns only results linked to Melnikov). Amusing is Krotov’s own account of a time that he met Melnikov and realized he was a flimflam artist:

Once, several years ago, when I had been impressed by the “talents” of Villi Melnikov, I happened to be among the same group of people with him and decided to carry out a small test of his talents. To test him, I used 1) a surah from the Quran, 2) a passage from Practical Free Travel [one of Krotov’s own books] in its Latvian translation, and 3) some text or other in Indonesian. A friend of Roman Pechenkin’s from Laos was also present. Villi enthusiastically provided a Russian translation of all the necessary phrases and passages, but there was just one thing: none of these translations matched the actual contents of the given texts. Villi had cleverly played a trick on every present, and many of them didn’t even realize the nature of the con. The woman from Laos was happy to hear him rambling in Lao, but she didn’t understand anything and answered, I am not understand you, after which the polyglot switched to English and everyone gathered there was satisfied.

The squirrel-less are guilty

Courtesy of Loanwords in the World’s Languages ed. Haspelmath & Tadmor (Mouton de Gruyter, 2009) comes one of the most amusing etymologies I’ve ever seen.

In his contribution on loanwords in Ket, Edward Vajda describes the historical background of how speakers of Yeniseic languages first encountered Russians through the latter settlers’ demand for a tribute of furs. Due to these pressures, many Yeniseic speakers sought to avoid contact with Russians, and Russian loanwords before the Soviet era are fairly limited. But the Ket also showed a predilection for coining their own words for new concepts from native Ket material instead of borrowing foreign words.

Vajda ultimately drops this little gem: The concept ‘guilty’ was interpreted in Ket as saʁan, derived from a combination of native Ket sa’q ‘squirrel’ with the case marker -an ‘without, lacking’, since someone without furs to pay their tax was ‘guilty’ or ‘at fault’ in a legalistic sense.

Unmarked loanwords in Tscheremissisches Wörterbuch

In Tscheremissisches Wörterbuch known loanwords in Mari are usually noted as such, e.g. “taɣaWidder, Hammel, Schafbock’ [< Tschuw.]”, “pülẟaremfordern, verlangen’ [< Tat.]”. By going through the dictionary and compiling a list of unetymologized words, I’ve been able to propose a few new etymologies that hopefully will be published eventually. However, one must tread cautiously, as a few loanwords are left unmarked even when they have long been recognized as such.

One of these is the Mari word for ‘frog, toad’, listed under the headword užaβa with a great deal of dialectal variation. This bears a striking resemblance to Russian жаба id. Indeed, I turned to Savatkova’s Русские заимствования в марийском языке, and the loanword is included in the great big Russian–Mari index at the back (namely on page 95).

MariE taɣarl’aein kleiner Vogel’ is a borrowing of Tat. täkärlek, as recognized already by Räsänen in his Die tatarischen Lehnwörter im Tscheremissischen of 1923, p. 65. The word may have come into Mari through Chuvash mediation on account of the voiced velar spirant if one supposes that Mari did not take it from a Tatar dialect that voiced the velar, but that would still have merited writing “[< Tschuw./Tat.]” next to this headword like with other doubtful items, such as purlogräulich’.

Mauritian words in J.M.G. Le Clézio’s La quarantaine

I recently finished J.M.G. Le Clézio’s 1994 novel La quarantaine, about two Mauritius-born brothers returning to their native land but stranded for two months on a neighbouring smaller island used as a quarantine station. Le Clézio’s French prose is straightforward, maybe disappointingly so if one has read other authors with a great flair for language.

The dialogue is also in standard French, with the exception of the single creole sentence Pour faire la guerre licien, napa bisoin fizi, bisoin coup de roce. I initially imagined that this sentence, while opaque to me, would be readily decipherable by native French speakers, as why else would Le Clézio dare to present it without any gloss in Standard French? In fact, the several French people I have presented this passage to stumbled on fizi, and only from the unexpected source of the Dictionnaire pratique du créole de Guadeloupe did I learn that this creole word goes back to fusil. The same French people also couldn’t identify licien, but several sources on the internet (e.g. here) show that the word means ‘dog’, originating in le chien. Thus I suppose the full sentence in the novel would be ‘To fight a dog, one doesn’t need a firearm, one just needs to hit it with a stone.’

There are several individual Mauritian words that pop up throughout the book, however. Three refer to Indians who were employed by the colonial authorities as recruiters or overseers of coolie labour: arkottie and sirdar, encountered often in the book, and duffadar. A Google search for arkottie and duffadar shows that they are found mainly in 19th-century English publications and must have entered Mauritian French or Mauritian Creole from English, which makes sense considering that the labour was sourced from British India. The title sirdar was widely used through the Middle East and the Indian Subcontinent as a military or aristocratic rank.

Le Clézio’s main Indian character refers to Europeans as les grands mounes. This is presumably ‘the big people’, as in the creole of Réunion, (dë)moune ?< monde is frequently used as a replacement for Standard French hommes or gens.

There is also longaniste, a native sorcerer and healer, comparable to the sangoma of South Africa, and laffe-la-bou, a name for a venomous stonefish.

Finally, Le Clézio mentions astère used as a creole equivalent of maintenant. In an interesting comment thread on a blog about mauricianisms, the French Canadian linguist Marie-Lucie Tarpent notes that the word is ultimately a contraction of à cette heure, and present in Canada, too, as asteur(e), but the origin must have been a west France dialect where the phrase was no longer analyzable.

In addition, Le Clézio mentions in passing that a dialect of the North Indian language Bhojpuri is still spoken on the island.

When I stayed in Madagascar in the company of the Russian hitchhiking club Academy of Free Travel several years ago, I was jealous that several people had got to see Mauritius on the way to Madagascar, and after reading La quarantaine I’m again intrigued by this island and its unusual cultural mix.

Article on hitherto unidentified Mari items in Pallas’s Vocabularia comparativa

Linguistica Uralica 2016:3 is out, and in it is my article “On some hitherto unidentified Mari items in the ‘Vocabularia comparativa’ of P. S. Pallas” (PDF). Here’s the abstract followed by the considerably more detailed Russian-language summary:

The ”Linguarum Totius Orbis Vocabularia comparativa” of Peter Simon Pallas published in 1787—1789 is a prominent early record of the Mari language, containing Mari translations of 273 Russian headwords.This material has been examined by Thomas A. Sebeok in an ample commentary published in 1960, and by Alho Alhoniemi two decades later, but they were unable to identify all words. Using recent lexical resources on Mari and studies of the original manuscripts, the present contribution identifies further words and corrects some errors in earlier interpretations. The result is a more complete picture of Pallas and 18th-century Mari.

«Сравнительный словарь всех языков и наречий» П. С. Палласа, изданный в 1787–1789 годах, является выдающейся ранней записью марийского языка, содержащей марийские переводы 273 русских заглавных слов. Этот материал был исследован Т. А. Себеоком в его обширном комментарии, опубликованном в 1960 г., а затем А. Алхониеми, почти двадцать лет спустя. Оба ученых не смогли однако распознать всех марийских слов содержащихся в этом словаре. С помощью современных лексических источников по марийскому языку, а также благодаря изучению рукописных словарей являвшихся источником для Палласа, автор статьи расшифровал некоторые из ранее неидентифицированных слов, а именно: Ирла́ ‘боль’ = MariE (Большой Кильмез) irla ‘ворчать’; Шу́идабу́и ‘власть’ = MariE šüδə̑βuj ‘сотник’; Чюмышта́ ‘ростъ’ = MariE č́ə̑memčəmem ‘натянуть’; Шитешь ‘ростъ’ = MariE šə̑tem W Nw šətä ‘прорастать’; (Чумра)тырмышь ‘шаръ’ = MariE tə̑rtə̑štərtəš ‘шар’; Пыла́мирь ‘буря’ = MariE pulamə̑r ‘беспорядок, смута, раздор’; Садиги ‘паръ’ = MariE saδə̑γe ‘так, таким образом’; Муней ‘колъ’ = MariE (Большой Кильмез) munej ‘жаба’; Кунзя ‘судно’ = MariE (Малмыж) kunźə̑ ‘воз’; Чипталмаш ‘брань’ = MariE č́ə̑ptalaš ‘нападать’; Пилнышь ‘побѣда’ = MariE pə̑lnaš ‘слабеть’; Шурть ‘китъ’ = MariE šə̑rt Nw šərt ‘злой дух’; Всерсе ‘послѣ’ = MariE βarase ‘последний (только что появившийся)’; Умсысь ‘безъ (кромѣ)’ = MariE umsə̑z ‘безумный’. В статье также отмечено, что бяи ‘въ’ – это возможно удмуртское слово, ошибочно упомянутое как марийское. Результат настоящего исследования дает лучшее понимание словаря Палласа, а также марийского языка XVIII века, несмотря на то, что 17 марийских слов из словаря Палласа по-прежнему остаются неясными.

Contraction as a source of Meadow Mari a in an inherited Uralic word

In his article “The Finnic ‘secondary e-stems’ and Proto-Uralic vocalism”, published in the 2015 issue of Journal de la Société Finno-ougrienne, Ante Aikio presents a new set of related Uralic items involving Mari: [Proto-Uralic] *woja/i ‘wild (animal)’ || MariW wojǝr | Komi vej | KhVVj wajǝɣ (< PKh *wājǝɣ) | MsSo ūj (< PMs *ūj) (UEW: 553). — The Mari word has not been previously been included in this cognate set.

I had formerly noted down MariW βojə̑r, drawn from Tscheremissisches Wörterbuch, in my big collection of unetymologized Mari words, so now with Aikio’s observation I must strike it from the list. What is interesting, however, is that the word is apparently attested in literary Meadow Mari, as well, but under the form вар ‘wild, running wild (after confinement)’. If this is the same word, then the originally two-syllable word has undergone contraction, producing an initial-syllable /a/, not something one generally expects from inherited Uralic material.

I know from Oleg Sergeev’s description that Zemljanitsky’s dictionary, compiled in the 1870s, has воеръ ‘дикий’. Unfortunately, Zemljanitsky’s dictionary contains words drawn from both Hill Mari and Meadow Mari forms, and Sergeev fails to make clear if this particular item was accompanied by any indication as to its origin (as some entries in the dictionary do specify Hill or Meadow Mari). Thus, it is presently impossible to know whether an uncontracted MariE βojə̑r did exist until recently, without going through the challenging process of examining the original manuscript in situ. It is extremely urgent that the Mari manuscript dictionaries in Russian state collections be digitized.

(For information on Zemljanitsky’s dictionary and the presence of this item in it, see O. A. Sergeev’s article “Рукописный словарь марийского языка Земляницкого” in Советское финно-угроведенеие XXIV No. 4 (1988), pp. 292–295).

The Albanian language in Kosovo

One of the great pleasures of this recent trip to Kosovo is that now equipped with a decent reading knowledge of Albanian, I could make sense of all the signage around me. But for one wanting to turn a fairly passive knowledge of the Albanian language into an active one, Kosovo is a frustrating place. I didn’t have a chance to buy the earlier edition of Routledge’s Colloquial Albanian written by Isa Zymberi that is based on Kosovo speech, so I have been using a mixture of more general resources for the artificial standard created in Socialist Albania a few decades ago. Kosovars understand that perfectly fine, and when speaking to me they kindly adapt their speech to a more standard variety, but I cannot understand Kosovars talking among each other and that makes for an awkward experience, especially when being able to follow many YouTube videos from Albania before the trip had so lifted my spirits.

Even bringing along a reference with details on Geg Albanian wasn’t as helpful as I expected: Martin Camaj’s Albanian Grammar with Exercises privileges Geg forms in the vocabulary, with Tosk/Standard Albanian forms following in parentheses. However, many of these Geg forms are not actually usable in Kosovo. Some are said by Kosovars to either be foreign to Kosovo (with the person vaguely pointing west towards northern Albania or Montenegro). Others are dismissed as from the village – indeed, residents of Prishtina and Gjakova seem to have a haughty attitude to rural speech and take pains to speak in a different way, though one that is not necessarily any easier for a foreign learner.

(From where I write this now in northeastern Albania, the accent remains much the same, but lexically things are closer to what I would expect from my learning materials, and it’s a lot easier to get language immersion than among the more cosmopolitan Kosovars who are quick to show off their knowledge of German or English.)

Spelling quirks

It’s curious indeed that after Hoxha’s Albania choose Tosk as the basis for the standard language, the Albanian minorities in Montenegro, Kosovo, and Macedonia – Geg speakers all – so readily adopted this rather perverse standard. Virtually all texts are created in the standard language, showing invariably the Tosk rhotacism though it’s utterly foreign to these parts. Still, occasionally one sees mistakes made in the writing of Standard Albanian ë. In final position it is no longer pronounced in either colloquial Geg or Tosk, and therefore one sees it left out on some signs associated with rural contexts, e.g. blejm hekur for blejmë hekur ‘we buy scrap metal’.

The other misspelling comes from Geg’s preservation of nasal vowels when the standard language has reduced these to ë. Consider the storefront windows shown here, only a couple of hundred meters from each other in Gjakova. A cafe advertises ëmbëlsira ‘sweets’ but writes the initial-syllable vowel with a instead of the standard ë, while another, perhaps more upscale establishment shows the word spelled according to the standard orthography which is indeed the norm even in Kosovo. A storefront reading “Ambelsira, espresso, kapuqino, makiato”Shop window reading “Punëtoria e ëmbëlsirave ‘Dor’ Pasticeri. Punojmë me porosi bakllava, torte dhe ëmbëlsira sipas kërkesave tuaja”