Mari /ŋ/ represented by Cyrillic <н>

In attestations of the Mari language from the 18th-century, Mari /ŋ/ tends to be represented with the Cyrillic letter <н>. Lots of manuscripts represent MariE jeŋ ‘person’ as <ен>, for instance. For more examples, see Alhoniemi’s 1979 commentary on the Mari wordlist of P. S. Pallas.

A colleague of mine found this odd, as he would have expected the sequence <нг>. Yet, denoting the sound [ŋ] in the same way as another single consonant has a long history. Consider Greek where the sequence [ŋg] is always spelled <-γγ->. Also, a samoyedologist once told me of a foreign colleague (Japanese, if I recall correctly) who kept hearing Nenets /ŋ/ as /g/; his ears simply couldn’t pick up on the nasal property of the consonant.

But if historically /ŋ/ has been confused by other peoples as either /n/ or /g/, the question remains why these Russian (and Russia-resident German) wordlist compilers constantly denoted Mari /ŋ/ with the symbol for /n/ and never for /g/. One reason for this may be that the compilers were already using Cyrillic <г> to represent Mari /ɣ/, which is a fricative, not a stop. Since the only other voiced velar sound in the language was a fricative, the velar stop /ŋ/ was heard as the closest stop to it: /n/.

But in the neighbouring Udmurt language, where the /g/ is a stop, not a fricative, 18th-century compilers still denoted /ŋ/ with the same symbol for /n/. D. G. Messerschmidt’s wordlist, which has been reprinted with a commentary by V. V. Napolskikh, has <Gurpuhn> for Udmurt dial. gurpuŋ ‘heron, stork’. (Note, however, how Messerschmidt denotes the sequence [ŋg] in <Ning-goron> for Udmurt dial. ńiŋgoron ‘woman’.)

So what else in these Uralic languages and in the native languages of these Russian and German compilers could have motivated the choice of the letter usually denoting /n/ and not the letter for /g/? Something worth thinking about.

Mari and Udmurt dictionaries for Kindle (with caveats)

A screenshot of the Udmurt dictionary on the KindleThere are fairly ample Mari-Russian and Udmurt-Russian dictionaries in the Goldendict format (get them here). And there is a toolchain that can convert a Goldendict dictionary to Mobipocket format for use on the Kindle and other e-readers. It works like this:

  1. Decompress the Goldendict file (.dsl.gz) with gzip.
  2. Pass the resulting .dsl file to dsl2mobi. This Ruby script will produce an HTML file (the dictionary data) and an .opf file (the metadata).
  3. Open the .opf file in a text editor, look down through the XML, and specify the title of the dictionary, as well as the input and output languages of the dictionary through their ISO 639 codes (udm or chm and ru in this case).
  4. Run the Windows application mobigen.exe (available from from Mobipocket) on the .opf file. Using the option -c2 enables compression for a much smaller size. This will produce a .mobi file that can then be moved to the documents/dictionaries/ directory on the Kindle.

This worked for the Mari and Udmurt dictionaries. I can highlight certain words on my Kindle and an entry from the Mari or Udmurt dictionary will pop up. The only problem is that these particular two dictionaries don’t have much in the way of morphology, that is, they don’t have all the inflected forms of words.

The Udmurt dictionary is the more usable, as for verbs often the past participle is included separately, with a link to the main entry for the verb under its infinitive form (e.g. highlighting потэм will pop up a window with a link to потыны), but highlighting any other inflected form like потӥсько will result in an error that the word is not found. Because the Kindle allows one to highlight only whole words, not a subset of letters within them, one cannot even leave off, say, the definite suffix -ез of a word like ужез to get the entry for уж. Perhaps that is possible on other e-readers.

The Mari dictionary provides automatic lookup of words only if they are the infinitive for verbs or the nominative singular for nouns.

Besides trying to look up words by highlighting them in a text, there is also the option of simply opening the dictionary from the Kindle home screen and searching for the word one needs with the Kindle keyboard. The caveat here is that one can only search in a Latin transliteration based on Russian Cyrillic, and there does not appear to be a way to input the extended Cyrillic characters used by Mari and Udmurt.

One might ask what the point of creating these dictionaries is when there are not yet any e-books for these languages. There may be no editions of classic literature yet in e-book form (though I’m working on an e-book version of Chavain’s Elnet), but there are a number of Udmurt and Mari blogs from ethnofuturistically-minded people, and with Calibre one can convert the RSS feeds of these blogs to a format suitable for reading on the Kindle. It’s a good way to have language practice on the go, even if the minutiae of the writers’ lives are not always particularly interesting in themselves.

With so much language learning, how does one ever publish anything?

A couple of years ago I quoted a statement from an introductory Altaic studies textbook that the continual language learning in this field means a lifelong commitment. It’s one thing to continually learn languages over one’s scholarly career to broaden one’s horizons, but lately it seems that so much language learning is imposed that I cannot ever actually finish a journal submission.

This is how things have gone so far:

  1. When I began my studies of Finno-Ugrian linguistics, my initial concern was just Mari, which struck me as the Uralic language with the most readily assimilable grammar, and Russian so that I could use the only decent textbook of Mari available at the time. (Of course I was learning Finnish too as a foreigner in Helsinki, and Saami, Erzya and Nenets as other coursework.)
  2. After a few months it became clear that one can hardly do anything with Mari without having real proficiency in Chuvash and Tatar.
  3. A few months after that, I saw that understanding the Turkic languages of the Volga–Kama area requires some knowledge of what they were like before they arrived in that part of the world. So, numerous references on the Turkic family in general were added to my reading list, and I had to learn a couple of other Turkic languages (I chose Turkish and Kazakh) to act as a sort of control group for Volga Kipchak.
  4. As the years went by, it became clear that I had considered enough the relationship of the Permian languages with Mari, so courses of Udmurt and Komi became obligatory before I could even dare to comment on the prehistory of Mari. The Ob-Ugrian languages are another area I should strengthen.

At the moment I’ve got a Mari-related research project that I would very much like to bring to publication, but I have the feeling that I will not have done my scholarly due diligence unless I get two more languages under my belt, namely Moksha Mordvin (Erzya Mordvin is not enough) and Ossetian. I’m very worried that the latter is going to lead to even more things to follow up on in Iranian. This could bog me down for years.

The low-hanging fruit in Uralic studies has long been taken. I think it virtually impossible now to publish a paper on Mari considering only that language and no others around it. To someone today, it seems incredible that in 1950 Thomas Sebeok was able to score another entry on his list of publications simply with a two-page article on how Mari family names or patronymics typically precede a person’s own name.

Do scholars who frequently publish simply say at some point OK, I’ve got enough data now and I am collecting no more? Are they not scared that during the peer review process some possibly more knowledgeable scholar is going to condemn them for overlooking data from another language spoken far away but nonetheless essential to the subject?

Deceptive Volga-Kama parallels

The Perso-Arabic word خبر khabar ‘tidings, news’ was firmly entrenched centuries ago in the Volga-Kama region, borrowed by Volga Bulgarian and reflected in Chuvash as xıpar, and borrowed into Tatar as xäbär. Mari got it from Chuvash as uβer, where the loss of initial x‑ is a regular phenomenon. When I began studying Udmurt, I learned the word ivor ‘news’ and, like many people probably, I simply assumed that it was part of this family of words. In fact, according to several publications by T. E. Uotila, the Udmurt word goes back to Proto-Permian (cf. Komi divor) and has no relationship to the Perso-Arabic loan. Such a loss of the initial velar would be unusual in Udmurt, after all.

Mari tukə̑m ‘family tree, stock, descent’ is another case where it’s important to always check the etymology instead of assuming it’s something one already knows. I had always thought this was from the Turkic root *doɣ- ‘to be born’, which is present in e.g. Tatar tuɣan ‘native’. Again, the phonetic mismatch should have clued me in: Turkic ‑ɣ‑ would not be reflected in Mari as ‑k‑ (and besides, Tatar had lost the voiced velar fricative from that root; the ‑ɣ in Tat. tuɣan belongs to the ‑GAn suffix). The Mari word actually comes from Tatar tokïm < Persian تخم tokhm ‘seed, egg, origin’, which bears only a coincidential resemblance to the Turkic root.

Mysterious peoples in contact with Finno-Ugrian speakers

On the basis of a large amount of apparently Iranian loanwords in the Finno-Ugrian languages of Russia, it’s easy to assume that in the north you had Uralic speakers and in the south you had Iranian speakers, and that’s all there was to it. In fact, these languages show signs of contact with mysterious populations that don’t fit a simplistic preconception of the area.

For a long time the Andronovo culture of the south Russian steppes and southwestern Siberia in 1800–1400 BC was identified as Iranian. However, Eugene Helimski argues for a reinterpretation of this in his article “The southern neighbours of Finno-Ugrians: Iranians or an extinct branch of Aryans (‘Andronovo Aryans’)?” in Finnisch-ugrische Sprachen in Kontakt (Maastricht, 1997) pp. 117–125. Noting that borrowings into Uralic often show “Proto-Aryan” or “Proto-Indo-Iranian” features in spite of being transmitted well after a distinct Iranian branch had arisen, Helimski says that the Andronovo language must be considered not Iranian but a third branch of Indo-Iranian. Thus alongside Indo-Aryan in the Indian Subcontinent, and Iranian further to the south or west, there was in the Andronovo area a branch distinct from both, a “Para-Iranian” which (Helimski supposes) would have died out in the first millennium BC.

The possibility of a hitherto unknown Indo-European language should excite anyone interested in the field. But that’s not all. Millennia later and further to the northwest Uralic speakers met another, even more mysterious population.

After the Mongol invasion in the mid 13th century and resulting population movements, the ancestors of the Chuvash and the Mari came into contact, and with them was an unknown language that left a few dozen words behind in the Chuvash and Mari dialects before disappearing. These words have been investigated by Klára Agyagási in volume 7 of Folia Uralica Debreceniensia and in “Die Spuren der Sprache der Spät-Gorodec Bevölkerung in den tscuwaschischen und mariischen Mundarten” in Congressus nonus internationalis Fenno-ugristarum Pars IV (Tartu, 2001), pp. 35-39.

I am still grappling with Agyagási’s corpus of 64 words and don’t know what to think. Words like Cv. ĕmĕlke ‘shade’ ~ Meadow Mari ümə̑lka ‘shadow’, or Cv. kăsăya ‘titmouse’ ~ Meadow Mari kə̑sa ibid. don’t seem readily comparable to any language of the area, whether Indo-European or Uralic.

Udmurt names for the months of the year

Native-language names for months of the year often provide interesting ethnographic colour. Sadly the Mari and Chuvash indigenous names are no longer used and often entirely forgotten, but a student working with an Udmurt textbook will still be taught and drilled on native Udmurt names.

In an Udmurt-language contribution to the Festschrift for Lars-Gunnar Larsson Lapponicae investigationes et uralicae, Valentin Kelmakov has helpfully gathered dialectal variants of the months of the year, along with citations of texts to show what the literary language uses. Needless to say, Udmurts also often use the Russian names for the months, but here I summarize only the native Udmurt material that Kelmakov presents, and only those dialectal forms for which he can provide an etymology:

Month Literary language Other, dialectal names
January tolšor ‘midwinter’ keźi̮ttoleź ‘cold month’
February tuli̮spal ‘towards spring’, kionśuan ‘wolf cavorting’ ku̯akatoleź ‘crow month’
March oštoleź ‘cattle(-watering) month, ku̯akatoleź ‘crow month’ vöjtoleź ‘butter month’
April južtoleź ‘month of a thin crust of ice over the snow’, śöd kuaka vuon toleź ‘month when the black crow comes’ ki̮źputoleź ‘birch month’, bi̮dǯ́i̮mnunal toleź ‘Easter month’
May ku̯artoleź ‘leaf month’ guždortoleź ‘grass month’, gi̮ri̮ni̮ poton toleź ‘month one goes plowing’
June invožo, of uncertain origin, see below lektoleź ‘harsh month’
July pöśtoleź ‘hot month’ köstoleź ‘dry month’, turnantoleź ‘grass-mowing month’
August gudi̮rikoškon ‘month when thunderstorms stop’ arantoleź ‘harvest month’
September ku̯aruśon ‘leaf-fall’, siźi̮ltoleź ‘red month’
October końi̮vuon ‘squirrel-hunting month’ vi̮lǯuktoleź ‘new porridge month’, pukrotoleź ‘Pokrov month’
November šurki̮nmon ‘river-freezing’ jöki̮ntontoleź ‘month when ice comes’, pöjtoleź ‘wild animal(-hunting) month’
December tolsur ‘winter beer month (pagan drinking festival)’ inmartoleź ‘God’s month/month of the gods’, nardugantoleź ‘winter solstice month’

The only month where the name is unclear in the literary language is June. Kelmakov writes:

According to F. I. Wiedemann’s 1880 word list, the word vožo means ‘rye blooming season’ (Zeit der Roggenblüthe), but today it is hard to say where that word originated from, whether from vož ‘green’, or perhaps from the word vožpoton ‘anger’. In some areas the Udmurts called malevolent minor gods vožo/invožo. These could do evil to people in winter (the vožo) and in summer (the invožo). These minor gods, it was said, didn’t approve of swimming in a river during the day, making noise, or drawing water with an iron tool. Anyone doing such would bring upon himself the anger of the invožo, who would rain down hail. In olden times the Udmurts strove to placate the minor gods that they recognized and live in harmony with them, so leading up to Trinity Sunday (known in Udmurt as ku̯arsur ‘flower celebration’) they would not go out to mow the grass with a scythe or reap with a sickle, and during the day they would keep their dogs locked up, while they themselves would hide under a canopy. To remember that this was the time when the vožo of the summer – the invožo – could become angry, the Udmurts apparently called the month of June invožo.

One finds here a similar difficulty as with reviving the Mari names for the months of the year: sometimes a name refers to one month in one dialect, and to a different month in another dialect.

Scan of Lytkin’s Древнепермский язык available

In spite of continuing interest in the Old Permic language (the Old Permic script was added to Unicode in version 7.0 last year), the only substantial reference for it remains V. I. Lytkin’s Древнепермский язык of 1952. Since the book is either not under copyright or no one particularly cares, I have taken the liberty of making a print-quality (300 dpi) and cleaned-up scan of the book: PDF (10 MB) Even the fold-outs are scanned, though you’ll probably need to print them on A3-sized paper for them to be legible.

Bartens’s history of Permian vowels

In my study of Udmurt and Komi, I have produced an English translation of the chapter on Permian vowels from Raija Bartens’s Permiläisten kielten rakenne ja kehitys (The Structure and Development of the Permian Languages, Helsinki: Finno-Ugrian Society, 2001). While Bartens’s book no longer represents the state of the art in Uralic linguistics, and in the years since Sándor Csúcs has shaken the field up with such publications as Die Rekonstruktion der permischen Grundsprache (Budapest: Akadémiai Kiadó, 2005), Permiläisten kielten rakenne ja kehitys does provide a helpful introduction to 20th-century work on Permian vocalism. Continue reading Bartens’s history of Permian vowels

Names for ‘ladybug’ in Udmurt

The Диалектологический атлас удмуртского языка edited by R. I. Nasibullin et al. (Iževsk: R&C Dynamics, 2009) has a series of maps showing the distributions of the Udmurt names for various things across the area where the language is spoken. For the most items, there are only a few variants, and in the case of borrowing, Russian loans are prevalent in the north of the Udmurt Republic while Tatar loans are prevalent in the south.

The word for ‘ladybug’ (Russian божья коровка) is a different story. The atlas lists 124 variants.

Some of these are very colourful: ӵужанай ‘maternal grandmother’, вӧйын нянь сиись ‘bread-and-butter eater’. A large number are formed with зор ‘rain’ (< Volga Bulgarian, cf. Chuvash ҫур ‘snow’). Nasibullin examines these names more closely in his article ‘“Божья коровка” в удмуртских говорах’ in the journal Иднакар (issue 2007-2).

Amusingly, after the myriad names for ‘ladybug’, the atlas documents only one name (with varying vocalism) for that most common pest on Earth, the cockroach: торокан/ таракан/тӓрӓкӓн (cf. Russian таракан).

(If this kind of variation fascinates you, in North America, the various names for the family Armadillidiidae, which I grew up calling a roly poly, have also been mapped.)

Mari and Udmurt children’s poetry

Textbooks of Finno-Ugrian languages written for foreign learners really like to give children’s poetry as translation exercises. Thus Марийский язык для всех presents the following from one Pet Pershut:

Kутко сӱан

Тыгыде кутко —
Шем кутко,
Йошкар кутко —
Ер кутко,
Сар кутко —
Сад кутко
Кеҥеж кечын сад мучко
Каеныт корно мучко,
Пурак веле тӱргалтын,
Изи йыҥгыр мӱгыралтын.
Орава да тарантас ден,
Шым гитар ден,
Шым шӱвыр ден,
Вич тӱмыр ден,
Волен, кӱзен,
Шудым пӱген,
Каеныт сӱаныш,
Рӱж миеныт йыраҥыш.

The ant wedding

Small ants,
black ants,
red ants,
lakeshore ant,
grey ants
garden ants
They made their way
though the garden on a summer day,
carrying only crumbs,
singing a little song.
With carts and wagons,
with seven guitars,
with seven bagpipes,
with five drums,
they sang and danced,
and made merry,
They went on, they went up,
They bent down grain stalks,
They went to the wedding,
with a buzz they headed into the flower-bed.

The third chapter of the Udmurt textbook Марым, леся… gives a series of several poems by Alla Kuznetsova exemplifying the numerals just introduced. Here’s the one for ‘7’:

Сизьым туж тодмо мыным,
Сизьым нунал арняын:
Вордӥськон бере пуксён,
Вирнунал, покчиарня,
Крезьгуро удмуртарня,
Кӧснунал, арнянунал.

Seven things are very familiar to me,
The seven days of the week:
Monday then Tuesday,
Wednesday, Thursday,
Melodious Friday
Saturday, Sunday.

I don’t much care for this. Adult learners should not be treated like children. Sure, it may be a few chapters before a student is ready for it, but it would be more dignified to bring in selections from folk songs or simple selections from novels.