An unexpected corpus: Russian version

Over at his blog Panchronica, Guillaume Jacques expresses his delight about The Jesus Film, that product of some American Protestant sect that has now been translated into an enormous amount of languages, even ones for which written material is extremely scanty. It has certainly been of great help to me as I’ve learned Ossetian, and the existence of separate Albanian translations for Kosovo and the Republic of Albania will help foreign learners feel comfortable with both the Gheg and Tosk variants of that language.

While there is probably no other film so widely translated as The Jesus Film, for my own particular purposes I’ve been pleased to find something else, and where the story is less likely to be familiar to the viewer: the Soviet cartoon Трое из Простоквашино (“The Trio from Prostokvashino”) has been dubbed into a number of languages, mainly from Southern Russia and the Caucasus, for example:

  • Ossetian
  • Ingush
  • Lezgian
  • Karachay-Balkar (I was very surprised by how difficult this language is to understand, I thought I would be able to follow it pretty easily after learning Kipchak languages from further east);
  • Lak
  • Kumyk
  • Tatar (under the translated title Простоквашинодан өчәү)

Clicking the links in the sidebar, one can find one’s way to other cartoons in various languages of the former USSR. There’s even an entire playlist of Ossetian-dubbed cartoons.

MariE tolašemtalašem ‘try hard, strive’ < Tatar talaš

One of the frustrations of working with Tscheremissiches Wörterbuch is that some Mari items are labeled Tschuw. or Tat., but the exact source is not specified and sometimes one has to dig a little to determine the original Chuvash or Tatar word.

A case in point is MariE tolašemtalašemsich bestreben, eilen, irgwendwie zu tun versuchen’. This is marked as a Tatar loanword in TschWb, and the word is clearly of Turkic origin since it has a causitive derivational form MariE tolaštaremtalaštarem. I turned to my dictionary of literary Kazan Tatar, the Татарско-русский словарь (Казань: Мәгариф, 2007), and found a phonetic match: талашу. However, the meanings ‘сспориться, скандалить, переругиваться’ of this verb and its derivational forms were not close enough to the Mari verb to satisfy.

If my Tatar dictionary doesn’t help for a Turkic loanword in Mari, the next stop is a Chuvash one. Ashmarin’s Thesaurus Linguae Tschuvaschorum contains a verb corresponding to the Tatar one and almost certainly a borrowing of it, namely tulaş, and the first meanings mentioned are the same as for the Tatar: ‘беситься, злиться, грызться’. However, buried deeper down in the entry is the meaning we’re looking for: возиться, стараться. This is an understandable extension of the Turkic root tal-, the basic meaning of which is ‘to force; to take by force’.

Thus Mari and Chuvash preserve a meaning of the Tatar word that seems to have died out among Kazan Tatars. Interestingly, Russian too borrowed this Tatar word dialectally and uses it in a similar sense, or at least it did in the 19th century: a verb талашитьсясуетиться, толочься, метаться’ is attested from the Tambov region in the Толковый словарь Даля, compiled by Vladimir Ivanovich Dal’ and published in 1863–1866.

Incidentally, had I carefully examined the Mari–English Dictionary instead of basing myself solely on Tscheremissiches Wörterbuch, then I could have figured out this etymology more quickly, because one of the meanings of MariE lit. толашаш is ‘to quarrel, to squabble, to bicker’, and that meaning is not found in TschWb. However, the Mari–English Dictionary, being a general literary-language reference and not a dialect dictionary, does not list the origin of the item, and I wonder if the word in that meaning was found only in Eastern Mari communities under heavy Tatar influence before the rise of the literary language, and only the meaning ‘try hard, strive’ is pan-Mari.

More adventures in Latin American Spanish

Argentina was a rather surprising experience. In Spain, where I had learned Spanish, the stereotype of the Argentine in television and films must be based on people from Buenos Aires: one hears the same invariable accent with no hint of the immense variety that one would actually encounter in Argentina. As I cycled west across the country, I found the regional accents clearly changing every 300 km or so.

Once I reached the provinces of La Rioja and Santa Fe, I was shocked to discover that the dialect here had not experienced the shift of *y (and *ʎ > *y) to /ʒ/ like Rioplatense Spanish and the Argentinian stereotype. Instead, it was *r that had shifted to /ʒ/, while *y remained /y/. My first inkling of this was when rápido ‘fast’ was increasingly heard as [ʒapiðo], but it happened to instances of word-medial *r as well and took some getting used to in fast speech. A child came up to my wife and I at a campground and asked if we had seen a man in a [ɣoʒaroxa], and only after a minute of thought did I realize he was looking for someone wearing a gorra roja ‘red hat’. Weeks later, in Chile, while I was cycling on the motorway, another tourist stopped his car to ask me if he had missed the turnoff to [βiyaʒika], i.e. Villarrica. I laughed, thinking that he was lucky to have come across a non-local who could understand his question.

I have seen it claimed in several popular sources that the dialects of western Argentina are transitional to Chilean Spanish, but I didn’t find that to be the case at all. Not only does the shift of *r to /ʒ/ stop at the Andes, but the intonation of Chilean Spanish is vastly different. The Andes serve as a mighty wall. For the first week or so in Chile, I had to concentrate very hard to understand what people were saying, and I could sympathize with the many Spanish speakers who point to Chilean Spanish as the most difficult to understand of all the Latin American varieties. Fortunately, after that first week, my difficulties vanished and the local speech came to feel entirely normal.

I’m not quite able to determine what phonetic quirks set Chilean Spanish apart, and I’m not sure that if I hear this accent qua accent again in some other part of the world, I would be able to trace it to Chile. However, the Chilean colloquial lexicon is very sui generis, and I’m sure I’ll be able to immediately identify Chileans by the presence of certain words. People are very fond of the item ueyá/ueyón, which is not only a generic word for ‘thing’ rather like Philadelphian English jawn, but apparently even works as a exclamation and more. Chileans also tend to end sentences with po’h, a reduction of pues and a particle which has an exotic, non-Spanish air about it, as if something from an East Asian language.

Curiously, while Argentines accepted my use of vosotros without batting an eye, Chileans have been much more ready to make fun of me for it. They complain that the mere existence of such a form is silly, because Spain is the only place in the world where people say that. (Clearly Chileans never get to talk to a Spanish speaker from Western Sahara or Equatorial Guinea.) Once when having dinner with several upper-class and well-educated Chileans, I found tiresome the company of a writer-who-should-know-better who kept claiming that vosotros, and not the word itself as much as the grammatical form in general with its verb marking, was an innovation that appeared in Spain after the colonization of the New World; my appeal to Latin *‑atis etc. was dismissed because, as a foreigner, I surely cannot have any understanding of the history of the Spanish language.

Hopefully, after making my way through Uruguay, Argentina and especially Chile and finding it entirely possible to communicate with the locals (with perhaps a few days of acclimatization), I can now travel in the remaining countries of Latin America without fear. Still, it is always the variety of the language in the place where you first learn it that sounds the sweetest, and I am very much looking forward to passing through Madrid next month.

Adventures in Uruguayan and Argentinian Spanish

Except for a few very brief orders made at Mexican restaurants in North America, these last few days in Uruguay and the Entre Ríos province of Argentina was the first time I had ever spoken Spanish outside of Spain. All in all, what surprised me is how easy it was to communicate on both sides, in Uruguay at least. I could imagine someone who learned some particular regional variety of UK English having some problems in the American South, for example. Even when I used more recently-coined colloquialisms common to Spain, rural Uruguayans understood me. I do find that a bit puzzling, since the Uruguayans to whom I spoke claimed to have virtually no contact with Spanish of Spain: no music or films or television, and Latin America is a large enough market to sustain its own publishing without having to import any books from Spain. In Argentina, however, I’ve been forced to start adapting to their way of talking in certain contexts.

Over the years, other foreigners who learned Spanish in Spain have told me that going to Latin America would require avoiding vosotros and the verb coger ‘take’, but I find that an exaggeration. No one I met seems to mind the use of vosotros as the second person plural, and the indicative endings are so close to the vos forms used here that nobody would be confused by the morphology. While the verb coger has become an obscenity here, no one batted an eye when I used it in its Spanish meaning ‘to take’. Speaking with ceceo provoked no jokes at our expense.

The main aspects of pronunciation which required a brief moment of adaptation was the seseo and the pronunciation of *y/λ as [ʒ]. Once I crossed the border into the Entre Ríos province of Argentina, I started to hear people dropping final /s/, a common development in varieties across Latin America. Otherwise, it feels like everyone here speaks “clearly”. The major differences found were naturally lexical ones:

  • For ‘tap, faucet’, grifo is understood, but apparently only canilla is used here.
  • For ‘tent’, carpa is used here, though tienda has generally been understood.
  • Uruguayans understand los aseos/los servicios for ‘toilet’, but they say el baño, and I’ve found that I have to use the latter in Argentina to be understood.
  • For ‘peanuts’, people here say maní instead of cacahuete, and Argentinians don’t even understand the latter (if the word is explained to them, they tend to laugh at it).
  • For the simple small-town eateries in Entre Ríos, everyone says comedor, which elsewhere means ‘dining room’. I wonder if my asking Hay un restaurante por aquí? suggested that I wanted something posher than these little communities could boast.

Andreev’s Chuvash textbook and what’s wrong with it

I wrote this review of I. A. Andreev’s Чувашский язык. Практический курс 3rd ed. (Cheboksary: Чувашское книжное издательство, 2011) ISBN 9785767018130 for a book-rating website, but I thought I should also post it here where it is probably more likely to be read. The cover of Andreev’s textbook (3rd ed. 2011) While I do love to just rant about this and other poor learning resources, I think it would be helpful if this book’s flaws were known, as one can avoid being too greatly disappointed. I remember how thrilled I was to discover the book nearly a decade ago, and how quickly my bubble was burst.

Continue reading Andreev’s Chuvash textbook and what’s wrong with it

Mapleland and Thornybank

My Romania–Finland hitchhiking commute and a memorable cycle tour have often brought me through extreme southeastern Poland and western Ukraine. I have been struck by constantly encountering the same toponym, e.g.:

  • Jawornik in Poland, on the 892 road south of Sanok;
  • Yavoriv, in Ukraine just across the border from Poland, south of the Ukrainian town of Turka;
  • Yavor, also in that same part of Ukraine, but just north of Turka.

For a long time I would half-consciously mull over this word and think about derivations (e.g. some weird creation from *voriti), but I should have just searched for the term on the web: Common Slavic *(j)avor means ‘maple’. And the reason why I found no headword in Derksen’s Etymological Dictionary of the Slavic Inherited Lexicon is because, according to Pronk-Tiethoff’s The Germanic Loanwords in Proto-Slavic, the term was borrowed after the Proto-Slavic period. I wonder if that makes a case for the Slavic Urheimat, which was supposed to be in this general area, not reaching down to the Carpathians, as why would the Slavs borrow a name for a tree that evidently was so distinguishing a feature of their landscape?

Another riddle from the same part of Europe remains slightly unsolved for me. For a long time, on the basis of the Romanian town of Târnaveni and the Bulgarian city of Veliko Tarnovo, I again, without thinking too deeply about the matter, thought it might be some contraction of *trgŭ novŭ ‘new market’, a sensible name for a place acting as a commercial centre. However, in the Romanian case, the town was actually named after the Târnava River, and one doesn’t often name rivers after markets. Plus, the Bulgarian town should be seen as containing the adjectival ending *‑ovo. Then, at some point I passed through the Polish town of Tarnobrzeg and realized that the common element here is Common Slavic *trŭnŭ ‘thorn’. So, these are areas with thorny banks, which the Polish toponym would seem to express clearly.

But my knowledge of Polish dialectology is scanty. The word ‘thorn’ in standard Polish is cierń. With a place-name like Tarnobrzeg, does this mean that the southeastern Polish dialects had a different development of early Slavic syllabic *r (or sequences of *r and a yer), one that led to a non-front vowel that wasn’t affected by the shift *t > c before front vowels? Interestingly, the Polish Wikipedia article for Tarnobrzeg speaks of a relationship to śliwa tarnina ‘blackthorn, sloe, Prunus spinosa’, and here we have a Poland-wide term with the unchanged consonant.

PIE roots as a mnemonic device in Farsi spelling

Persian roots in which a silent vāv must be written after an initial khe are often considered the bane of foreign learners of Farsi. I myself felt some discontent at having to learn this silly spelling rule after initially encountering Persian in the wonderfully clear Cyrillic script used by Tajiki. However, one of those little eureka moments one encounters in historical linguistics was that these words can be traced back to Proto-Indo-European roots with intial *sw-, e.g.:

  • خواهار ‘sister’ < PIE *swésōr;
  • خوابیدن ‘to sleep’ < PIE *swep‑;
  • خویش ‘himself’ < PIE *swe‑ (I guess, but even if I guess wrong, it still helps to remember).

Thus, a little knowledge of PIE can instantly serve as a mnemonic device in some tricky aspect of a language that arose millennia later.

Increasing age may make it more challenging to learn a language to real conversational proficiency and lose that accent, but I’ve been so encouraged lately by how a decade-plus of sometimes focused and deliberate, but just as often casual and absentminded, learning provides remarkable benefits in reaching a middling level effort-free. Another example is when I recently picked up an intermediate-level reference for Japanese grammar (a language I’ve never formally studied) and realized that I know most of the words used in the example sentences purely through some kind of osmosis over the years. It is wonderful how everything out there ties together somehow. Now if I could just have these fruits of a decade’s experience and have that decade itself back…

Linguistic pseudoscience in the breakup of Serbo-Croatian

I’m all too familiar with Romania and its dacomania, and I’ve read a great deal about Albania’s insistence on a glorious Illyrian past in order to present itself as a proud and stately nation today. But reading Greenberg’s Language and Identity in the Balkans showed me that there’s similar nuttery in the land in between, that is, the former Yugoslavia:

In an interview posted on the Montenet website entitled “Does a Montenegrin Language Exist?” (“Da li postoji crnogorski jezik”) [Montenegrin nationalist Vojislav] Nikčević made the highly dubious claim that the prototype for the Montenegrin language is the Polabian language, having based these unfounded assertions on hun­dreds of Montenegrin place names. Even more unlikely is his assertion that the ancestors of the Serbs came from an ekavian-speaking area of southeastern Poland, and that their ekavian reflexes of jat’ are somehow linked to those found in Byelorussian. For him, the Montenegrins are the sole authentic ijekavian speakers in the Balkans, and other peoples in the area (Serbs, Croats, Bosniacs) had acquired ijekavian speech secondarily. There is no credible evidence to justify any of these claims. The Montenegrins would be as connected to the Polabians as any other Southern Slavic people, and toponyms in the Southwestern Balkans can usually be traced to substratum languages or to South Slavic influ­ences, rather than West Slavic ones.

You’d think that any academic passionate about the distinctions between the languages of Yugoslavia as well as other Slavic languages would know better. And then there’s this:

[Bosnian language advocate Senahid] Halilović considered the term Bosna to be pre-Slavic and possibly even pre-Indo-European. Such statements on the ancient origin of a name bring to mind Fishman’s notion (1972: 7) of stressed authenticity, whereby ancient terms provide the necessary trappings of legitimacy to a linguistic revival.

That Joshua Fishman citation is Language and Nationalism (Rowley: Newbury House, 1972), which seems to have captured an especially common sort of woo-woo around smaller languages and peoples on the defensive, going well beyond the Balkans.

Inexplicably unproposed Uralic etymologies

There are resemblances between some Mari words and items in other Uralic languages that are extremely blatant and yet, to my knowledge, have gone uncommented. That’s not to say that an etymological link is tenable, but one would expect the UEW or other general references to at least note and shoot down some prior attempt to relate the given words. Why have the following not been compared before, even in the heady early 20th century when standards were relatively lax?

  • Finnish salama ‘lightning’ ~ MariE šolem ‘hail’. Yes, there’s a difference in meaning, but it’s not unusual for words denoting weather conditions to shift semantically, e.g. MariE jür ‘rain’ < Cv. yur ‘snow’. I suppose the difficulty here is that MariW šolem doesn’t show /a/ as some might have earlier expected in a word from *salama. However, it does agree with a formulation by Ante Aikio that Proto-Mari *o appears before *l if the word does not begin with a glide.
  • Finnish pakkanen ‘frost’ ~ MariE W pokšə̑m ‘frost’
  • Russian леньгас ‘loafer’, Estonian lõngus ‘lout’ ~ MariE laŋga ‘lazy’. Paul Ariste connected the first two in a 1966 paper, even mentioning some Finnish words, but he didn’t mention the Mari at all even though it’s there staring one in the face.
  • MariE W paŋga ‘lump’ ~ Udmurt pog id. One would have thought the Mari would be included in the UEW (404) under *puŋkaKnollen, Beuele, Unebenheit’, as similar forms from across Uralic are listed there. MariE paŋga is in Paasonen’s dictionary, so it’s not like earlier researchers could have been unaware of this word. Even if one would prefer to see the Mari as a loan on account of its first-syllable *a, it’s curious that Bereczki didn’t include it in his 1992 list of Permian or Udmurt loanwords in Mari.

Amusing linguistics web searches

Maintaining a blog with musings on linguistics has brought a lot of search engine traffic my way, and I occasionally look at my server logs to see what searches bring up my website. Often these are not particularly interesting, as people either come for something very specific that I’ve written about, or conversely, for some reason one of my posts shows up for what would seem to be a completely unrelated non-linguistics search. However, occasionally I see very amusing search strings. Here are four of the most recent ones that made me chuckle:

  • is tocharian worth learning, it’s hard to imagine what position a person would have to be in to need to ask this;
  • are people poor in yoshkar-ola, well, I think bednost’ is an apt word for Mari El in many senses;
  • салфетки glagolitic, is someone making a medieval Slavic-themed restaurant?
  • navajo elders understand chinese, looks like someone has been reading Gavin Menzies.