Site Contents

Last updated 2008-01-16.

Introduction

LaTeX is a free typesetting program that produces high-quality manuscripts ready for printing. It is frequently used for typesetting mathematical books due to its excellent support for formatting equations, but is in fact useful for nearly any purpose. Cambridge University Press typesets many of its books on linguistics, such as The Syntax of Hungarian, with LaTeX.

LaTeX can be a powerful and economic choice for those who work with classical languages and compose texts on historical linguistics. It supports, among many other languages, Latin and Greek, and even provides the correct hyphenation of text in those languages. Specialized transcription systems, such as those commonly used for reconstructions of Proto-Indo-European, can be included with relative ease. Here I shall present a brief guide to using LaTeX for the classical philologist and historical linguist. However, I do expect the reader to be computer-savvy, able to learn on his own, and to have already become familiar with the basics of LaTeX. A fairly clear primer on LaTeX is The Not So Short Introduction to LaTeX2e (PDF), though there are many other tutorials available. When one has LaTeX installed and has learned how to format basic documents, one can now learn to typeset Greek, Latin, and other philological or linguistic text.

Unicode Support

To enable Unicode support in a modern LaTeX distribution, one need only add \usepackage[utf8]{inputenc} to the document preamble and use an editor that can save the file in UTF-8 encoding.

Enabling Specific Language Support

LaTeX has support for many individual languages through the package “babel”, which can be activated in the document preamble with the arguments being the needed languages, i.e. \usepackage[polutonikogreek,latin,english]{babel} The option polutonikogreek sets hyphenation and spelling to ancient, polytonic Greek; typing merely greek provides only support for the modern Greek language. Latin is simply latin, though one can also somehow configure it for Medieval Latin. The language in which most of one’s document will be written should be the last argument to Babel. Note that in most LaTeX installations english configures hyphenation and spelling for American English. If your document is in British English, try british.

This only informs Babel which languages will be in the document to come, and the default language is that which comes last, but to conveniently switch languages within the document one should declare two commands within the document preamble: \newcommand\greektext[1]{{\localgreek{#1}}} and \newcommand\latintext[1]{\foreignlanguage{latin}{#1}}

With these declared one can simply indicate Latin or Greek text by surrounding it with the suitable markup, e.g. \latintext{vēnī, vīdī, vīcī} for Latin and \greektext{καὶ ὁ πρῶτος ἐσάλπισεν} for Greek.

Typography

Typefaces

The default Greek typeface provided by Babel is unsuitable for classical texts. Some might like it, but its odd-looking kappa and rho with curved descender make it rather different than the typefaces of most books. The package “psgreek” provides five Greek fonts, three of which are free and the others shareware. To use any of them, one must include the package by declaring \usepackage[FONTNAME]{psgreek} in one’s document preamble, replacing FONTNAME with the name of the typeface one desires, which can be found in the psgreek documentation. I myself prefer the one called “oxonia”, the traditional typeface of the Classical Greek publications of Oxford University Press. However, it lacks kerning and is suitable mostly for typesetting individual words or the occasional short sentence. If an author is going to be placing entire Greek paragraphs in your document, he may wish to use the two shareware fonts provided, which do contain kerning.

Hyphenation

Once the user has a fine typeface installed, he will certainly want to move to the next step in making classical text look attractive: hyphenation. For Greek, one must first acquire the hyphenation file GRAhyph3, install it, and then edit the file $LOCALTEXMF/tex/generic/babel/language.dat and add the line greek grahyph3.tex.

The hyphenation file for Latin usually ships with LaTeX, but to make the engine aware of it one may have to edit the file $LOCALTEXMF/tex/generic/babel/language.dat and add the line latin lahyph.tex.

Note that the user should edit a local copy of language.dat, not the one originally installed by your LaTeX distribution. Once that is added, one must update the format files. On at least the tetex distribution, this is done with: $ fmtutil --all.

Everything would have been much simpler if I had merely read Using a new language with Babel at the TeX Frequently Asked Questions.

Typesetting Indo-European Symbols

Many of these solutions for typesetting Proto-Indo-European reconstructions and the transcription of ancient IE languages require the “TIPA” package. This is a pain to install, but the effort is well worth it. A guide to Windows installation has been provided by Yoshinari Fujino. Once installed, one must enter \usepackage{tipa} in the document preamble.

TIPA has one drawback, namely that it does not include a proper italic typeface, but rather in an italic environment (e.g. \textit{}) it simply slants the standard typeface. The consequence is that if one uses an individual TIPA command in an already-italicized string, the TIPA character looks out of place. In the example below, note the incongruity the first letter of r̥reǵeti and the last letter of swesorm̥ share against the surrounding characters.

The solution is to keep all transcriptions in TIPA’s slanted style instead of using LaTeX's usual italic environment. Place the following code in your document preamble: \newcommand\ie[1]{\textipa{\slshape{#1}}}. Now one can acheive a harmonious appearance by giving all transcriptions to the \ie{} command, e.g. \ie{*\r*nnewos}.

Whatever slight roughness may now appear is a result of some PDF readers having problems displaying bitmap fonts. However, the document will print with a perfect appearance.

Proto-Indo-European Reconstructions

PIE Vowels

The long vowels ā (Unicode character u+0101 latin small letter a with macron), ē (u+0113 latin small letter e with macron), ī (u+012B latin small letter i with macron), ō (u+014D latin small letter o with macron), or ū (u+016B latin small letter u with macron) can be entered directly.

ə (u+0259 latin small letter schwa) may be entered directly. Using the Unicode character and adding superscripts, schwa primum can be entered as ə$^1$ and schwa secundum as ə$^1$.

Glottalic stops

The usual transcription for postulated glottalic stops must be typeset simply by typing the consonant followed by apostrophe, e.g. t' p' k'. I hope to find a solution by which the consonant can be typed followed by Unicode character u+02BC modifier letter apostrophe, which is intended to mark glottalisation, but this does not yet work.

Palatal velars

For the transcription and (k and g with inverted breve) one must issue the commands \textroundcap{k} and \textroundcap{g} respectively.

For the transcription with acute accent over the consonant, (u+1E31 latin small letter k with acute) and ǵ (u+01F5 latin small letter g with acute) may be entered directly.

Labiovelars

For transcription with velar followed by superscript w, one can directly enter or (k or g plus the Unicode character u+02B7 modifier letter small w).

For transcription with velar followed by superscript (u with breve below), one may enter k\super{\textsubarch{u}} or k\super{\u*u}, and g\super{\textsubarch{u}} or g\super{\u*u}.

Syllabic resonants

For the short syllabic resonants , , , or (l, r, m, and n with ring below) one must enter \r*l, \r*r, \r*m, or \r*n respectively.

For the long syllabic resonants l̥̄, r̥̄, m̥̄, or n̥̄ (l, r, m, and n with macron and ring below) one must enter \r*{\=l}, \r*{\=r}, \r*{\=m}, or \r*{\=n} respectively.

Glides

The transcription of the palatal glide (i with inverted breve below) can be entered as \u*i or \textsubarch{u}. The labial glide (u with inverted breve below) can be entered as \u*u or \textsubarch{u}.

Laryngeals

The first laryngeal h1 can be typeset as h$_{1}$, the second h2 by h$_{2}$, and the third h3 by h$_{3}$.

The transcription of an unspecified laryngeal resonant with (capital H with ring below) must be entered as \textsubring{\*H}.

Indo-European Languages

Avestan

Long vowels with macrons, ə (u+0259 latin small letter schwa), ą (u+0105 latin small letter a with ogonek), may be typed directly (see ).

ə̄ (schwa with macron) must be typed with the command: \={ə}

Delta must be entered as $\delta$.

Theta θ (u+03b8 greek small letter theta) may be entered directly, but a variant form of the letter can be had with $\vartheta$.

ġ (u+0121 latin small letter g with dot above) may be typed directly.

(m with ogonek) is typed as \textpolhook{m}.

ń (u+0144 latin small letter n with acute), ŋ (u+014b latin small letter eng) and (u+1e6f latin small letter t with line below) may be typed directly.

X with acute accent may be typed as \'{x}.

Hittite

For the Hittite laryngeal one can directly enter (u+1E2B latin small letter h with breve below).

Old Church Slavonic

Using the utf8x input method, most of the common symbols used in Old Church Slavonic transliteration, such as č, š, and ž (c, s, and z with caron), and ĭ and ŭ (i and u with breve) for the yers can be entered directly.

It is common for scholars to use the Latin alphabet but retain the Cyrillic symbols for the front yer and back yer (as in Schmalstieg’s Introduction to Old Church Slavic and Lunt’s Old Church Slavonic Grammar). For this the TIPA package provides the commands \textsoftsign and \texthardsign for the soft yer and hard yer respectively.

For information on typesetting Old Church Slavonic in the Cyrillic alphabet, one may consult my guide Typesetting Old Church Slavonic With LaTeX.

Sanskrit

Long vowels with macrons may be typed directly (see ).

(u+1E0D latin small letter d with dot below), (u+1E43 latin small letter m with dot below), ñ (u+00F1 latin small letter n with tilde) (u+1E37 latin small letter l with dot below), (u+1E5B latin small letter r with dot below), and (u+1E63 latin small letter s with dot below) may be entered directly.