Tuesday, August 22, 2017

What's wrong with the obvious analysis of waš bih واش بيه?

In the Algerian Arabic dialect I grew up speaking, "what's wrong with him?" is waš bi-h? واش بيه. (Further west, in Oran and in Morocco, it's the more classical sounding ma-leh? ما له.) When the object is a pronoun, as it usually is, waš bi-h? can readily be understood as waš "what?" and bi-, the form of "with" (otherwise b) used before pronominal suffixes (in this case, -h "him"). But substitute a noun, and this historically correct interpretation becomes synchronically untenable: we say waš bi jedd-ek? "what's wrong with you (lit. your grandfather)?" واش بي جدّك, whereas "with your grandfather" would be b-jedd-ek بجدّك. Nor can we cleft it with the relative/focus marker lli اللي: *waš lli bi jedd-ek? (*"what is it that's wrong with you?") is totally ungrammatical, while *waš lli b-jedd-ek? does not have the appropriate meaning (in fact, out of context, it makes no sense at all). This tells us that, whatever its origins, waš bi- can no longer be analysed as "what?" plus a preposition "with"; it has to be treated as a morphosyntactic unit in its own right. In particular, this bi- cannot be used to form an adverbial - it only forms a predicate - so it can hardly be treated as a preposition. Nevertheless, it continues to take the prepositional pronominal suffixes: "what's wrong with me?" is waš bi-yya? واش بيَّ, not *waš bi-ni.

The independent unity of waš bi-? becomes a lot clearer when the construction is borrowed into another language, as has happened in the Berber variety of Tamezret in southern Tunisia. The stories recorded there by Hans Stumme shortly before 1900 are a bit hard to read, but provide probably the single most extensive published corpus of material in Tunisian Berber. These texts furnish many examples of aš bi-, although Tamezret Berber neither has to mean "what?" (that would be matta) nor bi- to mean "with" (that would be s). Many of these look just like Arabic: aš bi-k "what's wrong with you? (m.)" (p. 14, l. 11); aš bi-kum "what's wrong with you (pl.)?" (p. 27, l. 26), aš bi-h "what's wrong with him?" (p. 14, l. 3); and even, with a noun, aš bi iryazen "what's wrong with men?" (p. 41, l. 5). But the similarity is somewhat deceptive; in some cases, this construction takes Berber rather than Arabic pronominal suffixes, as illustrated by aš bi-ṯ "what's wrong with her?" (p. 25, l. 21) instead of Arabic aš bi-ha, aš bi-m "what's wrong with you (f.)?" (p. 10, l. 5). Unfortunately, the texts do not provide a complete paradigm - further documentation is needed! But judging by the available data, all cells but 3m.sg. match well with the Berber paradigm:

Algerian ArabicTamezretTamezret, direct objectsTamezret, objects of prepositions
2m.sg.waš bi-kaš bi-k-ak-k
2f.sg.waš bi-kaš bi-m-am-m
2m.pl.waš bi-kumaš bi-kum-akum / -awem-kum
3m.sg.waš bi-haš bi-h-ṯ-s
3f.sg.waš bi-haaš bi-ṯ-ṯ-s

The 2m.sg. and 2m.pl. suffixes are quasi-identical between Tamezret Berber and Arabic, facilitating the borrowing; for the second person, neither language clearly distinguishes direct object forms from objects of prepositions. The third person, however, distinguishes the two in Berber but not in Arabic, and 3f.sg. suggests that the object in this construction is treated as a direct object, not as the object of a preposition, contrary to the situation seen for Arabic. This fits Berber-internal patterns; throughout Berber, nonverbal predicators (Aikhenvald's "semi-verbs") typically take the direct object pronominal paradigm, and assign absolutive case to their arguments. The perfect agreement of the most frequently used cells in this paradigm between Arabic and Berber surely facilitated the borrowing of this item, but within Berber the paradigm got rebuilt on a largely Berber basis. In morphology, etymology is not destiny!

Saturday, July 22, 2017

Can slur avoidance be taken too far?

I was rather flabberghasted to read an otherwise good post on Language Log seriously suggesting that racial slurs are so painful they should be coyly asterisked out even in careful lexicographical explanations of why they should not be used. I do not pretend to any expertise on the impact of the specific slur in question there - I'd prefer to hear more black linguists' comments on that - but much of the argument they make is general, not specific:
If you take the standard linguistic analysis of slurs, though, the word’s power does not come from mere taboo [...] The word literally has as part of its semantic content an expression of racial hate, and its history has made that content unavoidably salient. It is that content, and that history, that gives this word (and other slurs) its power over and above other taboo expressions. It is for this reason that the word is literally unutterable for many people, and why we (who are white [...]) avoid it here.

Yes, even here on Language Log. There seems to be an unfortunate attitude — even among those whose views on slurs are otherwise similar to our own — that we as linguists are somehow exceptions to the facts surrounding slurs discussed in this post. In Geoffrey Nunberg’s otherwise commendable post on July 13, for example, he continues to mention the slur (quite abundantly), despite acknowledging the hurt it can cause. We think this is a mistake. We are not special; our community includes members of oppressed groups (though not nearly enough of them), and the rest of us ought to respect and show courtesy to them.

Anglo culture has a long tradition of scrupulously avoiding certain words in order to respect and show courtesy towards, in particular, women and children - people who were thought of as weaker and more emotional than adult men, and in need of their protection. Politeness is great, but if you treat people like they're made of glass, you're not only patronizing them, you're excluding them - you're implying that there are some discussions they just can't handle. (The term "white knight" comes to mind.)

This is ironic in general - people who have made it through serious oppression tend to be pretty tough, though everyone has their vulnerabilities. It's doubly ironic within an academic context, in that a core academic skill is the ability to confront and (if necessary) rebut personally threatening arguments without getting carried away by one's immediate reactions. In order to master North African historical linguistics, I've had to read works by colonial generals and OAS terrorists who fought and killed to subjugate my ancestors, and whose attitudes often colour their work; most people working on marginalized languages will have had similar experiences. If I can deal with that, do you really expect me to be incapacitated by some professor's cautious mention of, say, the word "raghead"? Words certainly can hurt, but slurs have enough power as they stand without adding the power of absolute taboo on top.

Wednesday, June 14, 2017

Sticks and stones and value inversion

In the Western world over the past few years, freedom of speech seems to be becoming a matter not just of human rights but of cultural identity. While many threats to this principle are routinely ignored, some are singled out for a great deal of attention. In particular, legions of columnists stand firm against the efforts of ungrateful foreigners and degenerate youths – suicide bombers and special snowflakes – to undermine our liberal traditions. Such whiners, apparently, have forgotten one of the first proverbs an Anglo child learns:
Sticks and stones may break my bones, but words can never hurt me.
I am not aware of any close equivalent of this saying among the other cultures I know best; in that sense, it can indeed be seen as reflecting a distinctive characteristic of Anglo culture, if not necessarily Western culture. However, this saying is also much more recent than you might expect; its first appearance in print seems to be in mid-19th century America. This timing coincides well with the rise of classical liberalism, and its form seems to be a deliberate inversion of earlier proverbs, reversing the original meaning. Medieval Englishmen used to say precisely the opposite:
Malicious tongues, though they have no bones,
Are sharper than swords, sturdier than stones. (Skelton, Against Venemous Tongues, ed. Dyce, i. 134)
Tongue breaketh bone, all if the tongue himself have none. (Wyclif, Works, ed. Arnold, ii. 44)
Rhyming proverbs to the same effect can be found all over northern Africa, in Algerian Arabic (of Oran):
əḷḷahumma ḍəṛba bdəmmha wala kəlma bsəmmha.
اللهم ضربة بدمها ولا كلمة بسمها.
O God, better a blow drawing blood than a word dripping poison.
or Kabyle Berber:
Ljerḥ yeqqaz iḥellu, yir awal yeqqaz irennu.
A wound digs deep and heals, a bad word digs deep and keeps digging.
or even Zarma (Songhay), down in Niger:
Yaaji me ga daray, amma sanni futo me si daray.
A lance’s edge goes away, but a bad word’s edge doesn’t go away.
Both contrasting sets of proverbs are, of course, gross exaggerations, false if taken literally. Words certainly can hurt, and wounds can certainly hurt worse than words; no one in any culture is likely to deny either fact. What they represent in each case is a cultural consensus – robust, but subject to change – on how seriously to take the hurt that words can cause, and by implication on how sharp a response is justified.

The most compelling by far of the classical liberal arguments for freedom of speech is that it deepens our understanding of the truth. An opinion left unchallenged starts to seem like intuitive common sense; it becomes something people adhere to out of habit rather than out of conviction. Freedom of speech, ironically, is a case in point. Ideally, we are exposed to the arguments for its value at some point, in university if not in high school. But long before that, we’ve already had a weak version of it inculcated by elements of everyday life, like “Sticks and stones...” Such an early exposure makes it seem like universal common sense, like something that should be instinctively obvious to everyone. It’s not; even Englishmen assumed the opposite not too long ago. If you want everyone to believe it, you have to be able to make a good argument for it – and to do that effectively, you need to understand something of where they’re coming from.

How does this compare with cultures you've lived? Are you familiar with any other proverbs on the relative harmfulness of words and weapons?


Sunday, May 21, 2017

Latin-speaking Muslims in medieval Africa

In the Middle Ages as today, Christians and Jews regularly called God "Allah" when speaking Arabic, just as Muslims did . It is perhaps not as well known that the converse was often also true: from a very early period, North African Muslims called God "Deus" when speaking Latin. This can clearly be seen on the 8th century Umayyad coins of Tunisia and Spain, which include statements such as:
  • Non deus nisi Deus solus - There is no god but God alone (لا إله إلا الله)
  • Deus magnus omnium creator - God is great, the creator of all things (الله أكبر خالق كل شيء)

I had always assumed it more or less stopped there, as Latin-speaking Muslims shifted to Arabic. But in the towns of southern Tunisia, the former Bilad ul-Jarid, Latin was still being spoken well into the 12th century. In his recent book La langue berbère au Maghreb médiéval (p. 313), Mohamed Meouak uncovers a short recorded example of spoken African Latin from between these two periods, which otherwise seems to have escaped notice so far.

The 11th-century Ibadi history of Abu Zakariyya al-Warjlani, he gives a brief biography of the Rustamid governor Abu Ubayda Abd al-Hamid al-Jannawni (d. 826), who lived in the Nafusa Mountains of northwestern Libya. Before assuming his position, this future governor swore an oath:

Bi-llaahi (by God) in Arabic, and bar diyuu in town-language (بالحضرية), and abiikyush in Berber, I shall entrust the Muslims' affairs only to a person who says: "I am only a weak being, I am only a weak being."
In al-Shammakhi's later retelling, the languages are named as Arabic, Ajami, and Berber (بلغة العرب وبلغة العجم وبلغة البربر). As Mohamed Meouak correctly though hesitantly notes, diyuu must be Deo; he leaves bar uninterpreted, but it is equally clearly Latin per, making the expression an exact translation of Arabic bi-llaahi. The Berber form is probably somewhat miscopied, but seems to include the medieval Berber word for God, Yuc / Yakuc.

The earliest Romance text is the Old French part of the Oaths of Strasbourg, made in 842 and opening Pro Deo amur... "for the love of God". The Ibadi phrase recorded above curiously echoes this, although it predates it by several decades.

Saturday, May 13, 2017


In English, "re-" is a moderately productive derivational prefix - reboot, remake, redo... In French, though, it seems more like an incorporated adverb - it's practically the main way you say "again": remanger (eat again), repleuvoir (rain again), redire (say again) are all perfectly normal. It's even possible to say ravoir (have again), although it seems to be less and less frequent.

Now a number of states are expressed in French with the verb avoir "to have" plus a bare noun: avoir faim "to be hungry", avoir peur "to be afraid", avoir besoin "to need" etc. Given the preceding remarks, you would naturally assume that "need again" should be ravoir besoin - and, indeed, it is possible to find this expression at least in 19th century texts, eg:

Rentré dans le journalisme, cet esprit capable, mais aride et paresseux va ravoir besoin de moi. (1856)

It appears to be very little used in the 20th century, though. Instead we hear avoir rebesoin: j'ai rebesoin de ça, I need this again. The only Italian I asked said this is quite impossible in Italian, but even there ho ribisogno gets a few dozen hits on Google (though for all I know they're all second language speakers.)

The fact that besoin appears bare, with no article, already makes it unusual among nouns. The ability to take the prefix re- makes it stand out even more: you certainly can't say *revoiture (car again) or *repain (*bread again). So maybe it's not a noun any more? It certainly looks like it's become kind of verby; but what can we label it? In an Australian context, the uninflected element of a complex verb would be called a preverb, but apart from suggesting the wrong order of elements, this term has way too many different meanings depending on which part of the world you're in. Perhas, as in Japanese, we could call besoin a verbal noun - although that, too, is all too potentially ambiguous. Any better terminological suggestions are welcome.

Wednesday, May 03, 2017

Translating the comedy of diglossia

Even in English, you can sometimes get a laugh by inappropriately mixing high and low registers - gangster slang in blank verse*, or discussions of medieval agriculture in Cockney. In a diglossic language such as Arabic, this trick is both easier and more effective. An excellent example is provided by Message to the Parliamentarians, a recent political satire by Algerian YouTuber Anes Tina. Apart from its primary themes - the offensive meaninglessness of Algerian elections and the hopelessness of abstention - this video is a spectacular send-up of the bombastic period dramas that occupy such a significant role in Arab TV schedules. In such shows, often set in the pre-Islamic period, the characters speak intimidatingly classical Arabic, case endings and all, as a matter of course. (This is, incidentally, somewhat anachronistic: no attempt is ever made to reproduce even the substantial inter-tribal dialectal variation that early Arabic grammarians explicitly tell us about, much less the substandard non-Bedouin varieties they preferred to ignore.) In this video, the characters speak accordingly - but with carefully planted intrusions from the world of everyday speech. Consider the opening scene:
lam yabqaa lanaa 'illaa Hallun waHid.
wamaa huwa lHall?
falnaktub irrisaalah.
wayHak! ma lladhii taf3aluh?
uktub: wilaayatu banuu qaynuqaa3, firraabi3i min shubaaTi l'awwal. risaalatun min ibnu taynah, annaaTiqu rrasmiyy walmukallifu l'i3laamiyy liqabiilati shsha3b, 'ilaa lfaasiq alfaajir almunaafiqi lla3iin addaa3ir alxabiithu ssaaqiTu lmaariq azzindiiq quzaaHah 'amiiru qabiilati lxarlamaaniyyiin. ammaa ba3d. la3natu l'aalihati 3alaykum. la3natu l3uzzaa wa hubal 3alaa Hamlatikumu l'intikhaabiyya. waHaqqi 'aalihati lwaay waay, waHaqqi 'aalihati shshiita, naHnu lan nuHallibakum fil'intikhaabaat. lan nashtarii sila3akum, walan natazawwaja minkum, walla3natu 3alaykum 'ilaa yawmi ddiin.
hal bu3itha lmiisaaJ? hal hum 'on liin?
Sabran ya bna taynah, fa'inna la koneksyoona thaqiila.
tabban littiSaalaati quraysh. faltuxbirnii idhaa xarajati lvüü firrisaalah.
How on earth are we to translate this? The "letter" itself is not so hard - the inflated rhetoric is easy to render into olde English, and the occasional dialectal intrusions (bolded) correspond pretty well to English slang, producing a roughly similar effect. The allusions to pre-Islamic religion and early Islamic history are unlikely to make much sense to most English speakers, but corresponding names with appropriate resonances can be substituted without much damage; thus:

Only one solution remains before us.
What, then, is the solution?
Let us... write the letter.
Perdition! What are you doing?
Write! Province of Idumaea, on the 4th of Zivim. A letter from Taenaus, the official spokesman and media officer of the tribe of The People, to the evildoer [cymbals!], the sinner [!], the accursed hypocrite [!], the debauched [!], the malignant degraded renegade [!], the miscreant Cuzahah, prince of the tribe of the Charlamentarians. May the gods' curses be upon you. May the curses of Ashtoreth and Moloch be upon your electoral campaign. By the gods of canned applause, and the gods of brown-nosing, we shall not suck up to you in the elections. We shall not buy your goods, nor shall we marry from among you. And curses be upon you until the Day of Judgement.
But what can an English speaker possibly do to reproduce the comic effect of the dialogue that follows it?
Has the message been sent? Are they online?
Patience, O Taenaus, for the connection is slow.
Damnation unto Quraysh Telecom. Inform me when the message gets a view.
All the bolded words are from French except "slow"; but it would be a mistake to treat them as switches into French. Each of them is the normal, well-established way to refer to its referent in spoken Algerian Arabic. In daily conversations, the corresponding Standard Arabic synonyms (if known at all) would be used only by an insufferable pedant, or - more likely - as a joke. Conversely, in a school composition - almost the only context where the average Algerian child is expected to actually produce Standard Arabic - such terms would be strictly banned. No dialect of English that I know of has non-standard words for telecommunication technology (if it comes to that, I can't think of one offhand that has its own word for "slow" either.) The problem rears its head again soon after, as the protagonist attempts to buy a mobile phone in the marketplace. Suggestions are welcome, but it looks to me like this is one gag that simply can't be translated into English. Among their many other effects, it appears that sociolinguistic situations limit what kind of jokes you can make!
* I think John Cowan will have the link for this one?

Friday, April 14, 2017

Languages in 2117

Charlie Stross, a Scottish science fiction writer, recently posted some speculations on predictions for 2117 that touch rather heavily on the domain of linguistics. Linguists who like science fiction may want to consider commenting over there; he's got some good ideas, but some elements are clearly off. The basic conclusions are:
[B]y 2117, [t]here's [g]oing to be a decline in the number of languages spoken: the main world languages will be down to English, Mandarin, Spanish, and some dialect of Arabic (Arabic is highly fragmented), plus surviving secondary languages with large bodies of adherents (over a hundred million each: for example German, Russian, Japanese).

We're also going to see the widespread deployment of deep learning driven machine translation and, most importantly, near-real-time interpretation. There'll be less reason for a native speaker of an apex language to learn other tongues [...]

And the apex languages will have changed considerably [...]

I suspect that over the next century (assuming we don't lose our technological infrastructure) current mechanisms for writing will be supplanted by newer ones--e.g. the replacement of discrete mechanical keys on keyboards with multitouch keyboards and then with gestural/swipe interfaces, where each dictionary word is replaced by a directional ideogram swiped across a QWERTY keymap, until eventually the ideogram replaces the alphabetic word or is auto-replaced by a corresponding emoji.

So: gradual obsolescence of some grammatical forms, appearance of entire new writing systems, unforseen changes due to the vagaries of machine translation, assimilation of loan words from other cultures, and the 2117 equivalent of "don't drone me, bro" (new shorthand to describe stuff that has become the new normal).

What am I overlooking?

My immediate thoughts would be:
  • Actually, a lot of languages with less than 100 million speakers each will still be around 100 years from now. Even if the Netherlands decided overnight to stop teaching, broadcasting, or providing government services in Dutch - and it won't, quite the opposite - it would take more than 100 years for the language to die out. If anything, the fragmentation of mass media into social media already makes it easier to maintain small languages, and to the extent that e-learning becomes a thing, it will have similar effects. On the other hand, only a handful of Native North American or Australian Aboriginal languages seem likely to make it as far as 2117: right now most of them are already down to elderly speakers only, and revitalization efforts are not likely to succeed without a really drastic rethinking of the school system. This is because of grossly coercive educational policies inflicted on them decades earlier. Chinese educational policy has become significantly less tolerant of minority languages over the past few years, and if that trend continues, I suspect many currently viable languages of China are likely to be in a similar situation by 2117: not yet extinct, but reduced to the point that they seem doomed. More broadly, what to predict about language survival worldwide 100 years from now depends fundamentally on two factors: how compulsory education changes, and how much of the population ends up in big cities. The former, at least, is more than anything else about political decisions.
  • Adequate machine translation does seem likely - not good enough for contexts where precision counts, but easily sufficient for casual conversation or listening to speeches. I wouldn't expect this to have any really major effects on languages, but it might allow literal translations of new idiomatic expressions to spread faster between languages.
  • Emoji are basically discourse markers: they won't become ideograms, they'll become punctuation. If they really catch on, our descendants may be as puzzled by how we get by with just half a dozen punctuation marks as we are by how people used to read with no punctuation at all.
Finally, a line that's calculated to get a lot of linguists up in arms: "[L]anguages are vanishing, and to the extent that we can only reason about things we have words for, this may be a subtle but far-reaching loss." Obviously we can reason about things we don't have words for, and equally obviously not having words for them makes it more cumbersome to talk about them. But more to the point, even where languages are in rude health, words for certain things are vanishing at a rapid pace in them. Algerian Arabic isn't going anywhere, but the vocabulary it used to have for wild plants, for traditional farming technologies, for family relationships that are only relevant in a three-generation household? I don't even think most people my age know them, much less their grandchildren in 2117. Large written languages with sufficiently developed institutions can maintain such vocabulary precariously at the margins by having specialists use it - botanists, agricultural experts, historians, etc. Most languages can't.