Wednesday, November 6, 2013

In Another Head

The Happy Raccoon (a Russian? Lojbanist who goes by {gleki arxokuna}) pointed to Conlang Quest on Tumblr, which had just done a review of Lojban, following up on a half-dozen earlier reviews of conlangs. Since this is very like the project here and even covers much the same languages, I am going to be rude and comment on it more or less directly. I agree with much of what it says, but find some of the assumptions and expressions infelicitous at least.

 "My quest for the perfect constructed language." The writer's epigraph presupposes that all conlangs are for the same purpose (else how could there be a single perfect one). This either ignores the tradition of three (or so) types of conlangs or has a specific type raised above the rest as "the real conlangs." The latter is most likely, since, at least until recently, books that dealt with conlanging at all dealt with auxlangs or even replacement languages, "One Language for the World" as an old book (the first one I remember reading on the subject, so probably at least the early '50's) had it. And, indeed, the author says in his first essay

 "This blog shall serve as my review and quest to learn a second language. By “second language” I obviously mean: “Constructed Language” (aka conlang) as all natural languages manage to piss me off. Especially those (most Romance) with which the majority of nouns have an associated gender. (See French… how da fuq is evening masculine, but night is feminine?) I’m also looking for a logical conlang. Something that is a reasonable blend of brevity and unambiguity. In fact, brevity is secondary to ease of learning and communication."

 The second paragraph does suggest that he is interested in an engelang, particularly an unambiguous one, but this is apparently a secondary matter. It does, perhaps, explain why some of the languages he looks at are distinctly not auxlangs (in intention, however they may be used by their followers). The ease-of-learning is also an interesting one in both auxlangs and engelangs intended for experiments -- and is one of the great imponderables in this game.

His next move is to cite one of the most misleading Whorf quotes   "Language shapes the way we think, and determines what we can think about.“  But the problems with that are for another time, since it does not play a role here.

The author's first review is of -- inevitably -- Esperanto.  With more than a century of criticism, Esperanto's "problems" are well known and discussed, both presented and defended. From this heap the author chooses to focus on the spelling system, the declension of nouns, and syntactic ambiguity,in keeping with his declared interest.

The section on the alphabet and spelling suggests a lack of appreciation of the difference between phonetics and phonemics, which lack becomes more apparent in later reviews.  Fundamentally, he seems to want each letter of the Esperanto (in this case) alphabet to stand for exactly one IPA symbol/sound.  He thus misses two important points:1) within a language, "the same sound" (phoneme) may be pronounced in different ways in different environments, even by the same speaker, so no one IPA symbol or corresponding letter of the language's alphabet will serve as phonetically correct and 2) whether something is a single sound or a string in a given language depends upon other factors in the language.
1) What sound should the letter r in Esperanto represent, for example, in the speech of a New Englander: retroflex or "missing"? Which occurs is predictable and totally regular, so having two symbols -- to match with the IPA -- is a waste.  On the other hand, in a different language, the two sounds might need different letters since the occurrence of the two sounds is not predictable but actually differentiates two different words.
2) Whether using c to represent the sound /ts/ is a proper choice depends on several factors.  He points out that Eo. 'facila' sounds just like /fatsila/ but the point seems only to be that it should, therefore, be spelled the same.  Except that, if there were an Eo. word "fatsila" that was distinct from "facila" this might be a case for arguing that the two should be spelled differently (especially if there were many such cases).  The actual situation in Eo. seems to be that the single symbols solution is a choice: there are affricates which form a group on their own, patterning neither like simple consonants nor like other consonant clusters nor like vowel-modified consonants.  I don't know what considerations led to the chosen solution but it is clearly as easily justified as an alternative (even if the practices of familiar languages are allowed to play a role in the decision).
The remarks about the complex of extra symbols (and the unmentioned under use of the Latin alphabet) and of typewriter workarounds are just the standard stuff, with which everyone either agrees or gives the standard defenses.

In talking about declensions, the author slips by saying that Eo, has a possessive case, presumably meaning an objective one.  He does not note the advantages this has for freer word order, probably because he would not generally like free word order, which interferes with clarity (some say).  The principled attack on plurals, can be met in the usual ways: that no knowing whether there is one or several of a things is practically at a different level from not knowing whether there are two or twenty and that the differences can all be dealt with by using numbers or other quantity markers (but note that 2 on up still requires a plural in Eo.)

As for ambiguity, the author sees problems with homonyms, though these are very hard to avoid in human language (talk about all the possible words never seem to take non-combinatorial factors into account) and also even possible desirable -- for joke, if nothing else (the author thinks jokes are a problem, apparently, though the one about a giraffe always having a companion is pretty bad).  The problem of syntactic ambiguity remains, and is, of course, central to the authors quest.  The one example he cites is enough to make his case against virtually every language (though not exactly against Eo., since he gives only the English).

The review of Ido consists only of commenting on how it improves on Eo. as far as spelling (and the alphabet generally) goes, while still mixing phonetics and phonemics.  He notes also that Ido avoids the sexisn (mentioned by a commenter) of Eo., not making all items with definite feminine referents formed by adding "in" to the presumably masculine basic forms.

Now comes a brief interlude on a priori and a posteriori languages, which gets the fact essentially correct.  What is does not mention (nor does it cover it later, when it is relevant) are cases like Logjam, which don't fit handily into either pattern.  The vocabulary in these cases is based on that of natural languages but subjected to transformations which appear to be a priori (or a posteriori from something other than languages -- learning theory, say).  These languages achieve something in the way of cultural neutrality by bringing in many languages (and making them unrecognizable) but claim some of the familiarity of a posteriori  languages since the familiar pieces are still there to aid language learnin (this claim has nver been substantiated -- or even tested, so far as I know).

Next, a brief review of  Interlingua, not a language I know much about, nor, assuming this review is accurate, care to.  The review makes a number of claims without giving any data to back them up (this is universal in these reviews) but they mostly seem plausible.  The language is easy to learn (i.e., the vocabulary is European as is the syntax) and it uses the Latin alphabet alone.  The objections are that it is syntactically ambiguous (which follows from the above) and that the spelling is not phonetic (or, apparently, even phonemic -- it is not clear) and that there are homonyms.

Back on familiar ground, a review of toki pona, which begins with the claim that no thought went into its construction because it is so thoroughly ambiguous (syntactically).  Of course, this just means that the inventor did not think like the the author here.  It may have been (and there is some evidence for this) that the inventor spent some time thinking about testing for the extent that context could be trusted to deal with ambiguities to still present a clear message.  In which case, while the result may be unsatisfying to the author, the remark is unjustified.  But the examples (!) given are not about ambiguity at all, but about vagueness or generality.  [rant follows] The fact that a word in one language has several different translations in another (here tp and E.) does not mean that the word is ambiguous in its own language.  It may stand for a concept broader or more diverse than any of the other language's (partial) translations.  Thus, the tp 'suno' presents a concept that centers of light or brightness and so gets particularized as "sun" or "day" or "shiny", as appropriate.  If what is appropriate is not clear, then other words or features of context can be added or pointed to to get the message across.  To be sure, there is probably a context in which 'suno pona'  "good light" (but not, by the way, "great sun" 'suno suli' nor "thanks, sun" 'suno o, pona') rather than "Good day!", but not, generally, as a greeting (and, if there were a greeting case, it would come to the same thing with the assumed interests of the interlocutors).  [end]  The one thing the author likes about tp is, alas, not in it at all.  He takes a piece of the inventor's rhetoric and lays it on the language.  tp is not inherently more honest than any other language (nor dishonest, cf. Newspeak)  It is as easy to lie or obfuscate or euphemize in tp as any other language.

And then aUI, a surprising choice, since it doesn't get much press.  While I don't think tp began life as a auxlang, aUI is a mixed case.  In form it is a Philosophical Language, usually thought of as a type of engelang. But, since the point of such a language is that every word should be defined by its form  -- you can read the definition off if you understand how the word is constructed, it is natural to think that everyone should use this language, as then all misunderstandings would be avoided, etc., etc.  So, maybe even a replacement language rather than an auxlang.

The review like the phonetic use of the alphabet in aUI, though it gets the number wrong -- there are 41 sounds, or at least 32; the nasalized vowels (numbers) are omitted.(He also doesn't comment on the proper aUI alphabet nor the supposed naturalness of each sound for the concept it represents, two fun weak points in the language.)  The possibilities for ambiguity even in individual words is noted, but not explored (words are cited with their official meaning and said to be open to other interpretations, but none are actually suggested).  The grouping problem, for example, where taking different clusters can lead to very different results and there is not systematic way to choose one pattern rather than another.  The later problem of ordinary syntactic ambiguity isn't mentioned but can be taken as a given, I suppose, until we come to Logjam.  The prior problem that the supposed unchanging basic concepts either change or are much broader than is apparent and particularize in unspecified ways (apparently inconsistently, even -- but, since the factors involved are unspecified, it is hard to say) is not mentioned.  But I don't think it is quite as bad as Marklar.

The origin story is taken seriously even though it began as a short story and became biographical only later (cf.  some of Jesus' parable that become part of the story later).  In any case, the obvious problem that the language is bound up with the Latin alphabet and decimal numbers and other features of human (indeed, Iowa) culture militates against the story being true.

So, while we are dealing with strange languages and conceptually precise one, there comes a review of Ithkuil, probably the most difficult conlang with any traction at all.  "Phonetic script" but far from the Latin alphabet (60 sounds!).  Of course, with so many sounds it is hard to pronounce them all distinctively and correctly (the author has trouble outside a narrow range of English apparently from comments elsewhere) and the lack of devices that print the script or of an adequate Latin alphabet version make it undesirable.

Precision easily obtained.  "Semantic ambiguity" dealt with but not, necessarily, solved.
It is not exactly clear what is meant by "semantic ambiguity" here.  In some cases, it seems to be just referential ambiguity: does "the table" refer to this one or that one both available in the context (cf. "Flash stowed up to Ming.  He struck him").  At other times it stands for other kinds of specificity: whether a person who fell was pushed or jumped or just lost support, for example.  Ithkuil deals with the latter case in particular and many other (similar?  in what way?) ones but does not present a general solution to whatever is taken to be the problem here.  The various chairs is famously a pragmatic problem, not semantic at all.  The second does not seem to be ambiguity at all but appropriate specificity, again a pragmatic not a semantic issue. And neither has anything to do homonymy, the earlier version of semantic ambiguity.

And so, not quite finally, to Lojban, a language which seems to come closest to the authors criteria as expressed in various places.  He quite rightly notes that it comes from Loglan, but not so accurately says that Loglan is defunct.  He claims Lojban has about 5k users, which is way high, but says correctly that this is a medium sized language (though it may be -- ignoring Eo. -- a fairly large one, even if it is less than 1k; claims about user size are very unreliable).

He likes the sound/letter correspondence, of course, even if he doesn't understand what it means (i.e., that it does not match with IPA).  And, of course, he likes the fact that Lojban avoids syntactic ambiguity.  He is surer of this than is strictly warranted, of course, for, while it is true that every grammatical sentence in official Lojban has a unique parse and thus is not subject to any of the problems of the sort he exemplifies in English, it is also true that a good many sentences actually presented as Lojban are either not grammatical or officially mean something quite different from what the speaker (and usually the hearer) take them to mean. At the informal level toward which the official grammar strives, to be as clear as the formulas of symbolic logic, the sentences of even official Lojban are defective, typically having problems finding the right ends of structures.  The best that can be said is that Lojban is amazingly good at this, for all that it could be better by being simpler.

The main complaint about Lojban is about its vocabulary, that it is unfamiliar.  The claim that it is based on familiar languages and contain useful guides to familiar words is ignored (in fact, denied).  This is probably realistic if not accurate; the relationships are so obscure and unreliable as to be useless for language learning (or no more helpful than totally casual mnemonics).

The other main complaint is that the grammar is unfamiliar.  Although the author seems to be an engineer, he does not seem familiar with either math or programming, either of which would make the grammar of Lojban more accessible (not that it is really that different from ordinary languages except for the oddity of the meanings -- rather than the grammar -- of various places).  The average Lojbanist is a computer geek, after all, rarely a humanist or even a social scientist.  That may also account for the lousy study materials (user manuals, anyone?).

And so to the last review, Ceqli  (cheng'li, I gather), said to be derived from Loglan again, but not one that I know -- or at least remember.  So I can't say much about it,  About the comments, I note that "fowl" is not Latinate (picky, picky!) but that the use of more familiar forms does make part of Ceqli easier than Lojban, while requiring some other fussinesses later to get the clarity of compounds and structure which come naturally in Lojban.  Which is ultimately simpler is just not clear.

The author's main objection seems to be to phonology/spelling again, focusing on the q for the velar nasal, which he claims he cannot distinguish from a the same sound followed by a voiced velar stop.  I am unsure what dialect he speaks or whether he just has not paid much attention speech sounds for all he talks about them.  He has deleted the main remarks, but they and the retraction are not encouraging about his value when discussing phonology.  (He says that q should be a digraph, going against his earlier one sound/one letter rule!)

Though he like the familiarity of the forms of Ceqli words he still manages not to understand them and attributes to the creator (and Logjam's) the desire to keep everything unintelligible, rather than the well-documented goal of making things (analytically) clear.

Ceqli apparently does not do some job as well as Lojban, but just what it is is unclear from the brief comments
 Ceqli does not address ambiguity in syntax as Lojban does. In Lojban, one can group a “bridi phrase” to separate it from a continuation of arguments to the previous bridi, but in Ceqli you can only group subject, verb, and object phrases on a single level.
This seems a major point, so either (as he says) the information about Ceqli is incomplete at a crucial point or the review has found a major flaw whatever it is).

Finally, a summary, "Logical" Language Criteria.  I pass over the abuse of the word "logical" here, which not even the quotes can evade, to look at the actual cases.

1:1 match of letters and IPA.  No, 1:1 match of letters and phonemes of the language and what those are is not something to be gotten by looking at the IPA.  This muddle runs throughout these reviews.  It is useful to have the match, though not essential: one cluster per sound would work as well, provided that cluster culd not be generated in some other way.  And phonemes may be more than one IPA character long if that works out best (the arguments for Pinyin q are pretty convincing within Chinese, though other solutions also work and could be -- have been -- worked into systems as well).  (as noted earlier c and ts don't contrast in Eo., so there is no problem there).  Note that the freedom of Lojban from ambiguity has virtually nothing to do with the 1:1 correlation, which is not a matter of grammar.

Syntax with one interpretation.  Yes, important to have for some purposes  (machine translation, say), useful to have in any case, provided it is not too costly in terms of other complexities.  It is arguable that Lojban has gotten so complex that the value of ambiguity freedom has been overtaken, especially since it so often results in not saying what you mean (though perfectly grammatically) or missing grammaticality altogether.  Either while still being understood.  If the quest is for a language with this feature, then something that simplified Lojban but kept its power would be worth pursuing.   The examples given of failures of this feature are realistic but not completely convincing, since they fail to take into consideration features like juncture and prosody, but this does not totally undermine the general point.

Words have one distinct meaning.  The combinatorics about the number of "words" is largely irrelevant, since there are countless factors (beyond the uselessness of "xwqc" as a word) that restrict the possibilities down from the assumed limit.  "Semantic ambiguity" used again for referential ambiguity and then for generality and this -- whatever it finally is -- is said to be inevitable (true for generality: can't have a name for everything, less so for referential ambiguity).  Lojban is surprisingly bad on referential ambiguity at the simplest - naphoric level: it can deal with it but requires complex and easily mistaken techniques.