Tuesday, August 4, 2009

Toki Pona -- a simple language

Toki pona (usually not capitalized) was created by Sonja Elen Kisa in 2001-2. It soon went public and now has a sizable (for a conlang) and international participant base. The vocabulary and some points of grammar have been revised over time, but the basic outline -- and most of the details -- are unchanged.

Kisa ('jan Sonja' in the community) has offered a number of goals for the language, all centered on the notion of simplicity:
  • a minimal language adequate for living
  • a language to clarify thinking by going back to basic ideas
  • a language to aid troubled thinking by dissolving complexity into simplicity
  • a controlled model for pidgin languages
  • a language to put a positive spin on life
  • a language appropriate for a simple society built more or less on a Daoist model
and probably several others along the same pattern.

The tools jan Sonja uses are a near minimal phonology, a vocabulary of about 120 words (the number and exact content has varied slightly over the years) plus proper names, and a grammar that takes very few lines to state completely.


Toki pona uses only the letters (and sounds) a, e, i, j (= y), l, m, n, o, p, s, t, u, w. Pronunciation is fairly free, so long as you don't encroach too far on another sound's territory. Thus, voiced variants of the stops often occur as well f for p, for example -- derived from the sources of the words.

The syllabic structure is (C)V(n), with dropped initial C option available only in the first syllable of a word. If a syllable ends in n, the next syllable in that word cannot begin with n or m, but if the next syllable begins with p, the n is pronounced as m (though still written as n -- bad typing aside). Several syllables from the eighty possible are disallowed: wu(n) and ji(n) and ti(n).

Stress accent falls on the first syllable of each word (the case for names is less clear, but tends to agree).


The words of toki pona are invariant under all conditions. They are 1 to 3 syllable long. At this point, the most complete morphology would be to list the words, but I will pass on that for now. Some have seen a kind of vowel harmony in toki pona and examples are easy to find, but there are enough counterexamples to refute the claim. The extensive examples do seem to have affected name construction, though.

Names are not strictly words but are subject to the same phonological rules as words. Names are derived from names in their native language, as closely as possible given toki pona's limited phonology. Loosely, m picks up itself, as do w and j(y) (though this does tend to pick up ordinary English j as well), n picks up the other nasals (and m if followed by p), s picks up all the other tongue tip continuants but r and l, l picks up itself and many rs, each stop picks up everything left at its point of articulation. Then the proscribed syllables tend to come into play: Timothy becomes Simosi, for example. rs are particularly tricky here, tapped and trilled and dental go to l, uvular and glottal go to k (so Paki for Paris), and the rest end up as w (so Mewika for America). In general, people get to contruct their own name, however, so these rules are not rigorously enforced. The treatment of consonant clusters in original languages are met with two possible treatments (and mixtures, of course) spelling out all the elements as separate syllables (Elumutu for Helmut -- notice the vowel harmony) or picking the dominent elements while keeping the syllabic pattern (Kipo for Clifford). As noted, there is a tendency to place accents on names where they would fall in the original, but this is balanced by the language habit of first sylllable stress -- no definitive solution yet. Since most discussion on the list is, as usual, about the language, the pattern of quote-names is prevalent -- quotations attached to the relevant words: nimi, word, and kulupu [nimi], phrase.


Every sentence of toki pona is a minor variant of the pattern

w/g/sentence la w/g li w/g e w/g Prep w/g

where 'w/g' stands for 'word or group' and a group is a string of words built up from the left: word + word, then group + word or word pi group or group pi group. The final word here can be a name, which may be several names long.

The la and what goes before it need not occur, nor need the Prep and all that follows it nor the e and what follows it. The word/group after la may be preceded by o (optative), followed by o (vocative) or replaced by o (imperative) (the vocative strictly can go before any sentence after the la slot; if the sentence already has an o, the two os collapse to one). If the only thing before li and after la or the beginning fo the sentence is either mi (I) or sina (you), the li can be dropped.

The e and all that follows it may be repeated (with a new w/g, of course) any number of times, as may the li and all that follows it (even if the first li was lost to a personal pronoun) So may Prep and all that follows it. The w/gs in the pre-li position and after Prep may be repeated joined by en or anu.

The occurrence of Prep suggests that there are various word classes in toki pona, while the use of 'group' suggests the opposite. The truth is somewhere between: All of the words of toki pona, with the probable exception of those mentioned in the sentence formula and a few other possible exceptions, can, in principle be used in any role: as a word in any slot or as the basic word or the added word in any group in any slot. But in practice, most words occur in relatively restricted positions, as suggested by their translations. The freest ranging are the primitive prepositions:
tawa, to, toward, and lon, at. There are a few others that can stand in the Prep slot, but they do not affect other places as much as these, which can affect the structure of groups by introducing groups on the right not introduced by pi, in effect bringing the whole Prep structure into the group. So, the whole Prep phrase tawa tomo mi, to my house, grouped (tawa (tomo mi)), can applear after li as in mi tawa tomo mi, I go home, with the same grouping, or even in a modal form , mi wile tawa tomo mi, I want to go home, which groups (mi (wile (tawa (tomo mi)))). This change is so far seen only in the li group, but may be possible in others as well. As just exemplified, the li group also allows a few other words: wile, must, ken, can, kama, come, and maybe pini,finish, open, start, awen, continue, to introduce a whole li expression after them as a right group. In groups other than li, nanpa, number, followed by digits also introduces a right group (the string of digits functions as a unit), There are some cases where ala, not, seems to bind closely with the preceding word to form a right group.


The small number of words and the variety of roles each plays, means that the meaning of individual words must be very broad, even diffuse. When we try to pin these meaning down in English (say), we have to use a variety of words, depending on the context -- "what makes sense." This should not blind us to the fact that, in toki pona, each of these is a unity, with a meaning we may not be able to put into a few words, but which is simple to the speakers of the language.

Once that difficulty is over, the semantics presents few problems, developing much like an SVO/NA European language, once the special grouping problems are taken into account. Of course, as in any language, the exact relation being indicated by a modified-modifier bond may take some winkling out as will the effects of a particular form when it might be equally one sort of notion or something else. Not that these problems are novel, of course. Three items do seem to be peculiar to this language (though probably not unique).

Toki pona has only one deictic pronoun, ni, and one anaphoric, ona. As a result, back references (and forward ones) can be somewhat opaque. Various devices have been used to surmount this problem (genderizing ona by adding meli, female, or mije, male or attaching ni to a relevant descriptive word). But, in general, in keeping with the simplicity theme, the solution seems to be (partial) "repetition is also anaphora." The external deixis is rare in texts so far but the real world, pointing and such locutions a ni poka, that near, and ni weka, that far, might be put into service.

Toki pona has almost no provision for subordinate clauses as such. Most such are handled by separate sentences. In particular, presentation of someone's thought or utterances are set out
as sepratate sentences. In print, the difference between direct quotation and paraphrase is marked by quotation marks, but in spoken form there is no difference; both are introduced by such phrases as toki e ni:, says that, or pilin e ni:, thinks that. The difference, and the resulting differences in pronoun reference, have to be worked out by context -- a common occurrence in toki pona.

The one case where subordinate clauses -- indeed, sentences -- are permitted is before la, which introduces a condition in the sentence. For the most part, such conditions are various qualifiers on the straight claim: tenpo pini la, past, and other temporal locators, tenpo suno kama la, tomorrow, verifiers like ken la, maybe, or mi la, in my opinion, rhetorical flourishes like kin la, moreover, or ante la, on the other hand (the flourish taso, but, does not require la), and attitudinals like pona la, fortunately. But sentences in this slot are genuine antecedents for condtional sentences, la serving as the 'only if' arrow. Other than position, there are no further marks of conditionality, and no distinction, then, between contrary-to-fact and other conditonals -- "context will decide." The potential for iterated subordinate sentences has not been realized and seems unlikely given the ethos of the language. But some people do repeat la phrases, e.g.,ken la tenpo kama la, although this is not officially approved.

Discussed Problems

Aside from the usual "How do you say?" questions, which usually get swift answers, although a few remain, e.g. "left" and "right", there have been few topics of ongoing concern. The chief (maybe the only) has been the problem of big numbers, which, in this case, means numbers larger than two (or maybe five). Toki pona has only two number words, wan, one, and tu, two. Larger numbers -- when not relegated to mute, many -- are expressed additively: tu wan, three, tu tu, four, and so on. The use of luka, hand, arm, for five is common but officially condemned. Many solutions have been proposed (other means of constructing new numbers to represent multiplication as well as addition, place notation in a trinary number system -- ala, not, doing for zero, adding more number words), but none decisvely accepted. In writing, the temporary solution has been to take normal decimal number strings as names, but this cannot be carried over to oral use. The official position is that large numbers are not needed for the simple life that toki pona serves. But the pressure to date things and pay bills, keeps intruding.


Toki pona is a fun language: it is easy to learn and become fairly competent in in a short time (some say a day, some a week); it takes somewhat longer to feel comfortable in and to manipulate the very loose meanings of words of the language -- and probably even longer to regularly understand other people's manipulations. It gives rise to amusing expressions almost automatically: soweli li lili, the critter is small (or the critters are few). It has a surprisingly large range of practical use, maybe not philosophy (but translations of Dao De Jing are often interesting, even insightful) nor rocket science, but everyday life. Up to a point, that point being just where numbers come in, as noted above -- and, in today's world, that point is fairly early. Perhaps it fares better as a guide away from complexity (including numbers) and to a simpler life.

As for its intended purposes, it gets mixed score. Here are the negative notes, to be weighed against the language's charm and the possible indifference to particular goals.

It is probably not minimal, for all that it is small. NSM gets by with only 70 or so words, though with a broader range of sentence types. One can easily imagine reducing the phonology further (doing away with m, for example, or reducing the vowel inventory to i-a-u) but not much. NSM again claims to have a dodge around conditional sentences and some fairly easy tricks would surely do for most cases ("Imagine that ... . In that case ... .") The complexity of the words could also be reduced -- dropping three-syllabled ones, for example.

It fails as a model for pidgin languages precisely because it doesn't allow one to tend to one's pidgin. Business is about numbers in countless ways and so the lack of such numbers is a block that every real pidgin overcomes somehow (a look at how might be useful here).

As for putting a positive spin, it has to be noted that the limited vocabulary has only one word for good but two for bad, a word for disaster but none for success, one for dead but none for alive, war but no peace. Of course these concepts can be expressed, but only by non-simple forms, phrases, not words.

Nor is it very Daoist or other simple life pattern as generally understood. It has, for starters, a word for money and for shop, two of the major marks of non-simplicity, usually. On the other hand, it has no words for some tools for the simple life, a digging stick or a hoe, say (from the Daoist side). Perhaps this simple life is to be lived withn the context of the modern world -- in but not of -- and so the basic tools are a computer and a fast internet connection. But then the numbers come up again (http://1.bp.blogspot.com/_RJni9o2nQno/Snwct4jraEI/AAAAAAAAR7g/msVt3zb982M/s1600-h/cut+all+ties....jpg.?).

As for clarifying ideas by restating them in simple terms, the toki-ponists have demonstrated considerable ingenuity in expressing fairly complex terms with this vocabulary, but whether this really reduction rather than cover, is difficult to say. Nor is it at all clear that the vocabulary given is up to the task when we come to more complex problems -- emotions and personal relationships, say. The vocabulary seems to be an idiosyncratic selection, not based on any kind of scientific study -- unlike NSM or even the Swadesh list, though these are designed for different puurposes.


Offical website: tokipona.kisa.ca
List: groups.yahoo.com/groups/tokipona
Twitter: tokilili.shoutem.com
Community: community.livejournal.com/tokipona
Textbook: bknight0.myweb.uga.edu/toki-pona

1 comment:

  1. A comment about pidgins. A real pidgin has a specific function: To get things done. The resulting language is a kind of "best we can" compromise. If the speakers of a pidgin could instantly learn another language, they would, so toki pona is something quite different.

    One way that this becomes obvious is with the number system. In every pidgin and creole language I've seen, numbers are borrowed wholesale from another language--often intact. In the cases where they're not, "sensical" compounds are constructed (maybe "wanten" instead of "eleven"). If tp were like those language, "tri", "for", "faif", etc. would follow.