Journal of the Simplified Spelling Society, 1988/3 pp4-10
A Singaporean Corpus of Misspellings:
Analysis and Implications.

Adam Brown.

Dr. Adam Brown has researched into many areas of phonetics, and is especially interested in pronunciation models for foreign learners. The present corpus was collected while he was at the National University of Singapore (1982-85), and analysed in the Language Studies Unit of Aston University (1985-88). He is presently in the Department of Language Education of the University of Malaya, Kuala Lumpur.


The purpose of this paper is to present an analysis of a corpus of 1,392 misspellings by 360 fifteen-year-old Singaporeans. This is preceded by a discussion of the many analytical problems involved in such an analysis. In particular, it is noted that phonological explanations of misspelling phenomena have often been overlooked, and that non-native speakers have greater difficulties than native speakers in spelling English, owing to underdifferentiation of the phonological system. Implications for language teaching and spelling reform are discussed.


It is a common attitude among native speakers of English that the English language belongs to us. For example, the paradigm of English language teaching has long seemed to be one of 'us' (native speakers) teaching 'our' language to 'them' (non-native speakers). In this way, English language reaching around the world has been likened to the export of any other commodity or service. We native speakers export the language as an income-earner and vehicle for Western culture.

However, in recent years, people's attitudes have changed. The English language is no longer seen as the property of native speakers, but as something which is learnt and used by large numbers of people around the world, and is thus a part of their lives just as much as of ours. It has been estimated (Strevens, 1982) that there are around 300 million native speakers of English, but that nowadays these are outnumbered by the more than 375 million non-native speakers. Such estimates must necessarily be approximate, but it is clear that non-native speakers are in the majority.

There are also significant differences in the use made of English in non-native situations. The main distinction is between situations where English is a second language (ESL), and those where it is a foreign language (EFL). In ESL situations, English has some official status, e.g. in government, schools, by its use in the media. Fiji, Ghana, Singapore and Uganda are examples of ESL countries. In EFL situations, however, English is generally learnt only for international communication, and its use within the country is small. Most of the nations of the world fall in this category. The United Nations, for example, has 150 members, of which all but 33 are EFL (Moag, 1982). (This is a simplified picture of the situation. For example, in some situations, definition of the term native language becomes difficult. In Singapore, always referred to as an ESL country, there are many people who speak no language other than English.)

In short then, there are nowadays more non-native speakers of English than native. Problems of English spelling confronting non-native learners ought thus to be investigated in parallel to those of native English children learning the system.

Problems of analysis.

Several problems arise in the analysis of misspellings. A distinction must first be drawn between those misspellings which writers consistently make, and those which they only make on isolated occasions. In the first case, the writer either (i) does not know the correct spelling of the word, or (ii) is very unsure between alternative possibilities, or (iii) is convinced that the word is spelt in some way other than its correct form. In the second case, however, the writer does in fact know the correct spelling of the word, but for reasons of inattention, fatigue, pressure of time, etc., on a particular occasion fails to spell the word correctly; if we draw his attention to the misspelling, he will therefore be able to supply the correct form immediately and without doubt. The former are thus consistent errors of competence, while the latter are momentary errors of performance. The term slips of the pen is used for the latter kind (Hotopf, 1980), on analogy with the term slips of the tongue for the corresponding phenomenon in the spoken medium. There does not seem to be any established term for the former category; I shall use Wing & Baddeley's (1980) term convention errors.

However, it is often impossible to distinguish slips from convention errors, given the written material as the only source of data. Since I had no opportunity to check with the writers in the analysis of the corpus in this paper, I do not distinguish between slips and convention errors, but use the term misspelling to subsume both.

It is a well known phenomenon in studies of second language acquisition that students will avoid using items which they are not sure of. The same is true in studies of misspellings. Sterling (1983:355) points out that a student who is unsure, for example, of the number of <p>s, <n>s and <s>s in the word happiness may avoid the problem altogether by substituting the synonym joy, which is far simpler to spell. Given the written work as the sole source of data, there is no way of knowing if this has happened. The frequency of errors involving doubled consonants in a corpus where a student has employed such an avoidance strategy will therefore give a false picture of the extent of the problem.

In corpora of misspellings, certain examples may be misspelt in the same incorrect way on more than one occasion. This may be taken as a clear indication that the misspelling is a convention error rather than a slip. However, it is not clear on what principle an analyst should base his calculations. There seem to be three possibilities. He may (i) count the number of different kinds of misspellings in the data, or (ii) count the number of instances of misspellings, or (iii) somehow weight the calculation so that those misspellings which occur more than once are assigned greater importance than those which occur only once. That is, it seems sensible to distinguish between misspelling-types and misspelling-tokens, although how this may best be taken into account in a calculation of errors is not obvious. It is clear that calculations based solely on misspelling-tokens may lead to biassed statements of tendencies; Yannakoudakis & Fawthrop (1983a:91) admit that their figure for errors in 10-letter words (calculated by token) is deceptive, in that one subject misspelt monitoring as *monitering 47 times in their corpus.

For reasons such as the above, too great importance should not be assigned to quantitative analyses of the frequency of particular kinds of error in a corpus of data, even though the quantity of such errors contributes greatly to the stigmatisation of poor spellers. Qualitative analyses, which concentrate instead on the nature of the errors rather than their relative frequencies, are in many ways more insightful as indications of writers' problems.

The analysis which the investigator performs on the corpus of data may be pitched at different linguistic levels. Various methods of analysis have been used in the literature, the choice of a particular analysis being determined largely by the analyst's purpose.

An analysis at the surface graphological level was used by Lecours (1966) in his study of the diary of Lee Harvey Oswald. Four categories are used:

e.g. *serveral (several),
e.g. *eldery (elderly),
e.g. *mignight (midnight),
e.g. *presenec (presence).

Nearly all of the few hundred erroneous words found in the diary, several of which contain more than one misspelling (e.g. *foriengress for foreigners), can be classified under these headings." (Lecours, 1966:221)

Since the only conceivable examples which could not be discussed under the above four categories would be grossly incongruous misspellings (e.g. the present corpus [Siew, 1984) contains *slnight for snake), it is not surprising that these four categories handle virtually all examples. However, to say that an analytical system is descriptively adequate (i.e. that "nearly all ... erroneous words ... can be classified" somehow according to this system) does not necessarily imply that it is at all explanatory (i.e. that it explains the causes of the errors, or that the errors should be classified this way). Two cases are sufficient to illustrate this limitation.

Firstly, Lecours (1966:224) analyses the misspelling *scolls for scolds as an example of substitution: 'a letter is erroneously repeated, but ... the faulty doublet takes the place of another component of the involved sequence'. On a purely surface graphological level, this is a descriptively adequate analysis; the <d> is replaced by an <l>, and the preceding letter is also an <l>. However, it fails to capture the seemingly obvious observation that the /d/ of a final /ldz/ consonant cluster is often lost in connected speech (Temperley, 1983). That is, for many speakers the /d/ of a word like holds is often elided, making it homophonous with the word holes. Such an articulatory analysis may explain the absence of a <d> in *scolls.

The second illustration concerns Lecours' examples *promisis (promises) and *expensis (expenses). These would seem to be clear examples of the same phenomenon, namely the plural suffix being spelt <-is> instead of the correct <-es>. This substitution has a natural explanation, in that this suffix is pronounced /ιz/, and the vowel phoneme /ι/ is conceptually associated with the grapheme <i>. However, Lecours assigns them different analyses; *promisis is called a type I error, since it creates a pair of identical letters (i.e. there is an <i> earlier in the word which is implicitly considered to be an interference factor), whereas *expensis is a type II error, destroying a pair of identical letters (i.e. there is an <e> earlier in the word). A surface graphological analysis which ignores such obvious morphophonological explanations is thus restricted in its usefulness, but may be of importance in certain fields, notably in the devising of spelling-checking devices for word-processors (Yannakoadakis & Fawthrop, 1983b).

Other writers have used analyses at different levels. Wing & Baddeley's (1980) study of university entrance examination scripts investigated, among other factors, the importance of the position of the error within the word, and of the word within the sentence, and of the line within the script. They concluded that errors are most common word-medially, rather than -initially or -finally, and that the position of the word within the sentence and of the line within the script is not statistically significant. Levels of general fatigue do not therefore seem to affect the incidence of misspellings.

Sterling's (1983) work includes an analysis of the role of various factors in the spelling of inflected words, among them morphological structure, syllable structure, and other features of phonology. In terms of phonology, he notes (1983:359) that certain errors such as *probally and *samwiches "are not incorrect spellings of the correct sounds but rather correct spellings of the incorrect sounds" (by "incorrect sounds" is meant that the subject relies on a colloquial or regional pronunciation rather than a more standard or deliberate articulation). This neat formulation of the cause of these errors is not without its problems, however, in that it implies that English orthography corresponds to the correct spellings of the correct sounds. This is patently not the case, as witnessed by the many-to-one and one-to-many relationship between English graphemes and phonemes, and by the fact that English spelling does not represent any particular accent of English better than the rest.

Similar phonological considerations are appealed to by Ibrahim (1977) and G. Abbott (1979). However, there is an important difference, namely that these works deal with non-native speakers (writers) of English. When foreigners' problems are under examination, an extra category of misspelling becomes apparent, namely those errors which reflect the writer's phonology of English, which contains interference features from the writer's native language phonology. For example, Ibrahim (1977:208) points out that English has two separate phonemes /p/ and /b/ while Arabic has only one (/b/). Misspellings involving substitution of <b> for <p> (e.g. *Jaban, *bombous) as well as hypercorrections (e.g. *compination, *distripution) are common in his Jordanian corpus. Such misspellings, which one would not expect from native English speakers, occur in addition to those caused by the lack of a close graphemic-phonemic fit in English, which one would expect from native speakers.

Four hypotheses concerning misspellings by non-native speakers were investigated by Tesdell (1987), with groups of Arabic, Chinese, Malay and Spanish speakers attending EFL courses at Iowa State University. His conclusions are as follows. Firstly, non-native speakers make more errors than native speakers; results ranged from 1.13% word error rate for the Malay speakers to 2.55% for the Arabic speakers, compared with the 1.1% found for native speakers by Chédru & Geschwind (1972). "Second, non-native speakers at this proficiency level make more habitual errors than slips [although no indication is given how the two are distinguished]. Third, there may be no significant difference in error percentage between non-Roman [Arabic and Chinese] and Roman [Malay and Spanish] alphabet language speakers" (Tesdell, 1987:83). Finally, Wing & Baddeley's (1980) finding that native speaker misspellings occur most frequently word-medially was replicated with these non-native speaker groups.

E. Abbott (1976), following Haas (1970), uses an analytical system pitched entirely at the phonological level. Misspellings are analysed in terms of the graphemic-phonemic correspondence between the correct written form, the RP phonemic transcription of the intended word, and the incorrect written form. Misspellings are then classified according to the relationship between (i) the pronunciation of the intended word and (ii) a plausible pronunciation of the misspelling. For example, the misspellings *cot and *throt (for caught and throat) are analysed as follows:

Correct written form
RP phonemic transcription
Misspelt form


Misspellings can thus be categorised as substitutions, omissions, insertions and transpositions of the graphemic representation of phonemes (cf. Lecours' surface graphemic system discussed above). *cot and *throt are therefore substitutions of representations of /ɒ/ for /ɔ/ and /əʋ/ respectively (assuming pronunciations of /kɔt/ and /θrɒt/).

E. Abbott (1976) stresses that the graphemic-phonemic relationships can be used as a system for classifying types of misspelling, but that the subsequent explanation of the causes of misspellings may be found at other non-phonological levels. One situation where this system leads to counter-intuitive classifications is in examples such as *striper, *liking (stripper, licking). Since misspellings are categorised by reference to a plausible pronunciation of the misspelt form, these examples are both analysed as substitutions of an // representation (/straιpə, laιkιŋ/) for an /ι/ representation (/strιpə, lιkιŋ/). However, the error has clearly been caused solely at the graphemic level, by failure to double the <p>, and use <ck> instead of <k>, after the short /ι/ vowel.

The potential importance of phonological factors in explaining misspellings has been underestimated by some writers. Lecours (1966:223) found that 13% of all errors involved purely phonological or lexical factors. However, since his analysis avoids plausible phonological explanations for certain examples (e.g. see *scolls, *promisis, *expensis discussed above), this figure may be questioned; he calls it "a relatively small proportions, and considers phonological factors to be only "a reinforcing element" (p.237) rather than the root cause of many misspellings.

From the above discussion, it should be clear that there art many possible ways of analysing misspellings, just as there are many different reasons for wanting to analyse them. The investigator should therefore select his analytical system to match his purpose. A surface graphological analysis, although criticised above as failing to be explanatory of the causes of misspellings, nevertheless is appropriate for someone devising an automatic spelling checker. However, any analysis which purports to be explanatory should be pitched at as many levels as are necessary, since spellers' errors do not lie at only one linguistic level. Rather, misspellings "are intimately connected with a number of representations, structures and processes involved in writing and spelling" (Sterling, 1983:364).

Even so, it is not always possible to categorise with certainty the cause of a misspelling. E. Abbott (1976:126) notes that, in the preliminary analysis of her Ugandan data,
"the following had been classed as spelling errors:

a *fructured jaw (fractured)
*tear-gus was used (tear-gas)

the following as grammatical (morphological) errors:

they *drunk the water (drank)
they *begun buying books (began)

and the following as lexical errors:

the car *crushed into the wall (crashed)
dressed in *rugs (rags)

In some cases the substitution of <u> for <a> has 'produced' a form which, although inappropriate in the context, is actually another English word, and in other cases the substitution has produced a 'non-word', but this might be merely fortuitous".

If a speller in the present (Siew, 1984) corpus writes *grapped for grabbed, this may be analysed as a case of phoneme confusion (of the sound /p/ and its voiced counterpart /b/), or of grapheme confusion (of the letter-shapes <p> and <b>). Similarly, the example *your for yours may represent a phonological omission of final /z/, or may manifest a grammatical confusion. The misspelling *principle (for principal) may be considered a matter of phonology or of lexis. The use of analogy with other observed errors may not always help to disambiguate the cause; further examples of all the above competing causes may be found in the corpus.

The corpus.

The present corpus was collected by Siew Sook Yee (1984). It consists of 1,392 misspelling-tokens of 870 types, made by 360 fifteen-year-old Chinese Singaporeans in classwork essays. The corpus has been added to the collection of misspelling corpora compiled by Mitton (1985); it is available in computer-readable form from the Oxford University Computing Service, Text Archive No.643. If we define idiosyncrasies as features which do not clearly correlate with other features of the language-producing process, then the corpus contains much in the way of idiosyncratic data. And, as I have just pointed out above, many examples admit of more than one explanation. The following analysis therefore presents those misspelling types which occur with sufficient regularity for them to be considered as general categories; these are then of use to language teachers, spelling reformers and other language experts.

The occurrence figures given below can be taken as rough indications of the relative importance of the different misspelling categories. It should be clear, though, that misspelt words may contain more than one instance of misspelling. For instance, the example *serouding (surrounding) in the present corpus contains three errors: (i) wrong graphemic representation of the unstressed schwa vowel, (ii) failure to double the <r>, and (iii) omission (probably phonemic in origin) of <n>.

1. Phonemic conflations.

I have elsewhere (Brown, 1986, 1988) described the phonemic system typical of Singaporean English. It is sufficient here to note that many of the phonemic vowel and consonant distinctions of RP and other native accents of English are conflated (technically known as underdifferentiation).

In general, consonant phonemes are represented more regularly than vowels in English spelling. For this reason, consonant conflations can be analysed in the data with greater confidence than vowels.

The main consonant conflations in the corpus are as follows:
/t, d/
/p, b/
/f, v/
/t, θ/
/s, z/
/l, r/
/s, ʃ/
/m, n/
*intented (intended)
*blank (plank)
*grief (grieve)
*Baltazar (Balthazar)
*noice (noise)
*breeze (breeze)
*finised (finished)
*noon (moon)

The main vowel conflations are as follows:
/ɛ, æ/
/i, ι/
/ɔ, ɒ/
/ʌ, ɒ/
/ι, ə/
/æ, ʌ/
/əʋ, u/
/əʋ, ɔ/
*demage (damage)
*leaving (living)
*boll (ball)
*botton (button)
*accept (except)
*crashed (crushed)
*stoove (stove)
*deport (depot)

With regard to E. Abbott's (1976) Ugandan data, G. Abbott (1979:174) notes that "the indeterminacy of pronunciation ... is echoed in the results of the analysis by what the researcher calls 'pairing'. Here is one example:

/æ/ for /ʌ/ /ʌ/ for /æ/
etc. (n=60) etc. (n=65)

Not only do the mistakes occur 'in reverse', as it were; but the 'reverse' mistakes actually tend to balance the others numerically".

Similar 'pairing' is found in the Singaporean data.

/æ/ for /ɛ/ /ɛ/ for /æ/
etc. (n=28)
/i/ for /ι/
/ι/ for /i/
etc. (n=20) etc. (n=7)

So, if a Singaporean does not distinguish /i/ and /ι/ as in seat and sit, then these two words are in effect homophones for that speaker, and he cannot use any phonological basis for deciding on the correct spelling for the intended word. Instead, the two spellings must be learnt individually by rote on the basis of semantic and syntactic features.

2. Homophones.

While on the subject of homophones, we may note that these are a problem for non-native speakers (as indeed for natives). The Singaporean corpus contains 40 occurrences of 13 types, including *strait (straight), *weather (whether), *principle (principal), *here (hear) and *soul (sole).

3. Suffixes.

It is appropriate, when discussing omission and insertion of consonant graphemes/phonemes, to treat the English suffix morphemes as a separate category. The English inflectional suffixes for past tense/past participle, and plurals/3rd person singular present tense verbs/possessives account for the majority of (although not, of course, all) cases of omission/insertion of word-final /t, d, s, z/. Morphemic and non-morphemic examples are given below:










*fine (find)
*simile (smiled)
*crowed (crowded)
*strait (straits)
*respon (response)
*other (others)
*alway (always)
*banded (bandages)





*felt (fell)
*influenced (influence [noun])
*replied (reply [noun])
*importanted (important)
*sports (spot)
*sicks (sick)
*others (other)
*expensives (expensive)
*difficulties (difficult)

4. Other consonantal omissions & insertions.

Of all the other consonant phonemes of English, the problems created by three (/l, r, n/) far outweigh all the others.

/l/ and /r/ were often substituted for each other, as seen in section 1 above. This confusion is a common feature of Chinese learners of low proficiency. These two phonemes were also often omitted and inserted:
Omitted Word-medially tokens/types: /l/ 10/10 /r/ 13/12
   /l/*softy (softly) /r/*childen (children)
Word-finally tokens/types: /l/ 8/6 /r/ -
   /l/*cancer (cancel)
Inserted Word-medially tokens/types: /l/ 15/12 /r/ 33/6
   /l/*accordling (according) *elephrant (elephant)
Word-finally tokens/types: /l/ 7/6 /r/ -
   /l/*ful (fur)
No examples are given for word-final /r/ since Singaporean English, Re RP, is non-rhotic, i.e. syllable-finally /r/ is not pronounced in words like quarter. Altogether, there are 76 tokens of 61 types where <r> is inserted or omitted in potentially rhotic position, e.g. *surpport (support), *suprised (surprised), *merlingerer (malingerer), *Mecedes (Mercedes), *humoursexual (homosexual), *hazad (hazard).

Instances where <l> and <r> are involved, either as phonemic /l, r/ or graphemic <l, r> (or both), and whether as part of a substitution, transposition, omission or insertion, total 90 tokens of 65 types for <l>, and 193 tokens of 130 types for <r>.

Misspellings involving <n> (indeed all 3 nasals /m, ŋ/) were also very common.
Omitted Word-medially tokens/types: /m/ /1/l, /n/ 24/19, /ŋ/2/2
   *remeber (remember),*covert (convert),*back (bank)
Word-finally tokens/types: /m/ 1/1, /n/ 3/3, /ŋ/-
   *for (form) *garder (garden)
Inserted Word-medially tokens/types: /m/ -, /n/ 16/11, /ŋ/1/1
   *throwning (throwing) *linking (leaking)
Word-finally tokens/types: /m/ -, /n/ 3/2, /ŋ/-
   *own (owe)
The grand total of cases involving graphemic/phonemic <m, n> in any capacity was 23 tokens of 18 types for <m>, and 129 tokens of 90 types for <n> (including 12 tokens of 9 types where <n> represented /ŋ/).

An interesting parallel is seen with a specific spelling problem of native speakers discovered in some adults attending literacy courses, some schoolchildren and three neurological patients by Marcel (1980). "It concerns liquids (/l/ and /r/) when preceded in initial consonant clusters by a stop, and liquids and nasals (/m/ and /n/) when followed by a stop or fricative in terminal consonant clusters" (Marcel, 1980:376). Omissions, insertions and transpositions involving these consonants are taken to be caused by difficulties in phonetic segmentation, since it has been argued "that the consonant further from the vowel in 2-consonant clusters is the basic one and the one nearer the vowel is the affix" (1980:395-6). That is, the /n/ of men is more basic (and therefore more obviously present to the speaker/listener) than that of meant or mend (similarly the /l/ of coal vs. colt, cold).

A further complication is added, in that many Singaporeans do not pronounce syllable-final /l/ as a voiced alveolar lateral (Brown, 1986, and forthcoming). Instead, one of three things may happen:

(i) The alveolar tongue contact is lost, leaving a vocalic articulation of the [ ] type.

(ii) Where this follows a back vowel such as [ɔ, o, ʋ, u] the vocalic articulation may be absorbed by the vowel, giving rise to misspellings such as *aways (always), *pour (pool) and hypercorrections like *all (or), *scole (score), *wool (woo).

(iii) The articulation may be dropped following other vowels, leading to omissions as in *chid (child), *weath (wealth), and unnecessary insertions such as *oval (over), *fomel (former).

Mention should also be made in this section of the widespread use in Singaporean English of the glottal stop as a replacement for syllable-final /p, b, t, d, k, g/ and rarely /tʃ, dʒ/. Since the glotal stop is not a phoneme of English, and therefore has no regular written representation, confusion will arise in Singaporean spelling of final stops and affricates. The glottal stop is a plausible contributory factor in many of the examples of /p, b; t, d; k, g/ conflation, e.g. *jumb (jump), *graid (great), *beg (pack), as well as numerous others, e.g. *acept (accept), *suceed (succeed), *pinic (picnic), *basis (basics), *destrution (destruction), *bombarment (bombardment), *din't (didn't), *part (park), *blandly (blankly), *breadfast (breakfast), as well as possibly *speech (speed), *snapped (snatched).

5. Glides.

Several misspellings involved glides. Certain variation is possible in the phonological interpretation of these examples. I will treat them in 3 categories.

(i) The majority of glide misspellings involved the palatal glide transcribable as /i, ι,j/. In this category are included /ju/ examples such as *continised (continued), *unsual (usual), *suitation (situation), *humulate (humiliate). There were 35 tokens of 32 types in this category. Most involved omission of the glide, e.g. *curosity (curiosity), *victorous (victorious), *testmimonal (testimonial), *strenous (strenuous), *unniversity (university), although some involved insertion, e.g. *toliet (toilet), *disadventiage (disadvantage).

(ii) As a sub-category of the above phenomenon, 15 tokens of 12 types involved palatalisation, i.e. the process whereby palalo-alveolar consonants /ʃ, ʒ, ʧ, ʤ/ are created, usually from historical sequences of alveolar consonants /s, z, t, d/ plus /i, ι, j/. For many words, the two pronunciations are alternatives, the sequence being considered perhaps more precise or archaic, e.g. Christian /krιstjən ~ krιstʃən/. All but 2 of these examples involved deletion of the palatisation element, e.g. *christain/*christan (Christian), *efficently (efficiently), *Venetain (Venetian), *compassinate (compassionate), *solider (soldier). The 2 examples of insertion of palatalisation were *prision (prison) and *sprange (sprang). Some of the above examples could be analysed simply as graphemic transpositions my point is that the effect of this is to destroy the phonological palatalisation element.

(iii) The final category involves the velar glide transcribable as /u, ʋ, w/. There were only 5 tokens of 5 types, mostly involving the word language as the target or as the interfering factor, e.g. *langesage (language), *languges (languages), *laguage (luggage).

6. Syllable structure.

a) Stressed vowel omission.

In a number of misspellings (16 tokens of 14 types), a (primarily or secondarily) stressed vowel was omitted. This was surprising, since stressed vowels are thought to play an important part in the way words are stored and retrieved from a speaker's memory. Certain of these errors can be explained in that stress is sometimes placed differently in Singaporean English from RP, e.g. *devloping (developing), *exmination (examination), *graunto (guarantor), where, the stress is shifted or given far less prominence than in RP.

Other examples cannot be explained in this way, though: *alrm (alarm), *aplogise (apologise), *avarcious avaricious), *brigde (brigade), *reprimded (reprimanded), *scond (second), *very (every).

b) Unstressed vowels.

A larger number of examples involved misspelling of unstressed vowels. One would expect this, because the commonest unstressed vowel, schwa, may be represented by a wide variety of graphemes. Such errors are also common, therefore, among native speakers.

57 tokens of 40 types contained a substitution of the wrong vowel grapheme, e.g. *appearence, *referance, *passangers, *pleasently, *handsame, *scenary, *discribed, *inspecter, *oppurtunity, *buffolo, *envolope (noun).

18 tokens of 14 types omitted the unstressed vowel grapheme. In many cases, this occurred where the unstressed vowel might well be lost (elided) in fluent connected speech; the misspelling thus represented an acute observation on the actual pronunciation of the word, e.g. *beautful, *displine, *monastry, *opptunity, *restraunt, *sevral. However, not all cases can be explained in this way, e.g. *civilzation, *everwhere, *interst, *vist (visit).

19 tokens of 7 types contained an <l> which, as a consequence of the above omission of an unstressed vowel grapheme, might be considered to have become syllabic. For example, buffaloes is misspelt as *buffloes. On analogy with shuffling, which may be thought of as containing 2 or 3 syllables, a 3-syllable interpretation of *buffloes is still possible. Further examples include *accidently, *happly and *luckly.

In total, a whole syllable (stressed or unstressed) was omitted in 56 tokens of 37 types. That is, a plausible pronunciation of the misspellings contained fewer syllables than the target word.

7. Doubled consonant graphemes.

The graphemic phenomenon of doubling consonants is a well-known difficulty for native speakers. It is thus not unexpected that the present corpus from Singaporean writers also contained many such errors. In 85 tokens of 40 types, a doubled consonant was made single. Many of these involved failure to double with suffixes, e.g. *begining, *grabed, *unforgetable, *normaly, while others involved different structures, e.g. *asuming, *atitudes, *corupt, *embarasing, *inteligent, *rabit.

An unnecessary doubling of consonants was found in 50 tokens of 34 types. Most involved suffixation, e.g. *arrangging, *hangged, *listenning, *bidding, *writting, *morallity. Others included *appologise, *banannal *bannana and *fillial.

5 tokens in this category were misspellings of the word cigarette, as *cigerrette, *ciggarette and *ciggerette.

8. Silent <e>.

A graphemic phenomenon of similar notoriety is the silent <e>. Examples in the present corpus were common. In 60 tokens of 35 types, the <e> was omitted. Most of these occurred in situations where the <e> performs an easily specifiable role, e.g. *amusment, *arrangment, *cloths (clothes), *extremly, *practic, *prepard, *reptils, *sincerly. For others, the role of the <e> is not so clear, e.g. *advertisment, *heros, *mor, *unfortunatly.

Hypercorrection by unnecessarily inserting an <e>, occurred in 14 tokens of 10 types. In 3 types, this constituted failure to delete the <e> in appropriate circumstances - *arguement, *changeing, *rescueing. Other examples included *punishement, *slowely and *stomaches.

Observations and proposals.

Of the above 8 categories of major causes of misspellings by Singaporeans, a reasonably clear line can be drawn between those problems which are caused by anomalies inherent in the English spelling system, and those relating to features specific to Singaporean pronunciation. The former kind are therefore to be found in the spelling of native as well as non-native speakers, whereas the latter category will be unique to Singaporeans.

Problems inherent in the writing system clearly include consonant doubling and silent <e> (which are in fact often related phenomena, both dealing with the graphemic representation of long vs. short vowels). These should therefore be a major concern of any reformed spelling proposal. In the present corpus, far more mistakes are made by making double consonants single and omitting the silent <e> than by hypercorrections of these; this would therefore seem to be the preferable solution (as in Cut Speling).

A writing system with a perfect one-to-one correspondence between graphemes and phonemes would contain no homophones or homographs, although it might have total homonyms (where both spelling and pronunciation were the same). The existence of homophones and homographs may be taken to indicate the extent of this lack of fit, and they are therefore a source of misspellings for native and non-native speakers alike.

The difficulties associated with /l, r, m, n, ŋ/ may originate in higher-level language processes, and relate to difficulties in phonetic segmentation. Indeed, Marcel (1980) raises doubts about the traditional view of phonemic-graphemic representation, i.e. that speech is composed of basic phonemic units, of which speakers are consciously aware, and that spelling corresponds to the graphemic representation of these phonemes. Rather, it is much more of a 'chicken and egg' situation: "although the alphabet is the most efficient way of reading and writing, [it has been suggested] that it has been invented only once in all history. This would imply that the representation of speech on which it relies (the phoneme) is rather unnatural. In whatever way the alphabet was first invented, it is possible that for each learner today, the concept of the phoneme (tacit if not explicit) comes from rather than leads to the particular alphabetic system, with which he or she is confronted" (Marcel, 1980:401-2).

The remaining four categories of misspelling are specific to Singaporean speakers. Suffixation is a widespread problem but may be thought of as a grammatical (morphological) phenomenon as much as a phonological one. In the corpus there were 46 tokens of 23 types of omission/insertion of the <-s> suffix, and 79 tokens of 54 types for <-ed>. 19 tokens of 18 types involved other affixes, all but one (unconsiderate [inconsiderate]) being suffixes.

Nevertheless, in certain examples, similar confusion in spelling may be found among native speakers, owing to the process of elision, as when syllable-final /d/ is commonly elided in native speech where it is surrounded by other consonants, which may lead to confusion over morphology (and thus spelling) of certain phrases. For instance, should one talk about a one-arm bandit or a one-armed bandit? The comparison between native and non-native confusions cannot be drawn too far, though, since suffix-dropping is far more extensive for non-native speakers than the limited native possibilities just mentioned.

The importance of stress and other suprasegmental features (rhythm, intonation, voice quality) is increasingly being emphasised by English language teachers. The stress system of English is viewed as the basic framework of the spoken form of the language, within the bounds of which the individual segmental vowel and consonant articulations are performed; it plays a major role in the achievement of sounding like an English speaker. The surprisingly large number of misspellings relating to stressed vowels shows that stress commands far less importance in Singaporean English than it does for native accents.

At segmental level, teachers of Singaporeans should pay particular attention to the following features of Singaporean pronunciation (roughly in descending order of importance):

1. /e, æ/
2. /i, ι/
3. The voiced/voiceless distinction, in particular /t, d; p, b; f, v; s, z/, and the widespread use of the glottal stop.
4. Glides, including palatalisation.
5. All nasals.
6. /l, r/
7. /t, θ)/
8. /ɔ, ɑ/

Christopher Upward has pointed out (personal communication) that "one might conclude that no reformed English orthography can cater for interference from other languages, but that reforms designed specifically for native speakers will also benefit foreign learners. Therefore, there is no point in taking the needs of specific foreign learners into account' [in any spelling reform].

The above proposals for Singaporeans are based on analysis of the corpus of misspellings, and therefore are directly relevant to minimising problems of spelling. They should also improve the intelligibility of spoken communication. The two media cannot, of course, be divorced for foreign learners but, whereas language teachers are usually quick to rectify misspellings, they often allow unacceptably large variation in students' pronunciation to go uncorrected. Following G. Abbott (1979:175), we might therefore conclude that "an 'adequate' pronunciation is one which facilitates accurate spelling".


