Journal of the Simplified Spelling Society, J32, 2003, pp12-14.
[Godfrey Dewey: see Pamflet, Anthology, Bulletins.]
[Melvile Dewey: see Media.]

How phonemic is English spelling?

Godfrey Dewey, author of
Relative Frequency of English Spellings.

Reprinted from: Spelling Reform, ed. Newell Tune, 140-142
& Spelling Progress Bulletin. Spring, 1969 pp10-12.

Forward by S. Bett: In English Today, Don Hook [2002] said, "English spelling is very regular and not particularly hard to learn." Similar claims are repeated by those who write on the topics of phonics and English spelling. The figure of 85% regular is often quoted as if it were based on solid research [Crystal, 1999]. The original research was done by Dewey in the 1940's and repeated by Paul Hanna in the 1960's. Hanna noted that you can guess with 75% accuracy the dictionary spelling for each phoneme with 4 guesses (see]. Predicting phoneme spelling is not the same as predicting syllable spelling [see chart]. When people, such as Flesch [1956, 1983], say that English has a highly regular orthography or is 97% phonemic, they have something else in mind other than predictability. Spaulding (1964] uses 70+ phonograms and 26 exception rules to arrive at her high estimate for English regularity. With around 200 sequentially applied exception rules and two spellings per sound, traditionally spelled words can be shown to have a high degree of predictability. Memorizing 200 rules, however, might prove to be more difficult for humans than memorizing the dictionary. I know of no spelling champion that relies on this strategy. It might be interesting to survey spelling champions to determine the extent to which they rely on rules. One could ask a question such as: Given a sound, what are the possible spellings that you consider in sequential order?

How Phonemic depends on the unit of analysis

75% regular50% regular40% regular

These statistics are based on the regularity found in the traditional spelling of the Gettysburg Address. It may overstate dictionary regularity but is indicative of the regularity one generally encounters. If the avg. syllable has 2 phonemes which are 75% regular, the syllable would be .75x.75 or 50% regular.

How phonemic is English spelling? For a variety of reasons, no simple direct answer to our question is possible, and statements which failed to define their terms clearly, are meaningless or misleading - usually both. First, therefore let us define our terms.

A completely phonemic spelling of English would have a 1 to 1 phoneme - grapheme correspondence; that is, only one grapheme for each phoneme and only one phoneme for each grapheme. Several symbols for one sound are an obstruction to writing (that is, spelling); several sounds for one syllable are an obstruction to reading. Both factors are present in our traditional orthography (T.O.) to a high degree. Thus, the current edition of How we spell!, [1] formally English Heterography, identifies in a single [72,000 word], abridged dictionary 530 spellings of 41 sounds, employing 273 different symbols, that is 12.9 graphemes per phoneme, 1.9 phonemes per grapheme.

Consider the principal factors involved in determining the degree to which English spelling is phonemic.

Measurement may be based on running words (connected matter, or weighted word frequency lists); on unweighted lists of frequent words; or on a dictionary. The first is the more important for the teaching of reading, especially where a phonemic initial teaching medium such as i.t.a. is involved; the second is more useful for the teaching of writing (more particularly, spelling); the third is least valuable except as a matter of linguistic research. The basis of any pronouncement should be clearly stated, always.

Whatever the corpus of the study, results may be stated in terms of the spelling of phonemes, of syllables, or of words. Again, the basis should be clearly stated. A measurement in terms of words will be more immediately intelligible to the average layman.

In addition to the foregoing, the number of phonemes distinguished will quite obviously affect any measurement. For the untrained ear of the general public, the most practical number is somewhere between 39 and 44, probably 41: the traditional 40 sounds of Pitman shorthand, commonly classed as 24 consonants, 12 vowels, and 4 diphthongs, plus schwa, as in the Simpler Spelling Association Phonemic Alphabet. The treatment of the weak, unstressed vowels, in particular, will markedly affect the statistical outcome.

As an example of the influence of the number of phonemes distinguished, Hanna [5] analysed an unweighted list of some 17,000 frequent words on a 52-phoneme basis reduced from the 62 phonemes distinguished by Merriam-Webster's New Collegiate Dictionary (6th edition 1956) on which he relied. On that 52-phoneme basis, he found 334 different spellings, employing 170-odd different graphemes, or about 63% of the 530 spellings employing 273 different graphemes reported by How we spell!, as above. If, however, Hanna's results be restated on a 41-phoneme basis, his findings become only about 281 different spellings, employing substantially the same 170-odd graphemes, or only 53% of the dictionary basis total. My own study [2] of speech sounds (not spelling) analysed its corpus of 100,000 words of diversified connected reading matter on the 48-phoneme basis of the Revised Scientific Alphabet (Key 1 of the Funk & Wagnall's Unabridged New Standard Dictionary), but reported most of its results on the 41-phoneme basis noted above.

Answers by others to our question, how phonemic (phonetic, regular) is English spelling, range all the way from Hotson, [7] "At present we use 500 symbols for 40 sounds, so that English is 8% phonetic, " to Spaulding, [10] "if properly studied and taught, our language is, in fact, almost completely phonetic or regular," based on her statement that 94% of the most used 1,000 words may be spelled correctly by 70 phonograms, manipulated according to 26 rules! In between, Hanna, [6] in the most comprehensive and thoroly researched study to date, arbitrarily assumes 80% (that is, that a particular phoneme will correspond to a particular grapheme in 80% of the different words in which it occurs) as a criterion of consistent correspondence to the alphabetic principal; and his findings, in terms of phonemes, approximate that figure, provided that further factors such as the position of the phoneme in its syllable are taken into account. When, however, a computer was programmed with an algorithm or rule of procedure, based on the findings of that study, which manipulated 77 graphemes according to 203 rules, it was able to spell just under 50% of the investigated words correctly, and an additional 36% with only one error!

Most statements regarding the phonemic or non-phonemic character of English spelling are based, implicitly at least, on whole words (whether on a running word, word list, or dictionary basis), and usually evaded the phonemic issue by substituting the terms regular or irregular; words which, like charity, can be stretched to cover a multitude of sins. Thus, Laubach, [8] whose extraordinary achievements, "Each one teach one," in promoting literacy in over 300 languages thruout the world are well-known, employs for English a notation of 96 symbols [9] - actually, counting 4 recent additions and 18 doubled consonants, 118 symbols - several of them involving a diacritic, the macron; and describes as "regular" all spellings within the compass of that notation. Parenthetically, this method, which retains the precise T.O. forms of less than 50% of running words, has just achieved highly impressive results in teaching English to Chinese students in Hong Kong.

The farthest out example of such "regularity" is Wijk, [11] who, on the basis of an exhaustive and erudite examination of present-day English orthography, admits to his Regularized English 172 graphemes for 50 phonemes (actually 43 phonemes, since 7 are consonant clusters, not single sounds). Some of the graphemes are used for two or three different phonemes; many are supplemented by considerable lists of exceptions; and the problem of unstressed vowels and diphthongs is treated separately. The result is a notation, easy to read, of course, because it preserves so many of the familiar irregularities of T.O., but so complex to apply that it would take a linguistic Ph.D. with an encyclopedic memory to write it according to specifications. Nevertheless, on the basis that this notation preserves the TO. forms of just over 70% of running words, Wijk implicitly finds T.O. to be 70% "regular."

So far as I am aware, there exist no dependable data on the relative frequency of occurrence of the different spellings of the phonemes of English on a running words basis. With this data, the question, how phonemic is English spelling, may be answered in terms of the occurrence of particular spellings of sounds in running words, with some assurance. This, however, is an answer to only one facet of the problem.

Since TO. provides a maximum of 26 letters (three of which - c, q, x - are redundant and contribute nothing to the problem) for a minimum of 39 phonemes, a phonemic standard by which to measure T.O must obviously, in addition to assigning one explicit phonemic value to each letter, supplement them by a sufficient number of equally explicit letter combinations. Substantially this is done by the spelling reformed version of WES [World English Spelling], which, for the basic 40 sounds, assigns a single phonemic value (the same values as in i.t.a.) to each of the 23 useful single letters, and assigns equally explicit phonemic values to 16 digraphs and one trigraph (the majority closely resembling the corresponding i.t.a. characters). To these WES adds 4 vowel-plus-r digraphs, to make the notation equally acceptable to r-keepers and r-droppers; and 2 consonant digraphs (wh for /hw/ and nk for /ngk/) for the sake of compatibility. The WES treatment of the weak unstressed vowels, usually schwa, by retaining in general, any single vowel of T.O., is one of its strongest features; for a specific character for schwa, if it could be made available, would change, unnecessarily, what might otherwise be the exact T.O. forms of perhaps 1 word in 6 on the printed page.
Ed. Note: Not having a single symbol for schwa often obscures stress and spelling. Misspelling the mid lax vowel is one of the most common spelling errors. TO generally spells the terminal r-combination er [her] and the initial @ as a [ago] One exception rule can thus preserve systematic regularity, easy spelling, and a high correspondence with TO.
This notation is serves as an adequate standard of measurement for approximating an answer to our question, how phonemic is English spelling, by determining what proportion of the words, syllables, or phonemes of T.O. remain the same when transliterated into WES. For such a qualified answer to our question, let us apply this standard to a significant word list, both unweighted and weighted, and to a representative selection of connected matter.

Table 3 of my study of speech sounds [3] lists 1027 particular words (as distinct from root words, Table 4) which occurred over 10 times in 100,000 words of well-diversified connected matter, representative of English as written and spoken today, and which made up 73,633 of the 100,000 words. Of these, the TO. forms which are fully phonemic by our standard are:
Unweighted: 229 different words out of 1027 different words, or 22.3% phonemic.

Weighted: 36,436 total words out of 78,633 total words, or 46.3% phonemic.
Lincoln's Gettysburg Address, a masterpiece of English literature, which includes most of 41 phonemes in fairly typical proportions, contains (excluding the title) 267 words, 364 syllables, 958 phonemes (1,149 letters). By our standard, the words, syllables, or phonemes which are fully phonemic are:
106 total words out of 267, or 39.7% phonemic - roughly 40%
173 syllables out of 364, or 47.5% phonemic - roughly 50%
712 phonemes out of 958, or 74.3% phonemic - roughly 75%.
[see the chart at the beginning of this article]
That is, 106 of the complete words, 173 of the syllables, or 712 of the phonemes were spelt uniformly, according to the WES symbols, exactly as if they would be if the whole selection were translated into WES.

The last figure, which will vary only slightly for longer specimens of connected matter, is probably the most significant single answer presently available, out of the various possible answers, to our original question: How phonemic is English spelling?


[1] Dewey, Godfrey. How we spell! or English Heterography. Lake Placid Club. N.Y, Lake Placid Club Education Foundation. 1968, p.32, 76.

[2] Dewey, Godfrey. Relativ Frequency of English Speech Sounds, 2nd ed. Cambridge, Harvard Univ. Press, 1950.

[3] Op. cit, p.19-29.

[4] Dewey, Godfrey. Relative Frequency of English Spellings. Columbia, Teachers College Press. 1970.

[5] Hanna, Paul R., Jean S. Hanna, Richard E. Hodges, Edwin H. Rudorf, Jr. Phoneme-grapheme correspondences as cues to spelling improvement, Washington, U.S. Government. Printing Office, 1966 (Doc. OE-32008).

[6] Op. cit, p.34.

[7] Hotson, Clarence, Ph.D. Ryt Ryting. Romulus, N.Y. Privately printed, 1965, p.6.

[8] Laubach, Frank S. Learn English the new way. Syracuse N.Y. Laubach Literacy, Inc. 1967.

[9] Laubach, Frank S. Key to correct regular "New Spellingsz" (1 page) Syracuse N.Y Laubach Literacy, Inc. 1967.

[10] Spaulding, Romalda B. In Reading Reform Foundation Conference Proceedings, New York, 1964, p. 31.

[11] Wijk, Axel. Regularized English. Stockholm, Almquist & Wiksell, 1959.

Notes: Godfrey Dewey [1890-1976], [see Pamflet, Anthology, Bulletins.]
the son of Melville Dewey, the inventor of the Dewey Decimal System for library organization, was Vice-president, Lake Placid Club Education Foundation. He was the founder of the SSA [Simper Spelling Association], and the author of two important statistical studies: English spelling: Roadblock to reading. and The Relative Frequency of English Spelling.

Back to the top.