[See Journal and Newsletter articles by Steve Bett.]
[Note: this page is set to display International Phonetic Alphabet characters. If your browser over-rides this setting, by default or by choice, the IPA characters may not show.]
See UK symbols chart & link to IPA chart

The Number of Phonemes in English.

Steve Bett.

Counting the number of phonemes is like counting the number of colors in a rainbow.

When we try to break up a continuous spectrum into discrete units, we move to the realm of fuzzy logic.[1]

Abstract. Phonemes are categories or conceptual constructs not unlike our notions of color. Color is a little simpler since "blue" could be defined as falling between two specified frequencies. It is easier to take a slice out of the visual spectrum than it is to take a slice out of the speech sound spectrum. However, the same problems remain when we get too far removed from the primary colors. Adam Brown's argument is analogous to "if we can't count the number of colors in the rainbow, then we should stop using color names." Most categories & concepts have fuzzy boundaries particularly when we are talking about how that category is perceived by more than one person. Just as we can talk about the primary colors and be understood, we can talk about clear instances of pure or uncombined phonemes. We cannot expect agreement to extend to the blends. The minimum number of unblended phonemes in all varieties of English is 36 [14v-22c]. Agreement on this much is enough to retain the goal of one symbol-one sound. Dr. Bett is the moderator of the phonology forum: [See saundspel on the Links paje.]

1. Are pronunciation guides possible?

After reading the eloquent arguments of linguist, Adam Brown [JSSS J27/2000/1], I came away almost convinced that a dictionary pronunciation guide was impossible. Brown's intention was to explain why it was impossible to specify the exact number of phonemes [2] in English. He expanded on my [3/1999] article where I gave examples of phoneme estimates from 36 to 62. I suggested that, while 46 [21v 25c] was a adequate number of phonemes to describe English speech, the only number that people would be likely to agree on would be the number of uncombined phonemes. I thought there could be agreement on 14 pure vowels and 22 pure consonants. [2] As shown in the analysis of dialects below, the exact distribution of these vowels and their allophones can depend on the dialect. [see Northern & Southern English]

Brown doubted the possibility of even limited agreement. He added that if the exact number of vowels cannot be established, then speech sounds cannot be visualized ... and the alphabetical principle becomes an unrealistic ideal.

Brown's presentation reminded me of the following definition of an alphabet:
Alphabet ... meaningless marks arbitrarily associated with meaningless sounds.

After such a definition, one might conclude that written communication is impossible. By defining the goal as a strict one-to-one correspondence to an exact number of phonemes, Brown is able to build a similar strong case against the possibility of a phonemic transcription. Almost every argument that Brown brings up is valid. A strict one-to-one correspondence requires an exact number of phonemes. If we cannot agree on the exact number of phonemes then we cannot have a corresponding alphabet or phonemic transcription.

Fig. 1. In addition to the IPA symbols, this chart shows 4 ASCII or KWERTY representations.
The International Phonetic Alphabet typically isolates 14 pure vowels and 7 combinations.


VOWELS: Graphemes for 21 English speech sounds (phonemes)
SAMPA
ipa
SSKeyword SAMPAunifonenglik Saxon
stressed
Spanglish
unstressed
7 short [checked] vowels
i
ɛ
æ
a
ɔ
ʌ
ʊ
I
E
(
A
Q
V
U
i.
e.
a.
o.
o
u.
w u
pit
pet
pat
pot
cost
cut
put
pIt
pEt
p(t
pAt
kQst
kVt
pUt


pat

kxst
kut
pCt
pit
pet
paet
pot paat
koost
cat cvt
put
pitt
pett
patt
pott
cost
cutt
put
pit
pet
paet
pot
cost
cat
pwt
7 short vowelsunifon englik
7 free [long] vowels
i:
e
u:

ɔ:
ə
o
i
e
u
ɜ'
O
θ
O
ie
ei
uu
urr
o aw
a
oa
ease
raise
lose
furs
cause
allow
nose
iz
rez
luz
f3'z
kOz
θ"laU
noz
Ez
rAz
lUz
fcrz
kxz
clq
nOz
iiz
reiz
luuz
f'erz
kooz
alau
nouz
iez
reiz
luuz
furrz
cawz
alau
noaz
yz
reyz
luz
ferz
coz
aloud
noz
14 uncombined
8 combinations in minimum set
ɚ
θ'
aI
OI
aU




er
ai
oi
ou
aa
ir
err
or
corner
rise
noise
rouse
are
ear
air arrow
ore
"kOrnθ'
raIz
nOIz
raUZ





rIz
nQz
rqz
or
Er
Ar
Or
korner
raiz
noiz
rauz
aar
ir
er
or
corner
raiz
noiz
rauz
aar
ier
eir arro
oar
cornr

noyz
ræwz
ar
ir
err
or
21 total - Jonesherder "h3'dθ'h'erderhurrder herder

Listed above are the 21 vowels isolated by Daniel Jones and used by most scholars [e.g., Wilk and Wells] and many dictionary pronunciation guides. 14 of these vowels can be considered to be uncombined or pure vowels. [see next chart) The IPA special symbols are listed [column 1] when they differ from SAMPA. Well's SAMPA notation, a machine readable ASCII-IPA, Is listed In the second column. SAMPA uses upper case letter for the short vowels, O for 'awe' is the one exception. Unifon uses big letters for the long vowels. Spanglish, a digraphic solution, use double letters [digraphs) for the extended vowels. Spanglish, RITE, and the traditional orthography [with German words] use trailing double consonants to mark stressed short vowels.

In a [04 June 2000] letter written to the saundspel phonology forum, Michael Avinor put it this way, "Speech is an analog signal and writing is a digital signal. To talk about a phoneme we have to cut up continuous speech into discrete units. Digitizing speech can preserve only a limited part of the speech information."

Even more information is lost when speech is visualized or represented graphically. Nevertheless, the fragment of the original that remains can be enough to accurately convey information.

IPA5 Front 4 Central5 Back
14 vowelsUnroundedRoundedUnrounded RoundedUnroundedRounded
Highi (Ie)beat  hɜ'də'bootu
lower highɪ (I.)bitago herderbookʊ
higher) Mide (ei)baitə (a) (3) ɚ (er)boato (əu ou oa)
lower)ɛ (e.)betbut (v) ʌ (v.)boughtɔ (o)
Lowæ' (ae)batɑ (aa) fatherbottleɒ (o)

Daniel Jones' IPA for RP had 21 vowels. This chart lists the 14 pure or uncombined vowels. Diphthongs include ai au oi and iə eə oa uə

Fig. 2. (i:] tense unrounded, [1] lax unrnd, [ə] mid lax unmd unstressed. The other dimensions of speech sound go by several names. One reference is to the jaw position; open, half open, or closed. Another is to tongue position during vocalization [front and back] is also used. Typically 3 to 6 levels of openness and backness are isolated. All front vowels are unrounded and all back vowels are rounded according to the chart above. Example words are added to the vacant columns. The vowel symbols are in the same position as in the Jones' vowel diagram or quadrilateral - a more accurate representation of the mouth.

This chart does not show combined vowels such as the glided blends of two sounds [diphthongs] that start in one position or cell of the chart above and end up in another. e.g., ai is a combination of low central vowel [* or a:] and a high front vowel [i or I]. au begins as a low mid and ends as a high back.

Linguists, such as Daniel Jones, broke the sound spectrum down into two parallel segments. tense - lax, and rounded - unrounded. The same sound may have two different expressions depending on muscle tension and rounding. Place of articulation was also important.

2. Sounds are not the only perceptions with fuzzy boundaries.

Sounds are not the only things in our perceptual world with fuzzy boundaries. Has anyone ever claimed that a name for a discrete segment of the sound spectrum was any more exact than a color name for a discrete segment of the color spectrum? The flaw in the Brown's argument is the implied insistence on a high level of precision. If we raise the bar of precision high enough, then most ideals can be characterized as unobtainable and unrealistic.

We cannot see a particular color, e.g., blue, any more than we can hear a particular phoneme. This does not mean we cannot discriminate or sort blue and yellow objects. We can be presented with an instance and then asked to judge whether or not it a member of a category or class of 'blue' things. As we get near the boundaries, the judgments become more uncertain [e.g., Should a green ball be sorted into the pile of blue objects or yellow objects?]. However, there are some modal or mid range instances that nearly everyone will agree is a particular primary color. There are even instances that nearly everyone will agree is blue-green. When the blends become more complex, however, agreement becomes harder to achieve.

3. The Phoneme - fuzzy by definition:

a range of sounds treated as identical

The same is true for the abstract sounds we call phonemes. Phonemes have been defined as the smallest unit of sound capable of changing the meaning of a word. The substitution of b for p in [pit] changes the meaning. Therefore [b] is a phoneme and the p:b distinction is phonemic. [pit & bit] are called minimal pairs.

Voiced and unvoiced pairs such as [p: b] are not distinguished in all languages. It is difficult to distinguish the two in English if they are whispered.

[pit-bit] is just one of many examples of the [p:b] minimal pair. Examples of other minimal pairs can be much more difficult to find. Is there a [ɪ:ə>] minimal pair? Distinguishing minimal pairs for schwa [unstressed mid lax v] is much harder than for other vowels. The substitution of schwa [ə] for [ɪ] in the word [accept /ək 'sept] changes its meaning from [to take something offered] to [except /ik 'sept/] [to not include something]. Other notations Spanglish: [acseppt : icseppt ] WS fonetic [yksept: iksept]

4. A phoneme is a category, an abstraction, not a physical thing.

A phoneme is not really a unit in the sense of a single sound. It is a range of related but acoustically distinct sounds treated as a unit or category - treated "as if" they are the same sound. What we hear are sounds or phones. A phoneme is an abstraction or interpretation. The same person will pronounce the same vowel in acoustically different ways in association with different consonants.

People from different speech communities will rarely pronounce the same vowel the same way. [see Roger Brown [1988] for more examples of categories].

5. 100% agreement is possible for 14 clear instances of uncombined [primary or pure] vowel phonemes.

Any time you try to break up a continuum or spectrum into discrete units, there will be problems at the boundaries. Not everyone will slice the continuum at precisely the same point. However, just as it is possible to achieve nearly 100% agreement on instances of the primary colors, it is possible to get nearly 100% agreement on instances of the primary or uncombined phonemes.

Many words are not pronounced the same in different dialects of English. This means that a phonemic representation of one dialect [e.g., /ə/ or /U/] may not correspond to the speech in a different region. A phonemic script always presumes a base dialect and such alphabetical writing systems developed for one dialect will not always be a reliable guide to the pronunciation of another dialect.

Thus, some people will have to learn a spelling dialect in addition to their local dialect. This situation is also true in Spain and Italy. The base dialect, Castilian, for instance, does not always correspond to the local dialect.

Not everyone will agree where i sound stops and e sound begins. The disagreements will increase when the vowel is unstressed. The last phoneme in <vegetable> can be transcribed as vejtəbl, vejtə bəl, or vejtə bUl where [ə] represents the unstressed mid lax vowel or schwa ə. The first phoneme in <because> can be transcribed as bekoz, bikoz, or 'koz. The less stress or smaller the discrete unit, the less agreement there will be.

Breaking up the vowel sound continuum into discrete units is analogous to breaking up the color spectrum into discrete colors. Adding gray into to a color mix is analogous to removing stress in speech. Gray for the lost of brightness and contrast) reduces the ability to discriminate colors just as the loss of stress reduces the ability to discriminate vowels.

6. One phonographic representation cannot cover all the dialects of English.

This is a different issue than the one just discussed. It is not a question of not being able to recognize clear instances of a phoneme. Rather it is a case of which phonemes should be used with particular words. Brown is right, strictly speaking BBC-English and NBC-English are phonologically distinct. This however, does not mean that they cannot be represented with one orthography. All it means is that the pan-dialect solution will not be 100% phonemic.

If the symbol represents a speech sound, when that sound changes the symbol has to change. If this is the case, then dialects that are have unique pronunciations will have unique spellings. If this is the case, how does one come up with a standardized spelling for English that can match every dialect of English?

How do broadcasters determine what pronunciation to use on the air? It is the same problem. If a broadcaster can pronounce it, then it can be spelled in a phonemic notation. The dialect used by broadcasters is designed to be the easiest one for a widespread audience to understand. The spelling system would follow suit. The base dialect would be the broadcaster's dialect.

There are two broadcast dialects that can be described as BBC-English and NBC-English. Since these two dialects are not the same, their pronunciation guide spelling would also differ. Let's assume that both broadcast dialects pronounce /ei/ and /ai/ as /ei/ and /ai/. Many of their listeners do not. Words containing /ei/ can be pronounced e: in northern English and /ai/ in Cockney. /ai/ is pronounced /a:/ in parts of the Southern U.S. and also in parts of England.

The spelling system has an easier time of it than the dialects and pronunciations used by broadcasters. The reason for this is that people will reinterpret word pronunciations to match their regional dialect. If asked, they might say, "I know it is spelled [greit] but around here we pronounce it [gret]."

I have no problem with using broadcast English as a base dialect for the spelling system but there are two other proposal for dealing with such discrepancies between dialects of English.

Conclusion.

We cannot count the number of phonemes in English speech any better than we can count the number of colors in a rainbow. However, just as we can identify the primary colors in the rainbow, we can identify 36 clear instances of the primary or uncombined phonemes in speech [14 pure vowels, 22 pure consonants]. As we try to make finer distinctions, unanimity of opinion declines. There will never be much agreement on the exact number of combined vowels. Most people will want unique phonograms for [ch-tsh], [j-dzh], and [ai] - as in Saigon and aisle - bringing the total to 39. Many will want to grant phonogram status to [oi], [au] and a few r-combinations bringing the total to about [46]. An alphabet with [46] phonemes would be more than enough to represent the significant sounds in both British and American dialects of English. [see the IPA chart below]

We can get by with 42 phonograms or sound signs shown in the grapheme-phoneme chart above. Actually, we can get by with 36 by removing the redundant letters c q and x and the combinations [or diphthongs ai, oi, ou]. This particular chart includes Ch [tS] and J [dS] but combines [Dh] and [Th]. The one below has a unigraphic symbol for /tS/ and /dZ. It also has symbols for both NBC and BBC English. The IPA alphabet is simpler than Spanglish which has incorporated many traditional features.

I.P.A. Alphabet for English (BETA)
ɑ
ɑmz
aims
ɒ
ɒd
odd
æ
ænd
and
ʌ
ʌp
up
b
bæt
bat
č
čɪn
chin
d
dɪn
din
ð
ðe
they
e
ep
ape
ɛ
ɛg
egg
ɜ
ɜθ
earth
ɝ
ɝθ
earth
ə
ə'wɛə
aware
ɚ
ə'wɛɚ
aware
f
fæn
fan
g
get
gate
h
hæt
hat
i
it
eat
ɪ
ɪt
it
j
jist
yeast
ǰ
ǰɔ
jaw
k
ɪn
kin
l

law
m
mun
moon
n
not
note
ŋ
sʊŋ
sung
o
old
old
ɔ
ɔi
all
p
pip
peep
ɹ
ɹun
run
s
si
sea
š
ši
she
t
txn
tin
θ
θxn
thin
u
uz
ooze
ʊ
bʊk
book
v
vɛst
vest
w
wʊd
wood
z
zɪp
zip
Ž
vižon
vision
Received Pronunciation General American Common Pronunciation

People will continue to argue about such things as

[1] What would be the best way to represent unstressed sounds? [to schwa or not to schwa...].
[2] Should the redundant letters kqx be included in the alphabet? [they are in Spanglish]
[3] Should both the voiced and unvoiced [th] be included?
[4l Should the schwa [ago] and schwi [very] be represented with a unique phonogram?
[5] Stress is phonemic in English but should it be represented in the writing system?

Saxon Spanglish Alfabet
A
AGO
AA
CAAR
AE
CAET
AI
AIS AIL
B
BIBB
C (KS)
CANCEL
Ch
CHERCH
D
DIDD
UR ER
SURRFER
E, EA
BREAD
EI EY
VEIN THEY
F
FAIV
G
GIGGL
H
HORS
I.
IZ TIPPY
IE Y
FIELD
J
JUDJ
K Q
KICK
L 'L
LITTL
M 'M
MOUND
N 'N
NUNN
NG
SINGL
O.
OTTER
O AO
AWE DOG
OA
OAT
OI OY
OIL BOY
OU AU
OUT CAU
P
PICK
R 'R
ROAR
S
SISTER
Sh
SHIPP
T
TOT TOTT
Th Thh
THY THAI
U. v
UPPER
U .W
HUK HWK
UU u
GURU
V V
VALV
W Wh
WINNER
X KS
TAX TAKS
Y
YES YU
Z
ZIPPERS
Zh
MEZHER

There will be words that continue to be pronounced uniquely in a particular dialect. Thus a transcription system based on General American for NBC English] and BBC English may not always precisely represent the pronunciation of some words in other dialects of English. It is not that these dialects have any more phonemes [although this is a remote possibility], they just apply the 36 identified pure phonemes differently in a few words.

The goal is to develop a workable writing system for English that is as good as the Italian writing system is for the Italian language. Unlike the goal of a perfect one-to-one correspondence between graphemes and phonemes, this goal is attainable. What prevents its realization is not the elusive nature of isomorphism as a goal but the fact that any consistent system will respell 60% of the words in English and that most of these respellings look odd to those adept in the traditional writing system. Some respellings will "offend the eye." For those who have acquired a high level of word pattern recognition, respelling will nearly obliterate certain distinctions isolated by heterographic homophones such as [know, no], [dough, doe], [I, eye, aye].

There is certainly no need to abandon this idealistic goal of isomorphism or one-to-one correspondence at this point. It clearly defines the correct direction. One symbol per sound can remain the stated goal without the expectation that it is the kind of ideal that can ever be fully attained. When a practical English writing system becomes nearly as good as one of the systems used as a pronunciation guide in a dictionary, then the quest can be abandoned.

A broad pronunciation guide spelling that is nearly 100% predictable and easy to type is about as close as one can expect to get. Beyond this we quickly reach a point of diminishing returns. The goal is not to be better than today's dictionary pronunciation guides but to approximate them with a practical everyday writing system devoid of unsupported special characters and complicated diacritics.

The goal is to come up with the best possible visual representation of the abstract phonemes that people have in their heads. The goal is to achieve a system or representation that is nearly isomorphic with the phonological structure of English speech. We will never quite reach this goal. Fortunately, a system that is less than ideal will be "good enough". A writing system for English that is as good as the writing systems for Italian and Spanish will be fine.

There is no perfect graphic representation of speech sounds. Since the writing system is not designed to capture subtle differences between different dialects, the system does not have to be as detailed as IPA. As good as Spanish is quite adequate for English.

Applied linguistics works in a realm of fuzzy logic [1] not Aristotelian logic where everything is either black or white ... true or false.

Brown points out all of the limitations of phonemic spelling and then concludes that since the goal of one and only one symbol per sound is elusive an unobtainable it should be abandoned.

In building a better system for a broad transcription of English, there is a point of diminishing returns. This point will be reached long before we have to become overly concerned about the precision of phonemes or the suitability of a particular base dialect.

One and only one symbol and per sound should remain as the simplest expression of our goal.

Notes.

[1] fuzzy logic - in classical logic everything was black or white, true or false. fuzzy logic recognizes a middle ground, e.g., usually true. Fuzzy logic is a superset of conventional (Boolean) logic that has been extended to handle the concept of partial truth - truth values between "completely true" and "completely false". It was introduced by Dr. Lotfi Zadeh of UC/Berkeley in the 1960's as a means to model the uncertainty of natural language. The Sony PalmTop apparently uses a fuzzy logic decision tree algorithm to perform handwritten (i.e., computer lightpen) Kanji character recognition.

[2] phoneme - a difference in sound that makes a difference in meaning - a range of sounds treated as the same sound. A phoneme is abstract concept or category - you cannot see, touch, or hear a phoneme but you can point to instances. A phoneme is not one sound but a family of sounds, especially when more than one speaker is involved. A phoneme is an area. All instances in that area are referred to as allophones or diaphones.

Phonemes are language specific. Where English speakers distinguish two phonemes [lid/rid], speakers of other languages may hear only one. R is not distinct from L in Japanese.

Phonemes are called the smallest unit of meaningful sound within a language.

[3] phonemic - All languages are 100% phonemic. Differences in sound make up the code. To the extent that a writing system represents the important sound categories of a language, it is also said to be phonemic. Most writing systems are mixes. Pictographic and logographic elements are also included.

Writing in 1891 E.V. Graff presented a phonetic alphabet for 37 elementary sounds. This is the same as the one above except for the addition of hw as in when and where.

QUIZ:

[1] How many phonemes are there In the word brought?

To answer the question, look it up in a pronunciation guide and count the phonograms: IPA. b-r-ɔ-t Unifon brxt Spanglish brawl Truespel braut. Spanglish and Truespel are not unigraphic and this can distort the count. Answer: 4.

[2] How many phonemes in the word thorough?

The dictionary says / 'θ e r ou /. This looks like 5. This is a tougher question than the first because it is uncertain if /ou/ or /əu / represents one or two phonemes. It is one in Spanish and many other writing systems. English speakers generally pronounce the "long O" as a diphthong but they would understand a monopthong. Unifon TcrO Spanglish thurro Truespel thheroe. WS thyyrou. Unifon's one sound per symbol design would suggest that it is probably the best transcription system for easy phoneme counting were it not for the fact that it uses single letters for diphthongs [I=ai, O=ou, q=au, Q=oi]. Answer: 4 or 5 [with an explanation].

References.

Avinor, Michael 2001 Message dated June 4, 2001, Saundspel: The Phonology Forum. See Links paje.

Bett, Steve. 1999. Can we pin down the number of phonemes In English: Simpl Speling Newsletter, March, 1999, p. 7.

Bett, Steve. 2002. The number of phonemes in English. Longer versions of this article.

Brown, Adam. 2000. The number of phonemes in English: not a simple answer to a simple question. JSSS J27/2000/1 p. 11-13

Brown, Roger. 1988. Words and Things. Glencoe Free Press

Wells, John. SAMPA - speech assessment methods phonetic alphabet, 1987. [See Links paje.]


Valuable suggestions and comments received from members of the saundspel forum: particularly Charles Paulson and Drs. Valerie Yule and David Kelley.


Back to the top.