The Extended Speech Assessment Methods Phonetic Alphabet (X-SAMPA)
transcription scheme was created by John C. Wells in 1995 as an
extension to the various language-specific SAMPA phoneme sets available
at that time. The intention was to be able to encode the 1993 IPA chart
in ASCII, while preserving the scheme already developed by the
language-specific phoneme sets where there are no conflicts in the
representation of that ASCII character.
This document uses a set of phoneme features based on the features
originally developed by Evan Kirshenbaum. These features are described
in the phonemes document.
The _ character is used as a tie bar to join two
phonemes, as well as the first character in several diacritics. It can
be used for affricates, double articulations and diphthongs, but can
lead to ambiguity with the corresponding diacritic. The X-SAMPA PDF
document advises that _ is only used for affricates and
double articulations. In that case, the _^ non-syllabic
marker can be used for the second vowel in diphthongs.
The - character is described in the PDF version of
X-SAMPA to differentiate between affricates, double articulations, and
diphthongs such as ts and consonant clusters such as
t-s.
foncxs and fonkirsh are private use
extensions defined in the bcp47-extensions
file, so have the x- private use specifier before their
subtag names.
Consonants
blb
lbd
dnt
alv
pla
rfx
alp
pal
vel
uvl
phr
glt
vls
vcd
vls
vcd
vls
vcd
vls
vcd
vls
vcd
vls
vcd
vls
vcd
vls
vcd
vls
vcd
vls
vcd
vls
vcd
vls
vcd
nas
m
F
n
n`
J
N
N\
stp
p
b
t
d
t`
d`
c
J\
k
g
q
G\
>\
?
sibafr
afr
latafr
sibfrc
s
z
S
Z
s`
z`
s\
z\
frc
p\
B
f
v
T
D
C
j\
x
G
X
R
X\
?\
h
h\
latfrc
K
K\
apr
v\
r\
r\`
j
M\
latapr
l
l`
L
L\
flp
4
r`
latflp
l\
trl
B\
r
R\
H\
<\
clk
O\
|\
!\
=\
latclk
|\|\
imp
b_<
d_<
J\_<
g_<
G\
ejc
p_>
t_>
t`_>
c_>
k_>
q_>
>\_>
ejcfrc
f_>
T_>
s_>
S_>
s`_>
x_>
X_>
latejcfrc
K_>
J\ and ? are defined in the PDF version
of X-SAMPA, but not the HTML version.
G\ is only mentioned in the opening description of
the HTML version of X-SAMPA, but is defined properly in the PDF
version.
>\ is defined as an epiglottal plosive in
X-SAMPA. In this document it is defined as a pharyngeal plosive to be
consistent with the IPA transcription described in the phonemes document.
P can also be used for the labio-dental approximant
v\.
H\ and <\ are defined in X-SAMPA as
epiglottal fricatives. In this document they are defined as pharyngeal
trills to be consistent with the IPA transcription described in the phonemes document.
Other Symbols
bld
alv
pla
pal
lbv
vel
vls
vcd
vls
vcd
vls
vcd
vls
vcd
vls
vcd
vls
vcd
nas
stp
afr
vzdfrc
x\
ptrapr
H
W
w
fzdlatapr
5
5 is supported as an alternative to l_e
due to its use in some language- specific SAMPA phoneme sets.
Manner of Articulation
Feature
Symbol
Name
ejc
_>
ejective
imp
_<
implosive
Vowels
fnt
cnt
bck
unr
rnd
unr
rnd
unr
rnd
hgh
i
y
1
}
M
u
smh
I
Y
U
umd
e
2
@\
8
7
o
mid
@
lmd
E
9
3
3\
V
O
sml
{
6
low
a
&
A
Q
Other Symbols
Symbol
Features
@`
unrmidcntrzdvwl
3`
unrlmdcntrzdvwl
@` and 3` are not explicitly listed in
X-SAMPA. The rhoticized diacritic is specified instead.
Diacritics
Articulation
Feature
Symbol
Name
lgl
◌_N
linguolabial
idt
interdental
◌_d
dental
apc
◌_a
apical
lmn
◌_m
laminal
◌_+
advanced
◌_-
retracted
◌_"
centralized
mid-centralized
◌_r
raised
◌_l
lowered
The articulations that do not have a corresponding feature name are
recorded using the features of their new location in the consonant or
vowel charts, not using the features of the base phoneme.
Phonation
Feature
Symbol
Name
brv
◌_t
breathy voice
slv
◌_0
slack voice
stv
◌_v
stiff voice
crv
◌_k
creaky voice
glc
?_◌
glottal closure
The IPA _0 diacritic is also used to fill the
vls spaces in the IPA consonant charts. Thus, when
_0 is used with a vcd consonant that does not
have an equivalent vls consonant, the resulting consonant
is vls, not slv.
Rounding and Labialization
Feature
Symbol
Name
ptr
◌_w
protruded
cmp
compressed
The {ptr} (protruded) feature is described as
labialized in X-SAMPA.
The degree of rounding/labialization can be specified using the
following symbols:
Feature
Symbol
Name
mrd
◌_O
more rounded
lrd
◌_c
less rounded
Syllabicity
Feature
Symbol
Name
syl
◌=
syllabic
nsy
◌_^
non-syllabic
Consonant Release
Feature
Symbol
Name
asp
◌_h
aspirated
nrs
◌_n
nasal release
lrs
◌_l
lateral release
unx
◌_}
no audible release (unexploded)
Co-articulation
Feature
Symbol
Name
pzd
◌', ◌_j
palatalized
vzd
◌_G, ◌_e
velarized
fzd
◌_?\, ◌_e
pharyngealized
nzd
◌~, ◌_~
nasalized
rzd
◌`
rhoticized
Tongue Root
The tongue root position can be specified using the following
features:
Feature
Symbol
Name
atr
◌_A
advanced tongue root
rtr
◌_q
retracted tongue root
Suprasegmentals
Stress
Symbol
Name
"◌
primary stress
%◌
secondary stress
Length
Feature
Symbol
Name
est
◌_X
extra short
hlg
◌:\
half-long
lng
◌:
long
elg
◌::
extra long
The :: symbol for elg is not listed in
X-SAMPA, but is derived from the transcription for
lng.
Rhythm
Symbol
Name
◌-\◌
linking (no break)
Tones
Symbol
Name
◌_T
extra high tone
◌_H
high tone
◌_M
mid tone
◌_L
low tone
◌_B
extra low tone
!◌
downstep
^◌
upstep
X-SAMPA additionally defines various symbols for contour tones that
can be defined from the composite tone marks.