The Arabic Alphabet: A Guide to the Phonology and Orthography of MSA and Lebanese Arabic


In this post, we introduce the Arabic alphabet as well as the phonemes (sounds) and orthography (writing conventions) of the Arabic language. This covers everything you need to know to get started with reading and pronouncing Arabic correctly. Our point of reference is Modern Standard Arabic (MSA), i.e. the standardized literary language common to the whole Arab world, but we also discuss variations in colloquial Lebanese Arabic.

The transcription conventions used on this website are indicated in the course of our discussion. If you are looking for a concise presentation of this information, please see the key to our transcription system.

Arabic Script

The Arabic language, along with a number of other languages (e.g. Farsi, Pashto, Sorani, Urdu), is written with the Arabic script. This script is written from right to left in a cursive style, in print as well as in handwriting. Letter case does not exist, i.e. there is no distinction between upper-case and lower-case letters.

An Arabic edition of One Thousand and One Nights, showing the beginning of the story “The Fisherman and the Jinni”

In Arabic script, letters take different shapes depending upon their position in the word and whether they are connected to a preceding letter. All letters can connect from the right side (i.e. to the preceding letter), but some do not connect from the left side (i.e. to the subsequent letter). Therefore, every letter may be classified either as a connector, i.e. a letter that connects from both sides, or as a non-connector, i.e. a letter that does not connect to the subsequent letter. Most letters are connectors; there are only six non-connectors.

Connectors have four shapes:

  • Independent: not connected to any other letter
  • Initial: connected to the subsequent letter only
  • Medial: connected to the preceding and subsequent letters
  • Final: connected to the preceding letter only

Shown below, as an example, are the four shapes of the Arabic letter bāĀ.

Final Medial Initial Independent
ـب ـبـ بـ ب

Non-connectors have two shapes:

  • Independent: not connected to any other letter
  • Final: connected to the preceding letter only

Shown below are the two shapes of the Arabic letter Āalif, which is one of the six non-connectors.

Final Medial Initial Independent
ـا ـ ـ ا

It is important to distinguish between the shape a letter takes (independent, initial, medial, final) and the position of that letter in a word (word-initial, word-medial, word-final), as the two will not necessarily coincide. For example, a connector will take an initial shape in word-medial position if it follows a non-connector, and a non-connector will take an independent shape in word-final position if it follows another non-connector.

The Arabic Alphabet

Overview

There are 28 letters in the Arabic alphabet, all of which represent consonants. Three letters can also represent long vowels in certain contexts, namely Āalif (ا), wāw (و), and yāĀ (ي). Short vowels are not part of the alphabet. Most of the letters are arranged in groups of two or three with similar shapes, and are distinguished only by the presence and placement of small dots above or below the basic structure of the letter.

The Arabic language is written not only with the letters of the alphabet, but also with a number of characters which are not considered part of the alphabet. These include variant spellings, short vowel markers, various other markers of pronunciation and grammar, and a peculiar shape-shifting consonant called hamzaŧ. Some of these characters are required for correct writing, while others are optional in most texts and seldom written.

Arabic writing is highly phonemic, i.e. there is a high degree of consistency between the letters and characters of the language and their corresponding sounds.

Arabic Alphabet Chart

The chart below shows the names and shapes of the letters of the Arabic alphabet, along with the symbols used on this website to represent these letters in transcription.

Click next to the name of each letter to hear how it is said.

Arabic Alphabet
Shape
Name Symbol Final Medial Initial Independent
Āalif
Ā (for hamzaŧ*) or
long vowel ā
ـا ـ ـ ا
bāĀ
b ـب ـبـ بـ ب
tāĀ
t ـت ـتـ تـ ت
θāĀ
θ ـث ـثـ ثـ ث
jīm
j, g ـج ـجـ جـ ج
HāĀ
H ـح ـحـ حـ ح
ḰāĀ
Ќ ـخ ـخـ خـ خ
dāl
d ـد ـ ـ د
ḏāl
ـذ ـ ـ ذ
rāĀ
r ـر ـ ـ ر
zāyn
z ـز ـ ـ ز
sīn
s ـس ـسـ سـ س
šīn
š ـش ـشـ شـ ش
Sād
S ـص ـصـ صـ ص
Dād
D ـض ـضـ ضـ ض
TāĀ
T ـط ـطـ طـ ط
ẒāĀ
ـظ ـظـ ظـ ظ
3ayn
3 ـع ـعـ عـ ع
ğayn
ğ ـغ ـغـ غـ غ
fāĀ
f ـف ـفـ فـ ف
qāf
q ـق ـقـ قـ ق
kāf
k ـك ـكـ كـ ك
lām
l ـل ـلـ لـ ل
mīm
m ـم ـمـ مـ م
nūn
n ـن ـنـ نـ ن
hāĀ
h ـه/ه ـهـ هـ هـ
wāw
w or long vowel ū ـو ـ ـ و
yāĀ
y or long vowel ī **ـي ـيـ يـ ي

* Note that Āalif is not an independent consonant in itself, but is often used to represent the consonant hamzaŧ (ء), which is not traditionally included in the alphabet. We introduce the hamzaŧ below.

** In Egypt, final yāĀ is generally written without the dots.

Click below to hear all 28 letters of the Arabic alphabet recited in order:

Arabic Consonants

Pronunciation Chart

The chart below indicates how each Arabic letter is pronounced in MSA when used as a consonant. (Remember that Āalif (ا), wāw (و), and yāĀ (ي) can also be used as vowels; this will be discussed later.)

Some of these letters (in particular ظ , ذ, ث and ق) are regularly pronounced differently in spoken Lebanese Arabic than they are in MSA. These pronunciation variations will be discussed below.

Symbol Pronunciation Example Letter
Ā glottal stop; like “-” in “uh-oh”, “t” in the
informal pronunciation of “football”, or the
sound preceding a word beginning with a
vowel, as in “at”, “in” or “out”
أَسْماء

أ/ء

b like “b” in “band”
بَلَد ب
t like “t” in “tan”
تَلّ ت
θ like “th” in “thin”
ثَمَن ث
j, g like “g” in “regime”, or “s” in “leisure”
(like “j” in “jam” in Gulf countries, and
like “g” in “gate” in Egypt)
جَبَل ج
H no English equivalent; produced by saying
“ha” while constricting throat muscles
حَسَب ح
Ќ like “ch” in German “nacht”
or Scottish “loch”
خَطّ خ
d like “d” in “dance”
دَرْس د
like “th” in “that”
ذَنَب ذ
r ranges in quality from a trill, like “r” in
Spanish “carro”, to a tongue flap against the
roof of the mouth, similar to the “t” in “metal”
or “butter” in informal American English
رَبّ ر
z like “z” in “zoo”
زَميل ز
s like “s” in “sat”
سَفَر س
š like “sh” in “shine”
شَرْشَف ش
S emphatic counterpart of س, similar to
the “s” in “sauce”, “sauna”, or “sob”
صَيْف ص
D emphatic counterpart of د, similar to
the initial “d” in “dawdle” or “dawn”
ضَمير ض
T emphatic counterpart of ت, similar
to the “t” in “taught”
طَويل ط
emphatic counterpart of ذ, similar
to the “th” in “though”
ظَريف ظ
3 no English equivalent; produced by saying
“ah” while constricting throat muscles
عَرَب ع
ğ similar to the French “r” in “Paris”,
or to the sound made when gargling
غَريب غ
f like “f” in “fish”
فِلفُل ف
q similar to the “c” in “caught” but articulated
even further back in the mouth
قَريب ق
k like “k” in “kite”
كُتُب ك
l like “l” in “land”
لَبَن ل
m like “m” in “man”
مَكْتَب م
n like “n” in “now”
نَجاح ن
h like “h” in “hat”
هَدَف هـ
w like “w” in “win”
وَلَد و
y like “y” in “yes”
يَمين ي

Additional Consonants

There are two additional consonants which are not part of the alphabet proper: hamzaŧ (ء) and tāĀ marbūTaŧ (ة).

hamzaŧ (ء)

The hamzaŧ is a character representing the glottal stop, a sound which is produced by blocking and then releasing the airflow in the vocal tract. There is no letter representing the glottal stop in English, although the sound itself is common. For example, it occurs between the two syllables of “uh-oh”, and it often replaces “t” in the informal pronunciation of words like “fountain”, “cotton”, “network”, “football”, “caught”, “right”, “cat”, etc. It is also articulated at the beginning of any English word starting with a vowel, such as “at”, “in” or “out”.

The glottal stop is a distinct consonant phoneme in Arabic; however, the hamzaŧ itself is not an independent letter. It is written in various ways, depending upon its position in the word as well as upon the surrounding vowels. The rules for writing the hamzaŧ in Arabic script are complex, and will not be discussed here in any detail. The important thing for now is to learn to recognize the hamzaŧ in all of its various forms.

The basic shape of the hamzaŧ resembles the letter “c” with a short tail sloping down to the left. When it is not part of a word, this symbol is always written by itself “on the line”: ء. It is sometimes also written this way when it is part of a word, although never in word-initial position.

Examples:

questioning musāĀalaŧ
مُساءَلة
chivalry murūĀaŧ
مُروءَة
innocent barīĀ
بَريء
filling milĀ
مِلْء

Note that ء does not connect to any letter, whether before or after. Thus, a connector immediately preceding ء will take its independent or final shape, and not its initial or medial shape. (For example, yāĀ takes its independent rather than initial shape in بَريء, and lām takes its final rather than medial shape in مِلْء.) This is the only time this happens with connectors which are not at the end of a word.

More often than not, a hamzaŧ which is part of a word is not written on the line, but instead appears in miniature on one of three letters: Āalif (أ or إ), wāw (ؤ), or yāĀ without the dots (ئ). At the beginning of a word, hamzaŧ cannot be written on the line nor can it take any seat other than Āalif. It sits either above or below the Āalif (أ or إ), depending upon the following vowel. It sits above if it is followed by either of the vowels “a” (fatHaŧ) or “u” (Dammaŧ), and below if followed by “i” (kasraŧ).

Examples of word-initial hamzaŧ:

rabbit Āarnab
أَرْنَب
week Āusbū3
أُسْبوع
gospel Āinjīl
إِنْجيل

In the middle and at the end of words, hamzaŧ is either written on the line (ء) or it sits on an Āalif, wāw, or yāĀ. When it sits on an Āalif, it is always above (أ), never below. The following table shows all the possible seats of hamzaŧ in word-medial and word-final position:

Final Medial Initial Independent
ـأ أ
ـؤ ؤ
ـئ ـئـ ئـ ئ

Note that the same rules of connection apply to these letters whether they are long vowels, independent consonants, or seats for hamzaŧ.

Examples of word-medial and word-final hamzaŧ seats:

head raĀs
رَأْس
matter šaĀn
شَأْن
issue masĀalaŧ
مَسْأَلَة
to arise našaĀa
نَشَأَ
vision ruĀyā
رُؤْيا
question suĀāl
سُؤال
responsible masĀūl
مَسْؤول
slowness tabāTuĀ
تَباطُؤ
chief, boss raĀīs
رَئيس
heating tadfiĀaŧ
تَدْفِئَة
bad sayyiĀ
سَيِّئ
emergencies TawāriĀ
طَوارِئ
seashore šāTiĀ
شاطِئ

In certain situations, an Āalif takes a hamzaŧ but the miniature hamzaŧ symbol does not actually appear on the Āalif. This occurs with Āalif maddaŧ (آ), which will be discussed later. It also occurs with an Āalif which takes a type of hamzaŧ known as hamzat lwaSl. This hamzaŧ, which is only present in word-initial position, is unwritten and is pronounced only when the word begins a sentence or utterance.

Regardless of how hamzaŧ is written in Arabic script, we always transcribe it (when pronounced) as Ā. We do not transcribe hamzat lwaSl when it is not pronounced.

Learn more: Writing and Pronouncing the Hamza (ء): A Guide for the Perplexed

tāĀ marbūTaŧ (ة)

Literally meaning the “tied tāĀ”, the tāĀ marbūTaŧ is a form of the letter ت that exists only at the end of words, and is always preceded by a vowel. In MSA, this vowel is usually a short “a” (fatHaŧ) but occasionally a long “ā” (Āalif). The tāĀ marbūTaŧ usually serves to mark a feminine noun or adjective.

The tāĀ marbūTaŧ is written like the letter ه with two dots above. Since it appears only at the end of words, however, it has just two shapes: an independent shape if it follows a non-connector, and a final shape if it follows a connector.

Final Independent
ـة ة

The tāĀ marbūTaŧ is sometimes silent and sometimes pronounced, depending upon several factors including the location and function of the word in the sentence, as well as whether the speaker is using a dialect or MSA. In fully vocalized MSA, it is pronounced like ت unless followed by a pause, such as at the end of a sentence, in which case it is not pronounced. Note, however, that the standard in classical Arabic and Quranic recitation is for pre-pausal tāĀ marbūTaŧ to be pronounced like a soft ه.

When it is not pronounced, we transcribe tāĀ marbūTaŧ as ŧ.

Bear in mind that tāĀ marbūTaŧ changes to a regular tāĀ (ت) if a suffix is added to the word.

Examples (MSA pronunciation):

school madrasaŧ
مَدْرَسَة
our school madrasatnā
مَدْرَسَتْنا
picture Sūraŧ
صورَة
my picture Sūratī
صورَتي

In Lebanese Arabic, tāĀ marbūTaŧ is generally not pronounced at all unless the word is in the construct state, in which case it is pronounced ت. There are some exceptions, however. When the tāĀ marbūTaŧ is preceded by the long vowel Āalif, it is usually pronounced.

Examples (Lebanese Arabic):

life Hayēt
حَياة
pedestrians mušēt
مُشاة
channel qanāt
قَناة

As mentioned above, the tāĀ marbūTaŧ is always preceded by a vowel. This vowel – usually short, but occasionally long (as in the above examples) – is always pronounced, even when the tāĀ marbūTaŧ itself is not. In MSA, the short vowel is always “a” (fatHaŧ) and the long vowel is always “ā” (Āalif). In Lebanese Arabic, the short vowel is either “a” (fatHaŧ) or “e” (kasraŧ), while the long vowel is always Āalif but with pronunciation varying between “ā” or “ē”, depending on the preceding consonant. With the notable exception of ر, consonants which are produced in the front of the mouth (ي ,و ,ن ,م ,ل ,ك ,ف ,ش ,س ,ز ,ذ ,د ,ج ,ث ,ت ,ب) will take the front vowel sound – i.e. “e” or “ē” – before the tāå marbūTaŧ, while consonants which are produced in the back of the mouth and in the throat (هـ ,ق ,غ ,ع ,ظ ,ط ,ض , ص ,خ ,ح ,ء) will take the back vowel sound – i.e. “a” or “ā” – before the tāå marbūTaŧ.

Examples (Lebanese Arabic pronunciation):

Front Vowel: building binēyeŧ
بِنايِة
Front Vowel: school madraseŧ
مَدْرَسِة
Front Vowel: clean (f.) nDīfeŧ
نْضيفِة
Front Vowel: doctor (f.) Tabībeŧ
طَبيبِة
Back Vowel: language luğaŧ
لُغَة
Back Vowel: university jēm3aŧ
جامْعَة
Back Vowel: bag šanTaŧ
شَنْطَة
Back Vowel: fine (f.) mnīHaŧ
مْنيحَة

Note that ر is usually followed by the back vowel sound. There are, however, some common exceptions, two of which are shown in the following examples:

Examples (Lebanese Arabic pronunciation):

Back Vowel: car siyyāraŧ
سِيّارَة
Back Vowel: picture Sūraŧ
صورَة
Front Vowel: big (f.) kbīreŧ
كْبيرِة
Front Vowel: small (f.) Sğīreŧ
صْغيرِة

 

Difficult Consonants

Most Arabic consonants have English equivalents and their pronunciation should not pose any problem for English speakers. The others will require some practice to learn correctly. Particular attention should be paid to the proper pronunciation of consonants which may sound similar to a non-native ear but nevertheless constitute distinct phonemes in Arabic.

Emphatic Consonants

It is important to distinguish between emphatic consonants (also known as velarized or pharyngealized consonants) and their non-emphatic counterparts. The point of articulation is the same for both, but the emphatic consonants are produced with the edges of the tongue tensed and raised at the sides of the mouth while the middle part of the tongue is pulled back, thus forming a “u” shape in the middle of the mouth. Emphatic consonants deepen the sound of surrounding vowels and even other consonants. Beginning students of Arabic are advised to pay close attention to these sounds in order to learn how to recognize and pronounce the emphatic consonants properly.

Examples:

ص س ض د
صَعيد
سَعيد
ضَرّ
دَرّ
حاصِد
حاسِد
تَـحَضُّر
تَـحَدُّر
حَرَص
حَرَس
رُضوض
رُدود
ط ت ظ ذ
طاء
تاء
ظَليل
ذَليل
يُرَطِّب
يُرَتِّب
مَحْظور
مَحْذور
حاط
حات
بَظّ
بَذّ

ح and ه

ه is the same as the English “h” in words like “hat” and “hand”. ح is a much sharper sound. It is produced by pushing air from the throat while tightly constricting the throat muscles.

ه ح
هُروب
حُروب
تَهْديد
تَحْديد
نَبيه
نَبيح

ء and ع

ع is articulated while the throat muscles are tightly constricted. In this respect it is like ح; the two differ, however, in that ح is voiceless (like a whisper) and ع is voiced.

Arabic learners sometimes have difficulty at first distinguishing between ع and ء. Keep in mind that with ء there is a complete stop of airflow in the vocal tract, which results in an absence of sound, while with ع the vocal cords are vibrating as the throat muscles are constricted, which results in the “strangled” sound characteristic of this consonant.

Examples:

ع ء
عَمَل
أَمَل
سُعال
سُؤال
فاجِع
فاجِئ

ث and ذ

Both of these consonant sounds exist in English, but unlike Arabic they are not distinguished orthographically. ث is voiceless, pronounced like the “th” in “thin” and “thought”, and ذ is voiced, pronounced like the “th” in “that” and “then”.

Examples:

ذ ث
ذَوْب
ثَوْب
مُتَعَذِّر
مُتَعَثِّر
غَذّ
غَثّ

ق and ك

ك is the same as the English “k” or “c” in words like “kite” and “cat.” ق is a similar clicking sound, but produced with the tongue in the very back of the mouth.

Examples:

ك ق
كَلْب
قَلْب
تَكْرير
تَقْرير
حَكّ
حَقّ

Lebanese Arabic Consonant Variations

Four Arabic consonants – ظ, ذ, ث and ق – are regularly pronounced differently in spoken Lebanese than they are in MSA. As it is relatively simple to adapt to their Lebanese pronunciations, we generally preserve the original spelling of these consonants in Arabic script when representing the Lebanese pronunciation variants. However, there are exceptions in the case of ث and ذ, as noted below. Our aim is to preserve the connection to MSA spelling as much as possible while ensuring that the Lebanese pronunciation can be elicited from a reading of the Arabic script alone.

The Pronunciation and Spelling of ث in Lebanese Arabic

In Lebanese Arabic, ث is usually pronounced like س, i.e. “s”. For example, ثَوْرَة is usually pronounced sawraŧ in Lebanese Arabic. In a limited number of words, this consonant is regularly pronounced like ت, i.e. “t”. For example, ثوم is pronounced tūm in Lebanese Arabic.

We preserve the ث spelling in Arabic script when this letter is pronounced like س (as it is in most words). However, when it is pronounced like ت, we write this letter as it is pronounced. Thus, for example, we preserve the ث in ثَوْرَة regardless of whether this word is pronounced θawraŧ (MSA) or sawraŧ (Lebanese Arabic). However, the ث in ثوم is replaced with ت when we are representing Lebanese pronunciation (i.e. tūm); ث is preserved only if we are representing MSA pronunciation (i.e. θūm).

Examples:

MSA: revolution θawraŧ
ثَوْرَة
Lebanese: sawraŧ
MSA: chandelier θurayyaŧ
ثُرَيّا
Lebanese: surayyaŧ
MSA: clothing θiyāb
ثِياب
Lebanese: tyēb
تْياب
MSA: garlic θūm
ثوم
Lebanese: tūm
توم

The Pronunciation and Spelling of ذ in Lebanese Arabic

In Lebanese Arabic, ذ is usually pronounced like ز, i.e. “z”. For example, تَذْكَرَة is usually pronounced tazkaraŧ in Lebanese Arabic. In a limited number of words, this consonant is regularly pronounced like د, i.e. “d”. For example, ذَهَب is pronounced dahab in Lebanese Arabic.

We preserve the ذ spelling in Arabic script when this letter is pronounced like ز (as it is in most words). However, when it is pronounced like د, we write this letter as it is pronounced. Thus, for example, we preserve the ذ in تَذْكَرَة regardless of whether this word is pronounced taḏkaraŧ (MSA) or tazkaraŧ (Lebanese Arabic). However, the ذ in ذَهَب is replaced with د when we are representing Lebanese pronunciation (i.e. dahab); ذ is preserved only if we are representing MSA pronunciation (i.e. ḏahab).

Examples:

MSA: ticket taḏkaraŧ
تَذْكَرَة
Lebanese: tazkaraŧ
MSA: intelligent ḏakī
ذَكي
Lebanese: zaké
MSA: gold ḏahab
ذَهَب
Lebanese: dahab
دَهَب
MSA: to take ĀaЌaḏ
أَخَذ
Lebanese: ĀaЌad
أَخَد

The Pronunciation and Spelling of ظ in Lebanese Arabic

In Lebanese Arabic, ظ is usually pronounced as an emphatic ز, like the “z” in “zone”. However, we preserve the ظ spelling in Arabic script.

Examples:

MSA: luck HuẒūẒ
حُظوظ
Lebanese: HuZūZ
MSA: envelope Ẓarf
ظَرْف
Lebanese: Zarf
MSA: shrapnel šaẒāyā
شَظايا
Lebanese: šaZāyā
MSA: view manẒar
مَنْظَر
Lebanese: manZar

The Pronunciation and Spelling of ق in Lebanese Arabic

In Lebanese Arabic, ق is usually pronounced like ء, i.e. the glottal stop. However, we preserve the ق spelling in Arabic script.

Examples:

MSA: rights Huqūq
حُقوق
Lebanese: HuĀūĀ
MSA: report taqrīr
تَقْرير
Lebanese: taĀrīr
MSA: blond Āašqar
أَشْقَر
Lebanese: ĀašĀar
MSA: investigation taHqīq
تَـحْقيق
Lebanese: taHĀīĀ

A notable exception to the usual Lebanese pronunciation of qāf as a glottal stop is found among the Druze, particularly those living in rural areas (in Lebanon, this is primarily in the Chouf Mountains), who tend to preserve the MSA pronunciation of ق, i.e. “q”.

Even those who normally pronounce qāf as a glottal stop will pronounce it as “q” when it occurs in certain proper names or religious words, or some words that contain ا and ق together.

Examples:

Quran ĀalqurĀān
القُرْآن
Qatar qaTar
قَطَر
Qadisha qādīšā
قاديشا
friends ĀaSdiqāĀ
أَصْدِقاء

Additional Consonant Sounds in Lebanese Arabic

When pronouncing certain words of foreign origin, speakers of Lebanese Arabic often use consonant sounds which are not typical of the dialect. For example, “p” is used in pāspōr (“passport”) and kōmpyūtar (“computer”), “g” in Āinglīzé (“English”) and sigārā (“cigarette”), and “v” in brāvō (“bravo”) and sayyav (“to save”). We transcribe these sounds as they are pronounced, while writing them with the Arabic letter to which they approximate.

Symbol Pronunciation (Lebanese) Letter
p like “p” in “passport” ب
g like “g” in “English” ج
v like “v” in “bravo” ف

Silent Consonants in Lebanese Arabic

Aside from the previously discussed tāå marbūTaŧ, there are two consonants which are frequently silent in Lebanese Arabic: hamzaŧ (ء) and hāĀ (ه).

Word-final ء preceded by Āalif is often silent in Lebanese Arabic, in which case we transcribe it as å. Note also that the Āalif in such words is pronounced with a short vowel sound instead of its usual long vowel sound (i.e. “a” instead of “ā”); nevertheless, we transcribe it as a long vowel.

Examples:

lunch ğadāå
غَداء
evening masāå
مَساء

ه is sometimes silent in Lebanese Arabic, in which case we transcribe it as ħ.

Examples:

his book ktēbuħ
كْتابُه
they have 3indħun
عِنْدْهُن

Arabic Vowels

Overview

Arabic has only three vowels (“a”, “i”, “u”), each of which can be either short or long. Long vowels are pronounced for about twice as long as short vowels.

Although there are relatively few distinct vowels in Arabic, the quality of their pronunciation does admit of a considerable amount of variation, especially in the case of the “a” sound. One significant factor is the proximity of one or more of the emphatic consonants (ظ ,ط ,ض ,ص) or ق, which has the effect of deepening the surrounding vowel sounds. Pronunciation of vowels can also vary from one dialect region to another, and this can affect vowel quality even in MSA. These variations are not generally of any semantic significance, however. They are what linguists call allophones, i.e. different sounds which may be used interchangeably without changing the meaning of a word, as opposed to phonemes, which are the most basic units of sound capable of producing a distinctive meaning.

We make no attempt to catalogue the full array of allophonic vowel variations in our discussion below. However, we do highlight some vowel sounds which do not exist in MSA but are regularly produced by speakers of Lebanese Arabic in certain phonetic contexts. These vowel sounds are represented in our transcription system, and should be adopted by learners of Lebanese Arabic.

Short Vowels and sukūn

Short vowels are not represented in Arabic script with letters, but with diacritical marks called Harakāt / حَرَكات (literally: “motions”) which are placed above or below the letter they follow. These marks are not normally written in Arabic texts, except in the Quran (where they are required) and in children’s books, religious books, and grammar books in order to ensure that words are pronounced correctly. Words or text containing these marks are said to be vocalized. However, they are omitted in most written material except where they are necessary to resolve ambiguity. Therefore, if you intend to read Arabic, you must learn to recognize and pronounce words without relying upon them.

An excerpt from a vocalized edition of the Arabic classic Kalila wa-Dimna (Cairo: Bulaq Press, 1937)
Name Symbol Pronunciation Diacritical Mark
fatHaŧ
a varies from “a” in “tap” or “bat”
to “o” in “top” or “not”
ـَ
Dammaŧ
u varies from “o” in “to” or “move”
to “u” in “put” or “oo” in “good”
ـُ
kasraŧ
i varies from “e” in “me” or “evil”
to “i” in “tip” or “bit”*
ـِ
sukūn
(no symbol) the sukūn indicates that the
consonant is not followed by a vowel
but forms a consonant cluster with
the subsequent letter, like “s” in
“stem” or “r” in “burn”
ـْ

*In Lebanese Arabic, kasraŧ is sometimes also pronounced like the French “é” in “résumé, as discussed below.

Long Vowels

Long vowels are represented by the letters Āalif (ا), wāw (و), and yāĀ (ي).

Name Symbol Pronunciation Diacritical Mark
Āalif
ā varies from a lengthened “a” in “magic” or
“sadness” to a lengthened “a” in “father”
or “talking”
ا
wāw
u like a lengthened “oo” in “food” or “u”
in “rule”**
و
yāĀ
ī like a lengthened “ee” in “seem” or “i”
in “machine”***
ي

* In Lebanese Arabic, Āalif is sometimes also pronounced like a lengthened “e” in “best”, as discussed below.
** In Lebanese Arabic, wāw is sometimes also pronounced like “o” in “hole”, as discussed below.
*** In Lebanese Arabic, yāĀ is sometimes also pronounced like the French “é” in “résumé, as discussed below.

Lebanese Arabic Vowel Variations

The Pronunciation and Transcription of word-final kasraŧ in Lebanese Arabic

In Lebanese Arabic, kasraŧ at the end of a word is pronounced like the French “é” in “résumé”. Note that this is essentially the same as the usual Lebanese pronunciation of the yāĀ (ي) in certain phonetic contexts (see below). However, we transcribe word-final kasraŧ as e and the homophonous yāĀ as é. We use two different symbols to represent one and the same sound because it is important for grammatical reasons to distinguish between kasraŧ and yāĀ.

Notice the difference in sound quality between word-medial and word-final kasraŧ in the following words.

مِن
مِش
بِنْت
min miš bint
فيكِ
هِنِّ
هُوِّ
fīke hinne huwwe

The Pronunciation and Transcription of ا in Lebanese Arabic

In Lebanese Arabic, the pronunciation of Āalif varies between “ā” (like a lengthened “a” in “father”) and “ē” (like a lengthened “e” in “best”), depending primarily upon the surrounding consonants but to some extent also the accent of the speaker. By following the principles outlined below, it is possible to predict with a fairly high degree of accuracy how Āalif will be pronounced in a given word in Lebanese Arabic. Unfortunately, these cannot be taken as hard and fast rules (except where indicated otherwise); however, they will apply in most cases.

It should also be noted that word-final Āalif is typically given the length of a short vowel (i.e. “a”), and thus sounds much the same as a fatHaŧ.

Āalif is pronounced “ā”:

  • when it is followed by a terminal hamzeŧ (ء), or when it is itself the final letter. There are no exceptions to this rule.

    Examples:

    friends ĀaSdiqāĀ
    أَصْدِقاء
    doctors ĀaTibbāĀ
    أطِبّاء
    we niHnā
    نِحْنا
  • when it is preceded or followed by an emphatic consonant (ظ ,ط ,ض ,ص). There are no exceptions to this rule.

    Examples:

    student Tālib
    طالِب
    concierge nāTūr
    ناطور
    friend; boyfriend SāHib
    صاحِب
  • when it is preceded by any other consonant produced in the back of the mouth and in the throat (i.e. ه ,ك ,ق ,غ ,ع ,خ ,ح ,ء) and often ر as well.

    Examples:

    condition Hāl
    حال
    dictionary Āāmūs
    قاموس
    prices Āas3ār
    أسْعار

Āalif is pronounced “ē”:

  • when it is preceded by a consonant produced in the front of the mouth – with the frequent exception of ر (i.e. ي ,و ,ن ,م ,ل ,ف ,ش ,س ,ز ,د ,ج ,ت ,ب) – and not followed by an emphatic consonant.

    Examples:
teacher Āistēz
إسْتاذ
public square sēHaŧ
ساحَة
walking mēšé
ماشي

Notice that ر does not fit neatly into these rules. It is produced in the front of the mouth and it does indeed behave like a front consonant in some words, giving the Āalif an “ē” sound (e.g. rēkib / راكِب, šēri3 / شارِع); however, in other words it gives the Āalif an “ā” sound even though there are no back consonants in the word (e.g. rāyiH / رايِح, jār / جار).

Examples:

riding rēkib
راكِب
street šēri3
شارِع
going rāyiH
رايِح
neighbour jār
جار

The Pronunciation and Transcription of و in Lebanese Arabic

In Lebanese Arabic, “ō” is the usual pronunciation of the MSA diphthong “aw” in one-syllable words. For example, MSA yawm becomes yōm in Lebanese Arabic. It is also the usual Lebanese pronunciation of final و in words with more than one syllable, e.g. MSA fataHū becomes fataHō in Lebanese Arabic. Many words of foreign origin also contain “ō”, e.g. Āōrōpā, dōlār, šōkōlātaŧ, tilfizyōn, duktōr, bōnjūr. Although “ō” technically represents a long vowel, it is generally given the same length as a short vowel, particularly in word-final position.

MSA: day yawm
يَوْم
Lebanese: yōm
يوم
MSA: they opened fataHū
فَتَحوا
Lebanese: fataHō
فَتَحو

The Pronunciation and Transcription of ي in Lebanese Arabic

In Lebanese Arabic, “é” is the usual pronunciation of the MSA diphthong “ay” in one-syllable words or in the final syllable of a multi-syllable Arabic word. For example, MSA bayt becomes bét in Lebanese Arabic, and MSA maktabayn becomes maktabén in Lebanese Arabic. “é” is also the usual Lebanese pronunciation of final ي in words with more than one syllable, e.g. MSA ma3ī becomes ma3é in Lebanese Arabic. Many words of foreign origin also contain “é”, e.g. Āōtél, mōtér, sikritér, vītés. Although “é” technically represents a long vowel, it is generally given the same length as a short vowel, particularly in word-final position, and sounds much the same as the terminal kasraŧ, i.e. “e”.

 

MSA: house bayt
بَيْت
Lebanese: bét
بيت
MSA: two offices maktabayn
مَكْتَبَيْن
Lebanese: maktabén
مَكْتَبين
MSA: with me ma3ī
مَعي
Lebanese: ma3é
مَعي

Variant Spellings of Āalif in Arabic Script

There are a few additional characters which are used to represent Āalif in Arabic script. These are not optional spellings, but are required with certain words and letter combinations.

Āalif maqSūraŧ (ى)

Āalif maqSūraŧ is written like ي but without the dots. It appears only at the end of words, and therefore has just two shapes: an independent shape if it is preceded by a non-connector, and a final shape if it follows a connector.

Final Independent
ـى ى

Āalif maqSūraŧ literally means “shortened Āalif”, and is pronounced like the final “a” in “Santa”, i.e. like a short vowel (fatHaŧ) rather than a long vowel. Nevertheless, we represent Āalif maqSūraŧ in our transcription system as ā since it is important for grammatical reasons that it be distinguished from a final fatHaŧ. If a suffix is added, Āalif maqSūraŧ will change both in pronunciation and spelling either to a regular Āalif or to a yāĀ which forms a diphthong with the preceding fatHaŧ (“ay”).

Examples:

on 3alā
عَلى
on us 3alaynā
عَلَيْنا
he gave 3aTā
عَطى
he gave us 3aTānā
عَطانا

lām Āalif (لا)

lām followed by Āalif forms a ligature, i.e. the two letters are joined into one shape: لا. Bear in mind that lām Āalif does not connect to the following letter, since Āalif is a non-connector.

Medial & Final Independent & Initial
ـلا لا

Āalif maddaŧ (آ)

When hamzaŧ written on an Āalif (أ) takes a fatHaŧ and is followed by a second Āalif, the two are generally combined into one shape, called Āalif maddaŧ (“the Āalif of prolongation”). This takes the form of a short, slightly wavy line written above an Āalif.

آ أَ + ا
آ أَ + أْ

Notice that the second Āalif may have originally been a long vowel (as in the first example) or another carrier of hamzaŧ (as in the second example). In either case, the single Āalif maddaŧ is written in place of the two consecutive Āalifs, and is pronounced and transcribed as a hamzaŧ followed by a single Āalif: “Āā”.

Āalif maddaŧ, like a regular Āalif, is a non-connector and therefore has only two shapes:

Final Independent
ـآ آ

Examples:

Adam Āādam
آدَم
he supported, backed Āāzara
آزَرَ
Quran ĀalqurĀān
القُرْآن
they (dual) grew up našaĀā
نَشَآ

Dagger Āalif

The “dagger Āalif” (Āalif Ќanjariyyaŧ / أَلِف خَنْجَرِية) is an archaic spelling of Āalif which still exists in a few common words, and is pronounced in exactly the same way as a regular Āalif. When written, it appears as a miniature Āalif above the letter rather than on the line. However, it is usually omitted, in printed texts and handwriting alike. One notable exception is the word الله, meaning “God”, which often retains the dagger Āalif in print.

Examples:

God Āallāh
this (m.) dā
this (f.) dihi
but lākin
therefore lidālik

Remember that these words must be pronounced with the long vowel sound of the Āalif, even when (as is usually the case) the dagger Āalif is not written on them.

Doubled Consonants

Arabic consonants are sometimes doubled, which means that the same consonant occurs twice in a row without an intervening vowel. (The Arabic word for this phenomenon is Āattašdīd / التَّشْديد.) A doubled consonant is pronounced for about twice as long as a single consonant. Something similar sometimes occurs in English pronunciation when the last letter of one word is the same as the first letter of the next word, causing a sort of doubling of this letter, e.g. “great time”, “bad day”, “big gate”.

One and the same consonant written consecutively in transcription indicates doubling. In Arabic script, the two consonants are written only once, and the doubling action is indicated by the placement of a small symbol known as the šaddaŧ above the consonant ( ـّ ).

Examples:

carpenter najjār
نَجّار
teacher, master mu3allim
strictness tašaddud
تَشَدُّد
bitter murr
مُرّ

Doubled consonants are never followed directly by another consonant, but require an intervening long or short vowel. Note that kasraŧ is normally written below the šaddaŧ rather than below the letter itself.

Nunation

In fully vocalized MSA, short vowels are sometimes written twice at the end of nouns and adjectives. When this occurs, the second vowel is not pronounced as such, but is instead replaced with a “n”. This phenomenon is called nunation, or Āattanwīn / التَّنْوين (literally: “adding a n”), and it usually – but not always – serves as a marker of indefiniteness. The type of tanwīn a word takes depends upon its grammatical role in the sentence.

As shown in the table below, tanwīn occurs in three grammatical cases: nominative, accusative, and genitive. For any word not ending in hamzaŧ (ء) or tāĀ marbūTaŧ (ة), an Āalif is appended to the word when it takes accusative tanwīn. This Āalif is not pronounced, but it must be written. The accusative tanwīn marking should be placed on the letter before the Āalif, although many people actually write this tanwīn on the Āalif.

Case Transcription Symbol
Diacritical Mark
Nominative un ـٌ
Accusative an ـً / ـاً
Genitive in ـٍ

Examples:

مُعَلِّمٍ
مُعَلِّمًا
مُعَلِّمٌ
mu3allimin mu3alliman mu3allimun
كِتابٍ
كِتابًا
كِتابٌ
kitābin kitāban kitābun

In the case of words ending in tāĀ marbūTaŧ, the accusative tanwīn marking is placed atop the tāĀ marbūTaŧ.

Examples:

مَدْرَسَةٍ
مَدْرَسَةً
مَدْرَسَةٌ
madrasatin madrasatan madrasatun
صورَةٍ
صورَةً
صورَةٌ
Sūratin Sūratan Sūratun

See also: Word-final hamzaŧ and accusative tanwīn

Nunation does not usually occur in less formal MSA or colloquial Arabic, with the exception of “an”, which is a fixed adverbial ending on a number of common words.

Examples (Lebanese Arabic):

Thank you šukran
شُكْراً
Excuse me; You’re Welcome 3afwan
عَفْواً
never Āabadan
أَبَداً
always dēyman
دايْماً
usually 3ādatan
عادَةً
sometimes ĀaHyēnan
أَحْياناً
Of course Tab3an
طَبْعاً
approximately taĀrīban
تَقْريباً

Further Reading

Related
Transcription System
Writing and Pronouncing the Hamza (ء): A Guide for the Perplexed
The Definite Article in Arabic

6 thoughts on “The Arabic Alphabet: A Guide to the Phonology and Orthography of MSA and Lebanese Arabic”

  1. What about the accusative tanwin on the hamza? Why it is often written with an additional alif as in جزءاً but sometimes it is written on the hamza as in بناءً ?

  2. Based on what I’ve read here, it seems that kūsā كُوسَا / kūsa كُوسَة‎ would be pronounced kūsē / kūse in Lebanese Arabic, but it is not. What am I missing?

    1. The word “kūsā” is normally written with Āalif (كوسا or كوسى), and not with tāĀ marbūTaŧ (كوسة). As such, the word is pronounced “kūsā”, as Āalif is always pronounced “ā” when it is the final letter (although typically given the length of a short vowel, i.e. “a”).

      كوسَة would be an unusual spelling and, following the rules concerning the usual Lebanese pronunciation of the vowel before the tāĀ marbūTaŧ, would suggest the pronunciation “kūseŧ.” But of course, as you point out, the word is normally pronounced “kūsā.” (It is worth noting, however, that in some areas there is a final vowel shift from a to e.) There are indeed words that do not always fit the rules of pronunciation. For instance, “sūryā” is variously spelled سوريا or سورية, although its pronunciation does not change. Compare also مُديرَة, pronounced “mudīraŧ”, with كْبيرِة, pronounced “kbīreŧ”.

  3. Hi, Just a novice (real novice) here, but shouldn’t the final form of ḰāĀ have a connecting line in it extending down from the top horizontal stroke, then out to the right (as is the case for jīm and HāĀ)? I think your “final form ḰāĀ” in the Arabic Alphabet Pronunciation Table above actually shows the isolated form by mistake … is that possible? Many apologies if I’ve misread something … Great website by the way, and I love the pronunciation guides. This is the best website I’ve found on the Arabic alphabet and its pronunciation so far – thanks guys!

    1. Indeed, it should have been as you indicated, and we’ve now made the correction. Well spotted, and thank you!

Comments are closed.

Scroll to Top