The pronunciation of Japanese is very regular; for the most part, Japanese words sound as they are written in hiragana and katakana. Altogether, there are around 110 different sounds in Japanese, a walk in the park compared to the thousands that we have in English.
In Japanese, the order of the vowels is ‘a, i, u, e, o’; their sound is pure and sharp, similar to the vowels in Spanish, except for the ‘u’, which is sharper in Spanish than in Japanese.
- the ‘a’ (あ, ア) sounds like the ‘a’ in ‘axe’
anata – あなた
atama – あたま
sakana – さかな
- the ‘i’ (い, イ) sounds like the ‘i’ in ‘ink’
migi – みぎ
kimi – きみ
nichi – いち
- the ‘u’ (う, ウ) sounds like the ‘o’ in ‘who’, less sharp than the ‘u’ in the name ‘Uma’
uta – うた
umi – うみ
kuruma – くるま
- the ‘e’ (え, エ) sounds like the ‘e’ in ‘pen’
me – め
eki – えき
te – て
- the ‘o’ (お, オ) sounds like the ‘o’ in ‘ox’
kodomo – こども
tokoro – ところ
otoko no ko – おとこのこ
In English, two vowels often form a sound in a single syllable, but in Japanese the additional vowel is considered an additional syllable. For example, the English word ‘too’ (also) is one syllable long, while the Japanese word ‘too’ (とお – the number 10) has two syllables, and is pronounced in two beats: ‘to-o’.
- in hiragana, doubling the vowel doubles its length:
- in hiragana, an ‘i’ after an ‘e’ sound repeats the ‘e’ sound
the English lang.
- in hiragana, a ‘u’ after an ‘o’ sound repeats the ‘o’ sound
- in katakana, a ‘ー’ (dash) repeats the previous vowel
Vowel special cases
For the most part, every vowel is pronounced. However, it has become the norm to whisper or drop the ‘u’ and the ‘i’ in some cases; this is called devoicing:
- sometimes the ‘u’ (う) sound is faint or omitted, specially in ‘ku’, ‘tsu’ and ‘su’:
am, is, are
formal verb form
west; waist; waste
- sometimes the ‘i’ (い) sound is faint or omitted, specially in ‘shi’ (し) and ‘chi’ (ち):
Most Japanese sounds match an English sound. Here are a few unusual ones.
- the ‘r’ is like the Spanish ‘r’ in ‘cara’ or ‘toro’, not like the English ‘r’ in ‘ram’ or ‘car’.
- fu (hir. ふ, kat. フ) sounds like the English word ‘who‘, specially at the beginning of a word:
- the ‘n’ (hir. ん, kat. ン) is a separate syllable, so it takes an additional ‘beat’ to pronounce it:
romaji – kana
sensei – せんせい
sannin – さんにん
honya – ほんや
se-n-se-e (not ‘sen-se-e’)
sa-n-ni-n (not ‘san-nin’)
ho-n-ya (not ‘hon-ya’)
- the ‘tsu’ sound (hir. つ, kat. ツ) didn’t exist in English, but now we find it in some Japanese-borrowed words:
Japanese martial art
- when speaking casually, some ‘m’ and ‘n’ dissapear:
The small ‘tsu’
A small ‘tsu’ (hir. っ, kat. ッ) before a consonant indicates a consonant doubling or a pause; if the ‘tsu’ ends a word or sentence, it indicates a sudden stop. Finally, it can act as a word connector.
‘tsu’ as a consonant doubler
A ‘tsu’ before some consonants, like ‘s’ and ‘sh’, doubles their length:
‘tsu’ as a pause
We cannot extend consonants like ‘k’, ‘p’, ‘b’, or ‘ch’, because they have explosive sounds. In this case, a ‘tsu’ before any of them indicates a small pause, which is technically called a glottal stop. In romaji we indicate this pause doubling the consonant that follows the ‘tsu’, e.g., っこ becomes ‘kko’, except in the case of ‘ch-‘, in which っち becomes ‘tch’.
‘tsu’ as a sudden stop
We can also have an abrupt pause, i.e., a glottal stop, at the end of a word. In English, we use ellipsis (…) to indicate a suspended dragged-on word or thought, like in “Do you really think so… ?”, but we do not have a way to indicate the opposite, when a word finishes abruptly. In Japanese we also use ellipsis to indicate a suspended word or thought, and we use the small ‘tsu’ to indicate a word or thought stopped abruptly. This dynamic happens often in dialogs so we will find it often in mangas.
In the scene, both the words ‘kudasai’ (‘Please, do for me’) and ‘hayaku’ (‘fast!’ or ‘hurry up!’) are finished abruptly, so they end with a small ‘tsu’: 「くださいっ」and 「早くっ」. In this case, the woman said the words as requests, so in English we could have expressed them as ‘kadasai!’ and ‘hayaku!’, even though they are not actually exclamations. If the woman had been interrupted mid-word while she was saying ‘kudasai’, we would have written it in Japanese as「くだっ」, while in English we would have written it as ‘kuda…’, hopping that the situation makes clear that this is not a suspended dragged-on word, but an interrupted one.
‘tsu’ as a word connector
Japanese are masters of abbreviation; many words are abbreviated using ‘tsu’ as a bridge to connect them to the next word. A common word with this trait is the word 「いち」(‘ichi’, one), which is often replaced by 「いっ」, but the abbreviation is common for many other words too:
one + ‘week span’
one + ‘years old’
one + ‘cup counter’
miscellaneous + magazine
ichi-shuukan → is-shuukan
ichi-sai → is-sai
ichi-pai → ip-pai
zatsu-shi → zas-shi
Consonant special cases
- ha (は) is pronounced ‘wa’ when used as a particle
- he (へ) is pronounced ‘e’ when used as a particle
- wo (を) is pronounced ‘o’ when used as a particle
- Some English sounds, like ‘ing’, ‘ti’, and ‘si’, don’t exist in Japanese, while some Japanese sounds, like ‘tsu’, don’t exist in English; actually, the few English words that use ‘tsu’, like ‘tsunami’, are borrowed from Japanese; however, in the English pronunciation, we replace the ‘tsu’ with a ‘su’, e.g., we pronounce the word as ‘sunami’, instead of ‘tsunami’:
romaji – kana
kingyo – きんぎょ
atsui – あつい
ki-n-gyo (not ‘king-gyo’, nor ‘king-yo’)
a-tsu-i (not ‘at-su-i’, nor ‘at-tsu-i’)
- the ‘n’ (ん) before a ‘b’, ‘m’, or ‘p’ sounds like an ‘m’, so in these cases, the roman version of such ん is not ‘n’ but ‘m’; this is an example of euphony, i.e., making a sound both pleasing to the ear and easier to pronounce:
3 flat things
Here are some examples of this special case:
Words with a pitch
Most Japanese words truly have no pre-defined pitch, e.g., the word ‘ichi’ (one) is normally pronounced ‘ichi’ (flat), but it might be ‘ichi’ or ‘ichi‘ depending on the context, or the dialect. However, some words do have a specific pitch [wikipedia]. For example:
god, deity, spirit
The kana do not have accents that indicate pitch; the kanjis do not give a clue either; thus, there is no alternative but to listen to a native speaker and memorize the pitch, if any. Still, there are a few hints that can help in certain cases.
In English, when we put together two or more words to form a compound word, the compound word preserves the pitches of its component words, e,g,
belly + button → belly-button
carry + over → carry-over
In spite that these compound words are now single words, we still pronounce each of its components with their original pitches, as if we were pronouncing two different words. Japanese does the same, i.e., the components of compound words are pronounced as if they were individual words:
kami (God) + sama (lord) → kami-sama (God)
hachi (8) + hyaku (100) → hachi–hyaku (800)
ashi (foot) + kubi (neck) → ashi-kubi (ankle)
mizu (water) + umi (sea) → mizu–umi (lake)
If the component words happen to be one-syllable long, then we might end up with what appear to be different pronunciations of the same word, when in reality all we are doing is stressing one of the component words. In English, suppose that we have the word ‘twenty-five’. We could stress ‘twenty’ or ‘five’ to draw attention to that particular component of the word, or pronounce them flat. This is more difficult to see in Japanese where the compound words can be so small that we tend to think of them as single words (e.g., ‘gohan’) instead of multiple words (e.g., ‘go-han’):
han (cooked rice)
However, Japanese takes this a bit further. If we have a single word that is being modified, say, conjugated, both the word and the modifier keep their pitches:
I don’t drink
I want to drink
I don’t want to drink
nomi + masu → nomi–masu
nomi + masen → nomi-masen
nomi + tai → nomi–tai
nomi + taku + nai → nomi–taku-nai
Hence, the pronunciation tends to be correct when we treat the components of a word as separate words (e.g., nomi–masu), each with its own pitch (if any), instead of considering the word as a single unit (e.g., nomimasu) and attempting to single out a particular syllable.
Finally, as if Japanese pitch wasn’t already difficult enough, native speakers from different regions of Japan often pronounce words in different ways. For example, the Japanese spoken in Tokyo, which is considered the ‘standard’ Japanese, tends to stress the first syllable, while the Kansai dialect (e.g., Kyoto, Osaka) tends to stress the last one:
Other dialects, like those of Hokkaido and Okinawa, have their own idiosyncrasies.