These pages use TTS (text-to-speech) so highlight a word to hear it in Japanese; the TTS works better in PCs than in mobile devices, where the highlighting is sometimes mistaken for a command gesture.
The pronunciation of Japanese is very regular; for the most part, Japanese words sound as they are written in hiragana and katakana. Altogether, there are 110 different native sounds in Japanese, a walk in the park compared to the thousands that we have in English.
Vowel sounds
In Japanese, the order of the vowels is ‘a, i, u, e, o’; their sound is pure and sharp, similar to the vowels in Spanish, except for the ‘u’, which is sharper in Spanish than in Japanese.
- the ‘a’ (hir. あ, kat. ア) sounds like the ‘a’ in ‘axe’anata – あなた 
 formal ‘you’atama – あたま 
 headsakana – さかな 
 fish
- the ‘i’ (hir. い, kat. イ) sounds like the ‘i’ in ‘ink’migi – みぎ 
 right directionkimi – きみ 
 casual ‘you’nichi – にち 
 day
- the ‘u’ (hir. う, kat. ウ) sounds like the ‘o’ in ‘who’, or the ‘u’ in the name ‘Uma’uta – うた 
 songumi – うみ 
 seakuruma – くるま 
 car
- the ‘e’ (hir. え, kat. エ) sounds like the ‘e’ in ‘pen’, or ‘elf’kesa – けさ 
 this morningeki – えき 
 train stationebi – えび 
 shrimp
- the ‘o’ (hir. お, kat. オ) sounds like the ‘o’ in ‘ox’kodomo – こども 
 childtokoro – ところ 
 placeotoko no ko – おとこのこ 
 boy
Doubling vowels
There are no diphtongs in Japanese so each appearance of a vowel is pronounced as part of a different syllable, e.g., ‘tooi’ (とおい, ‘far’) has three syllables, and is pronounced in three beats: ‘to-o-i’.
- in hiragana, doubling the vowel doubles its length:
 English 
 your mother
 your brother
 week
 your sister
 iceromaji 
 o-kaa-san
 o-nii-san
 shuu
 o-nee-san
 koorikana 
 おかあさん
 おにいさん
 しゅう
 おねえさん
 こおりsounds… 
 o-ka-a-sa-n
 o-ni-i-sa-n
 shu-u
 o-ne-e-sa-n
 ko-o-ri
- in hiragana, an ‘i’ after an ‘e’ sound repeats the ‘e’ sound:
 A few exceptions are:English 
 the English lang.
 movie
 teacherromaji 
 eigo
 eiga
 senseikana 
 えいご
 えいが
 せんせいsounds… 
 e-e-go
 e-e-ga
 se-n-se-e‘tame-iki’ (ため息) is a word composed of two words that we pronounce separately: ‘tame’ (ため- to collect) and ‘iki’ (いき- breath).English 
 ray fish
 sighromaji 
 ei
 tame-ikikana 
 えい
 ためいき (ため息)sounds… 
 e-i
 ta-me-i-ki
- in hiragana, a ‘u’ after an ‘o’ sound repeats the ‘o’ sound
 A few exceptions are:English 
 good morning
 very
 thanksromaji 
 ohayou
 doumo
 arigatoukana 
 おはよう
 どうも
 ありがとうsounds… 
 o-ha-yo-o
 do-o-mo
 a-ri-ga-to-oEnglish 
 to think
 to get lostromaji 
 omou
 mayoukana 
 おもう
 まようsounds… 
 o-mo-u
 ma-yo-u
- in katakana, a ‘ー’ (dash) repeats the previous vowel
 English 
 ramen
 beer
 news
 cake
 cola
 coffeeromaji 
 raamen
 biiru
 nyuusu
 keeki
 koora
 koohiikana 
 ラーメン
 ビール
 ニュース
 ケーキ
 コーラ
 コーヒーsounds… 
 ra-a-me-n
 bi-i-ru
 nyu-u-su
 ke-e-ki
 ko-o-ra
 ko-o-hi-i
Vowel special cases
For the most part, every vowel is pronounced. However, it has become the norm to whisper or drop the ‘u’ and the ‘i’ in some cases; this is called devoicing:
- sometimes the ‘u’ (う) sound is faint or omitted, specially in ‘ku’, ‘tsu’ and ‘su’:English 
 taxi
 your wife
 manymoon 
 desk
 to holda little 
 am, is, are
 formal verb form
 west; waist; wasteromaji 
 takushii
 okusan
 takusantsuki 
 tsukue
 motsusukoshi 
 desu
 masu
 uesutokana 
 タクシー
 おくさん
 たくさんつき 
 つくえ
 もつすこし 
 です
 ます
 ウエストsounds… 
 ta-k-shi-i
 o-k-sa-n
 ta-k-sa-nts-ki 
 ts-ku-e
 mo-tss-ko-shi 
 de-s
 ma-s
 u-e-s-to
- sometimes the ‘i’ (い) sound is faint or omitted, specially in ‘shi’ (し) and ‘chi’ (ち):English 
 we
 tomorrow
 whyromaji 
 watashitachi
 ashita
 doushitekana 
 わたしたち
 あした
 どうしてsounds… 
 wa-ta-sh-ta-ch
 a-sh-ta
 do-o-sh-te
 Another example is the disappearance of the い from the えい combination that forms when we follow a ‘te’ form verb, i.e., a verb that ends in て, って or んで, with いる/います or any or its conjugations, e.g., -ている becomes -てる, -っています becomes -ってます, -んでいた becomes -んでた, etc. The following vanishing acts of い are courtesy of the manga ふらいんぐうぃっち: …しっている ⇒ …しってる (I know …)  …そらとんでいる (flying …)  …とどいています (reported …)  …みている ⇒ …みてる (watching)  みていた ⇒ みてた (I saw) 
Consonant sounds
Most Japanese sounds approximate an English sound. Here are a few unusual ones.
- the ‘r’ is like the Spanish ‘r’ in ‘cara’ or ‘toro’, not like the English ‘r’ in ‘ram’ or ‘car’.English 
 color
 noon
 sixromaji 
 iro
 hiru
 rokukana 
 いろ
 ひる
 ろく
-  fu (hir. ふ, kat. フ) sounds like a mix of ‘fu’ and ‘hu’, like the English word ‘who‘ spoken just blowing air, without changing the shape of the mouth:
 English 
 boat
 futon
 bathromaji 
 fune
 futon
 furokana 
 ふね
 ふとん
 ふろsounds… 
 ‘who‘-ne
 ‘who‘-to-n
 ‘who‘-ro
-  the ‘n’ (hir. ん, kat. ン) is a separate syllable, so it takes an additional ‘beat’ to pronounce it:
 English 
 teacher
 three people
 bookstoreromaji 
 sensei
 sannin
 honyakana 
 せんせい
 さんにん
 ほんやsounds… 
 se-n-se-e
 sa-n-ni-n
 ho-n-ya
-  the ‘tsu’ sound (hir. つ, kat. ツ) didn’t exist in English, but now we find it in some Japanese-borrowed words:
 English 
 tsunami
 ju-jutsu
 shiatsumeaning 
 tidal wave
 martial art
 acupressurekana 
 つなみ
 じゅじゅつ
 しあつsounds… 
 tsu-na-mi
 ju-ju-tsu
 shi-a-tsu
-  when speaking casually, some ‘m’ and ‘n’ dissapear:
 English 
 father
 mother
 excuse meJapanese 
 o-to-o-sa-n
 o-ka-a-sa-n
 su-mi-ma-se-ncasual 
 o-to-o-sa
 o-ka-a-sa
 su-i-ma-se-nsounds… 
 おとおさ
 おかあさ
 すいません
Consonant special cases
- ha (は) is always pronounced ‘wa’ when used as a particle
- he (へ) is always pronounced ‘e’ when used as a particle
- wo (を) is often pronounced ‘o’ when used as a particle
- We might think that ‘kingyo’ is pronounced ‘king-yo’, or ‘atsui’ is ‘at-sui’, but the sounds ‘ing’ and ‘at’, as well as many others, don’t exist in Japanese:English 
 goldfish
 hotromaji 
 kingyo
 atsuikana 
 きんぎょ
 あついsounds… 
 ki-n-gyo
 a-tsu-i
- the ‘n’ (ん) before a ‘b’, ‘m’, or ‘p’ sounds like an ‘m’, so in these cases, the roman version of such ん is not ‘n’ but ‘m’; this is an example of euphony, i.e., making a sound both pleasing to the ear and easier to pronounce:
 English 
 dragonfly
 stroll
 3 flat thingsromaji 
 tonbo
 sanpo
 sanmaikana 
 とんぼ
 さんぽ
 さんまいsounds… 
 to-m-bo
 sa-m-po
 sa-m-ma-i
 Here are some examples of this special case: なんば (nanba) sounds ‘namba’ (src: JPRail)  かんばら (kanbara) sounds ‘kambara’  てんま (tenma) sounds ‘temma’ 
Pitch accent
Many Japanese words truly have no pre-defined pitch accent, e.g., the word ‘ichi’ (one) is normally pronounced ‘ichi’ (flat), but it might be ‘ichi’ or ‘ichi‘ depending on the context, or the dialect. However, some words do have a specific pitch [wikipedia]. For example:
romaji
kami (sama)
kami
ame
ame
hashi
hashi
kaki
kaki
English
god, deity, spirit
hair
rain
hard candy
chopsticks
bridge
oyster
persimon
kana
かみ
かみ
あめ
あめ
はし
はし
かき
かき
kanji
神 
髪 
雨 
飴 
箸 
橋 
牡蠣
柿 
The kana do not have accents that indicate pitch; the kanjis do not give a clue either; thus, there is no alternative but to listen to a native speaker and memorize the pitch, if any. Still, there are a few hints that can help in certain cases.
Compound words
In English, when we put together two or more words to form a compound word, the compound word preserves the pitches of its component words, e,g,
belly + button → belly-button
carry + over → carry-over
In spite that these compound words are now single words, we still pronounce each of its components with their original pitches, as if we were pronouncing two different words. Japanese does the same, i.e., the components of compound words are pronounced as if they were individual words:
kami (God) + sama (lord) → kami-sama (God)
ashi (foot) + kubi (neck) → ashi-kubi (ankle)
mizu (water) + umi (sea) → mizu–umi (lake)
If the component words happen to be one-syllable long, then we might end up with what appear to be different pronunciations of the same word, when in reality all we are doing is stressing one of the component words. In English, suppose that we have the word ‘twenty-five’; we could stress ‘twenty’ or ‘five’ to draw attention to that component of the word, or pronounce them flat. This is more difficult to see in Japanese where the compound words can be so small that we tend to think of them as single words (e.g., ‘gohan’) instead of multiple words (e.g., ‘go-han’):
English
meal
tonight
weather
telephone
1st syllable
go (honorific)
kon (this)
ten (sky)
den (electric)
2nd syllable
han (cooked rice)
ban (evening)
ki (atmosphere)
wa (talk)
compound word
go-han
kon-ban
ten-ki
den-wa
However, Japanese takes this a bit further. If we have a single word that is being modified, say, with a suffix, both the word and the suffix keep their pitches:
I drink
I don’t drink
I want to drink
I don’t want to drink
nomi + masu → nomi–masu
nomi + masen → nomi-masen
nomi + tai → nomi–tai
nomi + taku + nai → nomi–taku-nai
Hence, the pronunciation tends to be correct when we treat the components of a word as separate words (e.g., nomi–masu), each with its own pitch (if any), instead of considering the word as a single unit (e.g., nomimasu) and attempting to single out a particular syllable.
Dialects
Finally, native speakers from different regions of Japan might pronounce words in different ways. For example, the Japanese spoken in Tokyo, which is considered the ‘standard’ Japanese, tends to stress the first syllable, while the Kansai dialect (e.g., Kyoto, Osaka) tends to stress the last one:
region
Tokyo
Kansai region
thanks
arigatou
arigatou
The differences between dialects go way beyond pitch, though. A kansai-dialect speaker would pronounce ‘arigatou’ different from a Tokyoite but, actually, he or she is more likely to give thanks using the local dialect word, i.e., 大きに (ookini); even different regions with the same dialect will speak in different ways, e.g., we could say that the kansai dialect covers, say, Osaka, Hyogo, and Kyoto, but there are marked differences among their speech. Dialects like those of Hokkaido, Okinawa, and many others, have yet their own idiosyncrasies.







