Japanese Romanization

Japanese Romanization is the way that Japanese text gets transliterated into the Roman alphabet. The romanized text is often referred to as "Rōmaji", from Roman alphabet + "ji" meaning "characters" (much the way "Kanji" literally means "Chinese characters"). The word Romanization can actually refer to using any Latin-based alphabet (French, German, Polish, ...) to write a words originally written with any non-Latin script, but in English-speaking fandom it almost universally refers to Japanese-to-English transliteration. See Romanization.

Japanese has three writing systems, two syllabaries and one logography. Katakana and hiragana (both types of kana) are two syllabic systems which are both used to write the same set of syllables; hiragana is the "everyday" system and katakana is mainly used for foreign words and for emphasis. (Technically they represent not syllables but morae. The difference probably won't matter to you.) Kanji are logographic Chinese characters, often with multiple pronunciations depending on context, and their pronunciation must be memorized individually. Small kana (furigana or "rubi") can be written above the kanji to show how they are pronounced; this happens in works meant for younger readers (who may not yet know many kanji) and is very common for names (which typically have multiple possible pronunciations), even on business cards.

Looking at the kana tables, you may notice that there is no "ye" sound, but the unit of currency of Japan is the yen. This is a holdover from Meiji days where え(e) was romanized as "ye" (if spelled today, it would be "en"). This is usually limited to older things that want to keep it for old-time's sake, such as Yebisu Beer from the Ebisu district of Tokyo.

Japanese to English

Japanese has a few quirks that don't exist in English. Although in general pronouncing kana is simpler, there are challenges in representing it in Roman letters. There are several systems to do this:

Hepburn romanization and its revised variants are the most widely used methods of transcription of Japanese, especially for formal and academic writing. The Hepburn system is intended for use by English speakers and is based on English phonology, so a native speaker of English with no knowledge of Japanese will be more likely to pronounce Hepburn-romanized words correctly than if a different system were used. (For this reason, Hepburn is the preferred system for the English-language version of All The Tropes.) Some linguists dislike the Hepburn method, as it can make the origins of Japanese phonetic structures unclear, but those in favour of it say that the Hepburn systems isn't supposed to be used as a linguistic tool anyway.
Kunrei-shiki Rōmaji (literally: Cabinet-ordered romanization system, romanized as "Kunrei-siki" in its own system) is based on the older Nihon-shiki system, and was modified for modern standard Japanese, essentially meaning words are romanized not as they appear, but how they sound in modern spoken Japanese.
Nihon-shiki or Nippon-shiki Rōmaji ("Japan-style"; romanized as Nihon-siki or Nippon-siki in its own system) is the most regular out of all the major romanization systems for Japanese, and has a one-to-one relation to the kana writing systems. The intention of this system was to completely replace kanji and kana with a romanized system, which, its creator believed, would make it easier for Japanese people to compete with Western countries. Since the system was intended for Japanese people to use to write their own language, it is not designed to be easy to pronounce for English speakers (and isn't for the most part).
There's also "word processor romanization" or "wāpuro" which is technically a workaround for inputting Japanese with a QUERTY keyboard but is also used for informal writing, especially on the web. It tends to ignore all the difficulties below and just give a direct transcription of the "standard" kana reading; as such, the spelling may not match the actual pronunciation of words.

R vs L

In the Japanese language, there is technically no "l" or "r" sound; instead, there is a single sound half way between both, kind of like a partly rolled "r". In natively Japanese words this is romanized as "r" in all systems. With loan words written in katakana, whether it is romanized as an "l" or "r" depends on the source word.

Chi/Ti, Tsu/Tu, Shi/Si, Fu/Hu, Zu/Du/Dzu

One difference between the major romanization systems has to do with how certain consonants are written. Certain consonant/vowel pairs sound more like what an English speaker would consider different consonants. Hepburn writes this as the sound (chi, tsu, shi, fu) and Nihon-shiki and Kunrei-shiki write this using the same consonant even if it doesn't match the English sound (ti, tu, si, hu). These romanizations are still taught in Japan, largely because beginning students of English in Japan have difficulties with the concept of letters as single sounds and consonant clusters are too much for them.

The inflected tsu, which sounds like zu, deserves its own mention. Kunrei-shiki joins modern Hepburn in using the phonetic zu. Nihon-shiki sticks with the same consonant for du. Old Hepburn broke its phonetic scheme to use dzu, which is where the "d" in "kudzu" comes from.

The Long Vowel Issue

In Japanese, vowels can be short or long. A long vowel (which just means that the syllable is held for slightly longer, not that the pronunciation is changed) is written in Japanese as two of the vowels in a row - except in the case of long o (which is usually written with a "u" character, as "ou", instead of "oo") and long e (which is usually written with an "i" character, as "ei", instead of "ee").

For example, the name of the city of Tokyo contains two long o vowels, and the Japanese kana (script) would be most directly transcribed as to-u-kyo-u.
There are several ways of presenting the long o:

Hepburn technically requires a bar (macron) over the o (ō): Tōkyō. This can be hard to type, and may cause formatting issues when text is copied between different systems.
A double vowel (oo): Tookyoo. The problem with this is that in English this represents an entirely different sound - a long u, as in "spoon".
The pair spelled the way they are in hiragana (ou): Toukyou. Again, in English this is a different sound, a dipthong as in the word "sound".
Rarely, an h after the vowel (oh): Tohkyoh. This can look unnatural, as no English words have this combination in the middle of a word.
The long/short distinction omitted entirely, as is the case with Tokyo. Most English speakers wouldn't really know the difference between a short and long vowel unless it was pointed out to them, so this is probably the most common way to write it. The downside is that if you want to turn it back into Japanese, you would lose the extra information of long syllables.

Note: There are a few cases where the doubled spelling for long "o" actually is "oo". Ooki (big) and ooi (many) are two such words. There are a few rare cases of "ee" as well.

In katakana, long syllables are shown with a dash-mark, which is also the stand-in for the English ending R sound. This is why so many Japanese people will accidentally say "ice cream corn" instead of "cone."

Long Consonants

Similarly, there is such a thing as a "long consonant", which is usually written by a small "tsu" character before the syllable; this indicates that the consonant part of the syllable is held for longer. This is generally easier to deal with, as the English consonant is just doubled (e.g. "kappa"). It does get confusing when the character to be doubled is a "ch" or "sh" sound, though.

The main exception to the spelling rule is a double "m" or "n", which is written by an additional "n" character rather than a "tsu". R and H cannot be doubled in Japanese, but H can be doubled in katakana to represent the German "ch" sound (e.g. Heinrich or Ludwig would be spelled he-i-n-ri-(small-tsu)-hi and ru-do-u(small i)-(small tsu)-hi).

Multi-syllables

There are a few syllables that turn into combinations, like "ji-ya", "chi-yo", etc., with the second syllable written smaller. In Hepburn this is turned into "ja" and "cho", but you can also see "jya"; Nihon-shiki and Kunrei-shiki actually use "zya".

The "n" apostrophe

One more issue is how to treat "n" followed by a vowel. Since "n", unlike other consonants, does not have to have a vowel sound after it, it's ambiguous whether "ni", for instance, refers to a single syllable or to a "n" followed by a separate "i". Some systems use an apostrophe to indicate this. (Example: ren'ai, "romantic love", vs. re'nai, "no re".)

English to Japanese to English

Japanese is a language of syllables. Very few words can end in a consonant; most end in vowels. There are also fewer sounds in Japanese than in English. When an English word is presented in Japanese (generally in katakana, the script used for foreign characters), information is invariably lost. When it then gets translated back into English, the missing information often leads to mistranslations. This is a common malaise when Video Games get brought to English-speaking countries; many names and words are meant to be English, but the translators sometimes mess up on what they're actually saying.

Common transliteration problems from English to Japanese include:

The lack of a differentiated "R" and "L" sound in Japanese. Japanese has only one sound, which is somewhere between the two. This is probably the most common challenge in romanization: figuring out whether a Japanese syllable is meant to be an R or an L. This is where the term "Engrish" comes from.
Similarly, Japanese don't really have a "f" sound; "f" is basically a somewhat stronger version of "h"; the -u syllable is usually written in English as "fu" but the others are "ha, he, hi, ho". This makes for weird combinations like "fu-(small ya)" for "fya" to stand in for "fa". Sometimes the two are interchangable; for example, "hu" in Japanese would still be spelled with the "fu" syllable.
The lack of ending consonants. "n" is the only consonant that Japanese allows to end a syllable, although "r" is also simulated by a horizontal dash. For everything else, an existing syllable is used, meaning there is an ending vowel (usually "u") that has to get chopped off when romanizing.
Japanese is not written with spaces or capitals. Translators have to figure out where the spaces go, which can be challenging. (Although there is a special dot symbol which can be used to separate words when necessary, e.g. to separate personal name from surname.)
Missing sounds. Japanese has fewer sounds than English. Examples include:
- "th" is turned into "s" when it's not voiced, like in the name "Smith" (su-mi-su). When it's voiced, the "s" is, too: "the" becomes "za" (although "z" is pronounced more like "dz" in most cases).
- "v" can be written as "u" with a digraph on it, followed by a vowel, but more often is just rendered with a "b" (e.g. "violin" would be "ba-i-o-ri-n").
- the "tee" sound doesn't exist in Japanese. It can also be written using "te-(small i)", but it's often replaced by "chi". So "steal" gets turned into "su-chi-ru".
For some reason, Japanese sometimes treats an ending "m" like an "n", leading to words like "combo" being turned into "ko-n-bo".
- This is because "n" is assimilated so it's pronounced "m" before labials (i.e. "b", "p", and "m" in Japanese), so writing it "ko-mu-bo" is unnecessary. This assimilation also happens in other languages, whether the speakers are aware of it or not.

Sounds that don't fit nicely into English or are unusual can be even more confusing.

Some fun examples of missed Romanization:

Star Ocean the Second Story: Scylla -> su-ku-ra -> Scewer
Final Fantasy VIII: Thamasa Soul -> sa-ma-sa -> Samantha Soul
Tales of Phantasia: Stirge -> su-te-i-ji -> Stage
Shadow Hearts: From The New World: Shub Niggurath -> she-bu-ni-gu-ra-su -> Jeb Niglas
Final Fantasy Tactics: Breath -> bu-re-su -> Bracelet
Wild ARMs: Jack Vambrace (a vambrace is an arm guard) -> va-n-bu-re-i-su -> Jack Van Burace

Note that some names were originally in Japanese but meant to sound English. These names have no "real" translation, and can result in all kinds of arguments. A good example is the town called "ri-ze-n-bu-r" from Fullmetal Alchemist, which has been variously translated as Resembool, Risembul, Riesenburgh, and Liesenburgh.