Converting Japanese to Unicode/HTML? دھاگا پوسٹ کرنے والے: Paul Cohen
| Paul Cohen گرین لینڈ Local time: 07:56 انگریزیسےجرمن + ...
I have a question concerning placing Japanese texts on websites.
Recently a longstanding client asked my wife to add Japanese texts to an existing website (which my wife programmed herself using an HTML editor). The client said that he would have the translations done by a qualified translator and then forwarded to her.
Well, now she has received a Japanese text in a Word file but she can't figure out how to convert it into HTML code and put it on the website. She has... See more I have a question concerning placing Japanese texts on websites.
Recently a longstanding client asked my wife to add Japanese texts to an existing website (which my wife programmed herself using an HTML editor). The client said that he would have the translations done by a qualified translator and then forwarded to her.
Well, now she has received a Japanese text in a Word file but she can't figure out how to convert it into HTML code and put it on the website. She has found a website with Unicodes for Hiragana and Katakana characters, but that seems to cover just a small proportion of the characters.
Does anyone out there have any experience in this area?
Does an application exist that can convert Japanese characters into Unicode/HTML?
Any comments or ideas would be greatly appreciated.
Thanks in advance for your help!
Paul ▲ Collapse | | | What is the problem exactly? | Jun 22, 2010 |
If she received a Word file, the Japanese text itself is most likely in Unicode.
I mean - does it display on her computer correctly?
As to how to put it into HTML - well, you just have to use Unicode for the encoding, and specify the language as Japanese. html lang="ja"
Use meta-tags for Unicode: content="text/html; charset=utf-8"
Or, if you want/need to use Shift-JIS encoding, then charset=Shift_JIS
So, there is no need to replace every character w... See more If she received a Word file, the Japanese text itself is most likely in Unicode.
I mean - does it display on her computer correctly?
As to how to put it into HTML - well, you just have to use Unicode for the encoding, and specify the language as Japanese. html lang="ja"
Use meta-tags for Unicode: content="text/html; charset=utf-8"
Or, if you want/need to use Shift-JIS encoding, then charset=Shift_JIS
So, there is no need to replace every character with a code number, if that's what you are thinking. That's not the way to go.
Maybe I am not clear about the question, but perhaps it would help to take a look at the source code of any Japanese webpage.
http://www.nikon.co.jp/
http://www.toyota.co.jp/
http://www.nissan.co.jp/
Katalin ▲ Collapse | | | From an amateur with an interest in character issues and web design | Jun 22, 2010 |
Might be a silly question, but has the web designer/developer remembered to add support for Unicode in the header? Like:
If not, the best thing would probably be to do so. Next alternative is to use a converter, there are many free ones on the net. This should convert actual characters into html numbers or codes. Just remember, if you're using a free online tool you have to be careful with confidential information as well as:
"It's generally better, howeve... See more Might be a silly question, but has the web designer/developer remembered to add support for Unicode in the header? Like:
If not, the best thing would probably be to do so. Next alternative is to use a converter, there are many free ones on the net. This should convert actual characters into html numbers or codes. Just remember, if you're using a free online tool you have to be careful with confidential information as well as:
"It's generally better, however, to use the characters themselves rather than their Unicode NCRs in cases where a Web page has a lot of Chinese text, because Chinese characters take up less file space than their NCRs."
Reference: http://pinyin.info/tools/converter/chars2uninumbers.html
If none of this is helpful, maybe your site developer might want to read this:
http://www.joelonsoftware.com/articles/Unicode.html
BTW - it's usually safer to throw text into Notepad or similar before adding to a CMS to remove Word's (unnecessary) formatting.
Edited to add that the missing "bit" above (forgot to add spaces around tags) is the same as mentioned by Katalin.
[Edited at 2010-06-22 21:54 GMT] ▲ Collapse | | | RieM امریکہ Local time: 04:56 جاپانیسےانگریزی + ... good ol' native2ascii | Jun 22, 2010 |
I still use it. It's part of Java SDK.
Of course, there are text editors that support such conversion. But then, the file should be text format first.
I will be happy to take a look and covert it as you like. Just send the file from my profile page.
Rie | |
|
|
Paul Cohen گرین لینڈ Local time: 07:56 انگریزیسےجرمن + ... TOPIC STARTER Exellent advice | Jun 23, 2010 |
Thanks, Katalin, Madeleine and Rie.
Excellent advice! We'll look into it an let you know how things turn out.
Thanks again,
Paul (& Monika) | | | esperantisto Local time: 12:56 رکن (2006) روسیسےانگریزی + ... SITE LOCALIZER Some remarks | Jun 23, 2010 |
Katalin Horvath McClure wrote:
If she received a Word file, the Japanese text itself is most likely in Unicode.
Theoretically, it can be so called Far Eastern Word 6.0/95 format. But if it is opened in Word 97 to 2010, it is converted to Unicode on-the-fly.
As to how to put it into HTML - well, you just have to use Unicode for the encoding, and specify the language as Japanese. html lang="ja"
Use meta-tags for Unicode: content="text/html; charset=utf-8"
…and make sure you’re saving your HTML file in, respectively, Unicode UTF-8 (with or without BOM, that’s immaterial). I would suggest using a text/HTML editor with explicit encoding control such as jEdit. | | | Paul Cohen گرین لینڈ Local time: 07:56 انگریزیسےجرمن + ... TOPIC STARTER |
Not that it matters much at this point, but if anyone wants to post tags in the forum, remember to use character entities, not actual angle brackets. I.e. write < instead of <, because otherwise the forum motor misinterprets your tags as, well, tags.
This post shows how things go wrong:
Madeleine MacRae Klintebo wrote:
Might be a silly question, but has the web designer/developer remembered to add support for Unicode in the header? Like:
This is how it looks - as intended - if you use lt and gt:
Madeleine MacRae Klintebo wrote:
Might be a silly question, but has the web designer/developer remembered to add support for Unicode in the header? Like:
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
| |
|
|
MS Word text format | Oct 1, 2010 |
Since the texts in question are in MS Word, one of the easiest ways to convert is saving the file as text, and select option as Unicode (Unicode-8, Unicode-7) etc. These fonts are shown correctly in HTML file with Unicode font enabled on the header tag line.
Soonthon Lupkitaro | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Converting Japanese to Unicode/HTML? Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
| Pastey | Your smart companion app
Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.
Find out more » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |