The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> Python Programming
|
Chinese and Japanese character support in python
Discuss Chinese and Japanese character support in python in the Python Programming forum on Dev Shed. Chinese and Japanese character support in python Python Programming forum discussing coding techniques, tips and tricks, and Zope related information. Python was designed from the ground up to be a completely object-oriented programming language.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

February 1st, 2013, 12:39 AM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 4
Time spent in forums: 49 m 59 sec
Reputation Power: 0
|
|
|
Chinese and Japanese character support in python
I'm reading text from textbox which has path "E\7.4\日本国" and writing it to text file(.txt). Problem is that it doesn't write japanese character in text file instead it will write? question marking in place japanese character. i tried writing it with utf-8 from python but it doesn't work. i used Codecs, unicode but not getting it.
After after writing in textfile i'm reading same path to use in my code.
i'm using ranorexpython and python to test code written in c#(GUI).
Here is the code
#Writing path in pathfile.txt file
writePath = Path1 + "\n" + Path2
tempfile = open("C:\\Temp\\pathfile.txt", "w") tempfile.writelines(writePath)
tempfile.close()
|

February 1st, 2013, 03:25 AM
|
|
Contributing User
|
|
Join Date: Jul 2007
Location: Joensuu, Finland
|
|
Quote: | Originally Posted by dx_generation25 I'm reading text from textbox which has path "E\7.4\日本国" and writing it to text file(.txt). Problem is that it doesn't write japanese character in text file instead it will write? question marking in place japanese character. |
Do you use Python 2 or Python 3? Handling character encodings can be extremely tricky in Python 2 even to the point that nothing seems to work.
Still, I tried your code snippet in Python 2 and it worked when the path was given as a literal. I guess the problem is that what comes out of the textbox is encoded in a legacy encoding, and you should use strings’ .decode() and .encode() methods to get it to UTF-8.
__________________
My armada: openSUSE 12.3 (home desktop, laptop, work desktop), Ubuntu 12.04 LTS (mini laptop), Debian GNU/Linux 7.0 (server), Mythbuntu 12.04 LTS (HTPC), Bodhi Linux 2.0 & Windows 7 Ultimate (test desktop), FreeBSD 9.1 (test server)
|

February 1st, 2013, 06:18 AM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 4
Time spent in forums: 49 m 59 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by SuperOscar Do you use Python 2 or Python 3? Handling character encodings can be extremely tricky in Python 2 even to the point that nothing seems to work.
Still, I tried your code snippet in Python 2 and it worked when the path was given as a literal. I guess the problem is that what comes out of the textbox is encoded in a legacy encoding, and you should use strings’ .decode() and .encode() methods to get it to UTF-8. |
i use python 2.5. Problem i'm facing is that if i give
tempfile = open("C:\\Temp\\pathfile.txt", "w", "utf-8")
it does not work.
I checked textfiel after writing japanese peth in text file, it is displayed as question mark.
i have tried with encode and decode but it doesn't work out.
update me if you have some solution for it.
e.g.
>>> path = r"E:\7.4\は最高のプログラマ"
>>> t = path.encode()
>>> print t
E:\7.4\?????????
>>> t = path.decode()
>>> print t
E:\7.4\?????????
>>> t = path.encode("utf-8")
>>> print t
E:\7.4\?????????
>>> t = path.decode("utf-8")
>>> print t
E:\7.4\?????????
>>>
|

February 1st, 2013, 06:41 AM
|
 |
Contributing User
|
|
|
|
|
Maybe your program is correct but the program you use to view the information won't display utf8?
__________________
[code] Code tags[/code] are essential for python code!
|

February 1st, 2013, 07:13 AM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 4
Time spent in forums: 49 m 59 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by b49P23TIvg Maybe your program is correct but the program you use to view the information won't display utf8? |
it is not happening with utf-8. If i write manually any japanese or chinese character in notepad and save it as UTF-8 Encoding, then it will not lose data.
But same thing is not happening with python.
|

February 1st, 2013, 07:43 AM
|
|
Contributing User
|
|
Join Date: Jul 2007
Location: Joensuu, Finland
|
|
I wish I knew something about Chinese and Japanese encodings (other than Unicode, that is)...
My own problem has usually been that GUIs written in Python 2 assume, say, Latin-1, and I want to write the output files in UTF-8. The solution is to:
Code:
s = s.decode('Latin-1').encode('UTF-8')
(where “s” is a string obtained from a textbox in a GUI). I think that in your case you would just replace Latin-1 with the correct coding.
|

February 1st, 2013, 08:14 AM
|
|
Registered User
|
|
Join Date: Jan 2013
Posts: 4
Time spent in forums: 49 m 59 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by SuperOscar I wish I knew something about Chinese and Japanese encodings (other than Unicode, that is)...
My own problem has usually been that GUIs written in Python 2 assume, say, Latin-1, and I want to write the output files in UTF-8. The solution is to:
Code:
s = s.decode('Latin-1').encode('UTF-8')
(where “s” is a string obtained from a textbox in a GUI). I think that in your case you would just replace Latin-1 with the correct coding. |
ok.. will try out...
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|