Python Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesPython Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old February 1st, 2013, 12:39 AM
dx_generation25 dx_generation25 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2013
Posts: 4 dx_generation25 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 49 m 59 sec
Reputation Power: 0
Chinese and Japanese character support in python

I'm reading text from textbox which has path "E\7.4\日本国" and writing it to text file(.txt). Problem is that it doesn't write japanese character in text file instead it will write? question marking in place japanese character. i tried writing it with utf-8 from python but it doesn't work. i used Codecs, unicode but not getting it.

After after writing in textfile i'm reading same path to use in my code.
i'm using ranorexpython and python to test code written in c#(GUI).

Here is the code
#Writing path in pathfile.txt file
writePath = Path1 + "\n" + Path2
tempfile = open("C:\\Temp\\pathfile.txt", "w") tempfile.writelines(writePath)
tempfile.close()

Reply With Quote
  #2  
Old February 1st, 2013, 03:25 AM
SuperOscar SuperOscar is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2007
Location: Joensuu, Finland
Posts: 403 SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Week 4 h 55 m 58 sec
Reputation Power: 65
Quote:
Originally Posted by dx_generation25
I'm reading text from textbox which has path "E\7.4\日本国" and writing it to text file(.txt). Problem is that it doesn't write japanese character in text file instead it will write? question marking in place japanese character.


Do you use Python 2 or Python 3? Handling character encodings can be extremely tricky in Python 2 even to the point that nothing seems to work.

Still, I tried your code snippet in Python 2 and it worked when the path was given as a literal. I guess the problem is that what comes out of the textbox is encoded in a legacy encoding, and you should use strings’ .decode() and .encode() methods to get it to UTF-8.
__________________
My armada: openSUSE 12.3 (home desktop, laptop, work desktop), Ubuntu 12.04 LTS (mini laptop), Debian GNU/Linux 7.0 (server), Mythbuntu 12.04 LTS (HTPC), Bodhi Linux 2.0 & Windows 7 Ultimate (test desktop), FreeBSD 9.1 (test server)

Reply With Quote
  #3  
Old February 1st, 2013, 06:18 AM
dx_generation25 dx_generation25 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2013
Posts: 4 dx_generation25 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 49 m 59 sec
Reputation Power: 0
Quote:
Originally Posted by SuperOscar
Do you use Python 2 or Python 3? Handling character encodings can be extremely tricky in Python 2 even to the point that nothing seems to work.

Still, I tried your code snippet in Python 2 and it worked when the path was given as a literal. I guess the problem is that what comes out of the textbox is encoded in a legacy encoding, and you should use strings’ .decode() and .encode() methods to get it to UTF-8.


i use python 2.5. Problem i'm facing is that if i give
tempfile = open("C:\\Temp\\pathfile.txt", "w", "utf-8")
it does not work.
I checked textfiel after writing japanese peth in text file, it is displayed as question mark.
i have tried with encode and decode but it doesn't work out.
update me if you have some solution for it.

e.g.
>>> path = r"E:\7.4\は最高のプログラマ"
>>> t = path.encode()
>>> print t
E:\7.4\?????????
>>> t = path.decode()
>>> print t
E:\7.4\?????????
>>> t = path.encode("utf-8")
>>> print t
E:\7.4\?????????
>>> t = path.decode("utf-8")
>>> print t
E:\7.4\?????????
>>>

Reply With Quote
  #4  
Old February 1st, 2013, 06:41 AM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,350 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 3 Days 7 h 38 m 45 sec
Reputation Power: 383
Maybe your program is correct but the program you use to view the information won't display utf8?
__________________
[code]Code tags[/code] are essential for python code!

Reply With Quote
  #5  
Old February 1st, 2013, 07:13 AM
dx_generation25 dx_generation25 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2013
Posts: 4 dx_generation25 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 49 m 59 sec
Reputation Power: 0
Quote:
Originally Posted by b49P23TIvg
Maybe your program is correct but the program you use to view the information won't display utf8?


it is not happening with utf-8. If i write manually any japanese or chinese character in notepad and save it as UTF-8 Encoding, then it will not lose data.
But same thing is not happening with python.

Reply With Quote
  #6  
Old February 1st, 2013, 07:43 AM
SuperOscar SuperOscar is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2007
Location: Joensuu, Finland
Posts: 403 SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Week 4 h 55 m 58 sec
Reputation Power: 65
I wish I knew something about Chinese and Japanese encodings (other than Unicode, that is)...

My own problem has usually been that GUIs written in Python 2 assume, say, Latin-1, and I want to write the output files in UTF-8. The solution is to:
Code:
s = s.decode('Latin-1').encode('UTF-8')

(where “s” is a string obtained from a textbox in a GUI). I think that in your case you would just replace Latin-1 with the correct coding.

Reply With Quote
  #7  
Old February 1st, 2013, 08:14 AM
dx_generation25 dx_generation25 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2013
Posts: 4 dx_generation25 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 49 m 59 sec
Reputation Power: 0
Quote:
Originally Posted by SuperOscar
I wish I knew something about Chinese and Japanese encodings (other than Unicode, that is)...

My own problem has usually been that GUIs written in Python 2 assume, say, Latin-1, and I want to write the output files in UTF-8. The solution is to:
Code:
s = s.decode('Latin-1').encode('UTF-8')

(where “s” is a string obtained from a textbox in a GUI). I think that in your case you would just replace Latin-1 with the correct coding.


ok.. will try out...

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPython Programming > Chinese and Japanese character support in python

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap