October 18th, 2003, 08:56 PM
special characters. unicode utf8
I need some guidance in the following problem.
I' building a file named convert.py which reads sewdish docs and converts all ó to o and all â to a. In general, what I need is to get rid of the special characters and convert them to regular ones.
I've trying to use encode, decode, utf8 and unicode but with no success.
Can someone help me?
October 19th, 2003, 08:13 AM
I suspect the reason encode and decode arn't working is because sweedish is a language not an encoding.. if you could post your code i'd be glad to take a look, also, do you have a list of special chars you need converting? (by special chars i'm assuming you mean like â and ó)
I cant imagin this would be too hard to pull off if we're talking txt docs, not Microsoft word right?
October 19th, 2003, 09:25 AM
It depends on how flexible you want to be. Doing the simplest thing, just reading an UTF8-encoded file in, containing just "åäö", they showed up as: "\xc3\xa5\xc3\xa4\xc3\xb6". This means a simple replace will change the characters.