#1
  1. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2003
    Posts
    11
    Rep Power
    0

    special characters. unicode utf8


    Hi there,


    I need some guidance in the following problem.

    I' building a file named convert.py which reads sewdish docs and converts all to o and all to a. In general, what I need is to get rid of the special characters and convert them to regular ones.

    I've trying to use encode, decode, utf8 and unicode but with no success.

    Can someone help me?

    Cheers,
    I
  2. #2
  3. Hello World :)
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2003
    Location
    Hull, UK
    Posts
    2,537
    Rep Power
    69
    I suspect the reason encode and decode arn't working is because sweedish is a language not an encoding.. if you could post your code i'd be glad to take a look, also, do you have a list of special chars you need converting? (by special chars i'm assuming you mean like and )

    I cant imagin this would be too hard to pull off if we're talking txt docs, not Microsoft word right?

    Mark
    programming language development: www.netytan.com Hula

  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2003
    Posts
    133
    Rep Power
    12
    It depends on how flexible you want to be. Doing the simplest thing, just reading an UTF8-encoded file in, containing just "", they showed up as: "\xc3\xa5\xc3\xa4\xc3\xb6". This means a simple replace will change the characters.

IMN logo majestic logo threadwatch logo seochat tools logo