#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2004
    Posts
    1
    Rep Power
    0

    removing unwanted characters from a string


    I was wondering how to remove a character from a string that was retreived from a text file (.txt).
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    Location
    London, England
    Posts
    1,585
    Rep Power
    1373
    If you only want to remove all instances of a single character, then you can use the string replace method:

    Code:
    >>> txt = 'hello world'
    >>> txt.replace('l', '')
    'heo word'
    To remove a set of characters then you could use the filter function:

    Code:
    >>> filter(lambda x: x not in 'aeiou', txt)
    'hll wrld'
    Alternatively you could use the string module's translate and maketrans functions:

    Code:
    >>> import string
    >>> trans = string.maketrans('', '')
    >>> string.translate(txt, trans, 'aeiou')
    'hll wrld'
    The translate function is more useful if you wish to replace some characters and remove others.
    EDIT: translate is also much faster than filter, since it is implemented entirely in C, while filter will need to call a python function for each character. This probably does not matter much for scripts occasionally acting on short strings, but can be critical if performance is an issue.

    Dave - The Developers' Coach
    Last edited by DevCoach; May 8th, 2004 at 06:04 PM.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Nov 2003
    Posts
    624
    Rep Power
    35
    It all depends which characters you want to remove...

    To remove whitespace characters (space, tabs, newlines, etc) from the left end of the string:

    Code:
    >>> txt = "  Hello World  "
    >>> txt.lstrip()
    "Hello World  "
    From the right of the string:

    Code:
    >>> txt = "  Hello World  "
    >>> txt.rstrip()
    "  Hello World"
    From both sides:

    Code:
    >>> txt = "  Hello World  "
    >>> txt.strip()
    "Hello World"
    To remove the first character:

    Code:
    >>> txt = "Hello World"
    >>> txt[1:]
    "ello World"
    The last character:

    Code:
    >>> txt = "Hello World"
    >>> txt[:-1]
    "Hello Worl"
    The 5th character (or the nth character(s)):

    Code:
    >>> txt = "Hello World"
    >>> txt[:5] + t[6:]
    "HelloWorld"

IMN logo majestic logo threadwatch logo seochat tools logo