#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2013
    Posts
    2
    Rep Power
    0

    Regex Split Lookahead Python


    Hi,

    I have a string similar to:
    Code:
    text:hello,text:world,text:sentence, with commas
    What I need to do is split the whole string to:
    Code:
    ['text:hello', 'text:world', 'text: sentence, with commas']
    Without splitting the comma followed by a space.

    I understand that one can use a lookahead, but I'm not quite sure how to implement it.

    Any ideas?

    Freddy
  2. #2
  3. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Location
    39N 104.28W
    Posts
    157
    Rep Power
    2
    Originally Posted by FreddoT
    Hi,

    I have a string similar to:
    Code:
    text:hello,text:world,text:sentence, with commas
    What I need to do is split the whole string to:
    Code:
    ['text:hello', 'text:world', 'text: sentence, with commas']
    Without splitting the comma followed by a space.

    I understand that one can use a lookahead, but I'm not quite sure how to implement it.

    Any ideas?


    Freddy
    Not using "look ahead" nor particularly elegant but:
    Code:
    s="text:hello,text:world,text:sentence, with commas"
    >>> s=s.replace(", ","||")
    >>> s
    'text:hello,text:world,text:sentence||with commas'
    >>> s2=s.split(',')
    >>> s3=[a.replace("||",", ") for a in s2]
    >>> s3
    ['text:hello', 'text:world', 'text:sentence, with commas']
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    /dev/null
    Posts
    162
    Rep Power
    18
    With lookahead:

    Code:
    >>> import re
    >>> x = 'text:hello,text:world,text:sentence, with commas'
    >>> re.split(',(?=t)', x)
    ['text:hello', 'text:world', 'text:sentence, with commas']
    >>>
  6. #4
  7. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Location
    39N 104.28W
    Posts
    157
    Rep Power
    2
    Originally Posted by noobie1000
    With lookahead:

    Code:
    >>> import re
    >>> x = 'text:hello,text:world,text:sentence, with commas'
    >>> re.split(',(?=t)', x)
    ['text:hello', 'text:world', 'text:sentence, with commas']
    >>>
    beware that this works only if the comma is followed by "t". To prevent splitting on comma followed by space:
    Code:
    re.split(',(?!= )',x)
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2013
    Posts
    2
    Rep Power
    0

    Thumbs up Cheers


    Originally Posted by rrashkin
    beware that this works only if the comma is followed by "t". To prevent splitting on comma followed by space:
    Code:
    re.split(',(?!= )',x)
    Cheers rrashkin Worked a treat.

    Freddy

IMN logo majestic logo threadwatch logo seochat tools logo