#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2005
    Posts
    227
    Rep Power
    16

    Preg_replace regex to replace "rogue" apostrophes


    Hi,

    Following up on a previous question posted to the wrong forum...

    I want to search for any "rogue" apostrophes in a long $string that don't match common usage, and replace them with a space.

    For example. they're, I've, he'll will remain as-is. But with... they'vve, he'lele, big'ss the ' will be replaced by a space and effectively split the word in two within the string.

    This is the regex I've got which seems to find the matches ok, I just don't know how to get it replacing unwanted apostrophes.

    Can anyone help? Thanks, Regan.


    Code:
    $word = "they'll";
    
    if ( preg_match("/([A-Za-z0-9]+)(\')(s|t|d|ll|re|ve|m)( |$)/i",$word,$matches) ) {
    	die($matches[0]);
    }
    
    die('nomatch');
    r
    Last edited by ryel01; January 28th, 2009 at 07:19 PM.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2005
    Posts
    227
    Rep Power
    16
    righto, I seem to have this working to some degree...

    Code:
    $string = "they'll do it a'ere he'dd be that's ok";
    
    $string = preg_replace("/\'(?!(s|t|d|ll|re|ve|m|nt)( |$))/i"," ",$string);
    
    die($string);

    If anyone can see any obvious mistakes your comments would be welcomed.

    regan
  4. #3
  5. Did you steal it?
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,069
    Rep Power
    9398
    That's basically what I would have done, though instead of ( |$) I might have used \b (=word boundary).
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2005
    Posts
    227
    Rep Power
    16
    Originally Posted by requinix
    instead of ( |$) I might have used \b (=word boundary).
    hi requinix.

    can you change my regex to show me how the \b works?

    it should also check for some leading characters before the ' but I'm still working on that one... (I'm not exactly the expect when it comes to regex!)

    regan

IMN logo majestic logo threadwatch logo seochat tools logo