#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    2
    Rep Power
    0

    Using preg_replace to remove excess lines of text


    Hi, this is my first post here. I'm trying to use preg_replace to remove all lines of text after the first 50. In other words, if there are 54 lines, it would remove the last 4, but leave the rest unchanged.

    The wiki software I'm using (pmwiki) uses this syntax for an "ROS" (replace on save) function:

    PHP Code:
    $ROSPatterns["text to search for"] = "text to replace with"
    The "text to search for" uses the preg_replace syntax. Now, I've never written php before, so try not to laugh, but here's what I have so far:

    PHP Code:
    $ROSPatterns["/\n.*/"] = " "
    This removes everything after the first line. So I thought something like this:

    PHP Code:
    $ROSPatterns["/\n.*{50}/"] = " "
    would remove everything after the 50th line, but the {50} thing doesn't seem to work anywhere. As I said, I've never written anything like this before, so I'm just feeling around in the dark and using php.net, but everything I try either fails or has mysterious and erratic results.

    If someone could show me how this is done, I'd really appreciate it. Also, if it's not too much trouble, if you could explain what each character you're using is for, that would be great.

    Thanks!
  2. #2
  3. No Profile Picture
    Lost in code
    Devshed Supreme Being (6500+ posts)

    Join Date
    Dec 2004
    Posts
    8,316
    Rep Power
    7170
    Note: I haven't tested this in PHP:
    PHP Code:
    $ROSPatterns["/^(([^\n]*\n?){0,50}).*/s"] = "$1"
    Rather than replacing all of the characters after the 50th line break, you are instead replacing the whole string with the first 50 lines. Replacing the characters after the 50th line break would require performing a positive lookbehind of indefinite length, which I don't believe PHP supports.

    PHP will replace the $1 in the value with the first matching subgroup of the regular expression, which in this case is the first 50 lines.

    The two /'s in the pattern are just delimiters. They can actually be any value you choose.

    Parenthesis are used to group patterns together.

    The ^ immediately after the first / anchors the pattern to the start of the string.

    [^\n] matches any character that is NOT a newline. [\n] would match any character that is a newline, however the ^ at the start of the character class inverts it.

    The * after [^\n] means match zero or more. This matches all of the characters on the line except the newline.

    The \n matches a single newline (the end of the line).

    The ? after the \n makes the ending newline optional (for example, in case your string is less than 50 lines).

    {0,50} means to match the preceding group 0 to 50 times (the first 50 lines).

    The . matches anything, and the * causes it to match anything zero or more times. This matches the remainder of the entire string.

    The s after the last / is a pattern modifier which makes the . meta-character match newlines.
    PHP FAQ

    Originally Posted by Spad
    Ah USB, the only rectangular connector where you have to make 3 attempts before you get it the right way around
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    2
    Rep Power
    0
    Thank you very much, that works perfectly. And thanks for explaining how it works. It was a lot more complex than I'd imagined.

IMN logo majestic logo threadwatch logo seochat tools logo