#1
  1. No Profile Picture
    Contributing User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jul 2003
    Posts
    4,373
    Rep Power
    631

    Replacing Space with New Line Based on Length


    I don't know if this is possible with just a regexp but I know the right magic can do a lot of things. I can parse the string programmatically but will a regexp do it? I have a table cell that can fit 'n' characters on a line. I am looking for a way to replace a space with a new line character. The space to be replaced needs to be the last one such that the number of characters before it is <=n. The number of characters after that space may also be >n which means multiple replacements. TIA.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  2. #2
  3. Maddening Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,459
    Rep Power
    9645
    Yup, this is definitely regex-able, though it isn't the nicest thing.

    Code:
    /(?=[^\r\n]{n+1})
    First you need a bit that searches for a long line. That's >n non-newline characters. (Replace "n+1" with actually n+1. Not literally a "+1" in there.)
    Code:
    ([^\r\n]{1,n})
    Then the part that finds the longest (because by default regexes are greedy and will try to match as much as possible) span of characters up to 'n', with capturing.
    Code:
    [ ]/
    Finally the required space, which I put in brackets for readability here on the forum - you don't need them. The engine will match the <=n characters from before then backtrack until it finds a space.

    Note that if you have a long line without spaces, probably because someone's testing your system, then those won't be split. But they could be...

    Comments on this post

    • gw1500se agrees
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jul 2003
    Posts
    4,373
    Rep Power
    631
    Thanks. Your explanation is a real bonus for me. I control the content of the string so there will be no embedded new lines anywhere. Thus, I think you have it a bit backwards, if I understand it (which I'm not sure I do yet). It looks to me like you are replacing new lines with spaces rather than spaces with new lines. Is this the right way around?
    Code:
     /(?=[^ ]{n+1})([^ ]{1,n})[\n]/
    Last edited by gw1500se; December 5th, 2017 at 08:36 AM.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  6. #4
  7. Maddening Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,459
    Rep Power
    9645
    If you know there aren't newlines then we can simplify the regex a bit. The point of that initial assertion was to make sure it only replaces on lines that are long enough, but if your input doesn't have multiple lines then it's not needed.

    However [^ ]{1,n} in the next part won't work: it will match a word and then stop. So "123 567 901" would match "123" and then try to replace after it, when it should continue on through the 567.
    Code:
    /(.{1,n})[ ]/
    That will take up to 'n' characters of anything and then attempt to match a space. If it can't then it will backtrack in the first group until it finds one.

IMN logo majestic logo threadwatch logo seochat tools logo