#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2017
    Posts
    2
    Rep Power
    0

    Delete page numbers


    Hi All

    I have a fourteen thousand page text document with page number 1 to 14562. I want to delete the page numbers. Document is in this sort of format:

    some text
    334
    some text

    The following regex seems to work ok for the above examaple - ^\d{1,5}\r\n

    However, once the page numbers get over 1000 they have a comma like 1,001 or 10,456. Can anyone help with amending my regex
    to cover these cases please?

    Rgds

    Bob
  2. #2
  3. Lazy Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,396
    Rep Power
    9645
    Can you guarantee that the page number line has only digits and commas? Can you guarantee that no other line will look like a page number?

    If so then don't worry about the length of the number and just look for an entire line (using ^ and $) that is only digits and commas (using []).
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2017
    Posts
    2
    Rep Power
    0
    Thanks for that. I used ^[0-9,]+$ which seems to work a treat. Will need to double check it did not delete anything other than page numbers but I don't think so.

IMN logo majestic logo threadwatch logo seochat tools logo