#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    39
    Rep Power
    2

    Couple of questions regarding RE


    Hello,
    I have a couple of doubts regarding Regular Expressions.

    1.I am trying to write a perl RE for unix time (number of seconds since Epoch). So it should check if :
    a.The string contains digits ONLY.
    b.It should allow unlimited number of digits

    I have come up with this but it only checks if it contains atleast one digit. An expression like 9021902910a would pass because it contains the 'a'.
    Code:
    #!/usr/bin/perl
    
    use strict;
    use warnings;
    
    my $string;
    
    while(1)
    {
     $string = <>;
     chomp($string);
     print "The entered string is [",$string,"]\n";
    
     #if ($string =~  m/^(\d+)$/)
     if ($string =~  m/(\d+)/)
     {
       print 'match';
     }
     else
     {
      print 'no match';
     }
    
     print "\n\n";
    }
    2.See the commented line above : #if ($string =~ m/^(\d+)$/)
    I have seen references on the net where they enclose the RE as ^RE$ . Is this necessary ?

    3.Lastly, is there a way I can save the RE pattern in the if-condition as a variable? As in, can
    if ($string =~ m/(\d+)/) be replaced by something like

    $RE = (\d+);
    if ($string =~ m/$RE/)
    {
    print "pass";
    }


    Please advise.
    Thanks

    PS : I understand the first question especially is a very naive one, but the internet only points me to \d+. I'm guessing I have to replace the =~ with something else, but my google searches don't get me results.
    Last edited by IAMTubby; July 1st, 2013 at 05:50 AM.
  2. #2
  3. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,259
    Rep Power
    1810
    There are probably many time modules to do the type of thing you are after, but I'm not really clear of your requirements.

    ---

    Regular expressions:

    1.I am trying to write a perl RE for unix time (number of seconds since Epoch). So it should check if :
    a.The string contains digits ONLY.
    b.It should allow unlimited number of digits
    I have come up with this but it only checks if it contains atleast one digit. An expression like 9021902910a would pass because it contains the 'a'.
    Code:
    if ($string =~  m/(\d+)/) {
    The leading m isn't necessary (matches by default). The parentheses aren't necessary.

    The parentheses are a capture, and they have a purpose that could be used to great effect, you just aren't using them here.

    The form of your regex says to check if the string contains something you like, but then keep the original string.

    If the string contains one or more digits, it will match. If you want to make sure the string has only digits, then you do need to check for other conditions.

    Code:
    if ($string =~  /^\d+$/) {
    The carat (^) means beginning of string, and the dollar sign means the end. You chomped the string so there is no newline character in the way. If there are only digits from beginning to end, one or more, the string matches.

    Now, about the capturing parentheses. You can extract the matching bits of a string with them, and still get a reasonable input, even if the user flubs it accidentally or deliberately. That has something to do with what 'taint' mode is about.

    Code:
    if ($string =~  m/(\d+)/) {
       print "Found: $1\n";
    } else {
       print "no match\n";
    }
    The part that matched is automatically placed in the variable $1. If you had a second set of parentheses, that result would be in $2, etc.

    Or you can use the list form to return the data to other variables.

    Code:
    my ($match) = $string =~  m/(\d+)/;
    You can use an array to capture the returned data, or a list of scalars as I did here (a list of one item).

    Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    
    my $string;
    
    while(1) {
    	$string = <>;
    	chomp($string);
    	print "The entered string is [",$string,"]\n";
    	my ($userinput) = $string =~  m/(\d+)/;
    	
    	if ($userinput) {
    		print "Found: $userinput\n";
    	} else {
    		print "no match\n";
    	}
    }
    If the user provides any number in the string, the script will continue. If he provides several numbers separated by the other characters, only the first will be accepted and other input ignored. You get to choose how to deal with errant users.

    Lastly, is there a way I can save the RE pattern in the if-condition as a variable? As in, can
    if ($string =~ m/(\d+)/) be replaced by something like
    Yeah:

    Code:
    my $re = qr/^\d+$/;
    
    if ($string =~ $re) {
        # ...
    }
    Last edited by keath; July 1st, 2013 at 06:33 AM.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    39
    Rep Power
    2
    Originally Posted by keath
    but I'm not really clear of your requirements
    keath, firstly thank you so much for your detailed reply. I really appreciate it. Basically, I want to check if the entered string contains ONLY digits

    The leading m isn't necessary (matches by default). The parentheses aren't necessary.
    So I assume that my RE can be writen as
    1.if ($string =~ m/(\d)+/) OR
    2.if ($string =~ /\d+/) OR
    3.if ($string =~ /^\d+$/) OR
    But if ($string =~ /^\d+$/) seems to be exactly what I want. While 1 and 2 above would pass an input string like say,1234abc, 3 does not. Just restating the fact that my requirement is that my RE passes an input string that contains ONLY digits
    Please can you tell me why 3 works and 1 and 2 don't

    If you want to make sure the string has only digits, then you do need to check for other conditions.
    Hmm, isn't there a 1-liner which checks if the string contains only digits ? I think if ($string =~ /^\d+$/) is working.

    Yeah:
    my $re = qr/^\d+$/;

    if ($string =~ $re) {
    # ...
    }
    Thanks a lot keath, tried it out and it works
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    832
    Rep Power
    496
    Yes,

    Code:
    if ($string =~ /^\d+$/) { # ...
    will check if the string contains only digits (at least one).
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    39
    Rep Power
    2
    Originally Posted by Laurent_R
    Yes,

    Code:
    if ($string =~ /^\d+$/) { # ...
    will check if the string contains only digits (at least one).
    Laurent_R, would it be okay to tell me how these 3 are different. Why is it that only 3 gives me what I want ? They all look so similar.
    1.if ($string =~ m/(\d)+/) OR
    2.if ($string =~ /\d+/) OR
    3.if ($string =~ /^\d+$/) OR
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    832
    Rep Power
    496
    Perl Code:
    if ($string =~ m/(\d)+/)


    This checks whether there are some digits in the string (at least one), but they can also be other things than digits. This will match, for example "foo2bar" because there is at least one digit in that string. The parens aroung the \d are useless or meaningless in this context. They would lead to the capture of one of the digits (presumably the last one) which does not make much sense, as you are not trying to capture the numbers, I understand. But if you wanted to capture the whole group of digits (which is much more common), the parens would need to be around the \d and + character: /(\d+)/


    So,
    Perl Code:
    if ($string =~ m/(\d)+/)

    could be rewritten:
    Perl Code:
    if ($string =~ m/\d+/)


    To tell the truth, the m at the beginning is also useless (but some people still prefer to have it as they feel it shows more explicitly what is being done). It is only really useful if you want to use a character other than / as a delimiter for the regex:

    Perl Code:
    if ($string =~ m{\d+})

    or
    Perl Code:
    if ($string =~ m[\d+])

    or
    Perl Code:
    if ($string =~ m!\d+!)


    The three pattern matches above are valid only if you keep the m at the stat to tell the compiler that you are stating the regex pattern, and they all match a string which has at least one digit in it. If you use the / delimiter as in /\d+/, you don't need the initial m.

    Perl Code:
    if ($string =~ /\d+/)

    If you have followed what I said above, this is just equivalent to the very first one above in this post (so long as you don't want capture), i.e. this checks whether there are some digits in the string (at least one). It is just a shorter way of saying the same thing.

    Perl Code:
    if ($string =~ /^\d+$/)


    This is quite different: this says litterally: start of string (^), followed by one or more digits, followed by end of string ($). In other words, it says a string containing only digits (at least one) and nothing else than digits (well, one small exception: a newline character at the end of the string will not prevent it from matching the pattern).

    I hope that this is clear.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    39
    Rep Power
    2
    Originally Posted by Laurent_R
    I hope that this is clear.
    Yes Laurent_R, it's crystal clear . Thanks a lot.

IMN logo majestic logo threadwatch logo seochat tools logo