#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    15
    Rep Power
    0

    Make of 2 regular expressions 1 regular expression


    Hi,
    I would like to have one expression:

    //1. expression
    $regex = '/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/'; //verify email

    //2. expression
    preg_match_all('~<(td)>(?<content>\w+)</\1>~Uis', $gesamteDatei, $result, PREG_SET_ORDER); //get values of the data set, which are
    // the 2. expression doesnt match any email adress

    I tried it, but I failed and I dont know why... may be you can help me ?


    preg_match_all('~<(td)>(?<content>/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/)</\1>~Uis', $gesamteDatei, $result, PREG_SET_ORDER);

    Thanks for your help (:
  2. #2
  3. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    15
    Rep Power
    0
    Thanks for your help so far,

    I got a file where i want

    ...<tr><td>value 1</td><td>value 1</td><td>value 2</td><td>value 3</td><td>value 4</td><td>value 5</td><td>value 6</td><td>email</td><td>value 8</td></tr> ... //1 dataset

    The problem is that the value is mostly at a different position:

    like ->

    ...<tr><td>value 1</td><td>value 1</td><td>value 2</td><td>value 3</td><td>value 4</td><td>email</td><td>value 6</td><td>value 7</td><td>value 8</td></tr> ... //1 dataset


    or like ->


    ...<tr><td>value 1</td><td>value 1</td><td>value 2</td><td>value 3</td><td>value 4</td><td>value 5</td><td>value 6</td><td>value 7</td><td>email</td></tr> ... //1 dataset

    and I need some of the value left or right of that field,

    I tried to solve it like this:

    preg_match_all('/<td>(.*?)<\/td>/',$gesamteDatei,$InhaltTdElemente); //find values

    $gesamteDatei= whole stream(1 table and many datasets)
    $InhaltTdElemente= all values between <td> </td>


    I need to search fo something"(ZiP, Country....)

    preg_match_all('~<(td)>(?<content>\w+)</\1>~Uis', $gesamteDatei, $result, PREG_SET_ORDER);

    // that doesnt give me the email address

    code:


    <?php
    print "<html>";
    print "<head>";
    print "</head>";
    print "<body>";
    print "<nobr>";

    $datei='new_folders42000.html'; // stream
    $pfad='/xampp/htdocs/xampp/new/'; //path to stream
    if($gesamteDatei=file_get_contents($datei)){ //checking access
    printf(" <div style='background:lime'>Datei kann ausgelesen werden !</div><br>\n"); //reading ok

    preg_match_all('/<td>(.*?)<\/td>/',$gesamteDatei,$InhaltTdElemente); //whole content
    preg_match_all('~<(td)>(?<content>\w+)</\1>~Uis', $gesamteDatei, $result, PREG_SET_ORDER); //

    /*checking email */
    $lange=count($InhaltTdElemente[1]);
    for($i=0;$i<$lange;$i++){
    $email = $InhaltTdElemente[1][$i]; // alle Felder
    $regex = '/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/';

    if (preg_match($regex, $email)) {
    echo $email.'<br/>';

    } else {
    //echo $email ." is an invalid email. Please try again.";
    }
    }

    echo '<br/>';
    echo '<br/>';



    $country_code="DE"; //regular expression germany

    //Checking zip_clode
    $ZIPREG=array(
    "US"=>"^\d{5}([\-]?\d{4})?$",
    "UK"=>"^(GIR|[A-Z]\d[A-Z\d]??|[A-Z]{2}\d[A-Z\d]??)[ ]??(\d[A-Z]{2})$",
    "DE"=>"\b(?!01000)(?!99999)(0[1-9]\d{3}|[1-9]\d{4})\b",
    "CA"=>"^([ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ])\ {0,1}(\d[ABCEGHJKLMNPRSTVWXYZ]\d)$",
    "FR"=>"^(F-)?((2[A|B])|[0-9]{2})[0-9]{3}$",
    "IT"=>"^(V-|I-)?[0-9]{5}$",
    "AU"=>"^(0[289][0-9]{2})|([1345689][0-9]{3})|(2[0-8][0-9]{2})|(290[0-9])|(291[0-4])|(7[0-4][0-9]{2})|(7[8-9][0-9]{2})$",
    "NL"=>"^[1-9][0-9]{3}\s?([a-zA-Z]{2})?$",
    "ES"=>"^([1-9]{2}|[0-9][1-9]|[1-9][0-9])[0-9]{3}$",
    "DK"=>"^([D-d][K-k])?( |-)?[1-9]{1}[0-9]{3}$",
    "SE"=>"^(s-|S-){0,1}[0-9]{3}\s?[0-9]{2}$",
    "BE"=>"^[1-9]{1}[0-9]{3}$"
    );






    for($i=0,$max = count($result); $i < $max; $i++) {
    $zip_postal=$result[$i]['content'];

    if ($ZIPREG[$country_code]) { //at the moment germany

    if (!preg_match("/".$ZIPREG[$country_code]."/i",$zip_postal)){ //check german zip
    //Validation failed, provided zip/postal code is not valid.
    } else {
    //Validation passed, provided zip/postal code is valid.
    if($result[$i]['content'] >=50000 && $result[$i]['content'] <= 60000) { //range of zip
    //$datei=fopen($pfad.'56000.html','w+');
    //trying to read fields
    printf($result[$i]['content'].'<br/>');
    printf($result[$i-1]['content'].'<br/>');
    printf($result[$i-2]['content'].'<br/>');
    printf($result[$i-3]['content'].'<br/>');
    printf($result[$i-4]['content'].'<br/>');
    printf($result[$i-5]['content'].'<br/>');
    printf($result[$i-6]['content'].'<br/>');
    printf($result[$i-7]['content'].'<br/>');
    printf($result[$i-8]['content'].'<br/>');
    printf($result[$i-9]['content'].'<br/>');
    printf($result[$i+1]['content'].'<br/>');
    printf($result[$i+2]['content'].'<br/>');
    printf($result[$i+3]['content'].'<br/>');
    printf($result[$i+4]['content'].'<br/>');
    printf('<br>');
    //fclose($datei);
    }

    }

    } else {

    //Validation not available

    }

    }

    // if ( preg_match('/^\d{5}$/', $input) && (int) $input > 1000 && (int) $input < 99999 ) {}
    //fclose($datei);
    }

    }
    }else{
    printf(" <div style='background:red'>Datei kann nicht ausgelesen werden !</div><br>\n");
    }


    print "</nobr>";
    print "</body>";
    print "</html>";

    ?>
  4. #3
  5. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,911
    Rep Power
    1045
    Hi,

    my eyes are bleeding -- and my heart as well.

    First of all: Why is your data stored in such a terrible format with the values randomly sprinkled over a HTML(!) table? That's pretty much the worst case scenario. Even a f*cking Excel table has more structure.

    Secondly, you need to stop trying to solve everything with a regex. I know it's common belief that regexes are some kind of all-powerful tool for processing anything text-related. That's a lie. Regular expression are the most primitive parsers available. They're fine for very simple patterns like a date or a telephone number. But they're already stretched to the limit when parsing email addresses. And they're totally and absolutely unsuitable for parsing actual languages like HTML, XML etc. Do not try to parse HTML with a regex.

    You need to choose the right tool for the concrete problem. Wanna parse HTML? Well, how about an HTML parser? However, before you do start digging into your HTML tables, you should answer the first question: Who did this to you and why?
    Last edited by Jacques1; May 15th, 2013 at 03:45 PM.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2013
    Posts
    5
    Rep Power
    0
    ^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$

    I was test your regex and see true. I try to check at php.toolregex.com

    Originally Posted by computeruser13
    Hi,
    I would like to have one expression:

    //1. expression
    $regex = '/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/'; //verify email

    //2. expression
    preg_match_all('~<(td)>(?<content>\w+)</\1>~Uis', $gesamteDatei, $result, PREG_SET_ORDER); //get values of the data set, which are
    // the 2. expression doesnt match any email adress

    I tried it, but I failed and I dont know why... may be you can help me ?


    preg_match_all('~<(td)>(?<content>/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/)</\1>~Uis', $gesamteDatei, $result, PREG_SET_ORDER);

    Thanks for your help (:

IMN logo majestic logo threadwatch logo seochat tools logo