#1
  1. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    745
    Rep Power
    0

    How To Weed-out Empty Values From Array Values ?


    Php Folks,

    How to weed-out empty array values ?

    PHP Code:
    print_r(array_filter($keywords_array'strlen')); 
    The above example from the following link did not work.
    https://stackoverflow.com/questions/...array-elements

    My code so far. Building a web crawler. It crawls your page and notes the keywords & links and counts them. Not fully finished.
    Look at the attached image and you will notice blank values on the column "keywords". That is due to array values being empty.
    Therefore, need to weed-out the empty values from the array values before dumping the array values onto mysql tbl.

    PHP Code:
    <?php 

    //Required PHP Files.
    include 'config.php';
    include 
    'header.php';

    //1). Set Banned Words.
    $banned_words = array("asshole""nut""bull****");

    $url 'https://www.york.ac.uk/teaching/cws/wws/webpage1.html';
    // 2). $curl is going to be data type curl resource.
    $curl curl_init();

    // 3). Set cURL options.
    curl_setopt($curlCURLOPT_URL"$url");
    curl_setopt($curlCURLOPT_SSL_VERIFYPEERfalse);
    curl_setopt($curlCURLOPT_RETURNTRANSFERtrue);

    // 4). Run cURL (execute http request).
    $result curl_exec($curl);

    if (
    curl_errno($curl))
    {
        echo 
    'Error:' curl_error($curl);
    }

    $response curl_getinfo$curl );

    //If page is fetched then replace banned words found on page.
    if($response['http_code'] == '200' )
    {   
        
    $regex '/\b';
        
    $regex .= implode('\b|\b'$banned_words);
        
    $regex .= '\b/i';
        
    $substitute 'BANNED WORD REPLACED';
        
    $clean_result preg_replace($regex$substitute$result);
        
    //Present the banned words filtered webpage.
        
    echo $clean_result;
    }
    else
    {
        
    //Show error if page fetching fails.
        
    echo "Page fetching problem!";
        echo 
    "$response[http_code]";
        exit();
    }

    curl_close($curl);

    //Define Variables
        
    $keywords_number "0";
        
    $keywords_count "0";
        
    $links_count "0";
        
    $keywords_links_count "0";
        
    $images_count "0";
        
    $keywords_images_count "0";
        
    $keywords_internal_links_count "0";
        
    $keywords_external_links_count "0";

    //Link Exractor starts here. It will extract all links present on the page.
    function linkExtractor($clean_result)
    {    
        
    $linkArray = array();
        if(
    preg_match_all('/<a\s+.*?href=[\"\']?([^\"\' >]*)[\"\']?[^>]*>(.*?)<\/a>/i'$clean_result$link_matchesPREG_SET_ORDER))
        {
            foreach (
    $link_matches as $link_match
            {          
                GLOBAL 
    $url,$links_count,$keywords_links_count,$images_count,$keywords_images_count,$keywords_internal_links_count,$keywords_external_links_count;
           
                echo 
    "url: $url<br>";
                echo 
    "link_match: $link_match[links_count]<br>";
                
    $links_count++;
                echo 
    "links_count: $links_count++<br>";
                
    $keywords_links_count++;
                echo 
    "keywords_links_count: $keywords_links_count++<br>";
                
    $images_count++;
                echo 
    "images_count: $images_count++<br>";
                
    $keywords_images_count++;
                echo 
    "keywords_images_count: $keywords_images_count++<br>";
                
    $keywords_internal_links_count++;
                echo 
    "keywords_internal_links_count: $keywords_internal_links_count++<br>";
                
    $keywords_external_links_count++;
                echo 
    "keywords_external_links_count: $keywords_external_links_count++<br>";          
           }
        }
        return 
    $linkArray;
    }
    echo 
    '<pre>' print_r(linkExtractor($clean_result), true) . '<pre>';


    //Content Filter starts here to check for banned words present on the page.
    $keywords_array explode(" "$clean_result);

    $keywords_count "0";
    foreach(
    $keywords_array as $keyword
    {   
        echo 
    $keyword."\n";
        echo 
    "keyword: $keyword<br>";
        
    $keywords_count++;
        echo 
    "Keywords_count: $keywords_count++<br>";
        
        
    print_r(array_filter($keywords_array'strlen'));
    }



    foreach(
    $keywords_array as $keyword
    {   
        
    $keywords_number++;
            
        
    //Insert the user's inputs into Mysql database using php's sql injection prevention method "Prepared Statements".
        
    $stmt mysqli_prepare($conn"INSERT INTO searchengine_index(url,keywords,keywords_number,keywords_count,links,links_count,keywords_links_count,images_count,keywords_images_count,keywords_internal_links_count,keywords_external_links_count) VALUES (?,?,?,?,?,?,?,?,?,?,?)");
        
        GLOBAL 
    $url,$keywords_number,$links_count,$keywords_links_count,$images_count,$keywords_images_count,$keywords_internal_links_count,$keywords_external_links_count;
        
        
    mysqli_stmt_bind_param($stmt'ssisiiiiiii'$url,$keyword,$keywords_number,$keywords_count,$link_match[$keywords_links_count],$links_count,$keywords_links_count,$images_count,$keywords_images_count,$keywords_internal_links_count,$keywords_external_links_count);
        
    mysqli_stmt_execute($stmt);
                
        
    //Check if data was successfully submitted or not.
        
    if(!$stmt)
        {
            echo 
    "Sorry! Our system is currently experiencing a problem indexing your website. We will try some other time!";
            exit();
        }    
    }

    ?>
    And, I get this error:

    Notice: Undefined index: links_count in C:\xampp\htdocs\test\crawler.php on line 71

    How to rid this error ? Wanting to echo each array values in the foreach loop.
    Line 71:
    PHP Code:
    echo "link_match: $link_match[links_count]<br>"
    And, I don't know why the "url_indexing_date" column showing zero values. I got another tbl that shows the dates in such a column.

    I will need to find a regex to weed-out the html tags so they don't get dumped into the "keywords" column in the tbl but only the keywords extracted from the webpage content that the visitor sees.
    Attached Images
    Last edited by UniqueIdeaMan; January 24th, 2018 at 12:40 PM.
  2. #2
  3. Code Monkey V. 0.9
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2005
    Location
    A Land Down Under
    Posts
    2,395
    Rep Power
    2105
    Originally Posted by UniqueIdeaMan
    How to rid this error ? Wanting to echo each array values in the foreach loop.
    Make sure that the values for $link_match contain a key for 'links_count'. You don't have that there to start with, so you get that notice.

    Again, read up on basic debugging. It's not that hard when you actually take a few seconds to understand the message that you're getting. In this case it's totally obvious.

    Comments on this post

    • UniqueIdeaMan agrees
  4. #3
  5. Banned (not really)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Dec 1999
    Location
    Caro, Michigan
    Posts
    14,925
    Rep Power
    4554
    That's not an error, it's a notice. There's a difference. The message tells you exactly what you need to know.

    As for why your datetime column is zero, I imagine you didn't do something mentioned here: https://dev.mysql.com/doc/refman/5.7...alization.html

    Comments on this post

    • UniqueIdeaMan agrees
    -- Cigars, whiskey and wild, wild women. --
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    745
    Rep Power
    0
    Thanks guys (Catacaustic & Sedopati). Gave you guys 1 rep each. Not much.
    I do not understand this forum's rep thing much. Usually, when I give reps I can only give 0. Only few days ago 8 were available and I gave someone 1 and then when I tried giving another then I see I can give only 0 again. I still should have had 7 left.
    Anyway, this time I see I can give 8 again. Gave catacaustic 1 and thought I'll have 7 left. But I see I got 8 left still when I gave Sedopati 1. Puzzling.

    Comments on this post

    • Sepodati disagrees : No one cares about Rep. it's a useless holdover feature. Here's -4554 points for ya.
    Last edited by UniqueIdeaMan; January 26th, 2018 at 04:48 AM.

IMN logo majestic logo threadwatch logo seochat tools logo