Thread: Strstr question

    #1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2012
    Posts
    40
    Rep Power
    3

    Strstr question


    So I used file_get_contents on a link I'm trying to extract only a table from. If I only echo the $string that file_get_contents generates, it shows the website. Then I tried passing the $string through htmlentities, and it showed the website in html code form.

    example of one of my results
    Code:
         <td align="center" >10 
         </td> 
    </tr> 
    <tr > 
         <td align="center" >2</td> 
         <td align="center" >2</td> 
         <td > Roberts, Gregory </td> 
         <td align="center" >&nbsp;</td>
         <td > Army WCAP </td> 
         <td align="center" >2.10m </td> 
         <td align="center" > .....etc
    I wanted to parse the $string with strstr and then after that with substr. If I parsed the $string using "center"
    PHP Code:
      $newstring strstr ($html"center"); 
    it would show center and everything after it, including all the html. If I tried to parse it with
    PHP Code:
      $newstring strstr ($html" center">10</td></tr"); 
    the result is empty. How can I parse with the special characters as the delimiter?

    Here is the set up of my php
    PHP Code:
    $url =$link';
    $str = file_get_contents($url);
    $html = htmlentities($str);
    $newstring = strstr ($html, "center");
    echo $newstring; 
  2. #2
  3. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,908
    Rep Power
    6351
    Of your three code samples, only 1 is valid PHP. It's very difficult to help you if you don't include your real code.

    What are you actually looking to get out of this string? Tell us the problem, don't vaguely describe what you've decided is the solution.

    The new user guide may be of some help.
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2012
    Posts
    40
    Rep Power
    3
    Well I acknowledged that
    PHP Code:
      $newstring strstr ($html" center">10</td></tr"); 
    was incorrect by saying the result is empty. So that only leaves two things to be incorrect. This:
    PHP Code:
    $url =$link
    $str file_get_contents($url); 
    $html htmlentities($str); 
    $newstring strstr ($html"center"); 
    echo 
    $newstring
    is correct because I know it works and the third sample is just a copy of this.

    This is the only PHP that I need to include and it is the code I'm using, not just a snippet or example. The $link is in place of the actual url, but thats not the problem either.

    Let me repeat myself but go a bit further. If I echo $newstring I get this:
    Code:
    Score</td> </tr> <tr > <td align="center" >1</td> <td align="center" >1</td> <td > Hoffman, Dave </td> <td align="center" >&nbsp;</td> <td > unat </td> <td align="center" >2.20m </td> <td align="center" > 10 </td> </tr> <tr > <td align="center" >2</td> <td align="center" >2</td> <td > Roberts, Gregory </td> <td align="center" >&nbsp;</td> <td > Army WCAP </td> <td align="center" >2.10m </td> <td align="center" > - </td> </tr> <tr > <td align="center" >3</td> <td align="center" >3</td> <td >
    or a long string. If I then try to parse it with strstr ($html, "Hoffman"); then Hoffman and everything after it is echoed. My question is, I want to parse by specific html code. If I tried to parse the same code with strstr ($html, "Hoffman, Dave </td>"); there would be a blank result. How can I parse while using those type of characters? I guess I just repeated myself but I don't know how to ask it. Its not just an example, I tried to do these things. I want to retrieve an entire table from many webpages. I just want to parse out the code for the table, is there a better way to go about this? I was seeing something about spiders but I couldn't get that to work either. I tried downloading Inspyder, it wasn't what I was looking for. Thanks for being patient, I'm trying to be as clear as possible.
  6. #4
  7. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,908
    Rep Power
    6351
    The result wasn't empty, that is invalid PHP code which will never run. I'm not trying to say that it's simply buggy, it will not run at all, ever. Pasting that line into a completely working document will break the entire program. Look at it, it's colored wrong.

    Now that you've removed an errant single quote from your other example, it has become valid code. We can only assume you copy and paste, and random misplaced quotes in the middle of your code are, in fact, in your code.

    Are you viewing the source to view the actual makeup of this HTML? Whitespace and newlines matter. Your first post seems to imply that you were ignoring whistpace. You can't do that.

    If you're looking for a table, use strpos to find the first occurrence of "<table". Then use strpos again to find the first occurrence of "</table". That gives you the beginning and ending of the table.

    The DOMDocument library is probably better for you.
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.

IMN logo majestic logo threadwatch logo seochat tools logo