Thread: File Parsing...

    #1
  1. No Profile Picture
    fantom
    Guest
    Devshed Newbie (0 - 499 posts)
    Hey people, I have a problem... I have this huge 9MB file (a database of my CGI search spider with tons of http:// links and all kinds of other information) that I want to import into my PHP search.
    Problem is, the database file contains not only links but descriptions, some weird characters which I don't need. So, I'm looking to separate the http:// links from the rest of the information, and I need a script that will do that.
    To make it easier, it's a database of pictures and all links start with "http://" and end with ".jpg", so what would be the script that would read this file and print out all the links in a separate file?

    P.S.: These links are not ordered per line, but they are all over the place, sometimes even 3-4 per each line.
  2. #2
  3. No Profile Picture
    Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2000
    Posts
    6
    Rep Power
    0
    This will do the trick, and very fast 2
    <BLOCKQUOTE><font size="1" face="Verdana,Arial,Helvetica">code:</font><HR><pre>
    <?php
    links.dat content: http://www.bla.com/test.jpg http://www.blie.nm/teste.gif http://www.erm.cs/testie.kwam http://www.YO.org/bore.jpg http://www.hellowworld.za/phprulez.jpg

    viewlinks.php content:
    <?php
    $fp = fopen ("links.dat","r");
    while (list ($line) = fscanf ($fp, "%sn"))
    {
    if(($beg=strpos($line,"http:"))===false&#0124; &#0124;($end=strpos($line,".jpg"))===false);
    else print substr($line,$beg,($end-$beg)+4)."<BR>";
    }
    fclose($fp);
    php?>
    [/code]

    u can do it with regular expression, but that's slow in my opinion
    , u should be able to solve such easy problems urself btw

    ------------------
    Greetings lewi
  4. #3
  5. No Profile Picture
    fantom
    Guest
    Devshed Newbie (0 - 499 posts)
    I would try to do it myself, but I'm just a newbie to all this PHP concept, and that would require a couple of days of reading some resources on PHP...

    But anyway, I tried this code:
    <BLOCKQUOTE><font size="1" face="Verdana,Arial,Helvetica">code:</font><HR><pre>
    <?php
    $fp = fopen ("links.dat","r");
    while (list ($line) = fscanf ($fp, "%sn"))
    {
    if(($beg=strpos($line,"http:"))===false| |($end=strpos($line,".jpg"))===false);
    else print substr($line,$beg,($end-$beg)+4)."<BR>";
    }
    fclose($fp);
    php?>
    [/code]

    and I'm getting parse error on line 5.
    I'm using it on a system that supports PHP4, so do you have any idea what might be wrong?
  6. #4
  7. No Profile Picture
    Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2000
    Posts
    6
    Rep Power
    0
    I forgot that u had to read multiple links in a file line,
    then this will work, i tested the code as is, and it worked

    Note: it only works if the links are seperated by ' '

    <BLOCKQUOTE><font size="1" face="Verdana,Arial,Helvetica">code:</font><HR><pre>
    <?php
    $fp = fopen ("test.dat","r");
    while ($fileline=fgets($fp,4096))
    {
    $links=explode(" ",$fileline);
    for($i=0; $i<count($links); $i++)
    {
    if(!(($beg=strpos($links[$i],"http:"))===false)&&!(($end=strpos($links[$i],".jpg"))===false)){
    print substr($links[$i],$beg,($end-$beg)+4)."<BR>";
    }
    }
    }
    fclose($fp);
    php?>
    [/code]

    and by the way on line 5 (the long if statement) the or (&#0124; &#0124 must stand
    togheter, not seperated by a ' '..
    somhow it isn't displayed right, or i copied it wrong

    ------------------
    Greetings lewi
  8. #5
  9. No Profile Picture
    fantom
    Guest
    Devshed Newbie (0 - 499 posts)
    Goddamn, Lewi, you are the man!

    That code worked exactly the way I needed it.
    Sweet... like a candy!

    Thanks a zillion man... you need anything, just let me know, cuz I owe you big time for this!

Similar Threads

  1. Pointer problem in opening a file
    By LAKI in forum C Programming
    Replies: 13
    Last Post: February 11th, 2004, 08:27 AM
  2. Pointer problems opening file *formatted*
    By LAKI in forum C Programming
    Replies: 4
    Last Post: February 11th, 2004, 06:06 AM
  3. Parsing a binary file
    By fstrnu in forum Software Design
    Replies: 0
    Last Post: February 8th, 2004, 12:51 PM
  4. Parsing Multi-line delimited file
    By LinuxBoxRocks in forum Perl Programming
    Replies: 1
    Last Post: February 4th, 2004, 02:30 PM
  5. Parsing a log file with grep
    By Scavy in forum Perl Programming
    Replies: 4
    Last Post: January 23rd, 2004, 06:28 AM

IMN logo majestic logo threadwatch logo seochat tools logo