#1
  1. No Profile Picture
    miguelgarcia
    Guest
    Devshed Newbie (0 - 499 posts)
    Does anyone know where I can find a perl module/program that will strip the text out of a HTML file.
  2. #2
  3. No Profile Picture
    dlamb
    Guest
    Devshed Newbie (0 - 499 posts)
    Here's a small one I just did for a buddy. This particular one grabs a table from an html file ( it has a nested table in it ) and prints the out put to a file. Then he just included the file with php3.

    Look at it closely, it's a simple script, and should be easy to customize. Here it is:

    #!/usr/bin/perl

    $file="playlist.code";
    $original ="playlist.html";

    open (FILE, $original);
    @lines=<FILE>;
    close FILE;
    #print @lines;

    #First line
    @playlist[0] = "<br>";

    $found = 0;
    $count = 0;
    $webdog = "<table";
    foreach $line (@lines) {
    if ($line =~ /$webdog/i) {
    $found++;}
    if ($found > 1) {
    push (@playlist,$line);
    if ($line =~ /</table>/i){
    $count++;
    last if ($count == 2);
    }
    }
    }
    print @playlist;
    #open (FILE, ">$file");
    #foreach $playlist(@playlist){
    # print FILE $playlist;
    #}
    close FILE;

    The script probably could be a bit simpler, but here you go. Note this only grabs whole lines. If you will need to grab, say, just half of the last line, you'll have to monkey with it (using $' and $` most likely).
  4. #3
  5. No Profile Picture
    dlamb
    Guest
    Devshed Newbie (0 - 499 posts)
    oops - that script actually prints out the results to stdout, to print to a file uncomment that small loop at the end.
  6. #4
  7. No Profile Picture
    curtdog
    Guest
    Devshed Newbie (0 - 499 posts)
    I would like to be able to grab html from a remote site (news headlines) once a day (cron) and write it to a file that I can include in my page. I have written a php script that grabs headlines from a site and writes it to my file, but this is slow.

    ------------------
    Christopher Curtis
    C Double Web Development
    http://c-double.com
  8. #5
  9. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2000
    Posts
    22
    Rep Power
    0
    use LWP::Simple;

    $html = get("http://www.mysite.com");
    print $html;

    Works for me - I learnt it from a book about two hours ago!

    ------------------
  10. #6
  11. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2000
    Location
    Lorain, OH, USA
    Posts
    0
    Rep Power
    0
    Okay, I haven't tried that out yet but I bet dollars to nickels it works. Now I have a little question of my own... Since I am a newbie...

    I want to know how to do this and instead of printing to a new file, print directly out of the cgi.

    Also... Be able to filter for certain things and replace them with other things... Like put <P> in places where <BR> are.

    ---

    This also brings up another problem I am having. I want to embed an external CGI to an internal CGI my friend and I are working on.

    Many of them are like a counter, a clock and even other things. They are in the same directory as our sidebar generator (the cgi we use to get the data for our fields) it is just we want to impliment these CGI files in this sidebar CGI so it will be seamless.

    Any quick code snippets, suggestions or such, please e-mail or reply. Thanks!

    Chris

IMN logo majestic logo threadwatch logo seochat tools logo