August 29th, 2000, 11:57 AM
Does anyone know of any scripts that will parse a webpage just as text (replaces images with the alt tags etc).
Our university page is insistent on having text equivalents of its pages. A perl script to automate things would be much easier than re-writing all of them in text form!
August 29th, 2000, 01:57 PM
What you saying is to remove all html tags? How about hyperlinks?
August 29th, 2000, 02:25 PM
I want to keep it as a web page- just want to set to a white background and black text (to make it easy to read for people with impaired eyesight etc)
Have a look at:
when links are clicked on then the new page will be sent to the script for formatatting the same way and so on- that way you could move through the entire site in just a Text only format
[This message has been edited by hotatom (edited August 29, 2000).]
August 29th, 2000, 02:55 PM
Shoulds like allot of fun (the scripting that is).
Here are some suggestions to get you going:
Use libww-perl to grab the theml contents into a variable.
Once in that variable, you can use some regular expressions to strip things like images ect.
Then use some more regular expressions change all links to point to that script again. Example:
then in that page all links would be:
August 29th, 2000, 03:12 PM
Cool but are there any pre-written scripts out there that do this? I'm a PHP man and don't have time to learn Perl- also the projects server i'm working with doesn't have PHP support- only Perl
August 29th, 2000, 05:14 PM
Er... isn't that what lynx does? A text only web browser?