January 18th, 2011, 01:30 PM
Extracting data from a XML website
Sorry, this may be a FAQ.
Anyway I didn't find an answer using the search function.
What will be the best way to extract dynamically changing information from a web page that holds XML?
I have to extract some data from multiple web pages.
These sites contain XML code that dynamically reloads (and changes) parts of the page.
Since only small parts get a refresh (perhaps 200 bytes every 5 seconds) it wouldn't be efficient (and also might lead to negative reactions from the server) to reload the whole page (about 40k) every time.
How could I best determine these updated contents and send them to the program that further processes them?
Please point me to the best language (Perl?) or tool for this task.
I am familiar with several programming languages but unfortunately have very little knowledge of internet programming.
January 18th, 2011, 04:01 PM
January 19th, 2011, 01:07 AM
Sure, I have been unprecise in my desription.
Any suggestions how the extraction of the XML content could be performed?
January 19th, 2011, 06:24 PM
Moved from XML.
Depending on how you got the XML (AJAX, I assume) you can inspect it just like you would an HTML document.
What does the XML look like and what are you trying to get?