using a primitive string search to extract info from HTML is generally a very poor approach. If the markup changes just a little bit (different formatting, additional whitespace, additional classes, whatever), then your whole "solution" falls apart, and you need to fumble with your code again -- until the next change. It also doesn't make a lot of sense, because you do not even want a string. What you want is an HTML element
Looking for the "days" keyword to get recent offers also isn't very sensible. What if the text says "1 hour"? Is that not recent? Do you really wanna wait until the offer is at least 2 days old so that your tool recognizes it?
I mean, if you're just playing around, and if this whole thing isn't really important, then this might be "good enough" as a quick and dirty hack. But if you're serious, you'll need to take a different approach.
What I would do is parse the HTML
and then look for all divs with the class offer but without the class hidden
(you can use XPath
). And then I'd parse the datetime
value from the time
element to see if it falls within in the given time limit (whatever that is).
I mean, c'mon, this is nice semantic HTML. They're making it easy to parse the data. Use that!