ColdFusion Development
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming Languages - MoreColdFusion Development

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old March 16th, 2005, 09:15 AM
samb1 samb1 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Posts: 67 samb1 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 1 h 42 m 42 sec
Reputation Power: 5
CFHTTP looping problems

I have parsed an index page for specific job links, I have then stored those links in a 1dm array.
Using cfhttp I have then looped over the links in the array so that I can perform more detailed parsing once I reach these pages that the links point to. However when I have outputed the results of the cfhttp, I have somehow managed to return the index page several times as apposed to the actual pages that the links point to.
Am I using cfhttp correctly here?

Reply With Quote
  #2  
Old March 16th, 2005, 10:40 AM
kiteless kiteless is offline
Moderator
Dev Shed Expert (3500 - 3999 posts)
 
Join Date: Jun 2002
Location: Raleigh, NC
Posts: 3,682 kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 4 Days 15 h 25 m 55 sec
Reputation Power: 53
So is this where the problem is happening?

<!---Store the list of FoundLinks into the Links Array--->
<cfset LinksArray = ListToArray(StripLinks)>
<cfloop from="1" to="#arrayLen(LinksArray)#" index="i">
<cfhttp method="get" url="#LinksArray[i]#" resolveurl="yes">
<cfoutput>#cfhttp.FileContent#</cfoutput>
</cfloop>

Can you confirm that the listToArray() call is generating an array with the elements you are expecting?
__________________
Ask if you have a question, but also help answer questions that you have knowledge of! Thanks, Brian.
How to Post a Question in the Forums

Reply With Quote
  #3  
Old March 16th, 2005, 10:54 AM
samb1 samb1 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Posts: 67 samb1 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 1 h 42 m 42 sec
Reputation Power: 5
Yes that is where the problem is happening. I have done a <cfdump var="#LinksArray#"> and each URL appears in the array as an element.

Reply With Quote
  #4  
Old March 16th, 2005, 11:42 AM
kiteless kiteless is offline
Moderator
Dev Shed Expert (3500 - 3999 posts)
 
Join Date: Jun 2002
Location: Raleigh, NC
Posts: 3,682 kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 4 Days 15 h 25 m 55 sec
Reputation Power: 53
If you output each element of the array as you loop over it (before the cfhttp call), are you seeing the correct URLs? And if so, are you saying that the immediately following cfhttp call's output is not for that URL?

Reply With Quote
  #5  
Old March 16th, 2005, 12:02 PM
samb1 samb1 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Posts: 67 samb1 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 1 h 42 m 42 sec
Reputation Power: 5
I output each element of the array as I looped over it and the results were strange. From website 1 only 1/27 links were output.
From website 2 all 7/7 links were output.
From website 3 1/1 links were output.

Therefore Iam not seeing all the correct URL's.

Reply With Quote
  #6  
Old March 16th, 2005, 12:12 PM
kiteless kiteless is offline
Moderator
Dev Shed Expert (3500 - 3999 posts)
 
Join Date: Jun 2002
Location: Raleigh, NC
Posts: 3,682 kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 4 Days 15 h 25 m 55 sec
Reputation Power: 53
Then the problem must be in the regular expression somewhere in here:

<cfset StartPos = 1>
<cfloop condition ="True">

<!---Parse the site index pages for job links--->
<cfset Match = REFindNoCase(#Trim(xmlObj.xmlRoot.site[i].parse.xmlAttributes.re)#, cfhttp.FileContent, StartPos, True)>

<cfif Match.pos[1] EQ 0>
<cfbreak>

<cfelse>
<cfset StartPos = Match.pos[1] + Match.len[1]>
<!---<cfset Foundlinks = Mid(cfhttp.FileContent, Match.pos[1], Match.len[1])>--->
<cfset StripLinks = #REReplaceNoCase(#Mid(cfhttp.FileContent, Match.pos[1], Match.len[1])#,"\s*HREF=\W", "", "ALL")#>

Unfortunately I don't know much about regex (don't need to use them much). If this is indeed where the problem is you might ask in a regex forum or search around on the net for regex expressions that do what you need.

Reply With Quote
  #7  
Old March 16th, 2005, 12:18 PM
samb1 samb1 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Posts: 67 samb1 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 1 h 42 m 42 sec
Reputation Power: 5
Your probably right Kiteless, Ill have a look round the web. Thanks for your time though

Reply With Quote
  #8  
Old March 16th, 2005, 01:50 PM
samb1 samb1 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Posts: 67 samb1 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 1 h 42 m 42 sec
Reputation Power: 5
Kiteless, just a general question; do you think there is a more efficient way of holding all the URL's for three different sites? As at present they are all stored in a 1dm array.

Reply With Quote
  #9  
Old March 16th, 2005, 02:54 PM
kiteless kiteless is offline
Moderator
Dev Shed Expert (3500 - 3999 posts)
 
Join Date: Jun 2002
Location: Raleigh, NC
Posts: 3,682 kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 4 Days 15 h 25 m 55 sec
Reputation Power: 53
I'd store them in a 2 dimensional array, where the first dimension is the site (so right now you'd have 3 elements in the first dimension) and the second element are the links within each site (so each of the 3 sites could have 1 or more links within it).

Reply With Quote
  #10  
Old March 17th, 2005, 04:45 PM
samb1 samb1 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Posts: 67 samb1 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 1 h 42 m 42 sec
Reputation Power: 5
Kiteless, sorry to bother you but I need you to tell me where Im going wrong.
Ive rewritten some of my code and this is my problem:
I fixed the regex and it works fine. I get the links from the 1st webite fine, store them in an array, when I output the links they appear as the correct URLs(fine so far). I then loop over the array and cfhttp each URL from the array. When I output the contents of cfhttp.filecontents there is only one webpage displayed, which is the first URL in the array (there should be 16). Are there limits on how many pages can be output?
With cfhttp.filecontents I have then tried to use 1/4 detailed page parsers (regex), which extracts specific info I need from each URL. The code Im using to do this is below, will I have to keep regenerating this code for each detailed page parser?


I hope im making sense

Reply With Quote
  #11  
Old March 17th, 2005, 10:27 PM
kiteless kiteless is offline
Moderator
Dev Shed Expert (3500 - 3999 posts)
 
Join Date: Jun 2002
Location: Raleigh, NC
Posts: 3,682 kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 4 Days 15 h 25 m 55 sec
Reputation Power: 53
No, I'm not understanding what you're trying to do.

Reply With Quote
  #12  
Old March 18th, 2005, 03:01 AM
samb1 samb1 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Posts: 67 samb1 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 1 h 42 m 42 sec
Reputation Power: 5
What Im trying to say is that, Im only outputting the first URL from the array when I perform a cfhttp on the array elements, there should be several. My loop looks fine but there is obviously something wrong.

Reply With Quote
  #13  
Old March 18th, 2005, 08:14 AM
kiteless kiteless is offline
Moderator
Dev Shed Expert (3500 - 3999 posts)
 
Join Date: Jun 2002
Location: Raleigh, NC
Posts: 3,682 kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 4 Days 15 h 25 m 55 sec
Reputation Power: 53
The first thing I would do is make sure that the URLs in the array are what you expect. If you dump the array do you see URL values that look correct?

Then make sure that the inner CFHTTP call is working. No, there is no limit on the number of pages you can output. Instead of all that regex stuff can you just output each inner CFHTTP call and see if it is doing what you expect?

<cfset LinksArray = ListToArray(StripLinks)>
<cfloop from="1" to="#arrayLen(LinksArray)#" index="i">
<cfhttp method="get" url="#LinksArray[i]#">
<cfdump var="#cfhttp#">
</cfloop>

Reply With Quote
  #14  
Old March 21st, 2005, 08:22 AM
samb1 samb1 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Posts: 67 samb1 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 1 h 42 m 42 sec
Reputation Power: 5
The URL's are what I expected, they are all complete and they can be cfhttp'd manually. I can see the URL's in the array when i dump it and they are correct.

The inner cfhttp only calls the first URL in the array and then stops. I have tried it without the regex stuff and its still the same.

It looks as though there is a problem with the loop????? I havent had any error messages either??

Reply With Quote
  #15  
Old March 21st, 2005, 01:07 PM
kiteless kiteless is offline
Moderator
Dev Shed Expert (3500 - 3999 posts)
 
Join Date: Jun 2002
Location: Raleigh, NC
Posts: 3,682 kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level)kiteless User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 4 Days 15 h 25 m 55 sec
Reputation Power: 53
So this is where it is failing?

<cfset StripLinks = #REReplaceNoCase(#Mid(cfhttp.FileContent, Match.pos[1], Match.len[1])#,"=[""]", "http://www.rspb.org.uk/vacancies/index.asp", "ALL")#>
<cfset LinksArray = ListToArray(StripLinks)>
<cfloop from="1" to="#arrayLen(LinksArray)#" index="i">
<cfhttp method="get" url="#LinksArray[i]#">
...

?

If so then the problem is that The LinksArray doesn't have the URLs that you need. More to the point I think this is not doing what you think it should:

<cfset StripLinks = #REReplaceNoCase(#Mid(cfhttp.FileContent, Match.pos[1], Match.len[1])#,"=[""]", "http://www.rspb.org.uk/vacancies/index.asp", "ALL")#>

REReplaceNoCase just replaces the first value with the second value in the string. I think if you do a cfdump var="#striplinks#" you'll see that it doesn't have the array of links that you are expecting.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreColdFusion Development > CFHTTP looping problems


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off