ColdFusion Development
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreColdFusion Development

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old November 28th, 2011, 04:38 PM
akdwalker akdwalker is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2011
Posts: 2 akdwalker User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 24 m 21 sec
Reputation Power: 0
Question Solr errors when indexing custom file extensions.

Greetings!

I am working on my company's website and need to be able to index the web pages with solr. The site is configured to read .ak file extensions as .cfm files, but Solr errors when trying to index them.

While testing I found that if I remove the <head> tags from the documents there are no errors. I've looked into the Solr config files for a location to tell Solr that .ak files should be parsed as cfm files. I have been unable to find such a setting, does one exist? Is there maybe another way to resolve this issue?



Thanks for your help,
Dave

Reply With Quote
  #2  
Old November 28th, 2011, 07:51 PM
kiteless kiteless is offline
Moderator
Dev Shed God (5000 - 5499 posts)
 
Join Date: Jun 2002
Location: Raleigh, NC
Posts: 5,091 kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level) 
Time spent in forums: 2 Weeks 5 Days 2 h 53 m 27 sec
Reputation Power: 966
Well, if I follow you correctly, I'm not sure this would do what you think it will. When you point Solr at a directory, it indexes the file content. So Solr has no idea what "ColdFusion" means, it just parses the raw text of the files. Which probably isn't going to do much good if your CF templates are actually showing dynamic data at runtime.

Consider a CF template named product.cfm. At runtime you might pass a url variable like product.cfm?id=20 which would show the information for the product with the ID of 20. But when Solr indexes product.cfm, it has no idea about product IDs or anything else, it's just going to index the actual text in the product.cfm file.

Make sense?

Reply With Quote
  #3  
Old November 29th, 2011, 02:54 PM
akdwalker akdwalker is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2011
Posts: 2 akdwalker User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 24 m 21 sec
Reputation Power: 0
Kiteless,
Thanks for your response. I understand what you are saying. The pages I am trying to index have some static content placed for the indexing.

When I index a directory that has duplicate files with both the .cfm and .ak extensions. If I index just the .cfm files I have no problems. But, when I index the .ak versions the indexing errors and finds 0 files. (The only difference in the files is the filename extension) This happens when indexing through the Administrator window as well as cfindex. If I remove the header tags, the indexing returns no errors and indexes the files properly.

Reply With Quote
  #4  
Old November 29th, 2011, 04:45 PM
kiteless kiteless is offline
Moderator
Dev Shed God (5000 - 5499 posts)
 
Join Date: Jun 2002
Location: Raleigh, NC
Posts: 5,091 kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level)kiteless User rank is General (90000 - 100000 Reputation Level) 
Time spent in forums: 2 Weeks 5 Days 2 h 53 m 27 sec
Reputation Power: 966
Hmm I'm not sure then, as I haven't needed to try and dig into the guts of Solr myself. My guess is that since the Solr instance doesn't know how to process that extension, it's treating it in some default way. Maybe as XML, or maybe it is grabbing the content and trying to force it into an XML CDATA block. Which means anything in the file that would be interpreted as invalid XML could make it blow up.

That said, a quick look at the Solr docs doesn't help much. Once again, the CF engineers have done an amazing job of taking something really complicated and making it easy to use. So my guess is you'll need to pour over the Solr docs or grab one of the Solr books to figure out what Solr config or setting will make it handle that extension the way you want it to. :-/

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreColdFusion Development > Solr errors when indexing custom file extensions.

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap