The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages - More
> ColdFusion Development
|
Solr Search
Discuss Solr Search in the ColdFusion Development forum on Dev Shed. Solr Search ColdFusion Development forum discussing CFML coding practices, tips on CFML, and other CFML related topics. Find out why ColdFusion is the tool of choice for many e-commerce developers.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

December 18th, 2012, 03:11 PM
|
|
Registered User
|
|
Join Date: Dec 2012
Posts: 6
Time spent in forums: 33 m 38 sec
Reputation Power: 0
|
|
|
Solr Search
Great site, this is my first post.
I am running CF9.
I have a folder of .doc and .docx files I want to search he file contents, I don't need the meta data, just the viewable content.
I have these same files listed in a MS SQL 2005 table.
What I'd like is to be able to search the contents of the .doc and .docx and return the primary key and extension that are listed in the table.
Am I making this more complicated than it needs to be? I thought this would be easier, but it's making me crazy.
Thanks for the help
|

December 18th, 2012, 05:37 PM
|
|
Moderator
|
|
Join Date: Jun 2002
Location: Raleigh, NC
|
|
If you think about what you're asking, this isn't easy at all. You have two separate sets of data (a directory of files, and a database table), and you want to search the content of the files but map the search results to a database row.
That said, I believe Example #5 (toward the bottom) of the cfindex page in the docs shows one way to blend database results and file path indexing.
Last edited by kiteless : December 18th, 2012 at 05:39 PM.
|

December 18th, 2012, 07:46 PM
|
|
Registered User
|
|
Join Date: Dec 2012
Posts: 6
Time spent in forums: 33 m 38 sec
Reputation Power: 0
|
|
|
Thank you very much for your reply.
I had seen that reference, but didn't see where I told it where the files are.
OK, maybe I can make it easier. When I upload the files, I name them with the primary key from the same database table I want to later reference. So in the folder it might look like:
1,doc
2.docx
7.docx
9.doc
...
Is there a way to index the folder and use the filename as the key? Then I can use what the search returns to query the database based on the key which will match the primary key from the table.
Thanks again
|

December 18th, 2012, 08:21 PM
|
|
Moderator
|
|
Join Date: Jun 2002
Location: Raleigh, NC
|
|
|
I've actually only used Solr to index database data OR documents, but not both of them at once. What you're talking about is probably possible, but it's an usual case and is not the normal way indexing is done. Other than going over the docs and doing some trial and error, I'm afraid I can't offer much on this specific scenario.
|

December 18th, 2012, 08:24 PM
|
|
Registered User
|
|
Join Date: Dec 2012
Posts: 6
Time spent in forums: 33 m 38 sec
Reputation Power: 0
|
|
|
OK, thanks, but in the second proposal I think I was only asking how to index the docs, all of the docs in a single folder.
I'll do some more searching.
|

December 19th, 2012, 08:51 AM
|
|
Moderator
|
|
Join Date: Jun 2002
Location: Raleigh, NC
|
|
|
Ah, the Example #2 on the cfindex docs page shows indexing a file path.
|

December 19th, 2012, 02:38 PM
|
|
Registered User
|
|
Join Date: Dec 2012
Posts: 6
Time spent in forums: 33 m 38 sec
Reputation Power: 0
|
|
|
It worked!!!
In case anyone else tries something similar, and I understand this is pretty basic, but here is what I did.
Here is the index I'll run nightly to keep the search updated.
<cfindex
collection="docs"
action="refresh"
type="path"
key="my path"
extensions=".doc, .docx"
URLpath="my URL">
Then from the Key I can get the file name:
<cfset FileName=GetFileFromPath(Key)>
I'll use that variable in a query to get a list of documents from the database.
The one hiccup I had is getting the number from the key. I used something similar in another search, but all of the documents were .pdf so I could simply strip off the 4 characters on the right. I don't have anything like that this time. The file might be 1.doc or 1342.docx.
So I used this to get the number portion of the Key:
<cfset SearchFile = ReReplaceNoCase(FileName,"[^0-9,]","","ALL")>
Then I'll query the database to find all of those stripped Keys that will match the primary key.
Thank you for your patience and guidance.
Cliff
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|