Page 1 of 3 123 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    29
    Rep Power
    0

    Microsoft Word Document Manipulation


    I was wondering if there is a better way of working with .doc/.docx Microsoft Word documents with ColdFusion 9. I want to be able to manipulate wildcards inside of the documents and also fill in checkboxes and such. I know all of this can be done with Microsoft Access via using VBA.

    I know how to do a file read of an .rtf document and finding/replacing wildcards within the document, then outputting the document via cfheader/cfcontent. But this is extremely tedious and a pain to do, also converting the documents from .doc to .rtf makes the file three times as big.

    I've seen stuff on Google with using ColdFusion with Apache POI... can that do what I want? I don't know Java at all, just ColdFusion, JavaScript and HTML, so if it requires doing heavy Java programming that might not work... any ideas?

    Also, does anybody know if ColdFusion X (10) will have better integration with Microsoft Office?
  2. #2
  3. No Profile Picture
    Moderator

    Join Date
    Jun 2002
    Location
    Raleigh, NC
    Posts
    5,265
    Rep Power
    968
    ColdFusion Zeus is still in closed alpha so I'm afraid can't comment on what may be coming in the next version yet (partly because it isn't complete yet). But I don't believe CF 9 has this built into it. As you mentioned, you can use the Apache POI library to do things like this. Java experience would help, but most of the effort would be in learning and using the POI API. There are also a number of .NET libraries that will do this, but in the same vein as POI, it would probably be difficult without at least a bit of C# experience and the ability to use the library APIs.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2008
    Posts
    131
    Rep Power
    7
    .. and aside from the learning curve, POI's word package is not nearly as mature as the excel package. So I probably would not recommend it for this task.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    29
    Rep Power
    0
    I think what I am going to have to end up doing is converting all of these word docs into HTML (and trying to mimic their original format as best as I can). Then manipulating them with Coldfusion and shooting them out as PDF documents. This will probably take me a long time to do, but I don't see any better alternative.

    I don't think this is at fault to ColdFusion, since I don't think any server-side language has really any prebuilt features when it comes to Office Document manipulation besides the .Net languages (since it's in their family of products like PDF is in Adobe's).
  8. #5
  9. No Profile Picture
    Moderator

    Join Date
    Jun 2002
    Location
    Raleigh, NC
    Posts
    5,265
    Rep Power
    968
    Just to expand on that, even in .NET you'll likely end up falling back to third-party libraries for this. I've actually done exactly what you're talking about in a .NET web application (using C#) and used Aspose.Words.
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    29
    Rep Power
    0
    Looks like converting doc/docx into HTML is destroying the format. So can Apache POI manipulate Word docs (such as marking a checkbox marked)? I was looking on their website and they do not have much information in regards to HWPF/XWPF... even their project plan link is dated 2003... any other open source alternative libraries/API's out there?

    Aspose.Words looks like it can do what I want, but no way would my company pay $3,000 for a Java library to be used on one project.

    I wonder if it is possible to read/write to placeholders if I convert the documents into PDF via CFDocument/CFPDF?
  12. #7
  13. No Profile Picture
    Moderator

    Join Date
    Jun 2002
    Location
    Raleigh, NC
    Posts
    5,265
    Rep Power
    968
    I don't think CFPDF goes to that level of detail but I believe something like iText probably would.
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    29
    Rep Power
    0
    Looks like the open source Java library Docx4J can do what I need (it can even convert to PDF as well).

    Does anybody know how I would go about using this Java library in ColdFusion? Or would I need to do most of it in Java then load it up in ColdFusion? I'm completely new at using external Java libraries, and I'm willing to learn enough Java to get this thing to work.
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2008
    Posts
    131
    Rep Power
    7
    I tried docx4J a while back and was never able to get it to work from CF. Nothing against the library. However, it unfortunately uses the jaxb library (which is built into CF already) and I could never get past all the class loader conflicts.
  18. #10
  19. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    29
    Rep Power
    0
    Was this with ColdFusion 9?

    If ColdFusion can't play well with Docx4J, and Docx4J is based on Microsoft's OpenXML SDK (which I'm guessing is .Net), is there a way to use OpenXML instead with ColdFusion?
  20. #11
  21. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2008
    Posts
    131
    Rep Power
    7
    No, it was with CF8. I never bothered re-testing under CF9 because it includes jaxb too, and I decided I was not up for another round of jar hell.

    I have only used OpenXML tangentially (old document viewer). So I do not know if it supports what you need to do.
  22. #12
  23. No Profile Picture
    Moderator

    Join Date
    Jun 2002
    Location
    Raleigh, NC
    Posts
    5,265
    Rep Power
    968
    I'd try using Mark Mandel's Javaloader component, which I think lets you set up a separate parent classloader from the one CF uses.

    Regardless, when using a Java API from CF you shouldn't normally have to write any Java. You just make API calls to the library from your CFML code.
  24. #13
  25. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2008
    Posts
    131
    Rep Power
    7
    Oh believe me I tried the javaLoader too, but no joy. If memory serves the tricky part was there were multiple versions of jaxb in the class path to begin with - 1.x/2.x. Something about it being pre-bundled with sun's jvm after a certain point. I cannot remember all the details. Just that it was not the typical class path conflict I was used to dealing with.
  26. #14
  27. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    29
    Rep Power
    0
    Yeah that sounds like no fun. I think I'll go with using the Open XML SDK. I've been reading MSDN's website and it looks like it can do everything I need (find/replace placeholders, and the ability to check checkboxes). Even Aspose.Words is based off it (albeit, it makes it simpler).

    CFObject can interact with .Net but I'm not sure if I'll have to do all the coding in C#, then load up the class files inside ColdFusion. I never done anything like this before, but I'm out of options.
  28. #15
  29. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2008
    Posts
    131
    Rep Power
    7
    I am not sure. Depends on the code. In theory you can invoke most .net code from CF. But there are some exceptions.

    Even Aspose.Words is based off it (albeit, it makes it simpler).
    Yep. Since the later *.docx format is basically just a zip file theoretically you could do it yourself with just file, string and xml functions. But since the overall schema is so incredibly complicated using a wrapper library makes more sense.
Page 1 of 3 123 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo