#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    18
    Rep Power
    0

    Quickbooks to .XLS/.XML, then to SQL.


    Hello Dev Shed, I will try to keep this as brief as possible. I want to export a Standard Profit & Loss from QuickBooks Pro 2012 into a .XLS file, then convert it to .XML so CF(vers 8) can parse the data, extract the relevant fields, and put them into a row in a SQL database. It would be IDEAL to have all of this set up in a batch file (for Windows Server 08) and then to run the batch file from Windows Scheduled Tasks on the first day of each month but at this point simply getting the data into a usable format is my primary goal.

    So after extensive searching, I found this website, and reproduced the code almost verbatim w/ a few minor changes to reflect what I was trying to do. I also deleted the first sheet in the spreadsheet (.XLS format) as it was a graphic inserted by the QB export function.

    XL.cfm:

    Code:
    <cfoutput>
    <cffile action="read" file="C:\inetpub\wwwroot\kcinvtest\PL.xml" variable="xmlImport" />
    
    <cfset xmlImport = XmlParse(xmlImport) />
    
    <cfset aRowNodes = xmlImport.Workbook.Worksheet.Table.XmlChildren />
    
    <!--- Set up the Column names in the Workbook --->
    <cfset lstColumns = 'NA,TITLE,SECTION,CATEGORY,SUBCAT,AMOUNT' />
    
    <cfset qryExcel = QueryNew(lstColumns) />
    
    <cfloop from="1" to="#ArrayLen(aRowNodes)#" index="index">
      <cfif aRowNodes[index].XmlName eq "Row">
        <cfif ArrayLen(aRowNodes[index].XmlChildren) eq ListLen(lstColumns)>
          <cfset QueryAddRow(qryExcel) />
    
          <cfloop from="1" to="#ListLen(lstColumns)#" index="colindex">
            <cfset colname = ListGetAt(lstColumns, colindex) />
            <cfif ArrayLen(aRowNodes[index].XmlChildren) gte colindex And ArrayLen(aRowNodes[index].XmlChildren[colindex].XmlChildren) neq 0>
              <cftry>
                <cfset QuerySetCell(qryExcel, colname, aRowNodes[index].XmlChildren[colindex].XmlChildren[1].XmlText) />
                <cfcatch>
                  <cfdump var="#aRowNodes[index]#" />
                  <cfdump var="#cfcatch#" />
                  <cfset QuerySetCell(qryExcel, colname, "#XmlText#") />
                </cfcatch>
              </cftry>
            <cfelse>
              <!--- blank node --->
              <cfset QuerySetCell(qryExcel, colname, "") />
            </cfif>
          </cfloop>    
        </cfif>
      </cfif>
      
    </cfloop>
    </cfoutput>
    I convert the .XLS file to .XML using OpenOffice 3 and upon opening it w/ OpenOffice, all looks well, I can see all the tags and their children. I put it in the same directory as the XL.cfm page (which is the path in the "file" attrib in the cffile tag) and execute the page. Here's where it gets weird: In FireFox and Chrome v.30, I get a timeout error, in IE8, the page 'loads' but there is nothing there...absolutely nothing, just a blank white page. I even went as far as trying to comment out the cfcatch/try sections and just tried to print aRowNodes[index].XmlChildren[colindex].XmlChildren[1].XmlText at each iteration of the lower-level loop, hoping to get (in whatever raw format) a few pieces of data from the spreadsheet, but to no avail...however a cfdump of xmlImport yields (in IE8 ONLY) a huge struct with all of the xml data accounted for.

    What gives? Did I miss something when I changed the column names (lstColumns)? Any guesses as to why I cannot pull data by an individual key in the lower-level loop? Also any pointers on XML and what I may need to change to adapt this code to my needs? FYI the spreadsheet ends up being about 6 columns wide and 25-50 rows deep, and I have no concern about the formatting attribs in the .XML file, just the data itself. Yes, I know I can export to CSV but then I'll be in the same boat and I'd really like to start learning more about XML and getting data from XML to SQL using CF. Thank you in advance for your help, DevShed!
  2. #2
  3. No Profile Picture
    Moderator

    Join Date
    Jun 2002
    Location
    Raleigh, NC
    Posts
    5,242
    Rep Power
    967
    You can check the CF logs to see if any exceptions are showing up there. You can also add some cfdump and cfabort tags at different spots in the code to try and determine how far it gets before it gets hung up.

    How large is the XML file you're parsing?
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    18
    Rep Power
    0
    I get this message in CF Admin logs no matter which browser I use:

    "The request has exceeded the allowable time limit Tag: cfoutput The specific sequence of files included or processed is: C:\inetpub\wwwroot\KCINV\xl2.cfm''

    I tried putting a cfdump of "XmlImport" in various spots in the code and got only one (the first one) to show up (it was after the "<cfset aRowNodes = xmlImport.Workbook...." line 8ish). Will try again commenting out all but one at a time and see what that yields as I got the timeout error again, i.e. "The request has exceeded the allowable time limit Tag: cfoutput" 500 error.

    The file size of PL.xml is 281 KB.

    Quick Edit: I just added "#XmlText#" inside the bottom-level loop hoping to at least get data output but to no avail...but this time both IE8 and FF(v.21) show a blank page. Also can all these "SS" references (style sheets, I assume?) and other attribs in the XML file be the cause of the timeouts? And if so, how would I negate that? Wipe out all design/formatting in OpenOffice before exporting to XML? Thanks again!
  6. #4
  7. No Profile Picture
    Moderator

    Join Date
    Jun 2002
    Location
    Raleigh, NC
    Posts
    5,242
    Rep Power
    967
    What is the timeout currently set at in the CF administrator? You can try bumping it up to see if it finishes. But it shouldn't take all that long to parse a 200k XML file.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    18
    Rep Power
    0
    Timeout was @ 60s, I bumped it up to 360 and it's still timing out...an interesting thing, I changed the cfdumps to dump the url struct (empty) and put all ten instances of it back in the code and it crashed Firefox. So thinking I was using too much of the webservers resources (we have @ 30-35 users on this webserver at any given time) I changed this part of the code to include counters:

    Code:
    <cfloop from="1" to="#ArrayLen(aRowNodes)#" index="index">
      <cfset topctr = topctr + 1>
      <cfif aRowNodes[index].XmlName eq "Row">
        <cfset btmctr = btmctr + 1>
    Then I go to the page and I show 4101 iterations for "topctr" and 4096 iterations for "btmctr." Does this jive w/ the filesize? Maybe there is some sort of corruption occurring during the conversion from .XLS to .XML? I am admittedly a novice at best when it comes to .XML.

    Eventually, I would add code (if I can get it working correctly) to this that would create a struct before the loops, at the lowest-level loop say something like:

    Code:
    <cfif left(aRowNodes[index].XmlChildren[colindex].XmlChildren[1].XmlText,5) EQ '51010'>
      <cfset myStruct.51010 = val(XmlText)>
    </cfif>
    ...then at the bottom of the code, loop thru the struct keys and insert into SQL the appropriate dollar amounts from the .XML. The '51010' in case you're wondering is an index we use internally in QB to indicate that this money goes to a certain liability, such as company car gas, repair, or building insurance, etc. So the relevant fields in SQL (per record) are date, id#, and then all these indices, 51010, 51011, 51020, etc. Hopefully that is not just making this more confusing as to what I want to do and actually clears it up a little. As always thank you all for your help with this, it's really driving me batty.
  10. #6
  11. No Profile Picture
    Moderator

    Join Date
    Jun 2002
    Location
    Raleigh, NC
    Posts
    5,242
    Rep Power
    967
    Are you saying you output the counters after XML is looped over and the query is built up? If, so that means the timeout is happening after the query is populated.

    You absolutely should not be testing this on a live server, especially one with 35 other concurrent users. Install CF locally and test it that way.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    18
    Rep Power
    0
    Okay, here's the latest update to the XL.cfm page, and I've moved it to a VM w/ CF dev ed installed so I'm not taxing the production server:

    Code:
    <cfoutput>
    <cffile action="read" file="C:\inetpub\wwwroot\kcinv\PL.xml" variable="xmlImport" />
    
    <cfset xmlImport = XmlParse(xmlImport) />
    <cfset myStruct = StructNew() />
    <cfset aRowNodes = xmlImport.Workbook.Worksheet.Table.XmlChildren />
    
    <!--- Set up the Column names in the Workbook --->
    <cfset lstColumns = 'NA,TITLE,SECTION,CATEGORY,SUBCAT,AMOUNT' />
    
    <cfset qryExcel = QueryNew(lstColumns) />
    
    <cfset topctr = 0>
    <cfset btmctr = 0>
    <cfloop from="1" to="#ArrayLen(aRowNodes)#" index="index">
      <cfset topctr = topctr + 1>
      <cfif aRowNodes[index].XmlName eq "Row">
        <cfset btmctr = btmctr + 1>
        <cfif ArrayLen(aRowNodes[index].XmlChildren) eq ListLen(lstColumns)>
          
          <cfset QueryAddRow(qryExcel) />
          
          <cfloop from="1" to="#ListLen(lstColumns)#" index="colindex">
            
            <cfset colname = ListGetAt(lstColumns, colindex) />
            
            <cfif ArrayLen(aRowNodes[index].XmlChildren) gte colindex And ArrayLen(aRowNodes[index].XmlChildren[colindex].XmlChildren) neq 0>
              
              <cfif left(trim(aRowNodes[index].XmlChildren[colindex].XmlChildren[1].XmlText),5) EQ '51010'>
    		        <cfset myStruct.51010 = trim(aRowNodes[index].XmlChildren[colindex].XmlChildren[1].XmlText)>
    	        </cfif>
    
              <cftry>
                <cfset QuerySetCell(qryExcel, colname, aRowNodes[index].XmlChildren[colindex].XmlChildren[1].XmlText) />
              <cfcatch>
                  <cfdump var="#aRowNodes[index]#" />
                  <cfdump var="#cfcatch#" />
                  <cfset QuerySetCell(qryExcel, colname, "#XmlText#") />
              </cfcatch>
              </cftry>
            <cfelse>
              <!--- blank node --->
              <cfset QuerySetCell(qryExcel, colname, "") />
            </cfif>
            #XmlText#
          </cfloop>    
        </cfif>
      </cfif>
      
    </cfloop>
    #topctr#
    #btmctr#
    <cfdump var="#myStruct#">
    </cfoutput>
    My output is:
    "4101 4096
    |struct[empty]|"

    I must be missing something here because I'm expecting at least one key in this structure, namely .51010 with a value of "51010 - BOB'S VEHICLE REPAIR AND MAINTENANCE" or the like. I understand the depth of the XML struct and the [index] and [colindex] indices, though I'm assuming that I don't necessarily need those in there because they are in a
    loop (I will leave them unless otherwise advised as I believe it to be 'correct' programming). The one thing that confuses me is the call for the first index at the "XmlChildren[1]" level. Is this why I'm not getting the XmlText string? I know for a fact this value is in the XML file, I opened it in a text editor and did a search for '51010' and found it.

    Again, I just copied this code verbatim from the aforementioned site, then changed the column names and the file attrib in the cffile tag. I get the gist of what it's supposed to be doing but I'd be remiss not to mention that while I'm familiar with using the cfdump tag, I don't really know how to use cftry/catch...isn't it supposed to be throwing the cfdumps w/ that bit of code in there? Let me know what you think, thanks again Kiteless.
  14. #8
  15. No Profile Picture
    Moderator

    Join Date
    Jun 2002
    Location
    Raleigh, NC
    Posts
    5,242
    Rep Power
    967
    If it makes it through the loops and shows the counters, then I don't really understand what the problem is as far as the timeout goes. If it's outputting the counters, then the XML processing is finishing. Which means the timeout would have to be happening somewhere further along that you're not showing.

    As far as why a specific key wouldn't be in your results, I can't really say. It would mean the processing logic isn't doing exactly what you want. But I can't tell you what the specific problem might be...that's something you'd need to debug to figure out. There's always this danger when using code you didn't write, since the original author wasn't doing exactly what you're doing or using exactly the same data.
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    18
    Rep Power
    0
    I think I'm understanding now that the timeouts were due to the interspersed <cfdump var="#xmlImport#"> lines in the code when I was debugging it. But now I'm at a bit of an impasse and I suppose my question has changed a bit, while my understanding of XML is still limited at best.

    I'm going to revise the code to obtain just the "SUBCAT" and "AMT" columns and will post here if it doesn't yield the expected output, otherwise I'll consider the thread resolved. Thank you again for your help, Kiteless.
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2008
    Posts
    131
    Rep Power
    6
    Originally Posted by BrandonG
    export a Standard Profit & Loss from QuickBooks Pro 2012 into a .XLS file
    I do not have an answer for you about the XML, but just a thought ... are you exporting to XLS (binary) or XLSX (ooxml)? If it is the former, have you looked into using Apache POI to read the file? An older version is built into CF8 and can be used with createObject.

    (It is also possible to parse the newer xlsx format as well, but that is a little more complicated. Requires updating the POI jars or using the JavaLoader.cfc)

IMN logo majestic logo threadwatch logo seochat tools logo