#1
  1. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2012
    Location
    Southern New Jersey, USA
    Posts
    205
    Rep Power
    102

    Trying to Optimize a huge Jasper report


    Hello everyone, I am looking for ideas how I might attack this issue or what I might do. I have a Jasper report that runs many many sub reports that also have sub selects.

    Here is the main report with one line of sub reports...what or how I attack this issue. BTW the SQL Report Developer that wrote this stuff is gone...so I can not ask him. The problem now is that this report takes about 6 minutes to run and there is hope that it might be be able to spead this up...some. Any thoughts here please?

    Note: Any text that looks like $P{...} is a placeholder that gets filled in by the report engine.

    1: Main Report Dataset:

    SELECT
    GROUP_CONCAT( DISTINCT CASE WHEN ParentType_2='PARENT' THEN TRIM(BaseCurrencyCode_2)
    WHEN ParentType_3='PARENT' THEN TRIM(BaseCurrencyCode_3) ELSE '' END SEPARATOR "" ) AS 'ParentCurrencyCode',
    CASE
    WHEN TRIM( GROUP_CONCAT( DISTINCT CASE ParentType_3 WHEN 'PARENT' THEN BaseCurrencyCode_3 ELSE '' END SEPARATOR "" ) ) = 0 THEN
    ( SELECT cur1.CurrencyName FROM Q2DataMart.Currency cur1
    WHERE cur1.ConversionCode = 0 AND cur1.RateFormatCode = 0 AND cur1.CurrencyCode != TRIM( GROUP_CONCAT( DISTINCT CASE ParentType_2 WHEN 'PARENT' THEN BaseCurrencyCode_2 ELSE '' END ORDER BY ParentFundID_2 SEPARATOR "" ) ) )
    ELSE
    ( SELECT cur1.CurrencyName FROM Q2DataMart.Currency cur1
    WHERE cur1.CurrencyCode = TRIM( GROUP_CONCAT( DISTINCT CASE ParentType_2 WHEN 'PARENT' THEN BaseCurrencyCode_2 ELSE '' END ORDER BY ParentFundID_2 SEPARATOR "" ) ) )
    END AS 'ParentCurrencyName',
    -- GROUP_CONCAT( DISTINCT CASE ParentType_2 WHEN 'PARENT' THEN TRIM(FundID_2) ELSE '' END SEPARATOR "" ) AS 'ParentCode',
    GROUP_CONCAT( DISTINCT CASE WHEN ParentType_2='PARENT' THEN TRIM(FundID_2)
    WHEN ParentType_3='PARENT' THEN TRIM(FundID_3) ELSE '' END SEPARATOR "" ) AS 'ParentCode',
    GROUP_CONCAT( DISTINCT CASE ParentType_2 WHEN 'SUBFUND' THEN TRIM(FundID_2) ELSE '' END ORDER BY ParentFundID_2 SEPARATOR "/" ) AS 'SubFundParentCode',
    REPLACE( GROUP_CONCAT( DISTINCT CASE ParentType_1 WHEN 'CHILD' THEN TRIM(FundID_1) ELSE '' END SEPARATOR "/" ),'//','/' ) AS 'FundID',
    GROUP_CONCAT( DISTINCT CASE ParentType_2 WHEN 'PARENT' OR 'SUBFUND' THEN TRIM(FundName_3) ELSE '' END SEPARATOR "" ) AS 'ParentName',
    GROUP_CONCAT( DISTINCT CASE ParentType_2 WHEN 'SUBFUND' THEN TRIM(FundName_2) ELSE '' END ORDER BY ParentFundID_2 SEPARATOR "/" ) AS 'SubFundParentName',
    REPLACE( GROUP_CONCAT( DISTINCT CASE ParentType_1 WHEN 'CHILD' THEN TRIM(FundName_1) ELSE '' END SEPARATOR "/" ),'//','/' ) AS 'FundName'
    FROM (
    SELECT
    fnd.BaseCurrencyCode AS 'BaseCurrencyCode_1',
    fnd.FundName AS 'FundName_1',
    fnd.FundCode AS 'FundID_1',
    fnd.ParentID AS 'ParentFundID_1',
    fnd.EntityType AS 'FundEntityType_1',
    fnd.ParentType AS 'ParentType_1' ,
    fnd2.BaseCurrencyCode AS 'BaseCurrencyCode_2',
    CASE WHEN TRIM(fnd2.FundName)=TRIM(fnd3.FundName) THEN ' ' ELSE fnd2.FundName END AS 'FundName_2',
    fnd2.FundCode AS 'FundID_2',
    fnd2.ParentID AS 'ParentFundID_2',
    fnd2.EntityType AS 'FundEntityType_2',
    fnd2.ParentType AS 'ParentType_2',
    CASE WHEN fnd2.ParentType='PARENT' THEN fnd2.FundName ELSE fnd3.FundName END AS 'FundName_3',
    fnd3.FundCode AS 'FundID_3',
    fnd3.ParentID AS 'ParentFundID_3',
    fnd3.EntityType AS 'FundEntityType_3',
    fnd3.ParentType AS 'ParentType_3',
    fnd3.BaseCurrencyCode AS 'BaseCurrencyCode_3'
    FROM Q2DataMart.Fund fnd, Q2DataMart.Fund fnd2 LEFT JOIN Q2DataMart.Fund fnd3 ON (fnd2.ParentID = fnd3.FundCode)
    WHERE
    (fnd.FundCode IN (select fnd.FundCode AS 'FundID' from Q2DataMart.Fund fnd where fnd.FundCode = ($P{qparam_ParentFundCode_STRING}) OR fnd.ParentID = ($P{qparam_ParentFundCode_STRING}) )
    OR fnd.ParentID IN (select fnd.FundCode AS 'FundID' from Q2DataMart.Fund fnd where fnd.FundCode = ($P{qparam_ParentFundCode_STRING}) OR fnd.ParentID = ($P{qparam_ParentFundCode_STRING}) ) )
    AND
    CASE WHEN fnd.ParentType='CHILD' THEN fnd.ParentID=fnd2.FundCode
    WHEN fnd.ParentType='SUBFUND' THEN fnd.ParentID=fnd2.FundCode
    ELSE fnd.FundCode=fnd2.ParentID END
    AND fnd.ParentID != 0) A
    GROUP BY
    CASE WHEN ParentType_2='PARENT' THEN TRIM(FundID_2)
    WHEN ParentType_3='PARENT' THEN TRIM(FundID_3) END
    ;


    2: Summary Asset Allocation by Asset Class

    <snipped the rest as it was too long to look into>
  2. #2
  3. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,146
    Rep Power
    4274
    Originally Posted by ByGoneYrs
    Hello everyone, I am looking for ideas how I might attack this issue or what I might do.
    hire someone

    the stuff you posted is a classic example of TL;DR

    not to mention the insane complexity of it all
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  4. #3
  5. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2012
    Location
    Southern New Jersey, USA
    Posts
    205
    Rep Power
    102
    Originally Posted by r937
    hire someone

    the stuff you posted is a classic example of TL;DR

    not to mention the insane complexity of it all
    Ok first off what does TL;DR stand for?

    Secondly I did not think that someone was going to answer the question of how to fix it...but I was hoping to have at least some ideas how one might attack the whole process?

    Now I was thinking maybe seeing if adding some indexes on some of the tables might help speed it up? Also maybe run Explain on each querry might also help?

    Now too I guess I would like someone to explain in theory what needs to be done here and how realistically this might be approached / attacked to figure it all out.

    My thought is a Data Architect is needed along with a Sql Developer whom knows the business needs of the Financial data would be extremely helpful. along with someone that also knows Jasper reports. Now this is only my guess, but I am hoping to get a little bit more details of how to explain to upper management what and why said other resources are needed to figure this out so we can push to get what ever is needed.

    Thnaks...
  6. #4
  7. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,146
    Rep Power
    4274
    Originally Posted by ByGoneYrs
    Ok first off what does TL;DR stand for?
    too long; didn't read

    it means your queries are far too complex to even begin to analyze in a forum environment


    Originally Posted by ByGoneYrs
    My thought is a Data Architect is needed along with a Sql Developer whom knows the business needs of the Financial data would be extremely helpful.
    a senior sql developer will do, and he/she will need to pick up the financial needs of the business

    data architect at this point wouldn't help (not unless you plan to design and build new a completely database)

    knowing jasper reports can probably be picked up by the senior sql developer

    like i said, you'll probably want to hire someone
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  8. #5
  9. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2012
    Location
    Southern New Jersey, USA
    Posts
    205
    Rep Power
    102
    Ok I have been asked to try to flow chart this tangled mess, and it would appear that each report then calls sub-reports, that then call other sub reports and the tree goes on.

    That being said for Jasper reports, is that the correct way to write things or should it be a single long report, not reports nested in other reports nested still in others...all nneding to work to get the main report to work?
  10. #6
  11. No Profile Picture
    Problem Solver
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jan 2001
    Location
    Stockholm, Sweden
    Posts
    4,495
    Rep Power
    534
    Originally Posted by ByGoneYrs
    Ok I have been asked to try to flow chart this tangled mess, and it would appear that each report then calls sub-reports, that then call other sub reports and the tree goes on.

    That being said for Jasper reports, is that the correct way to write things or should it be a single long report, not reports nested in other reports nested still in others...all nneding to work to get the main report to work?
    If you by reports mean queries then no nested queries are prone to be slow by nature. BUT they are usually easier to write since you can break down a complex problem into smaller ones and be able to test each query independently so I guess that is why they are so commonly (miss)used.

    But regarding your performance problem here are some quick thoughts.
    Unless you are missing some majorly important indexes (have you check index usage with EXPLAIN [your query]) the hotspots in your query are probably:
    1. The query is using a strange join condition
    Code:
    FROM Q2DataMart.Fund fnd, Q2DataMart.Fund fnd2 LEFT JOIN Q2DataMart.Fund fnd3 ON (fnd2.ParentID = fnd3.FundCode)
    ...
    AND
    CASE WHEN fnd.ParentType='CHILD' THEN fnd.ParentID=fnd2.FundCode
    WHEN fnd.ParentType='SUBFUND' THEN fnd.ParentID=fnd2.FundCode
    ELSE fnd.FundCode=fnd2.ParentID END
    Where you are using the older join syntax table1, table2 with a strange CASE WHEN .... THEN ... ELSE ... END.
    If this Fund table contains a decent amount of records then that condition is probably going to slow things down a lot since it needs to evaluated the CASE ... for a lot of rows in the join.

    2. The query is using a bunch of sub-queries in the WHERE clause which can also be very slow since they are evaluated for each record for the outer query. A better approach is to write queries like this as a derived table instead. That way the query to create the derived table is only evaluated once and then the join is usually quick to perform.


    Now the question for you is: -"What is your goal for the execution time?". How much time and money are you willing to put down to rewrite this query? Compared to if you possibly upgrade the hardware instead? Have you checked how much load the server is under when you execute this query? And is it I/O or CPU? Because if it is IO and all RAM on the server is used then you can usually add some more RAM to avoid disk reads and speed things up. Which can be a quick and easy fix if you have a small amount of RAM on the server. I'm just mentioning it as it can sometimes be an easy and pretty cheap fix compared to how much time a rewrite can cost.

    And then we have the rewrite option which must be combined with checking indexes etc to see what can be optimized.
    /Stefan
  12. #7
  13. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2012
    Location
    Southern New Jersey, USA
    Posts
    205
    Rep Power
    102
    Thank you very much sir...

    We are planning on increasing the memory on the box/Zone for Solaris. We will see how that does to begin with and then long term the reports need to be rebuilt as well too.

IMN logo majestic logo threadwatch logo seochat tools logo