Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    12
    Rep Power
    0

    Postgres performance tuning


    Server configuration:
    Processor: 2 x Intel Quad core E5620 @ 2.40GHz
    RAM: 16 GB

    Postgres configuration:
    Effective cache size = 10 GB
    shared Buffer = 1250 MB
    random page cost = 4

    Table size = 60 GB
    Number of records = 44 million
    Index, multi column, 3 columns, matches exactly with query where condition.

    Query:
    EXPLAIN (analyze, buffers) SELECT text(client) as client, text(gateway) as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0) ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd, sum(call_duration_recv)/1000.0 as duration_recv, sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as call_amount_recv, sum(call_amount_pay) as call_amount_
    pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway ORDER BY call_amount_recv DESC; QUERY PLAN

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    ------------------------------------------------------
    Sort (cost=3422863.06..3422868.69 rows=2254 width=44) (actual time=137852.474..137852.474 rows=5 loops=1)
    Sort Key: (sum(call_amount_recv))
    Sort Method: quicksort Memory: 25kB
    Buffers: shared read=2491664
    -> HashAggregate (cost=3422664.28..3422737.53 rows=2254 width=44) (actual time=137852.402..137852.454 rows=5 loops=1)
    Buffers: shared read=2491664
    -> Bitmap Heap Scan on detailed_report (cost=644828.11..3399506.87 rows=1029218 width=44) (actual time=4499.558..125443.122 rows=5248227 loops=1)
    Recheck Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time zone) AND ((group_id)::text = 'adm
    in'::text) AND ((client)::text = 'CHOICE'::text))
    Buffers: shared read=2491664
    -> Bitmap Index Scan on endtime_groupid_client_tsidx_detail_report (cost=0.00..644570.81 rows=1029218 width=0) (actual time=3418.754..3418.754 rows=5248227 loops=1)
    Index Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time zone) AND ((group_id)::text =
    'admin'::text) AND ((client)::text = 'CHOICE'::text))
    Buffers: shared read=95055
    Total runtime: 137868.946 ms
    (13 rows)

    Can someone guide me to bring down the time to few seconds from hundred's of seconds.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    158
    Rep Power
    10
    What do your indexes look like?
    Are your planner statistics uptodate? (run VACUUM ANALYSE)
    Does the query perform better without the order by?
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    12
    Rep Power
    0

    postgres performance tuning


    thanks for the response:
    1. Indexes:
    "endtime_groupid_client_tsidx_detail_report" btree (end_time DESC, group_id, client), tablespace "indexspace" which exactly matches with 'where' condition,
    " WHERE end_time>='2013-05-01 00:00' and end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway;"
    Index on a separate tablespace on another hard disk.

    2. The above results are taken after performing vacuum Analyze

    3. No improvement after removing 'order by'

    The above is with 6 months records, every month records of around 10 million will get added. This is used only for reporting/read.

    Writing is through bulk insertion, once in a day

    4. By increasing random_page_cost to 25, the query gets executed sequentially,Seq Scan on detailed_report, time taken is comparatively better than Indexed scan. But I am not preferring because when the data grows the sequential scan performance will come down.

    Please let me know if you require any other details.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    158
    Rep Power
    10
    Databases use indexes to find records and limit the number of results as quickly as they can, using statistics. The order in which you put the conditions in the WHERE clause has practically no effect on how the database will evaluate them. If the database desides that filtering the date is more work than filtering the client, it will filter the client first.

    So, adding all the columns of the WHERE clause in the same index may not even be the fastest solution for this, it depends very much on how many rows can be eliminated by the first column, and then how many of those can be elminated by the second, etc. If your date range leaves 2Mln rows and the group_id='admin' reduces that to 400k, then putting group_id first in the index may be significantly faster.

    Also, gateway is not include in the index, but it is being used for group-by, it may be worth a try to stick client and gateway in a separate index, or both together at the beginning of the index.

    Another option you can definitely try is partial indexes. Instead of creating one index to rule them all, you could create indexes that only contain the rows where group_id='admin' and client ='CHOICE' . That means the database only has to apply the date-range and even there you could create indexes by week or something.

    I don't think it's worth messing about with the rest of the config, they should be set to realworld values, My guess is that this issue depends on the index and how the query is handled by the database.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    12
    Rep Power
    0

    Postgres Performance tuning


    hi,
    thank you so much for going through the details and providing suggestion. As per your suggestion, carried out 3 set of tests:
    1. Index on columns
    2. multiple column index, with 2 columns
    3. multiple column index, with three columns
    response time is very high, Please suggest to bring down. Attached the details.
    Test Case 1:
    ************
    indexes :
    1)client
    2)group_id
    3)gateway
    4)end_time

    "det_rep_pkey" PRIMARY KEY, btree (group_id, call_id, start_time)
    "client_detailed_report_idx" btree (client), tablespace "indexspace"
    "end_time_detailed_report_idx" btree (end_time), tablespace "indexspace"
    "gateway_detailed_report_idx" btree (gateway), tablespace "indexspace"
    "group_id_detailed_report_idx" btree (group_id), tablespace "indexspace"

    testdb=# EXPLAIN (analyze,buffers,verbose)SELECT text(client) as client, text(gateway) as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0) ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd, sum(call_duration_recv)/1000.0 as duration_recv, sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as call_amount_recv, sum(call_amount_pay) as call_amount_pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway ORDER BY call_amount_recv DESC;
    QUERY PLAN

    Sort (cost=3510106.93..3510112.25 rows=2127 width=44) (actual time=148557.599..148557.599 rows=5 loops=1)
    Output: ((client)::text), ((gateway)::text), (count(*)), (((avg(duration) / 1000.0))::numeric(10,2)), ((avg(pdd))::numeric(10,2)), (((sum(call_duration_recv))::numeric / 1000.0)), (((sum(c
    all_duration_pay))::numeric / 1000.0)), (sum(call_amount_recv)), (sum(call_amount_pay)), client, gateway
    Sort Key: (sum(detailed_report.call_amount_recv))
    Sort Method: quicksort Memory: 25kB
    Buffers: shared hit=69 read=2505035
    -> HashAggregate (cost=3509920.24..3509989.37 rows=2127 width=44) (actual time=148557.556..148557.581 rows=5 loops=1)
    Output: (client)::text, (gateway)::text, count(*), ((avg(duration) / 1000.0))::numeric(10,2), (avg(pdd))::numeric(10,2), ((sum(call_duration_recv))::numeric / 1000.0), ((sum(call_dur
    ation_pay))::numeric / 1000.0), sum(call_amount_recv), sum(call_amount_pay), client, gateway
    Buffers: shared hit=69 read=2505035
    -> Bitmap Heap Scan on public.detailed_report (cost=832774.93..3487872.62 rows=979894 width=44) (actual time=14257.148..135355.676 rows=5248227 loops=1)
    Output: group_id, client, gateway, call_id, parent_call_id, start_time, connect_time, end_time, duration, source, source_alias, dest_in_number, dest_out_number, bp_code_pay, bi
    lled_duration_pay, rate_pay, rate_effective_date_pay, type_value_pay, slab_time_pay, pin_pay, amount_pay, adjusted_pin_pay, adjusted_amount_pay, call_amount_pay, country_code_pay, country_des
    c_pay, master_country_code, master_country_desc, bp_code_recv, billed_duration_recv, rate_recv, rate_effective_date_recv, type_value_recv, slab_time_recv, pin_recv, amount_recv, adjusted_pin_
    recv, adjusted_amount_recv, call_amount_recv, country_code_recv, country_desc_recv, subscriber_type, pdd, disconnect_reason, source_ip, dest_ip, caller_hop, callee_hop, caller_received_from_h
    op, callee_sent_to_hop, caller_media_ip_port, callee_media_ip_port, caller_original_media_ip_port, callee_original_media_ip_port, switch_ip, call_shop_amount_paid, version, call_duration_pay,
    call_duration_recv, audio_codec, video_codec, shadow_amount_recv, shadow_amount_pay, pulse_applied_recv, pulse_applied_pay
    Recheck Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND ((detailed_report.group_id)::text = 'admin'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:00+00
    '::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp with time zone))
    Buffers: shared hit=69 read=2505035
    -> BitmapAnd (cost=832774.93..832774.93 rows=979894 width=0) (actual time=13007.643..13007.643 rows=0 loops=1)
    Buffers: shared read=108495
    -> Bitmap Index Scan on client_detailed_report_idx (cost=0.00..172876.66 rows=7862413 width=0) (actual time=2546.204..2546.204 rows=7840766 loops=1)
    Index Cond: ((detailed_report.client)::text = 'CHOICE'::text)
    Buffers: shared read=21427
    -> Bitmap Index Scan on group_id_detailed_report_idx (cost=0.00..307105.20 rows=14971818 width=0) (actual time=4265.728..4265.728 rows=14945965 loops=1)
    Index Cond: ((detailed_report.group_id)::text = 'admin'::text)
    Buffers: shared read=40840
    -> Bitmap Index Scan on end_time_detailed_report_idx (cost=0.00..352057.65 rows=16790108 width=0) (actual time=3489.106..3489.106 rows=16917795 loops=1)
    Index Cond: ((detailed_report.end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp wi
    th time zone))
    Buffers: shared read=46228
    Total runtime: 148558.070 ms
    (24 rows)

    ==================================================================================================== =============================================================

    Test Case 2:
    ************
    Indexes :
    1)client
    2)group_id
    3)gateway
    4)end_time
    5)client,group_id

    "det_rep_pkey" PRIMARY KEY, btree (group_id, call_id, start_time)
    "client_detailed_report_idx" btree (client), tablespace "indexspace"
    "clientgroupid_detailed_report_idx" btree (client, group_id), tablespace "indexspace"
    "end_time_detailed_report_idx" btree (end_time), tablespace "indexspace"
    "gateway_detailed_report_idx" btree (gateway), tablespace "indexspace"
    "group_id_detailed_report_idx" btree (group_id), tablespace "indexspace"


    testdb=# EXPLAIN (analyze,buffers,verbose)SELECT text(client) as client, text(gateway) as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0) ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd, sum(call_duration_recv)/1000.0 as duration_recv, sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as call_amount_recv, sum(call_amount_pay) as call_amount_pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway ORDER BY call_amount_recv DESC;

    QUERY PLAN
    Sort (cost=3172381.37..3172387.11 rows=2297 width=44) (actual time=132725.901..132725.901 rows=5 loops=1)
    Output: ((client)::text), ((gateway)::text), (count(*)), (((avg(duration) / 1000.0))::numeric(10,2)), ((avg(pdd))::numeric(10,2)), (((sum(call_duration_recv))::numeric / 1000.0)), (((sum(c
    all_duration_pay))::numeric / 1000.0)), (sum(call_amount_recv)), (sum(call_amount_pay)), client, gateway
    Sort Key: (sum(detailed_report.call_amount_recv))
    Sort Method: quicksort Memory: 25kB
    Buffers: shared read=2472883
    -> HashAggregate (cost=3172178.48..3172253.13 rows=2297 width=44) (actual time=132725.861..132725.881 rows=5 loops=1)
    Output: (client)::text, (gateway)::text, count(*), ((avg(duration) / 1000.0))::numeric(10,2), (avg(pdd))::numeric(10,2), ((sum(call_duration_recv))::numeric / 1000.0), ((sum(call_dur
    ation_pay))::numeric / 1000.0), sum(call_amount_recv), sum(call_amount_pay), client, gateway
    Buffers: shared read=2472883
    -> Bitmap Heap Scan on public.detailed_report (cost=434121.21..3149462.57 rows=1009596 width=44) (actual time=8257.581..120311.450 rows=5248227 loops=1)
    Output: group_id, client, gateway, call_id, parent_call_id, start_time, connect_time, end_time, duration, source, source_alias, dest_in_number, dest_out_number, bp_code_pay, bi
    lled_duration_pay, rate_pay, rate_effective_date_pay, type_value_pay, slab_time_pay, pin_pay, amount_pay, adjusted_pin_pay, adjusted_amount_pay, call_amount_pay, country_code_pay, country_des
    c_pay, master_country_code, master_country_desc, bp_code_recv, billed_duration_recv, rate_recv, rate_effective_date_recv, type_value_recv, slab_time_recv, pin_recv, amount_recv, adjusted_pin_
    recv, adjusted_amount_recv, call_amount_recv, country_code_recv, country_desc_recv, subscriber_type, pdd, disconnect_reason, source_ip, dest_ip, caller_hop, callee_hop, caller_received_from_h
    op, callee_sent_to_hop, caller_media_ip_port, callee_media_ip_port, caller_original_media_ip_port, callee_original_media_ip_port, switch_ip, call_shop_amount_paid, version, call_duration_pay,
    call_duration_recv, audio_codec, video_codec, shadow_amount_recv, shadow_amount_pay, pulse_applied_recv, pulse_applied_pay
    Recheck Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND ((detailed_report.group_id)::text = 'admin'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:00+00
    '::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp with time zone))
    Buffers: shared read=2472883
    -> BitmapAnd (cost=434121.21..434121.21 rows=1009596 width=0) (actual time=7101.419..7101.419 rows=0 loops=1)
    Buffers: shared read=76274
    -> Bitmap Index Scan on clientgroupid_detailed_report_idx (cost=0.00..74766.52 rows=2649396 width=0) (actual time=3066.346..3066.346 rows=7840766 loops=1)
    Index Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND ((detailed_report.group_id)::text = 'admin'::text))
    Buffers: shared read=30046
    -> Bitmap Index Scan on end_time_detailed_report_idx (cost=0.00..358849.64 rows=17114107 width=0) (actual time=2969.577..2969.577 rows=16917795 loops=1)
    Index Cond: ((detailed_report.end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp wi
    th time zone))
    Buffers: shared read=46228
    Total runtime: 132726.073 ms
    (21 rows)



    ==================================================================================================== =============================================================
    Test Case 3:
    ************
    Indexes:
    Index :
    1)client
    2)group_id
    3)gateway
    4)end_time
    5)client,group_id
    6)client,group_id,end_time

    "det_rep_pkey" PRIMARY KEY, btree (group_id, call_id, start_time)
    "client_detailed_report_idx" btree (client), tablespace "indexspace"
    "clientgroupid_detailed_report_idx" btree (client, group_id), tablespace "indexspace"
    "clientgroupidendtime_detailed_report_idx" btree (client, group_id, end_time), tablespace "indexspace"
    "end_time_detailed_report_idx" btree (end_time), tablespace "indexspace"
    "gateway_detailed_report_idx" btree (gateway), tablespace "indexspace"
    "group_id_detailed_report_idx" btree (group_id), tablespace "indexspace"


    testdb=# EXPLAIN (analyze, verbose) SELECT text(client) as client, text(gateway) as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0) ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd, sum(call_duration_recv)/1000.0 as duration_recv, sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as call_amount_recv, sum(call_amount_pay) as call_amount_
    pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway ORDER BY call_amount_recv DESC;

    QUERY PLAN

    Sort (cost=2725603.99..2725609.46 rows=2188 width=44) (actual time=137713.264..137713.265 rows=5 loops=1)
    Output: ((client)::text), ((gateway)::text), (count(*)), (((avg(duration) / 1000.0))::numeric(10,2)), ((avg(pdd))::numeric(10,2)), (((sum(call_duration_recv))::numeric / 1000.0)), (((sum(c
    all_duration_pay))::numeric / 1000.0)), (sum(call_amount_recv)), (sum(call_amount_pay)), client, gateway
    Sort Key: (sum(detailed_report.call_amount_recv))
    Sort Method: quicksort Memory: 25kB
    -> HashAggregate (cost=2725411.50..2725482.61 rows=2188 width=44) (actual time=137713.192..137713.215 rows=5 loops=1)
    Output: (client)::text, (gateway)::text, count(*), ((avg(duration) / 1000.0))::numeric(10,2), (avg(pdd))::numeric(10,2), ((sum(call_duration_recv))::numeric / 1000.0), ((sum(call_dur
    ation_pay))::numeric / 1000.0), sum(call_amount_recv), sum(call_amount_pay), client, gateway
    -> Bitmap Heap Scan on public.detailed_report (cost=37356.61..2703244.88 rows=985183 width=44) (actual time=3925.850..124647.660 rows=5248227 loops=1)
    Output: group_id, client, gateway, call_id, parent_call_id, start_time, connect_time, end_time, duration, source, source_alias, dest_in_number, dest_out_number, bp_code_pay, bi
    lled_duration_pay, rate_pay, rate_effective_date_pay, type_value_pay, slab_time_pay, pin_pay, amount_pay, adjusted_pin_pay, adjusted_amount_pay, call_amount_pay, country_code_pay, country_des
    c_pay, master_country_code, master_country_desc, bp_code_recv, billed_duration_recv, rate_recv, rate_effective_date_recv, type_value_recv, slab_time_recv, pin_recv, amount_recv, adjusted_pin_
    recv, adjusted_amount_recv, call_amount_recv, country_code_recv, country_desc_recv, subscriber_type, pdd, disconnect_reason, source_ip, dest_ip, caller_hop, callee_hop, caller_received_from_h
    op, callee_sent_to_hop, caller_media_ip_port, callee_media_ip_port, caller_original_media_ip_port, callee_original_media_ip_port, switch_ip, call_shop_amount_paid, version, call_duration_pay,
    call_duration_recv, audio_codec, video_codec, shadow_amount_recv, shadow_amount_pay, pulse_applied_recv, pulse_applied_pay
    Recheck Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND ((detailed_report.group_id)::text = 'admin'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:00+00
    '::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp with time zone))
    -> Bitmap Index Scan on clientgroupidendtime_detailed_report_idx (cost=0.00..37110.31 rows=985183 width=0) (actual time=2820.150..2820.150 rows=5248227 loops=1)
    Index Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND ((detailed_report.group_id)::text = 'admin'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:0
    0+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-07-01 00:00:00+00'::timestamp with time zone))
    Total runtime: 137728.029 ms
    (12 rows)
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    158
    Rep Power
    10
    Interesting, the plans all indicate that the database tries the daterang first, so apparently that does reduce the number of rows the most.

    Also:

    Bitmap Heap Scan on public.detailed_report (cost=832774.93..3487872.62 rows=979894 width=44) (actual time=14257.148..135355.676 rows=5248227 loops=1)
    A bitmap heap scan is when the database has found rows using an index, and is now fetching them from the table for further processing. Apparently it want's to do something that requires having the row in memory.

    Which version of PostgreSQL are you using? The newer versions can do intex-only scans, which may remove the need for this very slow operation.

    I see two things you can try. One: create a partial index on just the range of dates you are looking for. If that helps then you can create more partial indexes for similar date ranges.
    Two: take your question to the guru's of the mailinglists: http://www.postgresql.org/list/
    The actual contributors to the sourcecode are very active there and can tell you a lot more about this than I cam.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    12
    Rep Power
    0

    Postgres Performance Tuning


    Thanks for the suggestion.
    We are using postgres 9.3.1

    As per your yesterday's suggestion, created partial Index for group_Id = 'admin' but it did not improve the performance.

    As you suggested, will create partial index on date range and update you.
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    158
    Rep Power
    10
    9.3.1 should be fine. Please take note that an important fix was released yesterday: http://www.postgresql.org/about/news/1492/

    Did you include the other columns in the partial indexes too?

    What you are looking for in the explain is a few steps as possible. The explain tells you which indexes it is using to solve which parts of the query, this is very clear in your previous post where explain says that it is trying to solve the date clause by using the only index it has that mentions the date, but then uses a different step to solve the rest of the query.

    It also does a recheck, which means the index is so large that the database can nolonger identify every record uniquely and has to re-check against the table to find out which records the index actually filtered. Hence my suggestion to use partial indexes.

    It is definately worth a try to post your question on the mailinglists too.
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    12
    Rep Power
    0

    Postgres Performance tuning


    thanks for the suggestion.
    As you suggested, created partial Index on date range, but the response time is high.
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    158
    Rep Power
    10
    What indexes did you try, and what was the explain plan?

    And how long does the query take?
  20. #11
  21. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    12
    Rep Power
    0

    Postgres Performance Tuning


    hi,
    created two partial index
    1. group_id = 'admin' because this category will have more number of records.
    2. end_time for one month date range, because we will be reading date range.

    executed the query for one month because partial index created for a month. response time was high.

    Increased the work_mem to 2 GB then I found that response time improved.
    The Query Plan provided below is taken with work_mem = 2 GB


    testdb=# EXPLAIN (analyze, verbose) SELECT text(client) as client, text(gateway) as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0) ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd, sum(call_duration_recv)/1000.0 as duration_recv, sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as call_amount_recv, sum(call_amount_pay) as call_amount
    _pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and end_time<'2013-06-01 00:00' and group_id='admin' and client ='CHOICE' GROUP by client, gateway ORDER BY call_amount_recv DESC;
    QUERY PLAN
    Sort (cost=2394805.34..2394808.42 rows=1233 width=44) (actual time=28567.127..28567.128 rows=5 loops=1)
    Output: ((client)::text), ((gateway)::text), (count(*)), (((avg(duration) / 1000.0))::numeric(10,2)), ((avg(pdd))::numeric(10,2)), (((sum(call_duration_recv))::numeric / 1000.0)), (((sum(c
    all_duration_pay))::numeric / 1000.0)), (sum(call_amount_recv)), (sum(call_amount_pay)), client, gateway
    Sort Key: (sum(detailed_report.call_amount_recv))
    Sort Method: quicksort Memory: 25kB
    -> HashAggregate (cost=2394701.96..2394742.04 rows=1233 width=44) (actual time=28567.094..28567.110 rows=5 loops=1)
    Output: (client)::text, (gateway)::text, count(*), ((avg(duration) / 1000.0))::numeric(10,2), (avg(pdd))::numeric(10,2), ((sum(call_duration_recv))::numeric / 1000.0), ((sum(call_dur
    ation_pay))::numeric / 1000.0), sum(call_amount_recv), sum(call_amount_pay), client, gateway
    -> Bitmap Heap Scan on public.detailed_report (cost=679779.50..2382099.69 rows=560101 width=44) (actual time=10868.007..24412.561 rows=2958493 loops=1)
    Output: group_id, client, gateway, call_id, parent_call_id, start_time, connect_time, end_time, duration, source, source_alias, dest_in_number, dest_out_number, bp_code_pay, bi
    lled_duration_pay, rate_pay, rate_effective_date_pay, type_value_pay, slab_time_pay, pin_pay, amount_pay, adjusted_pin_pay, adjusted_amount_pay, call_amount_pay, country_code_pay, country_des
    c_pay, master_country_code, master_country_desc, bp_code_recv, billed_duration_recv, rate_recv, rate_effective_date_recv, type_value_recv, slab_time_recv, pin_recv, amount_recv, adjusted_pin_
    recv, adjusted_amount_recv, call_amount_recv, country_code_recv, country_desc_recv, subscriber_type, pdd, disconnect_reason, source_ip, dest_ip, caller_hop, callee_hop, caller_received_from_h
    op, callee_sent_to_hop, caller_media_ip_port, callee_media_ip_port, caller_original_media_ip_port, callee_original_media_ip_port, switch_ip, call_shop_amount_paid, version, call_duration_pay,
    call_duration_recv, audio_codec, video_codec, shadow_amount_recv, shadow_amount_pay, pulse_applied_recv, pulse_applied_pay
    Recheck Cond: (((detailed_report.client)::text = 'CHOICE'::text) AND (detailed_report.end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (detailed_report.end_t
    ime < '2013-06-01 00:00:00+00'::timestamp with time zone) AND ((detailed_report.group_id)::text = 'admin'::text))
    -> BitmapAnd (cost=679779.50..679779.50 rows=560101 width=0) (actual time=10310.273..10310.273 rows=0 loops=1)
    -> Bitmap Index Scan on client_index (cost=0.00..173568.67 rows=7893881 width=0) (actual time=2473.213..2473.213 rows=7840766 loops=1)
    Index Cond: ((detailed_report.client)::text = 'CHOICE'::text)
    -> Bitmap Index Scan on partial_endtime_index(cost=0.00..199145.02 rows=9573259 width=0) (actual time=1834.884..1834.884 rows=9476589 loops=1)
    Index Cond: ((detailed_report.end_time >= '2013-05-01 00:00:00+00'::timestamp with time zone) AND (detailed_report.end_time < '2013-06-01 00:00:00+00'::timestamp wi
    th time zone))
    -> Bitmap Index Scan on groupid_index (cost=0.00..306645.23 rows=14949422 width=0) (actual time=4393.754..4393.754 rows=14945965 loops=1)
    Index Cond: ((detailed_report.group_id)::text = 'admin'::text)
    Total runtime: 28567.277 ms

    Basically, got the following doubts:
    1. why the inner node BitmapAnd rows are not matching with outer node Bitmap heap scan, that is in Bitmap heap scan, there is huge difference between estimated rows and actual rows.
    2. Why recheck happens, how to reduce it.

    thanks
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    158
    Rep Power
    10
    There is always a difference between the extimated rows and tha actual rows, depending on how recent the statistics are and wether the planner can is indexes to estimate how many rows it can eliminate.

    The size of the difference is probably due to the recheck condition itself: Read: http://postgresql.1045698.n5.nabble.com/Recheck-condition-td1892653.html

    Increasing work_mem may indeed help for the grouping and sorting, it probably won't help with fetching the records though, you may want to tune effective cache for that.
    Be carefull woth work_mem though, it can use the configured about once per sorting operation, and you are grouping and ordering so you'll probably already using twice that amount per query. Don't just allocate 100MB unless on a server that runs more queries than the ones you are testing or you may use up your memory very quickly indeed.
  24. #13
  25. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    12
    Rep Power
    0

    Postgres Performance Tuning


    hi,
    thanks for your support, feedback and suggestions. It helped me to analyze and experiment further.
    To reduce the time for the Bitmap Heap scan and recheck, created a CLUSTER Index on multiple columns(group_id, end_time) which really helped me to reduce the response time. Please check and provide your comment.

    As per the below Query Plan, let me know what more optimization we can do, especially the time takes for HashAggregate.

    INDEX:
    "groupid_endtime_detailed_report_idx" btree (group_id, end_time) CLUSTER, tablespace "indexspace"

    explain analyze SELECT text(client) as client, text(gateway) as gateway, text(country_code_pay[1] || ' - ' || country_desc_pay[1]) as cntry,count(*)::bigint as total_calls, (avg(duration)/1000.0) ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd, sum(call_duration_recv)/1000 ::bigint as duration_recv, sum(call_duration_pay)/1000 ::bigint as duration_pay, sum(call_amount_recv) as call_amount_recv, sum(call_amount_pay) as call_amount_pay FROM detailed_report WHERE end_time>='2013-04-01 00:00' and end_time<'2013-06-01 00:00' and group_id ='admin' and client = 'CHOICE' and gateway = 'DivXN' GROUP BY client, gateway, cntry ORDER BY call_amount_recv desc;
    QUERY PLAN
    --------------------------------------------------
    Sort (cost=2687561.38..2687740.44 rows=71625 width=120) (actual time=19862.346..19862.641 rows=4041 loops=1)
    Sort Key: (sum(call_amount_recv))
    Sort Method: quicksort Memory: 1042kB
    -> HashAggregate (cost=2679457.66..2681785.47 rows=71625 width=120) (actual time=19850.844..19859.021 rows=4041 loops=1)
    -> Bitmap Heap Scan on detailed_report (cost=314424.63..2661551.49 rows=716247 width=120) (actual time=3791.514..15565.001 rows=4180419 loops=1)
    Recheck Cond: (((group_id)::text = 'admin'::text) AND (end_time >= '2013-04-01 00:00:00+00'::timestamp with time zone) AND (end_time < '2013-06-01 00:00:00+00'::timestamp with
    time zone) AND ((client)::text = 'CHOICE'::text))
    Filter: ((gateway)::text = 'DivXN'::text)
    Rows Removed by Filter: 314051
    -> BitmapAnd (cost=314424.63..314424.63 rows=826884 width=0) (actual time=3464.895..3464.895 rows=0 loops=1)
    -> Bitmap Index Scan on groupid_endtime_detailed_report_idx (cost=0.00..142573.19 rows=4761330 width=0) (actual time=1373.094..1373.094 rows=4755008 loops=1)
    Index Cond: (((group_id)::text = 'admin'::text) AND (end_time >= '2013-04-01 00:00:00+00'::timestamp with time zone) AND (end_time < '2013-06-01 00:00:00+00'::times
    tamp with time zone))
    -> Bitmap Index Scan on client_index (cost=0.00..171493.06 rows=7799533 width=0) (actual time=1895.164..1895.164 rows=7840766 loops=1)
    Index Cond: ((client)::text = 'CHOICE'::text)
    Total runtime: 19864.231 ms
    (14 rows)

    Time: 19869.295 ms
  26. #14
  27. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    158
    Rep Power
    10
    Clustering re-orders the records so that would reduce seektimes on disk. Just remember that you will have to re-cluster once in a while, which will probably take some time.

    The fact that this helps indicates that your cache does not contain these records, or not enough of their content to be usefull. Hmm...

    As for the aggregation, well, you are aggregating the lot, I don't think there is much you can do to speed that up, the database needs to have the appropriate fields from disk and run the calculations on them.

    The only thing I can think of to help that is to create smaller, cached reports so that in stead of aggregating the lot every time, you aggregate smaller fragments once every X hours or so.
  28. #15
  29. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    12
    Rep Power
    0

    Postgres Performance Tuning


    hi,
    thanks for your comments.
    Regarding cluster, as I mentioned earlier, I am not going to make any updates. Will be inserting records only through bulk inserts. With this details, let me know, whether it requires periodic re-cluster.

    Moreover, with your analysis and observation, it appears that you are not favouring the performance through cluster, may be you can suggest an alternate.
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo