#1
  1. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2002
    Posts
    56
    Rep Power
    12

    Unhappy postgres and cursors


    After half a day of hammering away at it, I believe the postgres module to php does not support cursors. I had envisioned getting my clunky queries out of the php and into nice, overloadable functions....

    Does anyone know of another abstraction layer that does work with postgres cursors - adodb mentions only 'client cursors'.

    My failed cursor test
    Casuistry Agnostic
  2. #2
  3. Modding: Oracle MsSQL Firebird
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jun 2001
    Location
    Outside US
    Posts
    8,527
    Rep Power
    538
    I'm not shure, but you might try unixODBC, there are (unix)ODBC drivers for PostgreSQL, MySQL ... , and I think that php odbc functions support cursors.
    Hope this helps
  4. #3
  5. No Profile Picture
    Gödelian monster
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Jul 1999
    Location
    Central Florida, USA
    Posts
    2,307
    Rep Power
    61
    As I said in earlier posts, PHP does not directly support cursors under postgres (as it does in the oci8 library). You can sort of "trick" PHP into giving you cursor results, under the conditions I described, but I haven't spent much time examining it beyond that. Honestly, it seems that you want stored procedures to accomplish what views were intended for.

    One of the reasons PHP dosn't use cursors for PostgreSQL is that cursors are generally not considered as critical for performance in PostgreSQL as they are in Oracle, which has considerably more overhead. For much of what you would want cursors to handle, you could accomplish the same thing with the LIMIT .. OFFSET ... syntax, which is lacking in Oracle.

    I'm sure that eventually PHP will have more support for cursors in PostgreSQL, but you have to remember that web applications are not really a prime place to use cursors, because you are non maintaining stateful connections to the server. (but I could see some reasons to use cursors in backend, clustered application servers, maybe tied in with the PHP VL-SRM server)

    For more background on all this, visit http://archives.postgresql.org/ and search the PHP section for "cursor".
    The real n-tier system:

    FreeBSD -> PostgreSQL -> [any_language] -> Apache -> Mozilla/XUL

    Amazon wishlist -- rycamor (at) gmail.com
  6. #4
  7. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2002
    Posts
    56
    Rep Power
    12
    Thanks for the info!

    Being able to use functions in a stored procedure fashion was my end - returning cursors was the means. Stored procedures are much easier to change than going into the php, and I hoped to be able to share functions between the website and the data entry app.

    I've tried views, the problem is they give disturbingly worse performance than using the straight SELECT, even when I break down the query into a join involving several views. In a functon I could have inserted additional search criteria into the JOIN .. ON clauses so as to reduce the number of rows in the earlier joins, but I have no so option within the view definition. Maybe I am missing something here, but I have only used them for the simpler queries.
    Casuistry Agnostic
  8. #5
  9. No Profile Picture
    Gödelian monster
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Jul 1999
    Location
    Central Florida, USA
    Posts
    2,307
    Rep Power
    61
    Originally posted by ultraslacker
    Thanks for the info!

    Being able to use functions in a stored procedure fashion was my end - returning cursors was the means. Stored procedures are much easier to change than going into the php, and I hoped to be able to share functions between the website and the data entry app.
    Well, I was able to convert the second cursor example from the PostgreSQL documentation to work with PHP, so I can query on a named cursor, instead of the "<unnamed cursur 1>" stuff, and I can pass multiple parameters to the function. Now, as I said, cursors within PHP will only be available within one instance of a PHP script running. If this would be enough to satisfy your needs, why don't you post your attempts at creating the function and we'll take a look.
    I've tried views, the problem is they give disturbingly worse performance than using the straight SELECT, even when I break down the query into a join involving several views.
    Give me a for-instance. So far, the only time I have had performance problems with views was when they were doing some horrendous self-joins with calculated fields, which would slow down any query.
    In a functon I could have inserted additional search criteria into the JOIN .. ON clauses so as to reduce the number of rows in the earlier joins, but I have no so option within the view definition. Maybe I am missing something here, but I have only used them for the simpler queries.
    Yes, the only way of passing extra parameters to a view is by performing a query on that view. However, you could probably do something clever with a query rewrite RULE, and maybe having a function in the WHERE or JOIN ON clause of the query that defines the view, or in the query that you make against the view. PostgreSQL can index functions too, AFAIK.

    You might benefit from reading some of the PostgreSQL performance tuning articles at http://techdocs.postgresql.org/. I found that when I did just a bit of tuning, as well as running the query analyzer regularly, performance went way up.
    The real n-tier system:

    FreeBSD -> PostgreSQL -> [any_language] -> Apache -> Mozilla/XUL

    Amazon wishlist -- rycamor (at) gmail.com
  10. #6
  11. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2002
    Posts
    56
    Rep Power
    12
    The link to my broken php code is on my first posting in this thread - I posted it in the PHP forum. I started using stored procedures religiously with mssql, and even though postgres doesn't precompile them like sql server, it still sounds like the best route.

    Concerning the views, here's a (rather lengthy) example from a query that returns candidates per districts, using the full zipcode:

    SELECT c.candidate_id, candidate_lastname, office_name, office_title, state_lowername, state_name, state_abbreviation, district_name, party_name
    FROM zip2district z
    JOIN district d ON (zip5 = '59858' AND zip4start <= '0092' AND zip4end >= '0092' AND (d.state_id = 0 OR (z.state_id = d.state_id AND ((z.house = d.zip2district_code AND d.district_level = 'H') OR (z.upperdistrict = d.zip2district_code AND d.district_level = 'U') OR (z.lowerdistrict = d.zip2district_code AND d.district_level = 'L') OR d.district_level = 'S' OR d.district_level = '0'))))
    JOIN office_candidate oc ON (oc.office_candidate_active AND oc.district_id = d.district_id)
    JOIN office o ON (o.office_id = oc.office_id)
    JOIN candidate c ON (oc.candidate_id = c.candidate_id)
    JOIN state s ON (oc.state_id = s.state_id)
    JOIN party_candidate pc ON (c.candidate_id = pc.candidate_id)
    JOIN party p ON (pc.party_id = p.party_id)
    ORDER BY o.office_id, d.district_code;

    NOTICE: QUERY PLAN:

    Sort (cost=8771.98..8771.98 rows=1 width=196) (actual time=185.90..185.92 rows=14 loops=1)
    -> Nested Loop (cost=8755.85..8771.97 rows=1 width=196) (actual time=145.20..185.52 rows=14 loops=1)
    -> Nested Loop (cost=8755.85..8763.92 rows=1 width=173) (actual time=142.16..146.44 rows=14 loops=1)
    -> Hash Join (cost=8755.85..8759.71 rows=1 width=165) (actual time=141.66..142.60 rows=14 loops=1)
    -> Seq Scan on state s (cost=0.00..3.56 rows=56 width=33) (actual time=0.12..0.83 rows=56 loops=1)
    -> Hash (cost=8755.85..8755.85 rows=1 width=132) (actual time=140.73..140.73 rows=0 loops=1)
    -> Nested Loop (cost=8748.07..8755.85 rows=1 width=132) (actual time=136.42..140.63 rows=14 loops=1)
    -> Hash Join (cost=8748.07..8749.61 rows=1 width=104) (actual time=135.94..136.87 rows=14 loops=1)
    -> Seq Scan on office o (cost=0.00..1.35 rows=35 width=67) (actual time=0.12..0.49 rows=35 loops=1)
    -> Hash (cost=8748.07..8748.07 rows=1 width=37) (actual time=135.64..135.64 rows=0 loops=1)
    -> Nested Loop (cost=0.00..8748.07 rows=1 width=37) (actual time=1.42..135.55 rows=14 loops=1)
    -> Nested Loop (cost=0.00..8736.42 rows=1 width=27) (actual time=0.94..132.30 rows=9 loops=1)
    -> Index Scan using zip2district_zip5_idx on zip2district z (cost=0.00..19.07 rows=9 width=8) (actual time=0.65..1.09 rows=1 loops=1)
    -> Seq Scan on district d (cost=0.00..241.53 rows=5671 width=19) (actual time=0.14..89.78 rows=7421 loops=1)
    -> Index Scan using office_candidate_district_idx on office_candidate oc (cost=0.00..11.63 rows=2 width=10) (actual time=0.23..0.31 rows=2 loops=9)
    -> Index Scan using candidate_pk on candidate c (cost=0.00..5.78 rows=1 width=28) (actual time=0.21..0.22 rows=1 loops=14)
    -> Index Scan using party_candidate_candidate_idx on party_candidate pc (cost=0.00..4.20 rows=1 width=8) (actual time=0.20..0.20 rows=1 loops=14)
    -> Seq Scan on party p (cost=0.00..5.13 rows=213 width=23) (actual time=0.02..1.36 rows=213 loops=14)
    Total runtime: 188.56 msec

    Now the view

    vw_office_candidate definition:
    SELECT c.candidate_id, c.candidate_lastname, o.office_id, o.office_name, o.office_title, oc.office_candidate_id, oc.state_id, oc.district_id
    FROM office_candidate oc
    JOIN office o ON (oc.office_candidate_active AND o.office_id = oc.office_id)
    JOIN candidate c ON (c.candidate_active AND oc.candidate_id = c.candidate_id);


    SELECT c.candidate_id, candidate_lastname, office_name, office_title, state_lowername, state_name, state_abbreviation, district_name, party_name
    FROM zip2district z
    JOIN district d ON (zip5 = '59858' AND zip4start <= '0092' AND zip4end >= '0092' AND (d.state_id = 0 OR (z.state_id = d.state_id AND ((z.house = d.zip2district_code AND d.district_level = 'H') OR (z.upperdistrict = d.zip2district_code AND d.district_level = 'U') OR (z.lowerdistrict = d.zip2district_code AND d.district_level = 'L') OR d.district_level = 'S' OR d.district_level = '0'))))
    JOIN vw_office_candidate c ON (c.district_id = d.district_id)
    JOIN state s ON (c.state_id = s.state_id)
    JOIN party_candidate pc ON (c.candidate_id = pc.candidate_id)
    JOIN party p ON (pc.party_id = p.party_id)
    ORDER BY c.office_id, d.district_code;

    NOTICE: QUERY PLAN:

    Sort (cost=10279.85..10279.85 rows=1 width=159) (actual time=725.00..725.02 rows=14 loops=1)
    -> Nested Loop (cost=10263.73..10279.84 rows=1 width=159) (actual time=684.55..724.72 rows=14 loops=1)
    -> Nested Loop (cost=10263.73..10271.79 rows=1 width=136) (actual time=683.93..687.98 rows=14 loops=1)
    -> Hash Join (cost=10263.73..10267.58 rows=1 width=128) (actual time=683.44..684.30 rows=14 loops=1)
    -> Seq Scan on state s (cost=0.00..3.56 rows=56 width=33) (actual time=0.13..0.83 rows=56 loops=1)
    -> Hash (cost=10263.72..10263.72 rows=1 width=95) (actual time=682.41..682.41 rows=0 loops=1)
    -> Hash Join (cost=9680.34..10263.72 rows=1 width=95) (actual time=321.19..682.29 rows=14 loops=1)
    -> Hash Join (cost=943.91..1495.28 rows=6401 width=68) (actual time=187.63..526.21 rows=8210 loops=1)
    -> Hash Join (cost=1.44..412.82 rows=7998 width=54) (actual time=1.00..201.01 rows=8292 loops=1)
    -> Seq Scan on office_candidate oc (cost=0.00..271.42 rows=7998 width=10) (actual time=0.12..97.50 rows=8292 loops=1)
    -> Hash (cost=1.35..1.35 rows=35 width=44) (actual time=0.56..0.56 rows=0 loops=1)
    -> Seq Scan on office o (cost=0.00..1.35 rows=35 width=44) (actual time=0.13..0.41 rows=35 loops=1)
    -> Hash (cost=917.38..917.38 rows=10038 width=14) (actual time=186.22..186.22 rows=0 loops=1)
    -> Seq Scan on candidate c (cost=0.00..917.38 rows=10038 width=14) (actual time=0.13..146.94 rows=10038 loops=1)
    -> Hash (cost=8736.42..8736.42 rows=1 width=27) (actual time=132.90..132.90 rows=0 loops=1)
    -> Nested Loop (cost=0.00..8736.42 rows=1 width=27) (actual time=0.99..132.82 rows=9 loops=1)
    -> Index Scan using zip2district_zip5_idx on zip2district z (cost=0.00..19.07 rows=9 width=8) (actual time=0.67..1.13 rows=1 loops=1)
    -> Seq Scan on district d (cost=0.00..241.53 rows=5671 width=19) (actual time=0.15..91.30 rows=7421 loops=1)
    -> Index Scan using party_candidate_candidate_idx on party_candidate pc (cost=0.00..4.20 rows=1 width=8) (actual time=0.19..0.20 rows=1 loops=14)
    -> Seq Scan on party p (cost=0.00..5.13 rows=213 width=23) (actual time=0.02..1.37 rows=213 loops=14)
    Total runtime: 726.16 msec

    So all my care goes out the window, as the query ignores the ON clause for the view, initially - 8292 rows are all the active seats in office_candidate. Now, if I cheat on the view and specify the actual district_ids for the rows in office_candidate (as I do in the first query), I get a much better plan, but it still takes twice as long as the first.

    As far as optimization, I have tried my darnedest, indices where needed and vacuum analyze.

    Rules are brand new to me and I did not know that indexing postgres functions was possible, which sounds ideal for the static data - I'll check all that out after I get some sleep!!

    Thanks for your time and help!
    Last edited by ultraslacker; August 4th, 2002 at 11:12 AM.
    Casuistry Agnostic
  12. #7
  13. No Profile Picture
    Gödelian monster
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Jul 1999
    Location
    Central Florida, USA
    Posts
    2,307
    Rep Power
    61
    Maybe I'm missing something, but the view doesn't seem to use the same query as the top query. In fact, I see two distinct queries where you are talking about the view. Please clarify.

    ... though postgres doesn't precompile them like sql server, it still sounds like the best route.
    I don't know about that. Where did you find documentation suggesting that PostgreSQL doesn't precompile functions? Also, were you aware that you can use the keyword 'iscacheable' after the function definition, and Postgres will cache the function for next use?

    By the way, here is the (sparse) info I could find on functional indexes for Postgres. (Man, I hope someone writes a heavy-duty book on Postgres internals soon).
    The real n-tier system:

    FreeBSD -> PostgreSQL -> [any_language] -> Apache -> Mozilla/XUL

    Amazon wishlist -- rycamor (at) gmail.com
  14. #8
  15. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2002
    Posts
    56
    Rep Power
    12
    Yes, my view only includes office_candidate, office, and candidate. So in my second query, I replaced those three tables with the view. The only difference in the query involving the view is I cannot specify the district_id in office_candidate UNTIL the view is processed. I also tried the whole table in the view, but due to the size of zip2district (about 6.5 million rows) it performs quite horribly.

    Iscacheable - I am using that switch for my lookups on data that I know will not change. I'm also using it on areas that most likely will not change - I should make a trigger now to recreate those functions before I forget!

    Precompiled - Reading between the lines on the Benefits of Postgres functions. This sounds like Postgres sends the script to an interpretor and gets a new query plan each time - iscacheable is an exception to this.
    Last edited by ultraslacker; August 5th, 2002 at 04:58 PM.
    Casuistry Agnostic

IMN logo majestic logo threadwatch logo seochat tools logo