#1
  1. flit, flit, flit...
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2003
    Location
    New York City
    Posts
    167
    Rep Power
    13

    Speeding up DBI connect()


    Hi there.

    Currently, I have a client-server setup for accessing PostgreSQL through Perl DBI, and it looks like this:

    Server:
    Dual Pentium Pro 200MHz
    160MB RAM
    PostgreSQL 7.3.3, all Debian Linux default settings

    Client:
    Dual Pentium II 200MHz
    192MB RAM
    Perl 5.8.0

    My problem is that a DBI->connect() call from Perl takes anywhere from 600 to 700ms. I plan to upgrade the database server soon, but for now, can any of you recommend some way of tweaking the machines to expedite the connect() process? Also, am I correct in assuming the bottleneck is on the server side and not the client side?

    Thank you very much.

    Colin
    Colin Wetherbee
  2. #2
  3. No Profile Picture
    Gödelian monster
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Jul 1999
    Location
    Central Florida, USA
    Posts
    2,307
    Rep Power
    62
    That sounds more like a networking problem than a PostgreSQL problem. Have you done any benchmarking of your network connection itself? What speed is the network? Are you sure there is no packet loss somewhere, such as a faulty hub, network card, etc...?

    Also, I have noticed that sometimes certain Unix services lag in response because they do a reverse hostname lookup, and if the connecting machine doesn't have a recognized hostname, the response takes longer. (Simple answer: If you don't run your own DNS, then have each machine refer to the other in the /etc/hosts file)
    The real n-tier system:

    FreeBSD -> PostgreSQL -> [any_language] -> Apache -> Mozilla/XUL

    Amazon wishlist -- rycamor (at) gmail.com
  4. #3
  5. flit, flit, flit...
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2003
    Location
    New York City
    Posts
    167
    Rep Power
    13
    Thanks for your response.

    The network is my home 100Mbps LAN with only nine computers; one of which does serve the LAN's DNS. The SQL client and server are on adjacent ports on a 100Mbps Intel hub. Most of the rest of the network is switched, so there's actually very little traffic on the hub. I doubt the connection lag is due to network misconfiguration.

    I've done a bit more testing and found out that when I use the Perl DBI to connect to two databases sequentially, it takes between 900 and 1000ms.

    Even the psql client takes about that long to connect, though it was a bit more difficult to time that.

    Any thoughts? Thanks a lot.

    Colin
    Colin Wetherbee
  6. #4
  7. No Profile Picture
    Gödelian monster
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Jul 1999
    Location
    Central Florida, USA
    Posts
    2,307
    Rep Power
    62
    Well, the next thing to do is to remove all doubt . Actually test your network with some sort of standard test utility, connectiong from one machine to the other, to make sure. And then on the PostgreSQL side, set your postgresql.conf file to a higher level of verbosity than the default, and see what shows up in serverlog as you connect (tail -f serverlog). Also, while you are connecting, keep an eye on what shows up on /etc/messages for both machines.
    The real n-tier system:

    FreeBSD -> PostgreSQL -> [any_language] -> Apache -> Mozilla/XUL

    Amazon wishlist -- rycamor (at) gmail.com
  8. #5
  9. flit, flit, flit...
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2003
    Location
    New York City
    Posts
    167
    Rep Power
    13
    I've decided to use netperf to benchmark, and I've come up with the following results.

    Client to server:
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time      Throughput
    bytes  bytes   bytes    secs.    10 ^6bits/sec

     87380  16384  16384    10.00    &n bsp; 67.65


    Server to client:
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time      Throughput
    bytes  bytes   bytes    secs.    10 ^6bits/sec

     87380  16384  16384    10.01    &n bsp; 68.24


    I'll check log output later tonight or tomorrow. Thanks again for the help.
    Colin Wetherbee
  10. #6
  11. flit, flit, flit...
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2003
    Location
    New York City
    Posts
    167
    Rep Power
    13
    Well, my electric company ended up forcibly rebooting my PostgreSQL server (along with most other electronics in downtown Portland), and the lag fixed itself.

    If it comes back, I'll try to keep a closer eye on what's causing it. Thanks for your help.
    Colin Wetherbee
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2003
    Posts
    373
    Rep Power
    12
    I think in the postgres documentation there's something about changing the buffer size, that can help performance. they said it should be like 50% of your memory? I know your machines are already set up, but this could be a contributing factor. Also, I am not really sure what kind of programming you're doing here, like how many connections you're using, but when I was doing some JDBC stuff, it says that for that type of connection (not all server-side processing) then you should create a pool of connections, and then hand them out and return them to the pool so you don't have to worry about the overhead except at the beginning (which seems sensible to me). I am also wondering if you have to use perl and whatnot to do your project, maybe you could use a procedural language to execute your code, there is a plperl available in the 7.2 and up I am almost certain. that way, when the procedure is called, it is already sitting connected inside the database space.
  14. #8
  15. flit, flit, flit...
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2003
    Location
    New York City
    Posts
    167
    Rep Power
    13
    I think it's probably safe to assume changing the buffer size is a compile-time option. At this point, that would be kind of difficult to change.

    As for pooling database handles, that's a good idea. Writing a Perl daemon to handle that would be quick and easy. This is probably what I'll do, anyway, even if the lag doesn't come back.

    At the moment, I've done no pl/anything, but pl/Perl is something I look forward to checking out.

    My application is a series of web pages, all handled by the same script (and various Perl modules). When the script begins, it makes two network connections to the database, and when it ends, it disconnects them. The application will eventually have about 40 simultaneous users, so the database connections could happen as quickly as 80 times per second. Well, maybe not, but the application should be able to easily handle that. I think pooling database handles in a daemon is the way to go.

    Thanks for the ideas.

    Colin
    Colin Wetherbee
  16. #9
  17. No Profile Picture
    Gödelian monster
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Jul 1999
    Location
    Central Florida, USA
    Posts
    2,307
    Rep Power
    62
    There are plenty of things you can do to tweak PostgreSQL for performance, but really they would have no effect on the lag time of a single connection. From the fact that a reboot cleared it up, I would say there was some sort of fluke or abnormality with either the network or the network cards of one of those machines.

    While you cannot change the buffer size (default 8K) without recompiling, in postgresql.conf you can change the number of shared buffers available, which should be set to at least twice your max_connections setting.

    One of the most important settings to change is sort_mem, which gives PostgreSQL more memory for handling complex queries. The default is quite low, so I usually set it to at least 4096 (4 MB) on a machine with 256 MB of RAM. Experiment accordingly.

    You might want to also experiment with the following:

    effective_cache_size
    random_page_cost
    cpu_tuple_cost
    cpu_index_tuple_cost
    cpu_operator_cost

    Enjoy
    The real n-tier system:

    FreeBSD -> PostgreSQL -> [any_language] -> Apache -> Mozilla/XUL

    Amazon wishlist -- rycamor (at) gmail.com
  18. #10
  19. flit, flit, flit...
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2003
    Location
    New York City
    Posts
    167
    Rep Power
    13
    Ooh, shiny. Thanks a bunch.

    Colin
    Colin Wetherbee

IMN logo majestic logo threadwatch logo seochat tools logo