October 21st, 2003, 10:28 PM
Poor SQL Server Programming, Part 4
It is hard to believe their are so many injudicious DBA's and Developers out there that can let something so simple slide by--not to mention Q/A and Implementation folks. It's totally retarded.
I was supporting a banking product for a company that wrote client/server account processing software. This software ran on AIX Unix and was written in Mumps. It was essentially a mainframe application that clients when then access via telnet originally, which later supportd rich based application.
The entire design of the system was flawed and was clearly band-aided together to make all the disjointed pieces fit.
Mainframe would create extracts that would be premitively copied to a directory on the Unix box. A service that was written (which required the user to be logged on locally at all times because it had a form with status information--which defeats the purpose of having a service to begin with) would do a proprietary file transfer. Once text files from mainframe were copied to the local machine, they would procede to INSERT all the data into SQL Server.
SA password was blank. And our stupid implemenation folks rolled it out this way to all our customers' branches without saying a wierd. Yes, these were MCSE, in fact. Paper as far as I'm concerned.
Application also used an application SQL Server account that was SysAdmin. Of course, the password for this account was the same for every branch and every bank that we installed. Crazy!
There was never any maintenance plan or best practices given when the servers were rolled out. The systems basically ran until the transaction log filled the HD or performance suffered. I.E., When customer was initially setup, response between client application, mainframe, and SQL Server was about 3-5 secs. After about a month or more, it would take over a minute just to perform a transaction within the banking application.
The performance issue got so bad that our customers were irate at the amount of time it took to do their business at the teller. Tellers were furious because they were at the mery of the application.
After we had about 6 banks complain about poor performance, a special team was created. It consisted of a few people from our home office (which controlled most of the support and development of this application that was licensed from this other company--named withheld) and two gentlemen from the company that we leased the software from.
Everyone from the team went on-site at a few customers to try and identify and determine the cause of the slow reponse. They did various cosmetic changes at the customer sites that did temporary increase the response time. For example, many clients at the banks had virus scanner on and was scanning all incoming/outgoing packets. This would slow their network applications down. This was turned off and customer was pleased that performance was better---however, not great. Team thought they did something special. Yeah, right!
After about another month... performance was back or worse than what it originally was and team was re-assembled. There was various conference calls, etc. No one had any ideas but everyone pointed their fingers at the network. Isn't that always the easy thing to do? No matter how much bandwidth any application had, it was always a network problem.
The team later flew out to the home office to a test lab they had. They purchased $30K worth of load-balancing software to simulatea simultaneous access from the application to the host system and SQL Server. Myself and two other folks from my office were on the team. Essentially, nothing happened the 3 days we were there. The original team spent most of their time writing RoadRunner scripts that would modify a TCP/IP frame and change it around so it looked like it was performing the transcation from another machine. Nothing accomplished and we all went home without figuring out anything. My buds and I just hung around since the home office folks and the other company folks acted like they ran the show and it was their lab servers, etc., and we couldn't touch.
After about 1-2 more weeks of irate customers threatening to sue us and demanding the application work -- myself and my two coworkers decided we would research the problem. We built our own lab with two clients and one SQL Server . We purchased a small 16K PVC between our test lab and the remote location were the host server was located. Our PVC was much smaller than what anyone was using in production--but we just needed connectivity.
It took us about 2 days to get the software installed and configured so we could run a test. After monitoring response time between transactions and network bandwidth... it would appear response was acceptable. The next day, we decided to have the customer send us a copy of their branch SQL Server database. We attached their information and ran the test again. Response time went from 3-4 seconds on the empty database to 60+ secs on the backed up production database.
After a couple minutes of running SQL Profiler, it was determined that after every transaction, 19,000 rows was being returned from the client. We then knew we figured it out and this was our problem!!! I then find the SQL statement and ran it manually within Query Analyzer. Sure enough... 19,000+ rows would be returned.
We notified vendor the next day... and apparently they had a bug in one of their fields that was created as VARCHAR instead of TEXT. When a transaction was performed, they would put a SQL statement that was run on the mainframe to the Mumps database within the SQL Server table. The SQL statement was getting truncated, which caused the mainframe to create a new SQL statement each time a transaction went thru; sense the hash never found a match for an existing statement. Basically, the applicatoin would cache the mainframe SQL statement within SQL Server and simply pass the hash # back to the mainframe so it woudl not have to recompile the SQL.
After field was change -- response went back to normal and me and my boys got a bunch of atta-boys and essentially saved the company!!! We even got a Big Dog award, etc.
There ws no sorries or explanation from the company that made the bug. What made it worse, their own folks that went to our customer sites numerous times and to our main office to run tests, etc., could never find anything. I swear we spent $100K just in travel, etc., just between all these folks trying to resolve this problem. Not to mention home office bought a $30K application that they could never get to work and ended up never using...
Our little lab experiement only costs us about $150. We got a couple 1U serves shiipped to us, ordered a PVC, used our own routers and hardware, and spent about 15 hours of our time.
It was clear that we understood and knew how the application worked better than the company that sold it to us. Clearly, they did not know SQL Server, NT, or any of the other client/server technologies. Their mainframe programmers should have stuck with mainframe programming and left the client/server tools for the newbies....
Another success story!