April 5th, 2012, 10:00 AM
(Possible) Asynchronous commit Oracle 10g X Delphi desktop
We have a Delphi application (using dbexpress/dbexpora.dll) accessing a RAC 10g with five nodes. The clients machines use 9i clients.
We have clear evidences that a series of (simple) DML statements were done (one commit for each statement) a report was generated by the application showing the changes just done and hours later, queryng the database (tables with the original data and the auditing tables populated by triggers - application cannot change them), changes were not there anymore.
According to logs generated by Oracle, a series of rollbacks (around 70), with no other DMLS among them, were done quickly after a period of time that, we have clues to hint, was very stressfull to the RAC (database structure).
Everything points to a possible asynchronous commit, generated, either by a (possible?) driver configuration (changing session parameters) or by a failure in comunication between client's driver and the database itself due to (who knows?) incompatibilities. between client's driver versioin and oracle 10g database.
Database instance "commit_write" parameter is null (did not verify all nodes, but I can do it if necessary).
Isolating one of the cases...
The user ordered four inserts and commited then. The window of the aplication does not allow the user to exit it before getting the database answer saying the commit was successfull. Then the user printed a report and the changes were there (there are documents proving it). Hours later the data were not there anymore, as if a rollback had been ordered.
In fact, the database log of DMLs shows that a rollback happened in a strange way :a lot of them at the same time. This king of thing could not happen in an application like that once that for each DML there was a commit linked to it (DML made manually by the programmer, which eliminates the possibility that a grid component, for example, could have cached the results).
I have already received the DMLS logged and, as I said, everything points to the idea that:
1) application requested the commit;
2) the database could not do it at that time and answered: "Ok. Keep going that I will commit it as soon as I can."
3) It happened many times, the report was printed.
4) database crashed someway, lost the scheduled commits and rolled back the DMLs (as the log shows).
Asynchronous commit can be set in Oracle 10g changing the commit_write parameter to, for example, "BATCH,NOWAIT" (Cannot post the link here, but, if you don't mind, you can google "commit_write oracle" and you findout the desired documentation about it) .
Another possibility points to a possible problem in the comunication between oracle driver and oracle database itself:
1) application requested the commit;
2) the database could not do it at that time and answered: "Ok. I'm having problems here, wait a little more and I will commit. Don't give up!."
3) The drive understand it as an "Ok. Go ahead!", possibly because this is a warning that did not exist in older versions of Oracle.
4) Application keep inserting records and commiting the same way. (But nothing was commited, in fact.)
5) User prints the report (in the same session he can see the uncommited data) with all the changes.
6) User aborts the session cause to some network delay.
7) Database (hadn't commited yet; it had just asked the application to wait)sees thar the user aborted the session and rollsback all DMLs done.
I know that there is a lot of imagination in these sequences of events, but If I do not consider them I'll have to accept the hypoteses that oracle does not have the comfortable ACID properties we expect from a robust database.
Thanks a lot if any of you can enlighten my search for an explication.
(ACID characteristics of Oracle, to us, seem not to be true anymore until we can identify the problem and reproduce it.)