#1
  1. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Posts
    6
    Rep Power
    0

    Unhappy Insertion from .txt files using host arrays


    I'm writing a Pro*C program in order to insert the data from .txt files into the database.

    So, i have the following table

    CREATE TABLE IDIOKTITIS
    ( ARTAYT CHAR(7) NOT NULL,
    ONOMA VARCHAR(12),
    EPITHETO VARCHAR(20),
    POLI VARCHAR(10),
    PRIMARY KEY (ARTAYT));

    and the format of the .txt file has as follows

    M111111 PETER SNOW ATHENS
    A111111 PETER SNOW LARISA
    C111111 JOHN DAVIES HALKIDA
    N111111 PETER PAPPAS BOSTON .................

    I' using C arrays like artayt[200][8],
    onoma[200][13],
    epitheto[200][21],
    poli[200][11]
    So, in every iteration until the EOF has been reached
    i read 200 rows from the file and insert them in the database by using
    EXEC SQL
    INSERT INTO IDIOKTITIS VALUES(:artayt, noma, :epitheto,oli);

    The roblem is that when an "invalid" row exists in the file
    (say the same primary key with a previous row)my INSERT exits with an error.
    If the row is the 80th for example my program will insert 79 rows and then exit.

    Can I overcome tis problem;If yes how;
    And if not what is the right way to do it;

    I'm working on Oracle 8i (Unix Enviroment)
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Jan 2003
    Location
    No es importante
    Posts
    2,065
    Rep Power
    14
    You can define what action should happen when this error occurs using the exception block of your PL/SQL block.
    For exampled
    BEGIN
    ...
    EXCEPTION
    WHEN DUP_VAL_ON_INDEX THEN
    --perform some action here
    END;
    El éxito consiste en una serie de pequeñas victorias día a día

    MySQL, MS SQL, MS ACCESS, Oracle Database Manager - http://victorpendleton.net/products/psdviewer.html
  4. #3
  5. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Posts
    6
    Rep Power
    0
    Thank you for your interest victorpendleton!
    I really appreciate it but i think i need more help.

    I am not familiar with PL/SQL and i need you to show me an example of a PL/SQL block if you can.

    The INSERT command will be inside the PL/SQL i suppose.
    So if i have an array test[200][8] already filled with data and i want to discharge(not insert) the "invalid rows" that might exist in it how the PL/SQL block is going to look like;

    This array must communicate somehow with the PL/SQL block,right?

    I would be really grateful if you could show me how to do that.

    Sorry for my extravagant demands.
  6. #4
  7. No Profile Picture
    Senior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2003
    Location
    Canada
    Posts
    305
    Rep Power
    11
    As my understanding you want to keep continue the process until EOF reach either any error occur or not. I am not a PRO*C programmer but I would solve this problem by creating one new funtion in PRO*C that matches each value in the table if founds then return false so that record would not be inserted but if does not find then return true and proceed the value.

    For example like that :

    LOOP

    IF (!isDuplicate(:empno)) THEN
    INSERT INTO emp VALUES (empno,ename,.....)
    END IF;

    IF EOF() THEN
    EXIT;
    END IF;

    END LOOP;

    The above code is psuedo code not the actual implementation, you can conver this logic into PRO*C (if you find it easy).

    I am wondering why dont you use the SQL*Loader to overcome this problem?

    Regards
  8. #5
  9. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Posts
    6
    Rep Power
    0
    You are inserting the data one row at the time is that correct?
    The whole point is NOT to insert the rows one-by-one.

    I want to insert them by using arrays for perfomance reasons.
    (At least that's what i have read in the Oracle manuals.)
    I mean if there are 500 rows in my file and i'm using an array of dimension 100 i'll have to call INSERT only 5 times and each time 100 will be inserted.

    The problem as i've mentioned above is that if in one bunch of 100 rows there is an "invalid" row (let's say row 80) the first 79 rows will be inserted but the rest won't.

    The PL/SQL that was mentioned in the previous answer might do the trick but i cannot program it myself

    If anyone has any idea????

    I don't know how to use SQL*Loader.

    Also i want to take the name of the files as parameters in my program.SQL*Loader supports this?
  10. #6
  11. No Profile Picture
    Senior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2003
    Location
    Canada
    Posts
    305
    Rep Power
    11
    It is very difficult to filter the duplicate records from a group of records those are exracted from text file. You can only perform the validation of each record by PRO*C code before insert into table. Other option is little bit trickier again (if you need to work in only one(1) or two(2) files):

    -Disable the primary key of your table for a time being.
    -insert the records from text file with duplicate records
    -drop all duplicate records from the table
    -enable the primary key

    Does it make sense for you?
  12. #7
  13. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Posts
    6
    Rep Power
    0
    Originally posted by shafique
    It is very difficult to filter the duplicate records from a group of records those are exracted from text file. You can only perform the validation of each record by PRO*C code before insert into table.

    Wouldn't it be incredibly time consuming to perform this checking especially if my .txt file is large(e.g. 500000 rows).I mean, you would have to check every row's primary key with all the previous ones.For the last row only you would have to do 499999 checks.Am i not right?

    Other option is little bit trickier again (if you need to work in only one(1) or two(2) files):

    -Disable the primary key of your table for a time being.
    -insert the records from text file with duplicate records
    -drop all duplicate records from the table
    -enable the primary key

    I don't think tha i get the idea.I don't know how to disable the primary key.

    Anyway i managed to accomplish what i've wanted by C programming.
    I call the 1st INSERT and when it fails (because it encountered a duplicate key) i enter a loop where i read the sqlca.sqlerrd[2] value and call a new INSERT which read the array from the index where the duplicate key was present and onward.

    I would really like to hear what you have to say about my method.

    P.S. I compared my method with the one that inserts thw rows one-by-one in a small file(1000 rows) and the difference in speed was quite evident.

    Does it make sense for you?
  14. #8
  15. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Posts
    6
    Rep Power
    0
    Sorry for the misuse of the quote/post reply "feature".

    An one more thing ,obvious i think.
    The loop is terminated when the array that i'm using is full.
  16. #9
  17. No Profile Picture
    Senior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2003
    Location
    Canada
    Posts
    305
    Rep Power
    11
    That's fine, go a head.
  18. #10
  19. No Profile Picture
    Senior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2003
    Location
    Canada
    Posts
    305
    Rep Power
    11
    OK, see my first example:

    FOR (int i =0 ; i<100; i++)
    {
    IF (!isDuplicate(empno[i])) THEN
    INSERT INTO emp VALUES (empno[i],ename[i],.....)
    END IF;

    IF eof() THEN
    break;
    END IF;
    }


    How do you write the 'isDuplicate' function in PRO*C. The pseudo code is given below:

    function isDuplicate(f_empno char)
    {
    SELECT count(*) INTO num_found FROM emp
    WHERE empno = f_empno;
    IF num_found >0 then
    return FALSE;
    else
    return TRUE;
    END IF;
    }

    You must define 'num_found' integer variable in your PRO*C program.


    Second option is enabling and disabling primary key for a time being, let's take a look:

    SQL>CREATE TABLE test (roll_num number(2))
    /

    SQL>INSERT INTO test VALUES(1);
    /
    SQL>INSERT INTO test VALUES(2);
    /

    try to insert any duplicate value:
    SQL>INSERT INTO test VALUES(1);
    /

    give you constraint error message because of primary key, all values in that table suppose to be unique.

    now disable the primary key:
    SQL>ALTER TABLE test DISABLE PRIMARY KEY
    /

    now insert the duplicate record,this time SQL will never prompt you any error message
    SQL>INSERT INTO test VALUES(1);
    /

    delete the all duplicate records from the table:
    SQL>delete from test a
    where rowid = (select max(rowid) from test where b.roll_num=a.roll_num)
    /

    now enable the primary key:
    SQL>ALTER TABLE test ENABLE PRIMARY KEY
    /


    Regards,
  20. #11
  21. No Profile Picture
    Senior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2003
    Location
    Canada
    Posts
    305
    Rep Power
    11
    Please use the follwoing query to delete the duplicate records from table:

    SQL>delete from test
    where (roll_num,rowid) not in (select roll_num,min(rowid) from test group by roll_num)
    /

    Thanks,

IMN logo majestic logo threadwatch logo seochat tools logo