Hi, I'm trying to do a high-level parsing of Oracle SQL scripts to isolate single commands to execute, in an Oracle compilation plugin I'm writing for notepad++
E.g.
Code:
CREATE TABLE x (column_a data_type(length), ...);
ALTER TABLE y (column_a data_type)
/
CREATE OR REPLACE PACKAGE z IS ...
...
END;
/
CREATE OR REPLACE PACKAGE BODY z AS ...
-- remark...
-- END;
END;
/
INSERT INTO TABLE z (column_a) VALUES ('123');
This is a kind of script you usually can give to sqlplus for execution, with / or ; ending commands (usually ddl with / and dml with ; )
Now... I've this regex that correctly identifies each command with object owner, name and type, and I'm using it to check system tables for error after compilation:
Code:
(CREATE(\s+OR\s+REPLACE)?|ALTER)\s+(?<object_type>PACKAGE(?!\s+BODY)|PACKAGEs+BODY|VIEW|PROCEDURE|FUNCTION|TABLE|VIEW|TRIGGER)\s+((?<owner_name>[^\s\.]+?)\.)*(?<object_name>\w+)(\s+\w+|\s+IS|\s+AS|\s*\()
but I need something more reliable to isolate each command, considering remarks and dml instructions inside packages and procedures, that are to be ignored.
Right now, I'm using regex only to identify each object and I parse the script looking for a single / on a line, and it's working, but I can't handle single dml commands as the first and last in the example above. I was wondering if I'm using the right tool or if maybe it's better to learn antlr and build a grammar.