Parse through ~21,000 Database DDL statements -- Fastest way to perform search, replace and insert
Hello All:
We are looking to search through 2000 files with around 21,000 statements where we have to search, replace and insert a pattern based on the following:
1) Parse through the file and check for CREATE MULTISET TABLE or CREATE SET TABLE statements.....and they always end with ON COMMIT PRESERVE ROWS;
Code:
CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
select col1,max(col2) as col2
from vt_test
where col3 in (1,2,3)
)
WITH DATA
PRIMARY INDEX (col1)
ON COMMIT PRESERVE ROWS;
2) Replace WITH DATA to WITH NO DATA. If there is already NO DATA, skip changing it.
Code:
CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
select col1,max(col2) as col2
from vt_test
where col3 in (1,2,3)
)
WITH NO DATA
PRIMARY INDEX (col1)
ON COMMIT PRESERVE ROWS;
3) Add an INSERT statement right after this....with the same table name. Basically, take the SELECT part from above and end it with a semi-colon. The challenge is code is not formatted. It can be in one line...and can be a mix of CAPS and SMALL (case insensitive). Final code should look like this:
Code:
CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
select col1,max(col2) as col2
from vt_test
where col3 in (1,2,3)
)
WITH NO DATA
PRIMARY INDEX (col1)
ON COMMIT PRESERVE ROWS;
INSERT INTO vt_test
select col1,max(col2) as col2
from vt_test
where col3 in (1,2,3);
4) Output a new file in another directory....
I am comfortable doing the pattern replace using awk, but not well versed with doing the additional step. Please see below and see if you can help.
Code:
#!/usr/bin/ksh
#|------------------------------------------------------------------|
#| Split the CREATE TABLE AS into DDL and DML Step
#|------------------------------------------------------------------|
usage ()
{
echo " Usage: $0 <SRC_DIR> <TGT_DIR>"
}
if [ $# -lt 2 ]; then
usage
exit;
fi
SRC_DIR=$1
TGT_DIR=$2
for i in *.prc
do
awk 'BEGIN{IGNORECASE=1} {gsub(/WITH DATA/,"WITH NO DATA");print}' $i > TGT_DIR
done
awk ' {TMP = $5
sub ("WITH DATA", "WITH NO DATA");
}
NF>1 {print
print "\nINSERT INTO " TMP "\nselect col1,max(col2) as col2\nfrom " TMP "\nwhere col3 in (1,2,3)"
}
END {printf "\n"}
' RS=";" ORS=";" file
CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
select col1,max(col2) as col2
from vt_test
where col3 in (1,2,3)
)
WITH NO DATA
PRIMARY INDEX (col1)
ON COMMIT PRESERVE ROWS;
INSERT INTO vt_test
select col1,max(col2) as col2
from vt_test
where col3 in (1,2,3);
Thanks Rudi. Much appreciated. Tested this.....The challenge is that each statement is different. In addition, this snippet touches other parts of the code.
Before:
Code:
BEGIN
DECLARE v_1 INTEGER;
DECLARE v_1 BIGINT;
DECLARE v_3 VARCHAR(16);
CREATE MULTISET VOLATILE TABLE vt_test2
(
col1 VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
col2 BIGINT NOT NULL,
col3 BYTEINT NOT NULL,
col4 BYTEINT NOT NULL
)
PRIMARY INDEX ( col1 ) ON COMMIT PRESERVE ROWS;
Now became....
Code:
BEGIN
DECLARE v_1INTEGER;
INSERT INTO Date
select
from Date );
DECLARE v_2 BIGINT;
INSERT INTO
select
from );
DECLARE v_3 VARCHAR(16);
INSERT INTO
select
from );
CREATE MULTISET VOLATILE TABLE vt_test2
(
col1 VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
col2 BIGINT NOT NULL,
col3 BYTEINT NOT NULL,
col4 BYTEINT NOT NULL
)
PRIMARY INDEX ( col1 ) ON COMMIT PRESERVE ROWS;
INSERT INTO vt_test2
select
from vt_test2 );
The code should not touch these at all...and that is the biggest challenge.
BEGIN {
IGNORECASE=1
}
/^\(/,/^\)/ { # lets find out the table name and fill array.
if ( $1 !~ /\(|\)/ ) {
if ( match($0,/(from [a-z_]+)/)) {
table=substr($0,RSTART+5,RLENGTH-5)
}
a[$0]
}
}
{
if (sub("WITH DATA","WITH NO DATA")) { rtn = 1 }
} 1 # do END (change) or print same file.
END {
if ( rtn > 0 ) {
print "INSERT INTO " table
for ( i in a ) {
printf i
}
printf "; \n"
}
}
Save as job.awk and run in scripts as awk -f job.awk yourinputfile
Thanks Peasant. Tried the option. Take for example the following file which has these contents.
Code:
CREATE MULTISET VOLATILE TABLE vt_test2
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
SELECT a.col1 ,
a.col2,
a.col3,
a.col4,
a.col5 ,
1 AS rule
FROM table1 a
INNER JOIN vt_test1 b
ON a.col1 = b.col1
WHERE a.col1 = 1
AND a.col2 =
(SELECT MAX(c.col3) AS max_col3
FROM talble1 c
WHERE b.col2=c.col2
)
)
WITH DATA
PRIMARY INDEX (col1)
ON COMMIT PRESERVE ROWS;
CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
select col1,max(col2) as col2
from vt_test
where col3 in (1,2,3)
)
WITH DATA
PRIMARY INDEX (col1)
ON COMMIT PRESERVE ROWS;
What is interesting is, the program read the first statement, changed WITH DATA to WITH NO DATA and skipped adding INSERT statement. Read the next one, added the INSERT but changed the format.
Code:
CREATE MULTISET VOLATILE TABLE vt_test2
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
SELECT a.col1 ,
a.col2,
a.col3,
a.col4,
a.col5 ,
1 AS rule
FROM table1 a
INNER JOIN vt_test1 b
ON a.col1 = b.col1
WHERE a.col1 = 1
AND a.col2 =
(SELECT MAX(c.col3) AS max_col3
FROM talble1 c
WHERE b.col2=c.col2
)
)
WITH NO DATA
PRIMARY INDEX (col1)
ON COMMIT PRESERVE ROWS;
CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
select col1,max(col2) as col2
from vt_test
where col3 in (1,2,3)
)
WITH NO DATA
PRIMARY INDEX (col1)
ON COMMIT PRESERVE ROWS;
INSERT INTO vt_test
where col3 in (1,2,3) from vt_test select col1,max(col2) as col2 ;
Why didn't (or even: don't) you post a meaningful, comprehensive example of your file? No surprise the proposals fail. On the sample you supplied the proposal does work satisfyingly.
Hi Experts,
I have a situation where I need to write a shell script to continuously monitor a log directory with multiple log files and perform following:
1. Read the latest log file continuously and grep "Success" OR "Failure"
2. As it capture either Success or Failure, it has to perform a... (1 Reply)
I have a log file that's created daily by this command:
sar -u 300 288 >> /var/log/usage/$(date "+%Y-%m-%d")_$(hostname)_cpu.log
It that contains data like this:
Linux 3.16.0-4-amd64 (myhostname) 08/15/2015 _x86_64_ (1 CPU)
11:34:17 PM CPU %user %nice ... (12 Replies)
HI All
Im trying to come up with an approach to finding a string, using a portion of that string to insert it on lines starting with the value "GOTO" appending to end of line after removing PT's ( See example below! )
EXAMPLE:
1. I would like to search for the line that starts with "TLAXIS/"... (7 Replies)
Enclosed is comma separated text file. I need to write a korn shell program that will parse the text file and insert the values into Oracle database.
I need to write the korn shell program on Red Hat Enterprise Linux server.
Oracle database is 10g. (15 Replies)
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
The assignment is posted below:
Maintain automobile records in a database
Write a shell script to create,... (1 Reply)
Hi,
I need to convert the following file into DB insert statements.
$ cat input.txt
START
name=john
id=123
date=12/1/09
END
START
name=sam
id=4234
status=resigned
date=12/1/08
END (2 Replies)
Hi,
I generated an Oracle schema DDL script file using the show=y option of the Oracle import utility but the file that it generates needs a little more formating before we can run this as simple DDL comands to generate the schema at Target using the script file.Here is the simplified output of... (1 Reply)
Hi,
I have a directory with possibly around 800,000 files in it.
What is the fastest way to list file(s) in this directory with a wildcard.
for example would
ls -1 *.abcdefg.Z
or
find . -name "*.abcdefg.Z"
be the fastest way to find all of the files that end with .abcdefg.Z... (6 Replies)
In emacs I perform a non-regex search and replace where the pattern is
'
+ '
and the replacement text length is zero. Note that the first and last characters in the search pattern are apostrophes.
How can I write a bash script to automate this search and replace using... (1 Reply)
Hello all
i have big test file that has allot of structure text something like this :
<foo1 *.html>
<blah action>
somthing 1
somthing 2
</blah>
</foo1 >
now i will like to insert 2 more lines of text below the <blah action>
so it will be like :
<foo1... (1 Reply)