Parse through ~21,000 Database DDL statements -- Fastest way to perform search, replace and insert


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parse through ~21,000 Database DDL statements -- Fastest way to perform search, replace and insert
# 1  
Old 05-10-2013
Parse through ~21,000 Database DDL statements -- Fastest way to perform search, replace and insert

Hello All:

We are looking to search through 2000 files with around 21,000 statements where we have to search, replace and insert a pattern based on the following:

1) Parse through the file and check for CREATE MULTISET TABLE or CREATE SET TABLE statements.....and they always end with ON COMMIT PRESERVE ROWS;

Code:
CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
                select col1,max(col2) as col2 
                from vt_test
                where col3 in (1,2,3)
)
WITH DATA 
PRIMARY INDEX (col1) 
ON COMMIT PRESERVE ROWS;

2) Replace WITH DATA to WITH NO DATA. If there is already NO DATA, skip changing it.

Code:
CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
                select col1,max(col2) as col2 
                from vt_test
                where col3 in (1,2,3)
)
WITH NO DATA 
PRIMARY INDEX (col1) 
ON COMMIT PRESERVE ROWS;

3) Add an INSERT statement right after this....with the same table name. Basically, take the SELECT part from above and end it with a semi-colon. The challenge is code is not formatted. It can be in one line...and can be a mix of CAPS and SMALL (case insensitive). Final code should look like this:

Code:
CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
                select col1,max(col2) as col2 
                from vt_test
                where col3 in (1,2,3)
)
WITH NO DATA 
PRIMARY INDEX (col1) 
ON COMMIT PRESERVE ROWS;

INSERT INTO vt_test
select col1,max(col2) as col2 
from vt_test
where col3 in (1,2,3);

4) Output a new file in another directory....

I am comfortable doing the pattern replace using awk, but not well versed with doing the additional step. Please see below and see if you can help.

Code:
#!/usr/bin/ksh
#|------------------------------------------------------------------|
#|  Split the CREATE TABLE AS into DDL and DML Step
#|------------------------------------------------------------------|

usage ()
{
     echo " Usage: $0 <SRC_DIR> <TGT_DIR>"
}

if [ $# -lt 2 ]; then
        usage
        exit;
fi

SRC_DIR=$1
TGT_DIR=$2


for i in *.prc
do
awk 'BEGIN{IGNORECASE=1} {gsub(/WITH DATA/,"WITH NO DATA");print}' $i > TGT_DIR
done

# 2  
Old 05-10-2013
You may want to use this as a starting point:
Code:
awk     '       {TMP = $5
                 sub ("WITH DATA", "WITH NO DATA");
                }
         NF>1   {print
                 print "\nINSERT INTO " TMP "\nselect col1,max(col2) as col2\nfrom " TMP "\nwhere col3 in (1,2,3)"
                }
         END    {printf "\n"}
        ' RS=";" ORS=";" file
CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
                select col1,max(col2) as col2 
                from vt_test
                where col3 in (1,2,3)
)
WITH NO DATA 
PRIMARY INDEX (col1) 
ON COMMIT PRESERVE ROWS;
INSERT INTO vt_test
select col1,max(col2) as col2
from vt_test
where col3 in (1,2,3);

# 3  
Old 05-10-2013
Thanks Rudi. Much appreciated. Tested this.....The challenge is that each statement is different. In addition, this snippet touches other parts of the code.

Before:

Code:
BEGIN

   DECLARE v_1 INTEGER;
   DECLARE v_1 BIGINT;
   DECLARE v_3 VARCHAR(16);

CREATE MULTISET VOLATILE TABLE vt_test2
(
      col1 VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
      col2 BIGINT NOT NULL,
      col3 BYTEINT NOT NULL,
      col4 BYTEINT NOT NULL
)
PRIMARY INDEX ( col1 ) ON COMMIT PRESERVE ROWS;

Now became....

Code:
BEGIN

   DECLARE v_1INTEGER;
INSERT INTO Date
select
from Date );
   DECLARE v_2 BIGINT;
INSERT INTO
select
from  );
   DECLARE v_3 VARCHAR(16);
INSERT INTO
select
from  );

CREATE MULTISET VOLATILE TABLE vt_test2
(
      col1 VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
      col2 BIGINT NOT NULL,
      col3 BYTEINT NOT NULL,
      col4 BYTEINT NOT NULL
)
PRIMARY INDEX ( col1 ) ON COMMIT PRESERVE ROWS;

INSERT INTO vt_test2
select
from vt_test2 );

The code should not touch these at all...and that is the biggest challenge.
# 4  
Old 05-11-2013
Code:
BEGIN {
IGNORECASE=1
}
/^\(/,/^\)/ { # lets find out the table name and fill array.
	if ( $1 !~ /\(|\)/ ) {
	if ( match($0,/(from [a-z_]+)/)) {
	table=substr($0,RSTART+5,RLENGTH-5)
	}
	a[$0]
	}
}

{
if (sub("WITH DATA","WITH NO DATA")) { rtn = 1 }
} 1 # do END (change) or print same file.

END {
if ( rtn > 0 ) {
print "INSERT INTO " table
for ( i in a ) {
	printf i
	}
	printf "; \n"
	}
}

Save as job.awk and run in scripts as awk -f job.awk yourinputfile

Hope that helps
Regards
Peasant.
This User Gave Thanks to Peasant For This Post:
# 5  
Old 05-11-2013
Thanks Peasant. Tried the option. Take for example the following file which has these contents.

Code:
   CREATE MULTISET VOLATILE TABLE vt_test2
   , NO FALLBACK, NO JOURNAL, NO LOG AS
   (
      SELECT a.col1                   ,
             a.col2,
             a.col3,
             a.col4,
             a.col5                ,
             1 AS rule
      FROM   table1 a
      INNER JOIN vt_test1 b
             ON     a.col1 = b.col1
      WHERE  a.col1     = 1
      AND    a.col2    =
             (SELECT MAX(c.col3) AS max_col3
             FROM    talble1 c
             WHERE   b.col2=c.col2
             )
   )
   WITH DATA
   PRIMARY INDEX (col1)
   ON COMMIT PRESERVE ROWS;

CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
                select col1,max(col2) as col2
                from vt_test
                where col3 in (1,2,3)
)
WITH DATA
PRIMARY INDEX (col1)
ON COMMIT PRESERVE ROWS;

What is interesting is, the program read the first statement, changed WITH DATA to WITH NO DATA and skipped adding INSERT statement. Read the next one, added the INSERT but changed the format.

Code:
   CREATE MULTISET VOLATILE TABLE vt_test2
   , NO FALLBACK, NO JOURNAL, NO LOG AS
   (
      SELECT a.col1                   ,
             a.col2,
             a.col3,
             a.col4,
             a.col5                ,
             1 AS rule
      FROM   table1 a
      INNER JOIN vt_test1 b
             ON     a.col1 = b.col1
      WHERE  a.col1     = 1
      AND    a.col2    =
             (SELECT MAX(c.col3) AS max_col3
             FROM    talble1 c
             WHERE   b.col2=c.col2
             )
   )
   WITH NO DATA
   PRIMARY INDEX (col1)
   ON COMMIT PRESERVE ROWS;

CREATE MULTISET VOLATILE TABLE vt_test
, NO FALLBACK, NO JOURNAL, NO LOG AS
(
                select col1,max(col2) as col2
                from vt_test
                where col3 in (1,2,3)
)
WITH NO DATA
PRIMARY INDEX (col1)
ON COMMIT PRESERVE ROWS;
INSERT INTO vt_test
                where col3 in (1,2,3)                from vt_test                select col1,max(col2) as col2 ;

# 6  
Old 05-11-2013
Why didn't (or even: don't) you post a meaningful, comprehensive example of your file? No surprise the proposals fail. On the sample you supplied the proposal does work satisfyingly.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read latest log files and perform database insert

Hi Experts, I have a situation where I need to write a shell script to continuously monitor a log directory with multiple log files and perform following: 1. Read the latest log file continuously and grep "Success" OR "Failure" 2. As it capture either Success or Failure, it has to perform a... (1 Reply)
Discussion started by: rish_max
1 Replies

2. Shell Programming and Scripting

Parse log file to insert into database

I have a log file that's created daily by this command: sar -u 300 288 >> /var/log/usage/$(date "+%Y-%m-%d")_$(hostname)_cpu.log It that contains data like this: Linux 3.16.0-4-amd64 (myhostname) 08/15/2015 _x86_64_ (1 CPU) 11:34:17 PM CPU %user %nice ... (12 Replies)
Discussion started by: unplugme71
12 Replies

3. UNIX for Dummies Questions & Answers

Grep or sed to search, replace/insert chars!

HI All Im trying to come up with an approach to finding a string, using a portion of that string to insert it on lines starting with the value "GOTO" appending to end of line after removing PT's ( See example below! ) EXAMPLE: 1. I would like to search for the line that starts with "TLAXIS/"... (7 Replies)
Discussion started by: graymj
7 Replies

4. Shell Programming and Scripting

Korn shell program to parse CSV text file and insert values into Oracle database

Enclosed is comma separated text file. I need to write a korn shell program that will parse the text file and insert the values into Oracle database. I need to write the korn shell program on Red Hat Enterprise Linux server. Oracle database is 10g. (15 Replies)
Discussion started by: shellguy
15 Replies

5. Homework & Coursework Questions

Case statements and creating a file database

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: The assignment is posted below: Maintain automobile records in a database Write a shell script to create,... (1 Reply)
Discussion started by: Boltftw
1 Replies

6. Shell Programming and Scripting

block of name value pair to db insert statements

Hi, I need to convert the following file into DB insert statements. $ cat input.txt START name=john id=123 date=12/1/09 END START name=sam id=4234 status=resigned date=12/1/08 END (2 Replies)
Discussion started by: vlinet
2 Replies

7. Shell Programming and Scripting

Help required to parse Oracle imp show=y output to DDL Commands

Hi, I generated an Oracle schema DDL script file using the show=y option of the Oracle import utility but the file that it generates needs a little more formating before we can run this as simple DDL comands to generate the schema at Target using the script file.Here is the simplified output of... (1 Reply)
Discussion started by: rajan_san
1 Replies

8. Shell Programming and Scripting

Fastest way to list a file in a folder containing 800,000 files using wildcard

Hi, I have a directory with possibly around 800,000 files in it. What is the fastest way to list file(s) in this directory with a wildcard. for example would ls -1 *.abcdefg.Z or find . -name "*.abcdefg.Z" be the fastest way to find all of the files that end with .abcdefg.Z... (6 Replies)
Discussion started by: jerardfjay
6 Replies

9. Shell Programming and Scripting

How perform search & replace with pattner that contains \n?

In emacs I perform a non-regex search and replace where the pattern is ' + ' and the replacement text length is zero. Note that the first and last characters in the search pattern are apostrophes. How can I write a bash script to automate this search and replace using... (1 Reply)
Discussion started by: siegfried
1 Replies

10. Shell Programming and Scripting

how to insert line break + string in vi (search & replace )

Hello all i have big test file that has allot of structure text something like this : <foo1 *.html> <blah action> somthing 1 somthing 2 </blah> </foo1 > now i will like to insert 2 more lines of text below the <blah action> so it will be like : <foo1... (1 Reply)
Discussion started by: umen
1 Replies
Login or Register to Ask a Question