Regex to Parse data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Regex to Parse data
# 1  
Old 06-26-2012
Regex to Parse data

Experts and Informed folks,

Need some help here in parsing the log file.
HTML Code:
1389675	Opera_ShirtCatalog INSERT INTO Opera_ShirtCatalog(COL1, COL2) VALUES (1, 'TEST1'), (2,'TEST2');
1389685	Opera_ShirtCatlog_Wom INSERT INTO Opera_ShirtCatlog_Wom(col1, col2, col3) VALUES (9,'Siz12, FormFit', 'Test');
Now, I need to print the rest of line ie., from INSERT INTO …. till ";" which is the end of the line. This part is easy and all I have to do is -
HTML Code:
cat file | sed 's/) VALUES/,NEWCOLUMN) VALUES/g' |  sed 's/.*INSERT//g' & that will print it.
However the trickiest part in here is, I need to move the 1st column as the last "insertable column" inside the values clause. So, what I would necessarily want to see as final output would be as below -
HTML Code:
INSERT INTO Opera_ShirtCatalog(COL1, COL2, NEWCOLUMN) VALUES (1, 'TEST1', 1389675), (2,'TEST2', 1389675);
INSERT INTO Opera_ShirtCatlog_Wom(col1, col2, col3, NEWCOLUMN) VALUES (9,'Siz12, FormFit', 'Test', 1389685);
I cant seem to come up a regex that would not only mark the value but also move it inside just before the closing paranthesis.

So to make it clearer,

Input file -
HTML Code:
1389675	Opera_ShirtCatalog INSERT INTO Opera_ShirtCatalog(COL1, COL2) VALUES (1, 'TEST1'), (2,'TEST2');
1389685	Opera_ShirtCatlog_Wom INSERT INTO Opera_ShirtCatlog_Wom(col1, col2, col3) VALUES (9,'Siz12, FormFit', 'Test');
Output Expected -
HTML Code:
INSERT INTO Opera_ShirtCatalog(COL1, COL2, NEWCOLUMN) VALUES (1, 'TEST1', 1389675), (2,'TEST2', 1389675);
INSERT INTO Opera_ShirtCatlog_Wom(col1, col2, col3, NEWCOLUMN) VALUES (9,'Siz12, FormFit', 'Test', 1389685);
Any and all help is duly appreciated.
# 2  
Old 06-26-2012
Using awk:

Code:
awk '{
   gsub(/)/, ", "$1")");
   gsub(", "$1") VALUES", ", NEWCOLUMN) VALUES");
}
gsub("^"$1".* INSERT","INSERT")' file

# 3  
Old 06-27-2012
Chubler_XL,

At the outset, let me express my heartfelt thanks in addressing my issue.

Unfortunately this is erroring out as follows & I tried to muck with it but I dont understand what you wrote in here so, I am a bit helpless. I am running this on bash if that provides any context in here....

Code:
/Users/ManoharChandran 07:49:39 $cat NEWOUTPUT | grep 'INSERT' | cut  -f 5- | awk '{print $2=$3=$4=""}1' | awk '{
gsub(/)/, ", "$1")");
gsub(", "$1") VALUES", ", NEWCOLUMN) VALUES");
}
gsub("^"$1".* INSERT","INSERT")'
awk: illegal primary in regular expression ) at 
 source line number 2
 context is
    gsub(/)/, ", >>>  "$1")") <<< 
/Users/ManoharChandran 07:50:04 $


Last edited by radoulov; 06-28-2012 at 06:18 PM..
# 4  
Old 06-27-2012
Might have to escape the ) characters:

Code:
awk '{
gsub(/\)/, ", "$1")");
gsub(", "$1"\) VALUES", ", NEWCOLUMN) VALUES");
}
gsub("^"$1".* INSERT","INSERT")'

# 5  
Old 06-28-2012
Chubler_XL,

Sorry, even escaping the brackets didnt do the trick ... here it is

Code:
/Users/ManoharChandran 10:33:43 $cat NEWOUTPUT | grep 'INSERT' | cut  -f 5- | awk '{print $2=$3=$4=""}1' | awk '{
> gsub(/\)/, ", "$1")");
> gsub(", "$1"\) VALUES", ", NEWCOLUMN) VALUES");
> }
> gsub("^"$1".* INSERT","INSERT")'
awk: syntax error in regular expression , ) VALUES at  VALUES
 input record number 1, file 
 source line number 3

Please advise !

regards,
Manohar.

Last edited by radoulov; 06-28-2012 at 06:17 PM..
# 6  
Old 06-28-2012
Something on these lines:

Code:
awk -F\( 'BEGIN{OFS=FS} {for(i=2;i<=NF;i++) sub(/)/,", NEWCOLUMN)",$i);print}' inputfile

# 7  
Old 06-28-2012
elixir_sinari,

Thanks for taking time and addressing my issue.

I dont like to say this but it is still not working. I am not sure as to why.

regards,
Manohar.



Code:
/Users/ManoharChandran 14:07:33 $cat NEWOUTPUT | grep 'INSERT' | cut  -f 5- | awk '{print $2=$3=$4=""}1' | awk -F\( 'BEGIN{OFS=FS} {for(i=2;i<=NF;i++) sub(/)/,", NEWCOLUMN)",$i);print}' 
awk: illegal primary in regular expression ) at 
 source line number 1
 context is
    BEGIN{OFS=FS} {for(i=2;i<=NF;i++) sub(/)/,", >>>  NEWCOLUMN)",$i) <<<

Moderator's Comments:
Mod Comment Please use code tags next time for your code and data.

Last edited by radoulov; 06-28-2012 at 06:17 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Need command to parse data

Hi Friends, I have data like below t064266 I want output into this format t064266 Data are space delimited and i want parse third column data. Thanks (9 Replies)
Discussion started by: Jagaat
9 Replies

2. Shell Programming and Scripting

Perl :: to parse the data from a string.

Hi folks, I have a line in log from which I need to parse few data. Jul 6 00:05:58 dg01aipagnfe01p %FWSM-3-106011: Deny inbound (No xlate) From the above... I need to parse the %FWSM-3-106011: substring. Another example Jul 13 00:08:55 dq01aipaynas01p %FWSM-6-302010: 2 in use, 1661... (3 Replies)
Discussion started by: scriptscript
3 Replies

3. Shell Programming and Scripting

RegeX to parse data from a txt file

Hi all the experts out there, I am totally new to perl and I was given an assignment by using Perl to find the 2nd element of each line in each curly bracket which made up of 5 elements. Expected result should like this: Type: VCC Pin_name: AK32,AL32,AH21,..... Type: NC Pin_name:... (2 Replies)
Discussion started by: killbanne
2 Replies

4. Shell Programming and Scripting

Parse data

Guys , please help me out with another AWK solution ... Input Device Physical Name : Not Visible Device Symmetrix Name : 0743 Front Director Paths (2): { ---------------------------------------------------------------------- ... (5 Replies)
Discussion started by: greycells
5 Replies

5. Shell Programming and Scripting

Parse data

hi i have a file p1.htm <div class="colorID2"> aaaa aaaa aa <br/> bbbbbbbb bbb<br/> <br/>cccc ccc ccc </div><div class="colorID1"> dddd d ddddd<br/> eeee eeee eeeeeeeeee<br/> fffff <br/>g gg<br/> (5 Replies)
Discussion started by: saw7
5 Replies

6. Shell Programming and Scripting

How to parse data?

Hi all, I have output of paction command looking like this: RELCI 0 IP address 1.2.16.3 Xmit: CURRENT Recv: WAIT_HEADER 0 congestions 2617/0 buf. sent/rec Xmit: CURRENT Recv: WAIT_HEADER 0 congestions 0/0 buf. sent/rec BUFFER Xmit: ... (6 Replies)
Discussion started by: sameucho
6 Replies

7. Shell Programming and Scripting

Extract and parse data between two strings

Hi , I have a billing CDR file which is separated by “!”. I need to extract and format data between the starting (“!”) and the end of the line (“1.2.1.8”). These two variables are permanent tags to show begin and end. ! TICKET NBR : 2 ! GSI : 101 ! 3100.2.112.1 24/03/2010 00:41:14 !... (3 Replies)
Discussion started by: jaygamini
3 Replies

8. Shell Programming and Scripting

regex/shell script to Parse through XML Records

Hi All, I have been working on something that doesn't seem to have a clear regex solution and I just wanted to run it by everyone to see if I could get some insight into the method of solving this problem. I have a flat text file that contains billing records for users, however the records... (5 Replies)
Discussion started by: Jerrad
5 Replies

9. Shell Programming and Scripting

parse data using sh script

Hi, I am a newbie to unix/shell scripting and i have a question on how to parse a txt file using perl in a sh script. I have a txt file that contains hundreds of lines with data like this.... X, Y, Latitude, Longitude 1, 142, -38.000000, -91.000000, 26.348 2, 142, 60.000000, -90.000000,... (2 Replies)
Discussion started by: moonbaby
2 Replies

10. Shell Programming and Scripting

Parse a range of data

Hello, I have a file which has a range of date like: 00:00 test 00:01 test2 00:02 test3 00:03 test4 00:04 test5 00:05 test6 Using input (stdin) i would like to parse the data 00:01 to 00:04. The output file should be like this: 00:01 test2 00:02 test3 00:03 test4 00:04 test5 ... (5 Replies)
Discussion started by: BufferExploder
5 Replies
Login or Register to Ask a Question