Pull Intermediate Strings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Pull Intermediate Strings
# 1  
Old 04-13-2011
Pull Intermediate Strings

Experts,

You all have been very supportive of me so far & Im thankful for it.

I need to extract data between two sets of parenthesis and also between quotes.
Code:
cat LOGFILE | grep 'number wasnt' | head -2

I. 2011/04/14 01:12:03. process(130) Deleting Text on line 11 (ESN:27723211621B01DJ68AG) because a number wasnt 'AVAILABLE'  and is not found in the database
I. 2011/04/14 01:12:03. process(130) Deleting Text on line 12 (ESN:27723211634ATADJ68AK) because a number wasnt 'AVAILABLE'  and is not found in the database

what I need is "27723211621B01DJ68AG" & "AVAILABLE".

So here is what I do -
Code:
cat LOGFILE | grep -i 'number wasnt' | cut -d'(' -f3 | sed -e 's/[a-z].//g' | sed -e 's/SN://g' -e 's/)//g' | tr -d "'" | head -2

E27723211621B01DJ68AG  AVAILABLE    
E27723211634ATADJ68AK  AVAILABLE

The solution that Im using right now works and this has to do with eliminating all of unnecessary characters instead of extracting what I need(which definitely is not elegant at all).
But owing to my limited understand of regex, I coded this way.

However there is new change and we need to pull in even the "130" which is in the first set of quotes at the beginning and Im not sure as how to go about this.

Simply stated, here is what I have -
Code:
I. 2011/04/14 01:12:03. process(130) Deleting Text on line 11 (ESN:27723211621B01DJ68AG) because a number wasnt 'AVAILABLE'  and is not found in the database
I. 2011/04/14 01:12:03. process(130) Deleting Text on line 12 (ESN:27723211634ATADJ68AK) because a number wasnt 'AVAILABLE'  and is not found in the database

and I need

Code:
130  27723211621B01DJ68AG AVAILABLE.

How do I get this.

please help,

regards,
Lee.

Moderator's Comments:
Mod Comment Please use [code] and [/code] tags when posting code, data or logs etc. to preserve formatting and enhance readability, thanks.

Last edited by zaxxon; 04-14-2011 at 03:12 AM.. Reason: code tags
# 2  
Old 04-14-2011
Had to divide it into 2 parts since the separators were different
Code:
cat LOGFILE | grep "number wasnt" | head -2 | while read LINE
do
    BRACKET_TXT=`echo $LINE | awk '$0=$2' FS=\( RS=\) | tr -s "\n" "  "`
    QUOTE_TXT=`echo $LINE |  awk -F"'" '{print $2}'`
    
    echo "$BRACKET_TXT $QUOTE_TXT"
done

This User Gave Thanks to aster007 For This Post:
# 3  
Old 04-14-2011
Through sed..
Code:
grep 'number wasnt' logfile.txt |head -2| sed "s/^.*(\([^)]*\)).*(....\(.*\)).*'\(.*\)'.*/\1 \2 \3/"


Last edited by michaelrozar17; 04-14-2011 at 03:19 AM.. Reason: removed reading lines 1,2 in sed command as we have head -2 already
These 2 Users Gave Thanks to michaelrozar17 For This Post:
# 4  
Old 04-14-2011
michaelrozar17 & aster007,

~ wow ~

I cant believe you guys whipped it out in no time ! makes me feel so "small" & "trivial" ...

anyway, here is a simple question to Michaelrozar17.

Can you please please explain your sed ... I cant seem to understand it....

Aster007 your awk is so simple yet is so easy to read & assimiliate.

I stand up and say "thank you" to both of you.

regards,
Lee
# 5  
Old 04-14-2011
Hi OMLEELA,

Another with awk:

Code:
awk '{print gensub(/.*\((.*)\).*\(.*:(.*)\).*'\''(.*)'\''.*/,"\\1 \\2 \\3","g")}' inputfile
130 27723211621B01DJ68AG AVAILABLE
130 27723211634ATADJ68AK AVAILABLE


Regards
# 6  
Old 04-14-2011
cgkmal,

Thank you.

Yours works wonderful as well.

Btw, as you took the regular exp approach, can you please explain as to what you are doing here as your explanation would certainly give me a better sense of how to approach this problem, the next time onwards.

regards,
Lee.
# 7  
Old 04-14-2011
Sure OMLEELA,

I'll try to explain good enough Smilie

I'm using gensub function with regexp back reference feature, This feature within gensub works something like this:

gensub(/regexp/, replacement, how [, target], where
Code:
regexp: Pattern you want to search
Replacement: The replacement of "pattern"
how="g": Indicates it replaces all matches of regexp with replacement
target: If no target is supplied, $0 is used

To use back reference you need to suround between parentheses "(" and ")" the regexp you want to remember and
all that is outside the back reference parentheses won't be buffered or remembered.


Code:
# back reference parentheses in red, regexp to be matched is inside them 
gensub(/.*\((.*)\).*\(.*:(.*)\).*'\''(.*)'\''.*/ ....
              1            2           3

Explaining first part of regexp:
Code:
.*\((.*)\)=.* plus \( plus (.*) plus \) 

Where,
Code:
.* is to match: I. 2011/04/14 01:12:03. process
\( is to match: (  the opening parentheses  <<the literal "(" is escaped with \( >>
(.*) is to match: 130 = the content within the literal parentheses and to use back reference we suround with ()
\) is to match: )  the closing parentheses <<the literal ")" is escaped with \) >>

and so on.

Then the 2nd back reference matches and remembers the string after ":" and before ")" in green(ESN
:27723211621B01DJ68AG))

And the 3rd back reference is to match the substring between single quotes. Here, to match literal single quotes was needed to escape it
not only with \', but with '
\''. This is single
quotes ' ..
' around \'.

After follow the same process to match the following part of the complete string, to remember the stored substrings
we use \\i in the order needed. I've used 3 backreference parentheses, then I remenber 3 backreference in the order
1,2,3, but it could be 3,1,2 or 3,3,1, or 3,2,1 etc, up to your needs. In this case with a space between them
"\\1 \\2 \\3".


Hope this helps.

Regards
These 2 Users Gave Thanks to cgkmal For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Printing the intermediate integer values

Dear Help, I have an input file which looks like - 121 300 122 345 124 567 127 234 $1 has 125 and 126 missing. How can I output those missing values? Thanks (6 Replies)
Discussion started by: Indra2011
6 Replies

2. UNIX for Dummies Questions & Answers

Online UNIX intermediate level documentation

Hi, Can maybe some of the UNIX-guys recommend an online UNIX intermediate level documentation with examples (not too dry :-). More the post-beginner level, for someone who has to play around with files and directories, with chmod, grep, sed, a little awk maybe... bw, Omar KN (once tasted... (2 Replies)
Discussion started by: OmarKN
2 Replies

3. Shell Programming and Scripting

piping from grep to awk without intermediate files

I am trying to extract the file names alone, for example "TVLI_STATS_NRT_XLSTWS03_20120215_132629.csv", from below output which was given by the grep. sam:/data/log: grep "C10_Subscribe.000|subscribe|newfile|" PDEWG511_TVLI_JOB_STATS.ksh.201202* Output: ... (6 Replies)
Discussion started by: siteregsam
6 Replies

4. Shell Programming and Scripting

initializing loop to delete intermediate output files

Hi, I am running a script which produces a number of intermediate output files for each time step. is there a way to remove these intermediate files and just retain the final output at every end of the loop, like sort of an initialization process? this the inefficient way i do it. for i in... (3 Replies)
Discussion started by: ida1215
3 Replies

5. UNIX for Advanced & Expert Users

connecting through master ssh connection on intermediate host

Suppose host B does not allow public/private key authentication - only secureID authentication. I already have a master ssh connection from host A to host B. Host A does allow public/private key authentication. Is there any way to connect from host C to host B by way of the master ssh connection... (2 Replies)
Discussion started by: cpp6f
2 Replies

6. Shell Programming and Scripting

insert a header in a huge data file without using an intermediate file

I have a file with data extracted, and need to insert a header with a constant string, say: H|PayerDataExtract if i use sed, i have to redirect the output to a seperate file like sed ' sed commands' ExtractDataFile.dat > ExtractDataFileWithHeader.dat the same is true for awk and... (10 Replies)
Discussion started by: deepaktanna
10 Replies

7. Shell Programming and Scripting

How to Avoid intermediate files when pipe does nt work

problem with piping one output to another.Would like to avoid the intermediate file creation.The piping does nt work on places where files have been created and goes in an endless loop. sed -e "s/^\.\///g" $LINE1| sed -e "s/_\(\)/kkk\1/g" > $file1 tr -s '_' ' ' < $file1| \ sort -n -k... (1 Reply)
Discussion started by: w020637
1 Replies

8. Shell Programming and Scripting

Help needed in removing intermediate segments from a pipe delimited segment file

Hi, I just stuckup in doing some regular expressions on a file. I have data which has multiple FHS and BTS segments like: FHS|12121|LOCAL|2323 MSH|10101|POTAMAS|2323 PID|121221|THOMAS|DAVID|23432 OBX|2342|H1211|3232 BTS|0000|MERSTO|LIABLE FHS|12121|LOCAL|2323 MSH|10101|POTAMAS|2323... (3 Replies)
Discussion started by: naren_0101bits
3 Replies

9. Programming

Can we use write() to modify/update intermediate records in a file

Hi, I have a database (a simple .dat file) which has multiple records (structure datatype) in it. I would like to know if we can use write() system call to update/modify intermediate records in this file (using C). If so, could somegive give a code snippet of the same. :-) Thanks in advance... (2 Replies)
Discussion started by: maverix
2 Replies
Login or Register to Ask a Question