Find matched patterns and print them with other patterns not the whole line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find matched patterns and print them with other patterns not the whole line
# 1  
Old 09-24-2014
Find matched patterns and print them with other patterns not the whole line

Hi,

I am trying to extract some patterns from a line. The input file is space delimited and i could not use column to get value after "IN" or "OUT" patterns as there could be multiple white spaces before the next digits that i need to print in the output file . I need to print 3 patterns in a line as i bold them below:

inputfile
Code:
>RP: 123 DSU17281T6 DSU17281  Dressrossa crassa PT7T0 hypo prot (124 aa) OUT   0 
>RP: 286 DSU17282T0 DSU17282  Dressrossa crassa PT7T0 hypo prot (287 aa) OUT   5   51   70  111  130  170  189  204  223  234  253 
>RP: 110 DSU17283T0 DSU17283  Dressrossa crassa PT7T0 hypo prot (111 aa) OUT   0 
>RP: 230 DSU17284T2 DSU17284  Dressrossa crassa PT7T0 hypo prot (231 aa)  IN   1   18   35 
>RP: 54 DSU16024T3 DSU16024  Dressrossa crassa PT7T0 mo ATP unit 8 (55 aa) OUT   1   13   32 
>RP: 261 DSU16025T2 DSU16025  Dressrossa crassa PT7T0 mo ATP unit 6 (262 aa) OUT   7   41   60   96  118  127  146  153  172  183  206  213  231  236  254 
>RP: 480 DSU16026T0 DSU16026  Dressrossa crassa PT7T0 mo (481 aa)  IN   3   41   58   96  113  120  137 
>RP: 74 DSU16027T1 DSU16027  Dressrossa crassa PT7T0 mo ATP unit 9 (75 aa)  IN   2   11   35   48   72 
>RP: 250 DSU16028T0 DSU16028  Dressrossa crassa PT7T0 mo cytochrome c oxidase subunit 2 (251 aa) OUT   2   40   59   78   97

Expected Output (in tab delimited)
Code:
DSU17281T6	OUT	0 
DSU17282T0	OUT	5
DSU17283T0	OUT	0 
DSU17284T2	IN	1 
DSU16024T3	OUT	1 
DSU16025T2	OUT	7 
DSU16026T0	IN	3 
DSU16027T1	IN	2 
DSU16028T0	OUT	2

I have been trying many things but it did not give what i want. my best that i could do as below:

Code:
grep -wE "DSU.*T[0-9]|IN[[:space:]]*[0-9]|OUT[[:space:]]*[0-9]"

IT shows that the patterns that i wanted are matched good but still it prints the whole line. Then i tried changing "grep -wE" to "grep -oE" and the output that i got are not on the same line as below. I need them to be on the same line as i showed in my expected output above:
Code:
DSU17281T6	
OUT	0 
DSU17282T0	
OUT	5
DSU17283T0	
OUT	0 
DSU17284T2	
IN	1 
DSU16024T3	
OUT	1 
DSU16025T2	
OUT	7 
DSU16026T0	
IN	3 
DSU16027T1	
IN	2 
DSU16028T0	
OUT	2

I tried sed and awk, but i always get the whole lines being printed. Can anyone here show me where do i need to change here? also, may i know how to do it in sed and awk? Thanks.
# 2  
Old 09-24-2014
This should work albeit untested... awk '{for(i=1;i<=NF;i++) if($i ~ "^(IN|OUT)$") print $3,$i,$(i+1)}' file
This User Gave Thanks to shamrock For This Post:
# 3  
Old 09-24-2014
Code:
$ awk  'function p(regex){match($0,regex);return substr($0,RSTART,RLENGTH)}{print p("DSU[0-9]+T[0-9]"),p("(IN|OUT)[[:space:]]+[0-9]")}' file

DSU17281T6 OUT   0
DSU17282T0 OUT   5
DSU17283T0 OUT   0
DSU17284T2 IN   1
DSU16024T3 OUT   1
DSU16025T2 OUT   7
DSU16026T0 IN   3
DSU16027T1 IN   2
DSU16028T0 OUT   2

---------- Post updated at 11:39 PM ---------- Previous update was at 11:34 PM ----------

---

for tab separated fields

Code:
$ awk  'function p(regex){match($0,regex); return substr($0,RSTART,RLENGTH)}{s = p("DSU[0-9]+T[0-9]") FS p("(IN|OUT)[[:space:]]+[0-9]"); gsub(/[[:space:]]+/,OFS,s); print s}' OFS='\t'  file

This User Gave Thanks to Akshay Hegde For This Post:
# 4  
Old 09-24-2014
Quote:
Originally Posted by shamrock
This should work albeit untested... awk '{for(i=1;i<=NF;i++) if($i ~ "^(IN|OUT)$") print $3,$i,$(i+1)}' file
Hi shamrock,

It worked as expected. I just need to add OFS="\t" at the end. thanks a lot! Smilie

---------- Post updated at 01:12 PM ---------- Previous update was at 01:11 PM ----------

Quote:
Originally Posted by Akshay Hegde
Code:
$ awk  'function p(regex){match($0,regex);return substr($0,RSTART,RLENGTH)}{print p("DSU[0-9]+T[0-9]"),p("(IN|OUT)[[:space:]]+[0-9]")}' file

DSU17281T6 OUT   0
DSU17282T0 OUT   5
DSU17283T0 OUT   0
DSU17284T2 IN   1
DSU16024T3 OUT   1
DSU16025T2 OUT   7
DSU16026T0 IN   3
DSU16027T1 IN   2
DSU16028T0 OUT   2

---------- Post updated at 11:39 PM ---------- Previous update was at 11:34 PM ----------

---

for tab separated fields

Code:
$ awk  'function p(regex){match($0,regex); return substr($0,RSTART,RLENGTH)}{s = p("DSU[0-9]+T[0-9]") FS p("(IN|OUT)[[:space:]]+[0-9]"); gsub(/[[:space:]]+/,OFS,s); print s}' OFS='\t'  file

Hi Akshay Hegde,

It worked perfectly.. Thanks a lot Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Bash - Find files excluding file patterns and subfolder patterns

Hello. For a given folder, I want to select any files find $PATH1 -f \( -name "*" but omit any files like pattern name ! -iname "*.jpg" ! -iname "*.xsession*" ..... \) and also omit any subfolder like pattern name -type d \( -name "/etc/gconf/gconf.*" -o -name "*cache*" -o -name "*Cache*" -o... (2 Replies)
Discussion started by: jcdole
2 Replies

2. Shell Programming and Scripting

How to print two matched patterns only from each line?

My input looks like this. # Lot Of CODE Before AppType_somethinglese=$(cat << EOF AppType_test1='test-tool/blatest-tool-ear' AppType_test2='test/blabla-ear' # Lot Of CODE After I want to print text betwen 1) _ and = and 2)/ and ' from each line and exclude lines with "EOF". Output... (2 Replies)
Discussion started by: kchinnam
2 Replies

3. Shell Programming and Scripting

How to print line if two lines above it matches patterns.?

Hi, I could only find examples to print line before/after a match, but I'd need to print line after two separate lines matching. E.g.: From the below log entry, I would need to print out the 1234. This is from a huge log file, that has a lot of entries with "CLIENT" and "No" entries (+ other... (3 Replies)
Discussion started by: Juha
3 Replies

4. Shell Programming and Scripting

Matched multiple patterns that could be in a same line

Hi, I need help to match pattern started with "RW" in file 1 and with pattern in $1 in file 2 as follows:- File 1 BH /TOTAL=466(423); /POSITIVE=300(257); /UNKNOWN=25(25); BH /F_P=141(141); /F_N=136; /P=4; CC /TAX=!?; /MAX-R=2; CC /VER=2; RW P9610, AR_BSU , T; PAE25, AE_E57... (10 Replies)
Discussion started by: redse171
10 Replies

5. Shell Programming and Scripting

Find matched patterns in a column of 2 files with different size and merge them

Hi, i have input files like below:- input1 Name Seq_ID NewID Scores MT1 A0QZX3 1.65 277.4 IVO A0QZX3 1.65 244.5 HPO A0QZX3 1.65 240.5 RgP A0Q3PP 5.32 241.0 GX1 LPSZ3S 96.1 216.9 MEL LPSS3X 4.23 204.1 LDD LPSS3X 4.23 100.2 input2 Fac AddName NewID ... (9 Replies)
Discussion started by: redse171
9 Replies

6. Shell Programming and Scripting

Find matched patterns in multiple files

Hi, I need help to find matched patterns in 30 files residing in a folder simultaneously. All these files only contain 1 column. For example, File1 Gr_1 st-e34ss-11dd bt-wwd-fewq pt-wq02-ddpk pw-xsw17-aqpp Gr_2 srq-wy09-yyd9 sqq-fdfs-ffs9 Gr_3 etas-qqa-dfw ddw-ppls-qqw... (10 Replies)
Discussion started by: redse171
10 Replies

7. Shell Programming and Scripting

Print line between two patterns when a certain pattern matched

Hello Friends, I need to print lines in between two string when a keyword existed in those lines (keywords like exception, error, failed, not started etc). for example, input: .. Begin Edr ab12 ac13 ad14 bc23 exception occured bd24 cd34 dd44 ee55 ff66 End Edr (2 Replies)
Discussion started by: EAGL€
2 Replies

8. Shell Programming and Scripting

Print mutliple patterns in a line using sed

Hi, I am trying to print multiple patterns in a line using sed. But it is printing only the last occurance of a pattern. If the line is the the output should be Lookup Procedure|Stored proc But the output I am getting is Stored proc The code I am using is echo... (9 Replies)
Discussion started by: kedar_laveti
9 Replies

9. Shell Programming and Scripting

How to print the next line by searching with different patterns in AIX server?

Hi, I am having an '.xml' file with 'n' number of lines and also having another file with '.txt' format contains values which i want to search. Now I want to print the next line with the pattern which i am searching in '.xml' file. And the loop has to repeat for different patterns which are... (4 Replies)
Discussion started by: tejastrikez
4 Replies

10. Shell Programming and Scripting

print the next line by searching with different patterns

Hi, I am having an '.xml' file with 'n' number of lines and also having another file with '.txt' format contains values which i want to search. Now I want to print the next line with the pattern which i am searching in '.xml' file. And the loop has to repeat for different patterns which... (5 Replies)
Discussion started by: tejastrikez
5 Replies
Login or Register to Ask a Question