To print certain patterns in a column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting To print certain patterns in a column
# 1  
Old 08-20-2012
To print certain patterns in a column

Hi,

From my input files, I want to print $1, $2 and only certain pattern in $4 (EC). I use this code but it print all the words in $4
Code:
awk -F"\t" '$4 {print $1,$2,$4}

I just want EC follows by the numbers in $4

The input file as follows:-

Code:
Entry     Entry name    Status     Names
Q01284   2NPD_NEUCR     R        Nitronate monooxygenase (EC 1.13.12.16) (2-nitropropane dioxygenase) (2-NPD) (Nitroalkane oxidase)
Q99VF6   2NPD_STAAN     U        Probable nitronate monooxygenase (EC 1.13.12.16) (Nitroalkane oxidase)
Q9F131   3HBH1_PSEAC    R        3-hydroxybenzoate 6-hydroxylase 1 (EC 1.14.13.24) (Constitutive 3-hydroxybenzoate 6-hydroxylase)
Q5EXK1   3HBH_KLEOX     R        3-hydroxybenzoate 6-hydroxylase (EC 1.14.13.24)
P07046   3SHD_NEUCR     R        3-dehydroshikimate dehydratase (DHS dehydratase) (DHSase) (EC 4.2.1.-)

The output should be:-

Code:
Entry     Entry name         Names
Q01284   2NPD_NEUCR        EC 1.13.12.16
Q99VF6   2NPD_STAAN        EC 1.13.12.16
Q9F131   3HBH1_PSEAC       EC 1.14.13.24
Q5EXK1   3HBH_KLEOX        EC 1.14.13.24
P07046   3SHD_NEUCR        EC 4.2.1.-

Would appreciate your kind help on this. Thanks

Last edited by redse171; 08-20-2012 at 10:37 PM.. Reason: typo
# 2  
Old 08-20-2012
Have a go with this:

Code:
awk -F "\t"  '
    NR == 1 { printf( "%s\t%s\t%s\n", $1, $2, $4 ); next; }
    NR > 1 {
        gsub( ".*EC", "EC", $4 );
        gsub( "\\).*", "", $4 );
        printf( "%s\t%s\t%s\n", $1, $2, $4 );
    }
'  input-file >output-file


Last edited by agama; 08-20-2012 at 10:58 PM.. Reason: oops, header wrong.
This User Gave Thanks to agama For This Post:
# 3  
Old 08-20-2012
Hi agama,

i tried but it did not change anything. it still print the whole words in $4
# 4  
Old 08-20-2012
What Operating system and version of awk are you using? If you are on Sun/Solaris, try nawk instead of awk.
# 5  
Old 08-20-2012
If you have gawk, please try this:

Code:
awk -F"\t" 'match($0, /(EC [0-9\-\.]+)/, p) { print $1,$2,p[1]}'

This User Gave Thanks to leafei For This Post:
# 6  
Old 08-20-2012
Hi,

i am using ubuntu 10.04.
# 7  
Old 08-20-2012
What is the output of awk --version

Both the programme that I posted, and that leafei posted generate expected results with awk Version 4.0.0.

Are you sure that your columns are tab separated? If it's not, that would cause my solution to fail; leafei's match() works against the whole record and thus wouldn't be subject to that issue if the file is not tab separated.
This User Gave Thanks to agama For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to print different multiple lines after two patterns?

Hello, I need to print some lines as explained below, TXT example 1111 2222 3333 4444 5555 6666 7777 8888 6666 9999 1111 2222 3333 4444 5555 (8 Replies)
Discussion started by: liuzhencc
8 Replies

2. Shell Programming and Scripting

Find matched patterns and print them with other patterns not the whole line

Hi, I am trying to extract some patterns from a line. The input file is space delimited and i could not use column to get value after "IN" or "OUT" patterns as there could be multiple white spaces before the next digits that i need to print in the output file . I need to print 3 patterns in a... (3 Replies)
Discussion started by: redse171
3 Replies

3. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

4. Shell Programming and Scripting

Print between multiple patterns

Hello Gurus, I have a file this Dir Path 1 Connection pool="somename"; "DataSource Name"="DS name"; Password="pwd"; User Id="uid";some other fields Dir Path2 Password="pwd2"; User id="uid2"; Connection pool="somename2"; "datasource name"="DS name2";some other fields. Under each dir... (14 Replies)
Discussion started by: sirababu
14 Replies

5. Shell Programming and Scripting

Print all lines between patterns

Hi Gurus, I have a requirement where I need to display all lines between 2 patterns except the line where the first pattern in it. I tried the following command using awk but it is printing all lines except the lines where the 2 patterns exist. awk '/TRANSF_/{ P=1; next } /Busy/ {exit} P'... (9 Replies)
Discussion started by: svajhala
9 Replies

6. Shell Programming and Scripting

How to print only lines in between patterns?

Hi, I want to print only lines (green-italic lines) in between first and last strings in column 9. there are different number of lines between each strings. 10 AUGUSTUS exon 4558 4669 . - . 10.g1 10 AUGUSTUS exon 8771 8889 . ... (6 Replies)
Discussion started by: jamo
6 Replies

7. Shell Programming and Scripting

Need to print between patterns AND a few lines before

I need to print out sections (varying numbers of lines) of a file between patterns. That alone is easy enough: sed -n '/START/,/STOP/' I also need the 3 lines BEFORE the start pattern. That alone is easy enough: grep -B3 START But I can't seem to combine the two so that I get everything between the... (2 Replies)
Discussion started by: Finja
2 Replies

8. Shell Programming and Scripting

Search for the two patterns and print everything in between

Hi all, I have a file having data: @HWUSI-EAS1727:19:6:1:3674:984:0:1#GTTAATA NTTGGGTTTTCT @HWUSI-EAS1727:19:6:1:3674:984:0:1#GTTA... NTTGGGTTTTCT @HWUSI-EAS1727:19:6:1:3674:984:0:1#.....CT NTTGGGTTTTCT I want to print everything starting from # till line ends. can you please help me how... (5 Replies)
Discussion started by: pirates.genome
5 Replies

9. Shell Programming and Scripting

print lines between 2 matching patterns

Hi Guys, I have file like below, I want to print all lines between test1231233 to its 10 occurrence(till line 41) test1231233 qwe qwe qweq123 test1231233 qwe qwe qweq23 test1231233 qwe qwe qweq123 test1231233 qwe qwe qweq123131 (3 Replies)
Discussion started by: jagnikam
3 Replies

10. Shell Programming and Scripting

Perl print between 2 patterns

I have been unable to find this anywhere; I have a multiline variable, and I want to print the text between two patterns in that variable. So the variable is My real name is not DeadmanAnd I need the output to be this, by printing between "real" and "not" name is or including the two... (10 Replies)
Discussion started by: killer54291
10 Replies
Login or Register to Ask a Question