How to remove lines without a particular string in either column?


 
Thread Tools Search this Thread
Operating Systems Linux How to remove lines without a particular string in either column?
# 1  
Old 10-19-2015
How to remove lines without a particular string in either column?

I have a file that looks like this:

Code:
DIP-27772N       DIP-18408N refseq:NP_523941
DIP-23436N|refseq:NP_536784       DIP-23130N|refseq:NP_652017
DIP-22958N|refseq:NP_651195       DIP-20072N|refseq:NP_724597
DIP-22928N|refseq:NP_569972       DIP-22042N|refseq:NP_536744|uniprotkb:P54622
DIP-20065N|refseq:NP_731331       DIP-17103N

I want to remove those lines that do not contain "refseq:NP" in either column (the 1st and last line in the given example)

required output

Code:
DIP-23436N|refseq:NP_536784       DIP-23130N|refseq:NP_652017
DIP-22958N|refseq:NP_651195       DIP-20072N|refseq:NP_724597
DIP-22928N|refseq:NP_569972       DIP-22042N|refseq:NP_536744|uniprotkb:P54622

How can I do it using grep? Any help would be highly appreciated.
# 2  
Old 10-19-2015
Hello Syeda,

Could you please try following and let me know if this helps.
Code:
awk '{count=gsub(/refseq:NP/,"refseq:NP",$0);if(count==NF){print}}'  Input_file

Output will be as follows.
Code:
DIP-23436N|refseq:NP_536784       DIP-23130N|refseq:NP_652017
DIP-22958N|refseq:NP_651195       DIP-20072N|refseq:NP_724597
DIP-22928N|refseq:NP_569972       DIP-22042N|refseq:NP_536744|uniprotkb:P54622

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 10-19-2015
Try also
Code:
awk '2==gsub(/refseq:NP/,"&")' file
DIP-23436N|refseq:NP_536784       DIP-23130N|refseq:NP_652017
DIP-22958N|refseq:NP_651195       DIP-20072N|refseq:NP_724597
DIP-22928N|refseq:NP_569972       DIP-22042N|refseq:NP_536744|uniprotkb:P54622

---------- Post updated at 12:45 ---------- Previous update was at 12:43 ----------

If there's more than two columns, use NF== as RavinderSingh13 does.
This User Gave Thanks to RudiC For This Post:
# 4  
Old 10-19-2015
Thanks a lot R. Singh
# 5  
Old 10-19-2015
With grep:
Code:
grep -v 'refseq:NP.*refseq:NP' file

Not asked here, but I want to mention that sed can delete the nth occurrence, here the 2nd:
Code:
sed 's/|refseq:NP[_0-9]*//2' file

---------- Post updated at 09:22 AM ---------- Previous update was at 08:55 AM ----------

Thanks to RavinderSingh, I see you want to do the opposite, then it's
Code:
grep 'refseq:NP.*refseq:NP' file

BTW you can use a back reference as follows
Code:
grep '\(refseq:NP\).*\1' file

This User Gave Thanks to MadeInGermany For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Remove newline character from column spread over multiple lines in a file

Hi, I came across one issue recently where output from one of the columns of the table from where i am creating input file has newline characters hence, record in the file is spread over multiple lines. Fields in the file are separated by pipe (|) delimiter. As header will never have newline... (4 Replies)
Discussion started by: Prathmesh
4 Replies

2. Shell Programming and Scripting

Remove duplicate consecutive lines with specific string

Hello, I'm trying to remove the duplicate consecutive lines with specific string "WARNING". File.txt abc; WARNING 2345 WARNING 2345 WARNING 2345 WARNING 2345 WARNING 2345 bcd; abc; 123 123 123 WARNING 1234 WARNING 2345 WARNING 2345 efgh; (6 Replies)
Discussion started by: Mannu2525
6 Replies

3. Shell Programming and Scripting

Remove lines matching a substring in a specific column

Dear group, I have following input text file: Brit 2016 11 18 12 00 10 1.485,00 EUR Brit 2016 11 18 12 00 10 142,64 EUR Brit 2016 11 18 12 00 10 19,80 EUR Brit 2016 11 18 12 00 10 545,00 EUR Brit 2016 11 18 12 00 10 6.450,00 EUR... (3 Replies)
Discussion started by: gfhsd
3 Replies

4. Shell Programming and Scripting

Remove multiple lines from a particular string to particular string

Hi, I have a file containing the DDLs of tables in a schema. From that I need to remove all the lines from a starting string till a specific string. Here is an example. File1.txt ------------- CREATE TABLE "SCHEMA1"."LKP11_TBL_USERS" ( "ID" NUMBER(8,0) NOT NULL ENABLE, "USER_ID"... (3 Replies)
Discussion started by: satyaatcgi
3 Replies

5. UNIX for Dummies Questions & Answers

Remove lines in a positional file based on string value

Gurus, I am relatively new to Unix scripting and am struck with a problem in my script. I have positional input file which has a FLAG indicator in at position 11 in every record of the file. If the Flag has value =Y, then the record from the input needs to be written to a new file.However if... (3 Replies)
Discussion started by: gsam
3 Replies

6. UNIX for Dummies Questions & Answers

Remove lines contain certain string

i have file input aa,20120626 bb,45Cexpect to remove all lines when $2 doesn't end with 'C" output bb,45Ci tried this sed -i -nl -e '/\<C\>/ {p;} ' file1 but the result : sed illegal option -i (5 Replies)
Discussion started by: radius
5 Replies

7. UNIX for Dummies Questions & Answers

[SOLVED] remove lines that have duplicate values in column two

Hi, I've got a file that I'd like to uniquely sort based on column 2 (values in column 2 begin with "comp"). I tried sort -t -nuk2,3 file.txtBut got: sort: multi-character tab `-nuk2,3' "man sort" did not help me out Any pointers? Input: Output: (5 Replies)
Discussion started by: pathunkathunk
5 Replies

8. Shell Programming and Scripting

Remove lines that match string at end of column

I have this: 301205 0000030000041.49000000.00 2011111815505 908 301205 0000020000029.10000000.00 2011111815505 962 301205 0000010000027.56000000.00 2011111815505 3083 312291 ... (2 Replies)
Discussion started by: herot
2 Replies

9. Shell Programming and Scripting

Remove lines based on column value

Hi All, I just need a quick fix here. I need to delete all lines containing "." in the 6th column. Input: 1 1055498 . G T 5.46 . 1 1902377 . C T 7.80 . 1 1031540 . A G 34.01 PASS 1 ... (2 Replies)
Discussion started by: Hkins552
2 Replies

10. UNIX for Dummies Questions & Answers

How to remove a string from a specific column in a file

Hello, A basic query. How can I remove a string from a specific column. For example, remove "abcd" just from column 2 in example file: abcd abcd1 abcd abcd2 abcd abcd3 to get output: abcd 1 abcd 2 abcd 3 Thank you!:) (4 Replies)
Discussion started by: auburn
4 Replies
Login or Register to Ask a Question