Extract lines with min value, using two field separators.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract lines with min value, using two field separators.
# 1  
Old 11-09-2013
Extract lines with min value, using two field separators.

I have a file with two ID columns followed by five columns of counts in fraction form. I'd like to print lines that have a count of at least 4 (so at least 4 in the numerator, e.g. 4/17) in at least one of the five columns.

Input file:
Code:
comp51820_c1_seq1 693 0/29 0/50 0/69 0/36 0/31
comp51820_c1_seq1 694 0/29 0/54 1/67 0/34 0/30
comp51820_c1_seq1 710 0/11 0/36 0/14 0/25 4/17
comp51820_c1_seq1 711 1/11 2/35 6/14 5/25 6/17

Desired output:
Code:
comp51820_c1_seq1 710 0/11 0/36 0/14 0/25 4/17
comp51820_c1_seq1 711 1/11 2/35 6/14 5/25 6/17

I'm still new but I'm thinking awk could help, using "/" as the field separator. But I'm not sure how to keep the spaces as field separators as well. I have been looking into split, but it's not clear (to me) it will help.

Any ideas out there? I'm not great at parsing files yet and it's quite a bottleneck in my work (which obviously is not programming)...

Last edited by Don Cragun; 11-10-2013 at 04:59 AM.. Reason: Chane QUOTE tags to CODE tags.
# 2  
Old 11-09-2013
Code:
awk '/[4-9]\//' filename

# 3  
Old 11-10-2013
This works for the example I gave, but would not work if the counts were 10, 11, 12, or 13. How could I specify minimum 4 rather than 4-9?

---------- Post updated at 11:33 PM ---------- Previous update was at 11:21 PM ----------

I just realized I used a bad input example--all lines had a value in the numerator of at least 4. I have changed it now so that the input and output are correct. Sorry to waste your time.
# 4  
Old 11-10-2013
Assuming that your original condition is still true:

Code:
awk '{for(i=3; i<=NF; ++i){ split($i,f, "/"); if(f[1] > 3){ print $0; break }}}'

# 5  
Old 11-10-2013
You could also try the slightly simpler (using both space and slash as field separators):
Code:
awk -F '[ /]' '
{       for(i = 3; i < NF; i += 2)
                if($i >= 4) {
                        print
                        next
                }
}' input

or if you insist on a 1-line solution:
Code:
awk -F'[ /]' '{for(i=3;i<NF;i+=2) if($i>=4){print;next}}' input


Last edited by Don Cragun; 11-10-2013 at 05:20 AM.. Reason: Add note.
# 6  
Old 11-10-2013
If there is never a / in an ID field you can as well try
Code:
egrep '([4-9]|[1-9][0-9]+)/'

# 7  
Old 11-10-2013
And same with awk
Code:
awk '/([4-9]|[1-9][0-9]+)\//' file
comp51820_c1_seq1 710 0/11 0/36 0/14 0/25 41/17
comp51820_c1_seq1 711 1/11 2/35 6/14 5/25 6/17

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Extract lines if string found from last 30 min only

Hi guys, Appreciate your help as I am stuck with searching the logs for last 30 minutes from the current time. Current time is time when you execute the script and it will search for <string> through the logs for last 30 minutes only and if <string> found then print those lines only. The... (18 Replies)
Discussion started by: rockstar
18 Replies

2. Shell Programming and Scripting

Parsing out data with multiple field separators

I have a large file that I need to print certain sections out of. file.txt /alpha/beta/delta/gamma/425/590/USC00015420.blah.lt.0.01.str:USC00015420Y2017M10BLALT.01 12 13 14 -9 1 -9 -9 -9 -9 -9 1 2 3 4 5 -9 -9 I need to print the "USC00015420" and... (5 Replies)
Discussion started by: ncwxpanther
5 Replies

3. Shell Programming and Scripting

Multiple long field separators

How do I use multiple field separators in awk? I know that if I use awk -F"", both a and b will be field separators. But what if I need two field separators that both are longer than one letter? If I want the field separators to be "ab" and "cd", I will not be able to use awk -F"". The ... (2 Replies)
Discussion started by: locoroco
2 Replies

4. Shell Programming and Scripting

Extract lines whose third field is 0

Hi, I have a file with colon separated values like below. How can i get those lines whose third field is 0 (zero). In the below example, lines starting with stapler and tempo has its third field as 0 $ cat list.txt galaxy:b:5:world stapler:a:0:hello abc:a:4:stomper kepler:uic:5:jam... (8 Replies)
Discussion started by: John K
8 Replies

5. UNIX for Dummies Questions & Answers

Can one use 2 field separators in awk?

I have files such as n02-z30-dsr65-terr0.25-dc0.008-16x12drw-run1.cmd I am wondering if it is possible to define two field separators "-" and "." for these strings so that $7 is run1. (5 Replies)
Discussion started by: kristinu
5 Replies

6. UNIX Desktop Questions & Answers

awk Varing Field Separators

Hi Guys, I have small dilemma which I could do with a little help solving . I currently have text HDD S.M.A.R.T report which I have pasted below: smartctl 5.39 2008-10-24 22:33 (openSUSE RPM) Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net Device: COMPAQ... (2 Replies)
Discussion started by: bikerben
2 Replies

7. Shell Programming and Scripting

Problem with changing field separators in a file

I have a file with content as shown below. cat t2 : 100,100,"X",1234,"12A",,,"ab,c" Comma is the field seperator, however string fields will be within double quotes and comma within double quotes should not be treated as field seperator. I am trying to replace this field seperator to a... (7 Replies)
Discussion started by: mk1216
7 Replies

8. Shell Programming and Scripting

Multiple input field Separators in awk.

I saw a couple of posts here referencing how to handle more than one input field separator in awk. I figured I would share how I (just!) figured out how to turn this line in a logfile: 90000000000000000000010001 name... (4 Replies)
Discussion started by: kinksville
4 Replies

9. Shell Programming and Scripting

I need help counting the fields and field separators using Nawk

I need help counting the fields and field separators using Nawk. I have a file that has multiple lines on it and I need to read the file 1 at a time and then count the fields and field separators and then store those numbers in variables. I then need to delete the first 5 fields and the blank... (3 Replies)
Discussion started by: scrappycc
3 Replies

10. Shell Programming and Scripting

Awk Multiple Field Separators

Hi Guys, I'm tying to split a line similar to this:YO6-2000-30.htm: (3 properties found).......into separate columns, so effectively I need to check for a -, ., :, a tab and a space in the statement. Any help would be appreciated Thanks! (7 Replies)
Discussion started by: Tonka52
7 Replies
Login or Register to Ask a Question