Sponsored Content
Top Forums Shell Programming and Scripting Remove lines from output in files using awk Post 302976593 by cmccabe on Friday 1st of July 2016 09:26:28 AM
Old 07-01-2016
Remove lines from output in files using awk

I have two large files (~250GB) that I am trying to remove the where GT: 0/0 or 1/1 or 2/2 for both files. I was going to use a bash with the below awk, which I think will find each line but how do I remove that line is that condition is found? Thank you Smilie.

Input
Code:
20      60055   .       A       .       35      PASS    DP=25;PF=20;MF=5;MQ=60;SB=0.800 GT:AD:DP:GQ:FL  0/0:25:25:99:PASS
20      60056   .       G      A.       35      PASS    DP=25;PF=20;MF=5;MQ=60;SB=0.800 GT:AD:DP:GQ:FL  0/1:12,13:25:99:PASS,PASS
20      60057   .       T       .       35      PASS    DP=26;PF=20;MF=6;MQ=60;SB=0.769 GT:AD:DP:GQ:FL  0/0:26:26:99:PASS
20      60058   .       C      T       35      PASS    DP=25;PF=20;MF=5;MQ=60;SB=0.800 GT:AD:DP:GQ:FL  1/1:25:25:99:PASS

Code:
awk '$9~"^[012]"{$0=$0($9~"^(0/0|1/1|2/2)"?" hom
":" het")}1' input

Desired output
Code:
20      60056   .       G      A.       35      PASS    DP=25;PF=20;MF=5;MQ=60;SB=0.800 GT:AD:DP:GQ:FL  0/1:12,13:25:99:PASS,PASS


Last edited by RudiC; 07-01-2016 at 12:38 PM.. Reason: corrected icode tags.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to compare lines of two files and print output on screen

hey guys, I have two files both with two columns, I have already created an awk code to ignore certain lines (e.g lines that start with 963) as they wou ld begin with a certain string, however, the rest I have added together and calculated the average. At the moment the code also displays... (3 Replies)
Discussion started by: chlfc
3 Replies

2. Shell Programming and Scripting

compare two files and to remove the matching lines on both the files

I have two files and need to compare the two files and to remove the matching lines from both the files (4 Replies)
Discussion started by: shellscripter
4 Replies

3. Shell Programming and Scripting

How to remove lines before and after with awk / sed ?

Hi guys, I need to remove the pattern (ID=180), one line before and four lines after. Thanks. (5 Replies)
Discussion started by: ashimada
5 Replies

4. Shell Programming and Scripting

remove duplicate lines using awk

Hi, I came to know that using awk '!x++' removes the duplicate lines. Can anyone please explain the above syntax. I want to understand how the above awk syntax removes the duplicates. Thanks in advance, sudvishw :confused: (7 Replies)
Discussion started by: sudvishw
7 Replies

5. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example: input: <string-array name="threeItems"> <item>item1</item> <item>item2</item> <item>item3</item> </string-array> <string-array name="twoItems"> <item>item1</item> <item>item2</item>... (19 Replies)
Discussion started by: raidzero
19 Replies

6. Shell Programming and Scripting

[Solved] awk to remove lines

Hi, I have a file with contents. file1: <2013 tttaaa abc123 <2013 gggdddd <2013 sssssss <2013 eeeee I need to remove the lines which do not have the word "tttaaa" can some one help ? (7 Replies)
Discussion started by: giri_luck
7 Replies

7. Shell Programming and Scripting

Two files, remove lines from second based on lines in first

I have two files, a keepout.txt and a database.csv. They're unsorted, but could be sorted. keepout: user1 buser3 anuser19 notheruser27 database: user1,2343,"information about",field,blah,34 user2,4231,"mo info",etc,stuff,43 notheruser27,4344,"hiya",thing,more thing,423... (4 Replies)
Discussion started by: esoffron
4 Replies

8. Shell Programming and Scripting

awk Question: How to remove lines in which $3 == $1 +4

Hi all, I am trying to delete all lines from a file in which the value in 'column 3' is not the value of 'column 1' + 4. The code below that I tried doesn't work. awk '$3 == $1 + 4 {print}' input > output Example Input:- 1 xxx 2 3 xxx 26 4 xxx 8 2 xxx 9 7 xxx 11 (input file... (9 Replies)
Discussion started by: livbaddeley
9 Replies

9. Shell Programming and Scripting

awk to remove lines that do not start with digit and combine line or lines

I have been searching and trying to come up with an awk that will perform the following on a converted text file (original is a pdf). 1. Since the first two lines are (begin with) text they are removed 2. if $1 is a number then all text is merged (combined) into one line until the next... (3 Replies)
Discussion started by: cmccabe
3 Replies

10. UNIX for Beginners Questions & Answers

Awk: output lines with common field to separate files

Hi, A beginner one. my input.tab (tab-separated): h1 h2 h3 h4 h5 item1 grpA 2 3 customer1 item2 grpB 4 6 customer1 item3 grpA 5 9 customer1 item4 grpA 0 0 customer2 item5 grpA 9 1 customer2 objective: output a file for each customer ($5) with the item number ($1) only if $2 matches... (2 Replies)
Discussion started by: beca123456
2 Replies
finger(1)                                                          User Commands                                                         finger(1)

NAME
finger - display information about local and remote users SYNOPSIS
finger [-bfhilmpqsw] [username...] finger [-l] [ username@hostname 1 [ @hostname 2 .. .@hostname n...]] finger [-l] [ @hostname 1 [ @hostname 2 .. .@hostname n...]] DESCRIPTION
By default, the finger command displays in multi-column format the following information about each logged-in user: o user name o user's full name o terminal name (prepended with a `*' (asterisk) if write-permission is denied) o idle time o login time o host name, if logged in remotely Idle time is in minutes if it is a single integer, in hours and minutes if a `:' (colon) is present, or in days and hours if a `d' is present. When one or more username arguments are given, more detailed information is given for each username specified, whether they are logged in or not. username must be that of a local user, and may be a first or last name, or an account name. Information is presented in multi-line format as follows: o the user name and the user's full name o the user's home directory and login shell o time the user logged in if currently logged in, or the time the user last logged in; and the terminal or host from which the user logged in o last time the user received mail, and the last time the user read mail o the first line of the $HOME/.project file, if it exists o the contents of the $HOME/.plan file, if it exists Note: when the comment (GECOS) field in /etc/passwd includes a comma, finger does not display the information following the comma. If the arguments username@hostname1[@hostname2...@hostnamen] or @hostname1[@hostname2...@hostnamen] are used, the request is sent first to hostnamen and forwarded through each hostnamen-1 to hostname1. The program uses the finger user information protocol (see RFC 1288) to query that remote host for information about the named user (if username is specified), or about each logged-in user. The information dis- played is server dependent. As required by RFC 1288, finger passes only printable, 7-bit ASCII data. This behavior may be modified by a system administrator by using the PASS option in /etc/default/finger. Specifying PASS=low allows all characters less than decimal 32 ASCII. Specifying PASS=high allows all characters greater than decimal 126 ASCII. PASS=low,high or PASS=high,low allows both characters less than 32 and greater than 126 to pass through. OPTIONS
The following options are supported, except that the username@hostname form supports only the -l option: -b Suppresses printing the user's home directory and shell in a long format printout. -f Suppresses printing the header that is normally printed in a non-long format printout. -h Suppresses printing of the .project file in a long format printout. -i Forces "idle" output format, which is similar to short format except that only the login name, terminal, login time, and idle time are printed. -l Forces long output format. -m Matches arguments only on user name (not first or last name). -p Suppresses printing of the .plan file in a long format printout. -q Forces quick output format, which is similar to short format except that only the login name, terminal, and login time are printed. -s Forces short output format. -w Suppresses printing the full name in a short format printout. FILES
$HOME/.plan user's plan $HOME/.project user's projects /etc/default/finger finger options file /etc/passwd password file /var/adm/lastlog time of last login /var/adm/utmpx accounting ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWrcmds | +-----------------------------+-----------------------------+ SEE ALSO
passwd(1), who(1), whois(1), passwd(4), attributes(5) Zimmerman, D., The Finger User Information Protocol, RFC 1288, Center for Discrete Mathematics and Theoretical Computer Science (DIMACS), Rutgers University, December 1991. NOTES
The finger user information protocol limits the options that may be used with the remote form of this command. SunOS 5.10 6 Nov 2000 finger(1)
All times are GMT -4. The time now is 06:48 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy