Sponsored Content
Top Forums Shell Programming and Scripting awk to output match and mismatch with count using specific fields Post 302987464 by cmccabe on Friday 9th of December 2016 06:35:16 PM
Old 12-09-2016
awk to output match and mismatch with count using specific fields

In the below awk I am trying output to one file those lines that match between $2,$3,$4 of file1 and file2 with the count in (). I am also trying to output those lines that are missing between $2,$3,$4 of file1 and file2 with the count of in () each. Both input files are tab-delimited, but the output is not. I am not sure where to put the counter to get the desired output. Thank you Smilie.

file1
Code:
1    955597    G    G
1    9773306    T    C
1    981931    A    G
1    982994    T    C
1    984302    T    C

file2
Code:
1    955597    G    G
1    9773306    T    C
1    981939    A    G
1    982978    T    C
1    984302    T    C

desired output
Code:
Match: (3)
1    955597    G    G
1    9773306    T    C
1    984302    T    C
Missing from file1: (2)
1    981939    A    G
1    982978    T    C
Missing from file2:
1    981931    A    G
1    982994    T    C

awk
Code:
awk -F'\t' 'FNR==1 { next }
        FNR == NR { file1[$2,$3,$4] = $2 " " $3 " " $4 }
        FNR != NR { file2[$2,$3,$4] = $2 " " $3 " " $4 }
        END { print "Match:"; for (k in file1) if (k in file2) print file1[k] # Or file2[k]
              print "Missing in file1:"; for (k in file2) if (!(k in file1)) print file2[k]
              print "Missing in file2:"; for (k in file1) if (!(k in file2)) print file1[k]
}' file1 file2  > NA12878_match

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk - count character count of fields

Hello All, I got a requirement when I was working with a file. Say the file has unloads of data from a table in the form 1|121|asda|434|thesi|2012|05|24| 1|343|unit|09|best|2012|11|5| I was put into a scenario where I need the field count in all the lines in that file. It was simply... (6 Replies)
Discussion started by: PikK45
6 Replies

2. Shell Programming and Scripting

awk help: Match data fields from 2 files & output results from both into 1 file

I need to take 2 input files and create 1 output based on matches from each file. I am looking to match field #1 in both files (Userid) and create an output file that will be a combination of fields from both file1 and file2 if there are any differences in the fields 2,3,4,5,or 6. Below is an... (5 Replies)
Discussion started by: ambroze
5 Replies

3. Shell Programming and Scripting

awk count fields not working

Hi, i am trying to count the fields in a file. Input: 100,1000,,2000,3000,10/26/2012 12:12:30 200,3000,,1000,01/28/2012 17:12:30 300,5000,,5000,7000,09/06/2012 16:12:30 output: Cout of the fileds for each row 6 5 6 awk -F"," '{print $NF}' file1.txt When i try with above awk... (3 Replies)
Discussion started by: onesuri
3 Replies

4. Shell Programming and Scripting

Using to perl to output specific fields to one file

Trying to use perl to output specific fields from all text files in a directory to one new file. Each text file on a new line. The below seems to work for one text file but not more. Thank you :). perl -ne 's/^#//; @n = (6, 7, 8, 16); print if $. ~~ @n' *.txt > out.txt format of all text... (2 Replies)
Discussion started by: cmccabe
2 Replies

5. Shell Programming and Scripting

awk partial string match and add specific fields

Trying to combine strings that are a partial match to another in $1 (usually below it). If a match is found than the $2 value is added to the $2 value of the match and the $3 value is added to the $3 value of the match. I am not sure how to do this and need some expert help. Thank you :). file ... (2 Replies)
Discussion started by: cmccabe
2 Replies

6. UNIX for Beginners Questions & Answers

How to count lines of CSV file where 2 fields match variables?

I'm trying to use awk to count the occurrences of two matching fields of a CSV file. For instance, for data that looks like this... Joe,Blue,Yes,No,High Mike,Blue,Yes,Yes,Low Joe,Red,No,No,Low Joe,Red,Yes,Yes,Low I've been trying to use code like this... countvar=`awk ' $2~/$color/... (4 Replies)
Discussion started by: nmoore2843
4 Replies

7. Shell Programming and Scripting

awk to update specific value in file with match and add +1 to specific digit

I am trying to use awk to match the NM_ in file with $1 of id which is tab-delimited. The NM_ will always be in the line of file that starts with > and be after the second _. When there is a match between each NM_ and id, then the value of $2 in id is substituted or used to update the NM_. Each NM_... (3 Replies)
Discussion started by: cmccabe
3 Replies

8. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

9. UNIX for Beginners Questions & Answers

awk match two fields in two files

Hi, I have two TEST files t.xyz and a.xyz which have three columns each. a.xyz have more rows than t.xyz. I will like to output rows at which $1 and $2 of t.xyz match $1 and $2 of a.xyz. Total number of output rows should be equal to that of t.xyz. It works fine, but when I apply it to large... (6 Replies)
Discussion started by: geomarine
6 Replies

10. Shell Programming and Scripting

Match output fields agains two patterns

I need to print field and the next one if field matches 'patternA' and also print 'patternB' fields. echo "some output" | awk '{for(i=1;i<=NF;i++){if($i ~ /patternA/){print $i, $(i+1)}elif($i ~ /patternB/){print $i}}}' This code returnes me 'syntax error'. Pls advise how to do properly. (2 Replies)
Discussion started by: urello
2 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +--------------------+-----------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +--------------------+-----------------+ |Availability | SUNWgawk | +--------------------+-----------------+ |Interface Stability | Volatile | +--------------------+-----------------+ NOTES
Source for gawk is available on http://opensolaris.org. Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 10:22 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy