awk to print missing and keep sequential ordering if not found


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to print missing and keep sequential ordering if not found
# 1  
Old 06-23-2017
awk to print missing and keep sequential ordering if not found

The below awk in bold will look for the ids in file1 in $2 of file2 and if they match print the line in file2. If an id is missing or not found in file2 (like BMPR2 in line4 of file1) I can not figure out how to add it to them to the lines in the output as missing in $3 following the same format. That is with the next sequential number in $1, the id from file1 in $2, and the word missing in $3. My attempt at doing this is the modified awk which does execute but the output is all of
Code:
file2

not the desired output and I am not sure why? There may be multiple lines that are missing in my actual data but the files are always the same format as below. Thank you Smilie.

file1 tab-delimeted
Code:
ABCA3
ACVRL1
BMPR1B
BMPR2
CAV1

file2 tab-delimeted
Code:
20	ABCA3	100.00
101	ACVRL1	100.00
596	BMPR1B	100.00
597	BMPR3	100.00
733	CAV1	100.00
734	CAV3	100.00
735	CBFB	100.00
736	CBL	100.00
737	CBLB	100.00
738	CBR1	100.00

awk
Code:
awk -F'\t' 'NR==FNR{A[$1];next}$2 in A' file1 file2

output of command BMPR2 is not found so it is not printed
Code:
20	ABCA3	100.00
101	ACVRL1	100.00
596	BMPR1B	100.00
733	CAV1	100.00

modified awk
Code:
awk -F'\t' 'NR==FNR{a[$1]; next}$2 in a{delete a[$2]}
     END{for(i in a) print ++FNR,i,"missing"}1' file1 OFS='\t' file2

desired output tab-delimeted
Code:
1	ABCA3	100.00
2	ACVRL1	100.00
3	BMPR1B	100.00
4	CAV1	100.00
5	BMPR2	missing

# 2  
Old 06-23-2017
Code:
awk -F"\t" '
NR==FNR {a[$1]=$1; b[$1]=$1; next; }
a[$2] {$1=++l; print $0; delete b[$2]; }
END {for(i in b) {$0=++l OFS b[i] OFS "missing"; print $0; }}
' file1 OFS="\t" file2

This User Gave Thanks to rdrtx1 For This Post:
# 3  
Old 06-23-2017
The one array can be used for both lookups (index) and counting (value):
Code:
awk '
 BEGIN {FS=OFS="\t"}
 (NR==FNR){A[$1]=0; next}
 ($2 in A){A[$2]++; print ++cnt,$2,$3}
 END {for (i in A) if (A[i]==0) print ++cnt,i,"missing"}
' file1 file2

This User Gave Thanks to MadeInGermany For This Post:
# 4  
Old 06-24-2017
Another variation:
Code:
awk '
  NR==FNR{
    A[$2]=$3
    next
  } 
  $1 in A {
    print ++c, $1, A[$1]
    next
  }
  {
    M[$1]
  }
  END {
    for(i in M) print ++c, i, "missing"
  }
' FS='\t' OFS='\t' file2 file1

This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 06-26-2017
Thank you all Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

awk - Print lines if only matching key is found

I am looking to move matching lines (01 - 07) from File1 and 77 tab the matching string from File2, to File3.txt. I am almost done but - Currently, script is not printing lines to File3.txt in order. Thanks a lot. Any help is appreciated. Script I am using: awk 'FNR == NR && ! /^]*$/ {... (9 Replies)
Discussion started by: High-T
9 Replies

2. UNIX for Dummies Questions & Answers

awk - (URGENT!) Print lines sort and move lines if match found

URGENT HELP IS NEEDED!! I am looking to move matching lines (01 - 07) from File1 and 77 tab the matching string from File2, to File3.txt. I am almost done but - Currently, script is not printing lines to File3.txt in order. - Also the matching lines are not moving out of File1.txt ... (1 Reply)
Discussion started by: High-T
1 Replies

3. Shell Programming and Scripting

Looking for an awk command to print strings only if substring is missing

I have a file that I need to find each interface that has move-group on the interface line and print this line if the lines under the interface does Not have "filter-shared 14". Example file: interface 1/1/1/0 move-group decription one one one zero no shut filter-shared 14... (21 Replies)
Discussion started by: numele
21 Replies

4. Shell Programming and Scripting

awk to print all lines after a pattern is found

Is there a way with aw to print all lines after a string is found There is a file like this ....... ........ 2012/19/11 :11.58 PM some data lne no date 2012/19/11 :11.59 PM some other data 2012/20/11 :12.00 AM some other data some line without dates some more lines without dates... (8 Replies)
Discussion started by: swayam123
8 Replies

5. Shell Programming and Scripting

Re ordering lines - Awk

Is it possible to re-order certain rows as columns (of large files). Few lines from the file for reference. input Splicing Factor: Tra2beta, Motif: aaguguu, Cutoff: 0.5000 Sequence Position Genomic Coordinate K-mer Score 97 chr1:67052604 uacuguu 0.571 147... (3 Replies)
Discussion started by: quincyjones
3 Replies

6. Shell Programming and Scripting

trying to make an AWK code for ordering numbers in a column from least to highest

Hi all, I have a large column of numbers like 5.6789 2.4578 9.4678 13.5673 1.6589 ..... I am trying to make an awk code so that awk can easily go through the column and arrange the numbers from least to highest like 1.6589 2.4578 5.6789 ....... can anybody suggest, how can I do... (5 Replies)
Discussion started by: ananyob
5 Replies

7. Shell Programming and Scripting

AWK help to add up sequential values

Hello All! As a beginner user i want to add up sequential values in a text file and want to print total sum as output.The Text file will have values like the following: class1{root}>less SUM.txt 1140.00 1155.00 1183.00 ... # it continues # i tried to write a... (1 Reply)
Discussion started by: EAGL€
1 Replies

8. Shell Programming and Scripting

Finding missing sequential file names

So, I've got a ton of files that I want to go through (ie something like 300,000), and they're all labeled sequentially. However I'm not 100% positive that they are all there. Is there any way of running through a sequence of numbers, checking if the file is in the folder, if not appending it... (2 Replies)
Discussion started by: Julolidine
2 Replies

9. Programming

Reading special characters while converting sequential file to line sequential

We have to convert a sequential file to a 80 char line sequential file (HP UX platform).The sequential file contains special characters. which after conversion of the file to line sequential are getting coverted into "new line" or "tab" and file is getting distorted. Is there any way to read these... (2 Replies)
Discussion started by: Rajeshsu
2 Replies
Login or Register to Ask a Question