awk common between files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk common between files
# 1  
Old 01-16-2018
awk common between files

Hello there:

I want to find common among files. They all have one column.

Format for data:

CEU_snp_CHR21.txt
Code:
	
21:10758305
21:10827533
21:10913441
21:10920098
21:10952160
21:10966322
21:10985991

NAT_CHR21_variants.txt
Code:
	
21:10971951
21:14601415
21:14640400
21:14687571
21:14768343
21:14771811

variants_YRI_CHR21.txt
Code:
	
21:10758305
21:10827533
21:10913441
21:10920098
21:10952160
21:10966322

21_common_batches
Code:
	
21:14449943
21:14586958
21:14600044
21:14600045
21:14603751


Code:
Code:
	
awk '  BEGIN  {TF = ARGC - 1}
{
if(! ( $1 in LINE) )  
{
SEQ[++SN]=$1
LINE[$1]=$1
CNT[$1]++
}
else{
CNT[$1]++
} }

END{
{
for(s=1;s<=SN;s++){
if(CNT[SEQ[s]] == TF){
print LINE[SEQ[s]]

} 

} } } ' CEU_snp_CHR21.txt  NAT_CHR21_variants.txt  variants_YRI_CHR21.txt 21_common_batches

I don't get correct or convincing output through this code and I'm unable to figure out why.
Does order of file matter in awk in my code?

I snatched this code from one of my earlier posts.
Paste columns based on common column: multiple files
# 2  
Old 01-16-2018
So what exactly do you want for output? There's not a single entry that is common for all four files. Max is two...
# 3  
Old 01-16-2018
Code has a bug somewhere. Pasting 20,000 wouldn't be possible.

Code:
FNR == 1

Should I be using this?

Last edited by genome; 01-16-2018 at 04:22 PM.. Reason: FNR
# 4  
Old 01-16-2018
Quote:
Originally Posted by genome
Code has a bug somewhere. Pasting 20,000 wouldn't be possible.

Code:
FNR == 1

Should I be using this?
Posting 20,000 what wouldn't be possible?

We have no idea. You usually use the condition FNR == 1 to cause the associated action to be executed on the first line read from each input file. Is there any reason why you need to care about which file contained an input record?

Start by telling us what you are trying to accomplish. Then tell us what is wrong with the output being produced by the code you've shown us in post #1. Then, maybe, we can suggest ways to fix your code to get what you want.

The code you have shown us seems at first glance to be a slightly complicated way of removing all lines from a set of files that contain a duplicate field #1 value in the set of files you provide as input files to your awk script preserving the order in which those non-duplicated values were seen that uses more memory to get the job done than is needed.

How do you know that the output you have received is not correct? What would make the output convincing?
This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Comparing two files and list the difference with common first line content of both files

I have two file as given below which shows the ACL permissions of each file. I need to compare the source file with target file and list down the difference as specified below in required output. Can someone help me on this ? Source File ************* # file: /local/test_1 # owner: own #... (4 Replies)
Discussion started by: sarathy_a35
4 Replies

2. UNIX for Beginners Questions & Answers

Awk: output lines with common field to separate files

Hi, A beginner one. my input.tab (tab-separated): h1 h2 h3 h4 h5 item1 grpA 2 3 customer1 item2 grpB 4 6 customer1 item3 grpA 5 9 customer1 item4 grpA 0 0 customer2 item5 grpA 9 1 customer2 objective: output a file for each customer ($5) with the item number ($1) only if $2 matches... (2 Replies)
Discussion started by: beca123456
2 Replies

3. Shell Programming and Scripting

Finding out the common lines in two files using 4 fields with the help of awk and UNIX

Dear All, I have 2 files. If field 1, 2, 4 and 5 matches in both file1 and file2, I want to print the whole line of file1 and file2 one after another in my output file. File1: sc2/80 20 . A T 86 F=5;U=4 sc2/60 55 . G T ... (1 Reply)
Discussion started by: NamS
1 Replies

4. UNIX for Dummies Questions & Answers

Find common numbers from two very large files using awk or the like

I've got two files that each contain a 16-digit number in positions 1-16. The first file has 63,120 entries all sorted numerically. The second file has 142,479 entries, also sorted numerically. I want to read through each file and output the entries that appear in both. So far I've had no... (13 Replies)
Discussion started by: Scottie1954
13 Replies

5. Shell Programming and Scripting

Compare multiple files, and extract items that are common to ALL files only

I have this code awk 'NR==FNR{a=$1;next} a' file1 file2 which does what I need it to do, but for only two files. I want to make it so that I can have multiple files (for example 30) and the code will return only the items that are in every single one of those files and ignore the ones... (7 Replies)
Discussion started by: castrojc
7 Replies

6. Shell Programming and Scripting

Help in awk to read the common txt

Dear all, I have small script which seems to be working but seems to have some bug. It suppose to read commonTxt and then print the noOfLines in outputFile. It is working for most of the txt but unable to add some of the variables values. Can somebody please spend looking at the thread and... (3 Replies)
Discussion started by: emily
3 Replies

7. Shell Programming and Scripting

Common records using AWK

Hi, To be honest, I am really impressed and amazed at the pace I find solutions for un-solved coding mysteries in this forum. I have a file like this input1.txt x y z 1 2 3 a b c 4 -3 7 k l m n 0 p 1 2 a b c 4 input2 x y z 9 0 -1 a b c 0 6 9 k l m 8 o p 1 2 a f x 9 Output... (9 Replies)
Discussion started by: jacobs.smith
9 Replies

8. Shell Programming and Scripting

Getting Common value in three files

I have 3 files 1.csv abc def 2.csv abc xyb 3.csv abc e23 frw I need to search for the common word in all the three files. How do i do that in awk ? (10 Replies)
Discussion started by: nuthalapati
10 Replies

9. Shell Programming and Scripting

Merge files of differrent size with one field common in both files using awk

hi, i am facing a problem in merging two files using awk, the problem is as stated below, file1: A|B|C|D|E|F|G|H|I|1 M|N|O|P|Q|R|S|T|U|2 AA|BB|CC|DD|EE|FF|GG|HH|II|1 .... .... .... file2 : 1|Mn|op|qr (2 Replies)
Discussion started by: shashi1982
2 Replies

10. Shell Programming and Scripting

merge based on common, awk help

All, $ cat x.txt z 11 az x 12 ax y 13 ay $ cat y.txt ay TT ax NN Output required: y 13 ay TT x 12 ax NN (3 Replies)
Discussion started by: jkl_jkl
3 Replies
Login or Register to Ask a Question