Common records using AWK


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Common records using AWK
# 8  
Old 02-01-2012
What exactly is the output when you run the commands? Can you please post it. Also, what is your system?
It works on Linux with GNU awk and cut:
Code:
$ cat 1.txt 
x y z 1 2 3 4 5 6 7 8 9
a b c d e f 9 7 8 9 90 1
$ cat 2.txt 
x y z 2 4 5 6 7 8 1 0 0
a b c g h i 9 3 1 4 5 6
$ awk 'NR==FNR{a[$1,$2,$3]=$0; next} {printf a[$1,$2,$3]; $1=$2=$3=""; gsub(/^ */," ",$0); print $0}' 1.txt 2.txt
x y z 1 2 3 4 5 6 7 8 9 2 4 5 6 7 8 1 0 0
a b c d e f 9 7 8 9 90 1 g h i 9 3 1 4 5 6
$ cut -d" " -f4- 2.txt | paste -d" " 1.txt -
x y z 1 2 3 4 5 6 7 8 9 2 4 5 6 7 8 1 0 0
a b c d e f 9 7 8 9 90 1 g h i 9 3 1 4 5 6

This User Gave Thanks to mirni For This Post:
# 9  
Old 02-01-2012
Thank You.

Works like a charm.

---------- Post updated at 04:27 PM ---------- Previous update was at 03:07 PM ----------

I was wondering what if there are more than two files.

Any thoughts?
# 10  
Old 02-02-2012
Yes. You could do like this:
Code:
awk 'NR==FNR{a[NR]=$0; next} {$1=$2=$3=""; gsub(/^ */," ",$0);  a[FNR]=a[FNR] $0}END{for(i=1; i<=FNR; i++) print a[i];}' *.txt

or a shell loop:
Code:
cp first.txt > out.txt
 for i in *.txt ; do 
   [[ "$i" = first.txt ]] && continue
   cut -d" " -f4- "$i" | paste -d" " out.txt - > tmpFile
   mv tmpFile out.txt
 done

But this the most fun way of doing it -- recursive way:
Code:
#!/bin/bash
 
 function myPaste {
     if [[ $# -le 1 ]] ; then
        cat $1
        return
     fi
     arg=$1
     shift
 
     myPaste $* | cut -d" " -f4- | paste -d" " $arg -
 }
 
 myPaste $*

You would save this as myPaste.sh, make it executable and then call it:
Code:
./myPaste.sh *.txt

Note that all these above are dumb, they do not check whether the first 3 columns are the same, they just strip the first three column no matter what. The awk solution on 2 files with [$1,$2,$3] works on unsorted files also, and actually matches the first three fields. This could be extended to n files, would just need one extra loop in the END to sort the associative array.
This User Gave Thanks to mirni For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk common between files

Hello there: I want to find common among files. They all have one column. Format for data: CEU_snp_CHR21.txt 21:10758305 21:10827533 21:10913441 21:10920098 21:10952160 21:10966322 21:10985991 NAT_CHR21_variants.txt 21:10971951 (3 Replies)
Discussion started by: genome
3 Replies

2. UNIX for Beginners Questions & Answers

Comparing fastq files and outputting common records

I have two files: File_1: @M04961:22:000000000-B5VGJ:1:1101:9280:7106 1:N:0:86 GGCATGAAAACATACAAACCGTCTTTCCAGAAATTGTTCCAAGTATCGGCAACAGCTTTATCAATACCATGAAAAATATCAACCACACCAGAAGCAGCAT + GGGGGGGGGGGGGGGGGCCGGGGGF,EDFFGEDFG,@DGGCGGEGGG7DCGGGF68CGFFFGGGG@CGDGFFDFEFEFF:30CGAFFDFEFF8CAF;;8F ... (3 Replies)
Discussion started by: Xterra
3 Replies

3. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

4. UNIX for Dummies Questions & Answers

Values with common field in same line with awk

Hi all ! I almost did it but got a small problem. input: cars red cars blue cars green truck black Wanted: cars red-blue-green truck black Attempt: gawk 'BEGIN{FS="\t"}{a = a (a?"-":"")$2; $2=a; print $1 FS $2}' input But I also got the intermediate records... (2 Replies)
Discussion started by: beca123456
2 Replies

5. Shell Programming and Scripting

Two columns-Common records - 20 files

Hi Friends, I have an input file like this cat input1 x 1 y 2 z 3 a 2 b 4 c 6 d 9 cat input2 x 7 h 8 k 9 l 5 m 9 d 12 (5 Replies)
Discussion started by: jacobs.smith
5 Replies

6. Shell Programming and Scripting

Help in awk to read the common txt

Dear all, I have small script which seems to be working but seems to have some bug. It suppose to read commonTxt and then print the noOfLines in outputFile. It is working for most of the txt but unable to add some of the variables values. Can somebody please spend looking at the thread and... (3 Replies)
Discussion started by: emily
3 Replies

7. UNIX for Dummies Questions & Answers

keeping last record among group of records with common fields (awk)

input: ref.1;rack.1;1 #group1 ref.1;rack.1;2 #group1 ref.1;rack.2;1 #group2 ref.2;rack.3;1 #group3 ref.2;rack.3;2 #group3 ref.2;rack.3;3 #group3 Among records from same group (i.e. with same 1st and 2nd field - separated by ";"), I would need to keep the last record... (5 Replies)
Discussion started by: beca123456
5 Replies

8. Shell Programming and Scripting

Common records

Hi, I have the following files, A M 2 3 B E 4 5 C I 5 6 D O 4 5 A M 3 4 B E 5 2 F U 7 9 J K 2 3 OUTPUT A M 2 3 3 4 B E 4 5 5 2 thanks in advance, (7 Replies)
Discussion started by: jacobs.smith
7 Replies

9. Shell Programming and Scripting

Common records after matching on different columns

Hi, I have the following files. cat 1.txt cat 2.txt output.txt The logic is as follows.... (10 Replies)
Discussion started by: jacobs.smith
10 Replies

10. Shell Programming and Scripting

merge based on common, awk help

All, $ cat x.txt z 11 az x 12 ax y 13 ay $ cat y.txt ay TT ax NN Output required: y 13 ay TT x 12 ax NN (3 Replies)
Discussion started by: jkl_jkl
3 Replies
Login or Register to Ask a Question