Compare one file and get the count of multiple file


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Compare one file and get the count of multiple file
# 1  
Old 03-31-2018
Compare one file and get the count of multiple file

Hi All,

I need to compare a record in one file and find the matching across two file.

This is the master file. File name : CUST.dat
Code:
CUST_ID
9998
10000
10004
10005

DATAFILE1
Code:
9998;80000091;4;687582837443;;;;;;;;;
9998;80000091;4;687582841003;;;;;;;;;
9998;80000091;4;797582801705;;;;;;;;;
10000;80000091;4;85033400411;;;;;;;;;
10000;80000091;4;9648830021;;;;;;;;;
10000;80000091;4;9648830022;;;;;;;;;
10005;80000091;4;687582832052;;;;;;;;;
10005;80000091;4;687582842566;;;;;;;;;
10005;80000091;4;687582843915;;;;;;;;;

DATAFILE2
Code:
687582832052
687582842566

Expected output
Code:
9998;3;0
10000;3;0
10004;0;0
10005;3;2

Basically I need to take the key from CUST.dat and find the match in DATAFILE1 and extract the column 4 and get the match across DATAFILE2 and get the count of matching


I am doing this on a separate command . Getting the count of each grep and doing it for individually. Since I have 10,000 comparison it is taking more time

Code:
 grep ^9998 DATAFILE1
9998;80000091;4;687582837443;;;;;;;;;
9998;80000091;4;687582841003;;;;;;;;;
9998;80000091;4;797582801705;;;;;;;;;

grep 687582837443  DATAFILE2
grep 687582841003 DATAFILE2
grep 797582801705 DATAFILE2


Moderator's Comments:
Mod Comment Please use CODE (not HTML) tags as required by forum rules!

Last edited by RudiC; 03-31-2018 at 10:10 AM.. Reason: Changed HTML to CODE tags.
# 2  
Old 03-31-2018
How about
Code:
awk -F\; '
FNR==1          {FILE++
                }
FILE == 1       {CNT1[$1]++
                 GR[$4] = $1
                 next
                }
FILE == 2       {CNT2[GR[$1]]++
                 next
                }
FNR > 1         {print $1, CNT1[$1]+0, CNT2[$1]+0
                }
' OFS=";" DATAFILE[12] CUST.dat
9998;3;0
10000;3;0
10004;0;0
10005;3;2

This User Gave Thanks to RudiC For This Post:
# 3  
Old 03-31-2018
Quote:
Originally Posted by RudiC
How about
Code:
awk -F\; '
FNR==1          {FILE++
                }
FILE == 1       {CNT1[$1]++
                 GR[$4] = $1
                 next
                }
FILE == 2       {CNT2[GR[$1]]++
                 next
                }
FNR > 1         {print $1, CNT1[$1]+0, CNT2[$1]+0
                }
' OFS=";" DATAFILE[12] CUST.dat
9998;3;0
10000;3;0
10004;0;0
10005;3;2

Does my files needs to be in sorted order ?. It worked for the sample record I am having big file with that it is not working.
# 4  
Old 03-31-2018
As always: without sufficient data / info, nobody can for the life of it give you any meaningful help.
Post the failure mode and error (message) verbatim and the context that lead to it.
If you post a non-representative sample, you will get a not-necessarily working solution...
This User Gave Thanks to RudiC For This Post:
# 5  
Old 04-01-2018
Quote:
Originally Posted by RudiC
As always: without sufficient data / info, nobody can for the life of it give you any meaningful help.
Post the failure mode and error (message) verbatim and the context that lead to it.
If you post a non-representative sample, you will get a not-necessarily working solution...

My mistake , my file I was wrong with file. The above awk solution worked perfect
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Count multiple columns and print original file

Hello, I have two tab files with headers File1: with 4 columns header1 header2 header3 header4 44 a bb 1 57 c ab 4 64 d d 5 File2: with 26 columns header1.. header5 header6 header7 ... header 22...header26 id1 44 a bb id2 57 ... (6 Replies)
Discussion started by: nans
6 Replies

2. Shell Programming and Scripting

Count and search by sequence in multiple fasta file

Hello, I have 10 fasta files with sequenced reads information with read sizes from 15 - 35 . I have combined the reads and collapsed in to unique reads and filtered for sizes 18 - 26 bp long unique reads. Now i wanted to count each unique read appearance in all the fasta files and make a table... (5 Replies)
Discussion started by: empyrean
5 Replies

3. Shell Programming and Scripting

FASTEN count line of dat file and compare with the CTRL file

Hi All, I thinking on how to accelerate the speed on calculate the dat file against the number of records CTRL file. There are about 300 to 400 folder directories that contains both DAT and CTL files. DAT contain all the flat files records CTL is the reference check file for the... (3 Replies)
Discussion started by: ckwan
3 Replies

4. Shell Programming and Scripting

Compare Multiple Columns in one file

Hello guys, I am quite new to Shell Scripting and I need help for this I have a CSV file like this: Requisition,Order,RequisitionLineNumber,OrderLineNumber REQ1,Order1,1,1 REQ1,Order1,1,3 REQ2,Order2,1,5 Basically what I want to do is compare the first 3 fields If all 3 fields are the same... (5 Replies)
Discussion started by: jeffreybsu
5 Replies

5. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

6. Shell Programming and Scripting

Awk to Count Multiple patterns in a huge file

Hi, I have a file that is 430K lines long. It has records like below |site1|MAP |site2|MAP |site1|MODAL |site2|MAP |site2|MODAL |site2|LINK |site1|LINK My task is to count the number of time MAP, MODAL, LINK occurs for a single site and write new records like below to a new file ... (5 Replies)
Discussion started by: reach.sree@gmai
5 Replies

7. Shell Programming and Scripting

Extract string from multiple file based on line count number

Hi, I search all forum, but I can not find solutions of my problem :( I have multiple files (5000 files), inside there is this data : FILE 1: 1195.921 -898.995 0.750312E-02-0.497526E-02 0.195382E-05 0.609417E-05 -2021.287 1305.479-0.819754E-02 0.107572E-01 0.313018E-05 0.885066E-05 ... (15 Replies)
Discussion started by: guns
15 Replies

8. Shell Programming and Scripting

Trying to do a compare with multiple lines in a file

Hey guys I am having a problem with being able to find unused profiles in a configuration check script I am trying to create for accountability purposes for managing a large number of systems. What I am trying to do is run a script that will look at the raw config data in a file and pull all the... (3 Replies)
Discussion started by: scottzx7rr
3 Replies

9. Shell Programming and Scripting

Trying to do a count on multiple lines in a file

Guys I am having a problem with being able to do a count of entries in a file. What I am trying to get a count of the total number of members that are listed in the files. So I need to pull the number of the lines after members. I tried using sed but it only seems to count the first... (7 Replies)
Discussion started by: scottzx7rr
7 Replies

10. Linux

To find multiple strings count in a file

I need to find the line count of multiple strings in a particular file. The strings are as follows: bmgcc bmgccftp bsmsftp bulkftp cctuneftp crbtftp crmpos cso gujhr I am doing manual grep for each of the string to find the line count. The command i am using right now is: grep mark... (3 Replies)
Discussion started by: salaathi
3 Replies
Login or Register to Ask a Question