File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

10-25-2018

Registered User

13, 0

Join Date: Dec 2017

Last Activity: 12 December 2018, 7:37 AM EST

Posts: 13

Thanks Given: 4

Thanked 0 Times in 0 Posts

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

I have 2 large file (.dat) around 70 g, 12 columns but the data not sorted in both the files.. need your inputs in giving the best optimized method/command to achieve this and redirect the not macthing lines to the thrid file ( diff.dat)

File 1 - 15 columns
File 2 - 15 columns

Data is not in sorted order.

kartikirans

View Public Profile for kartikirans

Find all posts by kartikirans

10-25-2018

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

What is this in method/command to achieve this?
Sample files and the desired output would help as well...

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

10-25-2018

Registered User

13, 0

Join Date: Dec 2017

Last Activity: 12 December 2018, 7:37 AM EST

Posts: 13

Thanks Given: 4

Thanked 0 Times in 0 Posts

sample look-

Code:

2036|001|021|92|570|2|422|1|0|0|0|570|0|0|12

Field separate - "|"

File 1 Size ( 60 G)
File 2 Size ( 61 g)
Note - data is not in the sorted order ( file1 and file2)

Requirement, I need to find the not matching lines and redirect those to new file "differnce.dat"

Last edited by vgersh99; 10-25-2018 at 11:19 AM..

kartikirans

View Public Profile for kartikirans

Find all posts by kartikirans

10-25-2018

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

what constitutes "non-matching" lines?
Entire line or some key fields in file1 and 2 to match on?
You have to be clearer with your requirement statements.

Also, please use code tags when posting code/data samples.

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

10-25-2018

Registered User

13, 0

Join Date: Dec 2017

Last Activity: 12 December 2018, 7:37 AM EST

Posts: 13

Thanks Given: 4

Thanked 0 Times in 0 Posts

Thanks for the quick reply, Entire line...

kartikirans

View Public Profile for kartikirans

Find all posts by kartikirans

10-25-2018

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

look into man grep with options -F and -f.
Or man fgrep

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

10-25-2018

Registered User

13, 0

Join Date: Dec 2017

Last Activity: 12 December 2018, 7:37 AM EST

Posts: 13

Thanks Given: 4

Thanked 0 Times in 0 Posts

grep -F -x -v -f file2 file1 ?? or any other optimization command

kartikirans

View Public Profile for kartikirans

Find all posts by kartikirans

UNIX for Advanced & Expert Users

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Need Optimization shell/awk script to aggreagte (sum) for all the columns of Huge data file

Discussion started by: kartikirans

2. UNIX for Dummies Questions & Answers

File comparison of huge files

Discussion started by: kaaliakahn

3. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

Discussion started by: lv99

4. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

Discussion started by: patrick87

5. Shell Programming and Scripting

Problem running Perl Script with huge data files

Discussion started by: ad23

6. Shell Programming and Scripting

Splitting the Huge file into several files...

Discussion started by: lakteja

7. Shell Programming and Scripting

Split a huge data into few different files?!

Discussion started by: patrick87

8. Shell Programming and Scripting

insert a header in a huge data file without using an intermediate file

Discussion started by: deepaktanna

9. Shell Programming and Scripting

How to extract data from a huge file?

Discussion started by: srsahu75

10. UNIX for Dummies Questions & Answers

search and grab data from a huge file

Discussion started by: ting123