Sponsored Content
Top Forums UNIX for Advanced & Expert Users File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this Post 303025156 by vgersh99 on Thursday 25th of October 2018 10:35:26 AM
Old 10-25-2018
Quote:
Originally Posted by kartikirans
grep -F -x -v -f file2 file1 ?? or any other optimization command
sounds about right.
Just remember - whatever you do, comparing 60G files will be slow...
Test this on a smaller chunks to see if you're getting the desired results first.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

search and grab data from a huge file

folks, In my working directory, there a multiple large files which only contain one line in the file. The line is too long to use "grep", so any help? For example, if I want to find if these files contain a string like "93849", what command I should use? Also, there is oder_id number... (1 Reply)
Discussion started by: ting123
1 Replies

2. Shell Programming and Scripting

How to extract data from a huge file?

Hi, I have a huge file of bibliographic records in some standard format.I need a script to do some repeatable task as follows: 1. Needs to create folders as the strings starts with "item_*" from the input file 2. Create a file "contents" in each folders having "license.txt(tab... (5 Replies)
Discussion started by: srsahu75
5 Replies

3. Shell Programming and Scripting

insert a header in a huge data file without using an intermediate file

I have a file with data extracted, and need to insert a header with a constant string, say: H|PayerDataExtract if i use sed, i have to redirect the output to a seperate file like sed ' sed commands' ExtractDataFile.dat > ExtractDataFileWithHeader.dat the same is true for awk and... (10 Replies)
Discussion started by: deepaktanna
10 Replies

4. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA... (7 Replies)
Discussion started by: patrick87
7 Replies

5. Shell Programming and Scripting

Splitting the Huge file into several files...

Hi I have to write a script to split the huge file into several pieces. The file columns is | pipe delimited. The data sample is as: 6625060|1420215|07308806|N|20100120|5572477081|+0002.79|+0000.00|0004|0001|......... (3 Replies)
Discussion started by: lakteja
3 Replies

6. Shell Programming and Scripting

Problem running Perl Script with huge data files

Hello Everyone, I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this : foreach my $t (@text) { open TEXT, $t or die "Cannot open $t for reading: $!\n"; while(my $line=<TEXT>){ ... (4 Replies)
Discussion started by: ad23
4 Replies

7. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . . (4 Replies)
Discussion started by: patrick87
4 Replies

8. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment... (7 Replies)
Discussion started by: lv99
7 Replies

9. UNIX for Dummies Questions & Answers

File comparison of huge files

Hi all, I hope you are well. I am very happy to see your contribution. I am eager to become part of it. I have the following question. I have two huge files to compare (almost 3GB each). The files are simulation outputs. The format of the files are as below For clear picture, please see... (9 Replies)
Discussion started by: kaaliakahn
9 Replies

10. UNIX for Advanced & Expert Users

Need Optimization shell/awk script to aggreagte (sum) for all the columns of Huge data file

Optimization shell/awk script to aggregate (sum) for all the columns of Huge data file File delimiter "|" Need to have Sum of all columns, with column number : aggregation (summation) for each column File not having the header Like below - Column 1 "Total Column 2 : "Total ... ...... (2 Replies)
Discussion started by: kartikirans
2 Replies
diff3(1)						      General Commands Manual							  diff3(1)

Name
       diff3 - 3-way differential file comparison

Syntax
       diff3 [-ex3] file1 file2 file3

Description
       The command compares three versions of a file, and publishes the ranges of text that disagree, flagged with the following codes:

	  ====	      all three files differ

	  ====1       file1 is different

	  ====2       file2 is different

	  ====3       file3 is different

       The type of change needed to convert a given range of a given file to some other is indicated in one of these ways:

	  f : n1 a    Text is to be appended after line number n1 in file f, where f = 1, 2, or 3.

	  f : n1 , n2 c
		      Text is to be changed in the range line n1 to line n2.  If n1 = n2, the range may be abbreviated to n1.

       The original contents of the range follows immediately after a c indication.  When the contents of two files are identical, the contents of
       the lower-numbered file is suppressed.

Options
       -3   Produces an editor script containing the changes between file1 and file2 that are to be incorporated into file3.

       -e	   Produces an editor script containing the changes between file2 and file3 that are to be incorporated into file1.

       -x	   Produces an editor script containing the changes among all three files.

Examples
       Under the -e option, publishes a script for the editor that incorporates into file1 all changes between file2 and  file3  -  that  is,  the
       changes	that would normally be flagged ==== and ====3.	Option -x (-3) produces a script to incorporate only changes flagged ==== (====3).
       The following command applies the resulting script to `file1':
       (cat script; echo '1,$p') | ed - file1

Restrictions
       Text lines that consist of a single `.'	defeat -e.

Files
       /tmp/d3?????
       /usr/lib/diff3

See Also
       cmp(1), comm(1), diff(1), dffmk(1), join(1), sccsdiff(1), uniq(1)

																	  diff3(1)
All times are GMT -4. The time now is 04:04 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy