Difference between two huge .csv files Post: 302711935

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Difference between two huge files

Hi, As per my requirement, I need to take difference between two big files(around 6.5 GB) and get the difference to a output file without any line numbers or '<' or '>' in front of each new line. As DIFF command wont work for big files, i tried to use BDIFF instead. I am getting incorrect...

2. AIX

Huge difference in reported Disk usage between ls,df and du

IBM RS6000 F50 AIX 4.3.2 i am having trouble in calculating the actual size of a set of directories and reconciling the results with the actual Hard Disk space used I have 33GB disk which is showing 7.8GB used, a byte count of the files in the directory/sub-dirs i`m interested in is 48GB,...

3. Programming

Huge difference between _POSIX_OPEN_MAX and sysconf(_SC_OPEN_MAX).

On my Linux system there seems to be a massive difference between the value of _POSIX_OPEN_MAX and what sysconf(_SC_OPEN_MAX) returns and also what I'd expect from the table of examples of configuration limits from Advanced Programming In The UNIX Environment, 2nd Ed. _POSIX_OPEN_MAX: 16...

4. Shell Programming and Scripting

Counting difference in two CSV files

Hi, I am new to awk and trying to count the difference between the first columns of two CSV files. -------- Sample input (header is:name, id1,id2): file1.csv name, id1,id2 sss,34,56 yyy,3,56 www,56,78 pppp,43,12 file2.csv name,id1,id2 sss,32,56 yyy,12,7 ttt,4,8 uuu,7,9

5. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . .

6. Shell Programming and Scripting

Format & Compare two huge CSV files

I have two csv files having 90K records each & each row has around 50 columns.Lets say the file names are FILE1 and FILE2. I have to compare both the files and generate a new file that has rows from FILE2 if it differs. FILE1 ----- 2001,"John",25,19901130,21211.41,Unix Forum...

7. Shell Programming and Scripting

Comparing 2 difference csv files

Hello, I have about 10 csv files which range from csv1 - csv10. Each csv file has same type/set of tabs and we have around 5-6 tabs for each of the csv file which have slightly different content(data). A sample of CSV1 is shown below: Joins: Data related to Joins, it can be any number of...

8. Shell Programming and Scripting

Compare two CSV files and put the difference in third file with line no,field no and diff value.

I am having two csv files i need to compare these files and the output file should have the information of the differences at the field level. For Example, File 1: A,B,C,D,E,F 1,2,3,4,5,6 File 2: A,C,B,D,E,F 1,2,4,5,5,6 out put file:

9. Shell Programming and Scripting

Comparing 2 CSV files and sending the difference to a new csv file

(say) I have 2 csv files - file1.csv & file2.csv as mentioned below: file1.csv ID,version,cost 1000,1,30 2000,2,40 3000,3,50 4000,4,60 file2.csv ID,version,cost 1000,1,30 2000,2,45 3000,4,55 6000,5,70 ...

10. Shell Programming and Scripting

Compare 2 csv files in ksh and o/p the difference in a new csv file

LEARN ABOUT OSF1

comm

comm(1) 						      General Commands Manual							   comm(1)

NAME

       comm - Compares two sorted files.

SYNOPSIS

       comm [-123] file1 file2

STANDARDS

       Interfaces documented on this reference page conform to industry standards as follows:

       command: XCU5.0

       Refer to the standards(5) reference page for more information about industry standards and associated tags.

OPTIONS

       Suppresses  output  of  the  first column (lines in file1 only).  Suppresses output of the second column (lines in file2 only).	Suppresses
       output of the third column (lines common to file1 and file2).

       The command comm -123 produces no output.

OPERANDS

       A pathname of the first file to be compared. If file1 is a hyphen (-), the standard input is used.  A pathname of the  second  file  to	be
       compared. If file2 is a hyphen (-), the standard input is used.

       If both file1 and file2 refer to standard input or to the same FIFO special, block special or character special file, the results are unde-
       fined.

DESCRIPTION

       The comm command reads file1 and file2 and writes three columns to standard output, showing which lines are common to the files	and  which
       are unique to each.

       The  leftmost  column  of  standard output includes lines that are in file1 only.  The middle column includes lines that are in file2 only.
       The rightmost column includes lines that are in both file1 and file2.

       If you specify a hyphen (-) in place of one of the file names, comm reads standard input.

       Generally, file1 and file2 should be sorted according to the collating sequence specified by  the  LC_COLLATE  environment  variable.  (See
       sort(1).)  If the input files are not sorted properly, the output of comm might not be useful.

EXIT STATUS

       Successful completion.  Error occurred.

EXAMPLES

       In the following examples, file1 contains the following sorted list of North American cities:

	      Anaheim Baltimore Boston Chicago Cleveland Dallas Detroit Kansas City Milwaukee Minneapolis New York Oakland Seattle Toronto

	      The second file, file2, contains this sorted list:

	      Atlanta Chicago Cincinnati Houston Los Angeles Montreal New York Philadelphia Pittsburgh San Diego San Francisco St. Louis

	      To display the lines unique to each file and common to the two files, enter: comm file1 file2

	      This command results in the following output: Anaheim	 Atlanta Baltimore Boston	    Chicago	 Cincinnati Cleveland Dal-
	      las Detroit      Houston Kansas City	Los Angeles Milwaukee Minneapolis      Montreal 	  New York Oakland	 Philadel-
	      phia	Pittsburgh	San Diego      San Francisco Seattle	  St. Louis Toronto

	      The  leftmost column contains lines in file1 only, the middle column contains lines in file2 only, and the rightmost column contains
	      lines common to both files.  To display any one or two of the three output columns, include the appropriate flags  to  suppress  the
	      columns you do not want.	For example, the following command displays columns 1 and 2 only: comm -3 file1 file2

	      Anaheim
		     Atlanta Baltimore Boston
		     Cincinnati Cleveland Dallas Detroit
		     Houston Kansas City
		     Los Angeles Milwaukee Minneapolis
		     Montreal Oakland
		     Philadelphia
		     Pittsburgh
		     San Diego
		     San Francisco Seattle
		     St. Louis Toronto

	      The following command displays output from only the second column: comm -13 file1 file2

	      Atlanta Cincinnati Houston Los Angeles Montreal Philadelphia Pittsburgh San Diego San Francisco St. Louis

	      The following command displays output from only the third column: comm -12 file1 file2

	      Chicago New York

SEE ALSO

       Commands:  cmp(1), diff(1), sdiff(1), sort(1), uniq(1)

																	   comm(1)